Motivation
Recently, I bought an Nvidia RTX 4060 Ti (8GB) graphics card. To make full use of it, I want to use this graphics card on Ubuntu for games, CUDA programming, deep learning, etc. However, using Nvidia graphics cards on Ubuntu is not an easy task and requires some settings. Here I record the settings I use Nvidia graphics cards on Ubuntu.
Installing Nvidia Graphics Card Driver on Ubuntu
Check Graphics Card Information
First, we need to check our graphics card information. Open the terminal and enter the following command:
|
|
If you have an Nvidia graphics card on your computer, you will see output similar to the following:
|
|
For some reason, my computer displays NVIDIA Corporation Device 2803
instead of RTX 4060 Ti
, but it doesn’t matter. We just need to know that this is an Nvidia graphics card.
Install Nvidia Graphics Card Driver
- First, remove any Nvidia graphics card drivers that may have been installed:
|
|
- Install any missing dependencies:
|
|
- Add the Nvidia graphics card driver PPA repository and update:
|
|
- Install the Nvidia graphics card driver:
|
|
-
Restart the computer
-
Confirm that the graphics card driver is installed successfully:
|
|
If you see output similar to the following, congratulations, your Nvidia graphics card driver is installed successfully:
|
|
Disable Nouveau Graphics Card Driver
Nouveau is an open-source Nvidia graphics card driver, but its performance is not as good as the official Nvidia closed-source driver, so we need to disable the Nouveau graphics card driver.
- Check if the Nouveau graphics card driver is loaded:
|
|
If you see output similar to the following, it means that the Nouveau graphics card driver is loaded:
|
|
- Disable the Nouveau graphics card driver:
|
|
- Update initramfs:
Then we need to update the kernel modules loaded at boot time:
|
|
-
Restart the computer
-
Confirm that the Nouveau graphics card driver is disabled:
|
|
If there is no output, it means that the Nouveau graphics card driver has been disabled.
Test Graphics Card
We can use glmark2
to test the graphics card performance.
- Install
glmark2
:
|
|
- Run
glmark2
:
|
|
If you see output similar to the following in the terminal, it means that the graphics card performance test is successful:
|
|
And a window will pop up showing the content being tested. After the test is completed, the terminal will display the test score.
Games
Linux systems are not very suitable for playing games, but with the promotion of Steam, more and more games can be run on Linux using Proton. Proton is a tool developed by Valve based on Wine, which can run Windows games on Linux.
Install Steam
- Download the Steam installation package:
|
|
- Install Steam:
|
|
- Install any missing dependencies:
|
|
- Run Steam:
|
|
- Log in to your Steam account
Install Proton
In Steam, select games that can run on Linux, then click Settings
, in the Steam Play
tab, check Enable Steam Play for supported titles
and Enable Steam Play for all other titles
, then select a Proton version from the Steam Play
drop-down menu, and click OK
.
After Proton is installed, you can run Windows games on Linux.
Install Games
If you are using Steam for the first time and have not purchased any games, you can choose some free games to test, such as “Dota 2”, “Counter-Strike: Global Offensive”, etc.
CUDA Programming
CUDA is a parallel computing platform and programming model developed by Nvidia, which can use the parallel computing power of the GPU to accelerate compute-intensive applications. CUDA programming requires the installation of the Nvidia graphics card driver and the CUDA toolkit. The CUDA version and the Nvidia graphics card driver version have a certain correspondence, and you need to choose the appropriate CUDA version according to your graphics card driver version.
Install CUDA
- Check the CUDA version required by the Nvidia graphics card driver:
|
|
In the CUDA Version
line, you can see the CUDA version required by the Nvidia graphics card driver, for example, CUDA Version: 12.5
. That means we need to install CUDA 12.5.
- Download the CUDA installation package:
Download the appropriate CUDA installation package from the Nvidia website, select the appropriate operating system, architecture, distribution, version, etc.
Note that we need to install CUDA 12.5, but the default display on the official website is 12.6. However, this is not a problem. We choose to install via the network, and you will see the following installation guide:
We only need to change the last line of the command to the CUDA version we need.
- Install CUDA:
|
|
- Configure environment variables:
If the installation is successful, you will see the CUDA installation files in the /usr/local/cuda-12.5
directory. We need to configure environment variables so that CUDA can be found.
If you are using bash
, you can add the following content to the ~/.bashrc
file:
|
|
If you use another shell, you can add the above content to the corresponding configuration file.
- Test CUDA:
|
|
If you see output similar to the following, it means that the CUDA programming environment is set up successfully:
|
|
Compile CUDA Program
You can use the CUDA sample program provided by Nvidia to test the CUDA programming environment.
- Download the CUDA sample program:
|
|
- Compile the CUDA sample program:
|
|
- Run the CUDA sample program:
|
|
If you see output similar to the following, it means that the CUDA programming environment is set up successfully:
|
|
Here you can see that the graphics card on my computer is NVIDIA GeForce RTX 4060 Ti
, the CUDA version is 12.5
, and other information such as memory size, CUDA core count, GPU clock frequency, etc. If you see similar output, it means that the CUDA programming environment is set up successfully.
Now you can start writing your own CUDA program.
Deep Learning
Deep learning is one of the most commonly used machine learning methods, which can be used to solve problems such as image recognition, natural language processing, recommendation systems, etc. Deep learning usually requires a large amount of data and computing resources, so using GPUs to accelerate deep learning training is very common.
Install Deep Learning Framework
Currently, popular deep learning frameworks include TensorFlow, PyTorch, Keras, etc., all of which support GPU acceleration. Before installing a deep learning framework, we need to install CUDA and cuDNN.
- Install cuDNN:
cuDNN is a deep learning library provided by Nvidia, which can accelerate the operation of deep learning frameworks. We need to download the corresponding cuDNN installation package from the Nvidia website, and then install it.
- Install the deep learning framework:
For example, PyTorch:
|
|
- Test the deep learning framework:
|
|
If you see True
, it means that PyTorch is installed successfully and can be used to accelerate with GPU.
Train Model
Now you can use the GPU to accelerate the training of deep learning models. For example, we can download the example provided by the PyTorch official repository to test.
|
|
This is an example of using a Long Short-Term Memory (LSTM) network to predict a sine wave. You can modify the model and data according to your needs.
Docker Containers
I have run many applications using Docker before, but they were all running on the CPU. Now that I have an Nvidia graphics card, I want to use GPU acceleration in Docker containers, such as migrating the large language model I previously ran to the GPU.
Install Nvidia Container Toolkit
Nvidia Container Toolkit is a tool provided by Nvidia that allows Docker containers to access Nvidia graphics cards.
- Add the Nvidia Container Toolkit PPA repository:
|
|
- Install Nvidia Container Toolkit:
|
|
- Restart the Docker service:
|
|
Use GPU Acceleration in Docker Containers
Now we can use GPU acceleration in Docker containers, for example:
|
|
I usually use docker-compose to manage Docker containers, and you can add GPU configuration in the docker-compose.yml
file:
|
|
After defining the docker-compose.yml
file, you can start the container using the docker-compose
command:
|
|
Troubleshooting
The first time I installed the Nvidia driver and restarted the computer, the computer could not enter the desktop and was stuck in the command line interface, and I could not enter any commands. I suspect that the Nouveau graphics card driver was not disabled, causing the Nvidia graphics card driver to not work properly.
I seem to have not set up a recovery mode, so I can’t enter recovery mode to fix the problem. I don’t want to reinstall the system because there are many important things in the original system.
Finally, I had to start a Ubuntu system with a USB flash drive, mount the original system partition, enter the original system through the chroot command, uninstall the Nvidia driver, and then restart the computer.
But there were some small twists and turns in this process. Because the Ubuntu system in the USB flash drive is 24.04, and the system in the original computer is 22.04, so after entering the original system through chroot, when I ran apt-get update
, it seemed to install some wrong drivers, which caused the computer to lose the network after restarting, and neither wireless nor wired connections could be made. And the kernel version number was updated to a new version, which seemed to cause some compatibility issues.
So I had to create a new Ubuntu system with 22.04 in the USB flash drive, then chroot into the original system again, update with apt-get update
, and add some additional Linux modules dpkg -s linux-modules-extra-$(uname -r) | grep status
, and finally update the initramfs update-initramfs -u
, update the grub update-grub
. After restarting the computer, everything was back to normal, except that the system’s kernel was updated to a new version, and everything else should have been restored to its original state.
Finally, I reinstalled the Nvidia graphics card driver, this time without restarting the computer, but first disabled the Nouveau graphics card driver, then updated the initramfs and grub, and finally restarted the computer, and finally successfully installed the Nvidia graphics card driver.