Nvidia on Ubuntu: Difference between revisions

add notes
Add NVIDIA instructions
Line 1: Line 1:
Because I wanted to run a local [[Artificial Intelligence]] platform called [[Ollama]], I wanted to ensure that my GPU was fully utilized in the system since GPUs are the particular type of hardware best suited for these [[Vector database|Vector]] calculations. And, I have a 'decent' GPU - [[PC Build 2024#Video Card (GPU)|Nvidia GeForce RTX 4060]] (the best you could get in 2024). In trying to install the latest Nvidia driver, I set off on a week-long journey of learning, frustration and perseverance discovering the inner workings of Ubuntu 24.04, Xorg, the Linux kernel and kernel modules, DRM, Secure Boot, initramfs and more.  
Because I wanted to run a local [[Artificial Intelligence]] platform called [[Ollama]], I wanted to ensure that my GPU was fully utilized in the system since GPUs are the particular type of hardware best suited for these [[Vector database|Vector]] calculations<ref>https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#the-benefits-of-using-gpus</ref>. And, I have a 'decent' GPU - [[PC Build 2024#Video Card (GPU)|Nvidia GeForce RTX 4060]] (the best you could get in 2024). In trying to install the latest Nvidia driver, I set off on a week-long journey of learning, frustration and perseverance discovering the inner workings of Ubuntu 24.04, Xorg, the Linux kernel and kernel modules, DRM, Secure Boot, initramfs and more.  


I still do not have the Nvidia driver loaded - even after 40+ reboots and attempts. Instead I'm using the Nouveau driver but at least I have a working system and I believe now that I've finally figured out what needs to be done to disable Nouveau and install Nvidia - a project that I am approaching with greater scrutiny now. I'm documenting the things that I encounter in this journey.
I still do not have the Nvidia driver loaded - even after 40+ reboots and attempts. Instead I'm using the Nouveau driver but at least I have a working system and I believe now that I've finally figured out what needs to be done to disable Nouveau and install Nvidia - a project that I am approaching with greater scrutiny now. I'm documenting the things that I encounter in this journey.
Line 18: Line 18:


'''OpenGL version''' string: 4.3 (Compatibility Profile) Mesa 24.2.8-1ubuntu1~24.04.1
'''OpenGL version''' string: 4.3 (Compatibility Profile) Mesa 24.2.8-1ubuntu1~24.04.1
<code>lspci | grep VGA</code>
01:00.0 '''VGA''' compatible controller: NVIDIA Corporation AD107 [GeForce RTX 4060] (rev a1)


== GUI is stuck ==
== GUI is stuck ==
Line 99: Line 103:


== NVidia ==
== NVidia ==
The installation guide (46 chapters) is at https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/  
Documentation for installing NVidia drivers is at https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/
 
The installation guide for the v570 of the driver (46 chapters) is at https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/  


I've read the whole thing.  
I've read the whole thing.  
Line 112: Line 118:
We explore these in more detail below.
We explore these in more detail below.


Over at StackExchange, a user asked [https://unix.stackexchange.com/questions/352828/how-to-switch-nvidia-driver-from-nouveau-to-nvidia-proprietary how to switch nvidia driver from nouveau to nvidia proprietary] and succeeded in part by '''modifying the boot parameters in grub''' to deny nouveau.
Over at StackExchange, a user asked [https://unix.stackexchange.com/questions/352828/how-to-switch-nvidia-driver-from-nouveau-to-nvidia-proprietary how to switch graphics driver from nouveau to nvidia] and succeeded in part by '''modifying the boot parameters in grub''' to deny nouveau. Note that the boot parameters were used only during the process to stop using one driver and install the other driver. It is not a configuration that would allow you to have two different boot menu entries in GRUB in order to use two graphics modes.
 
Over in the Manjaro Linux forums, a user asked a similar question: [https://forum.manjaro.org/t/how-do-i-switch-between-nvidia-and-nouveau-drivers-on-boot/92044 How do I switch between Nvidia and Nouveau drivers on boot?] They tried using
 
<code>modprobe.blacklist=nvidia systemd.setenv=GPUMOD=nouveau rd.driver.blacklist=nvidia nouveau.modeset=1 nvidia.modeset=0</code>
 
But ultimately had to install the OS twice on different disk partitions in order to choose to boot one system or the other depending on what graphics driver they needed to use.


=== Denylist ===
=== Denylist ===
Line 164: Line 176:
=== Module Signing ===
=== Module Signing ===
On systems with Secure Boot enabled (mine), you most likely need to sign the module. See [https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/installdriver.html#modulesigning Signing NVIDIA Kernel Module]. However, I didn't get an explicit message that signing was a problem; and I did see that the installation process signs the module with a generated key. I assume that the MOK process hooks into the trust system somehow.
On systems with Secure Boot enabled (mine), you most likely need to sign the module. See [https://download.nvidia.com/XFree86/Linux-x86_64/570.153.02/README/installdriver.html#modulesigning Signing NVIDIA Kernel Module]. However, I didn't get an explicit message that signing was a problem; and I did see that the installation process signs the module with a generated key. I assume that the MOK process hooks into the trust system somehow.
== Tools and Troubleshooting ==
Ubuntu wants you to use the '[[Nvidia on Ubuntu/ubuntu-drivers|ubuntu-drivers]]' tool<ref>https://documentation.ubuntu.com/server/how-to/graphics/install-nvidia-drivers/</ref>.
NVIDIA seems to just settled on a new mechanism<ref>https://docs.nvidia.com/datacenter/tesla/driver-installation-guide/</ref> rather than downloading the (former?) .run installers:  <code>wget <nowiki>https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2404/x86_64/cuda-keyring_1.1-1_all.deb</nowiki> dpkg -i cuda-keyring_1.1-1_all.deb apt update</code> <code>apt install nvidia-open</code>
NVIDIA distributes a script called <code>nvidia-bug-report.sh</code> that you can and should run<ref>https://forums.developer.nvidia.com/t/if-you-have-a-problem-please-read-this-first/27131
</ref> to collect detailed information about any problems.


== Interesting Notes ==
== Interesting Notes ==