2.2 KiB
CUDA and Proprietary Nvidia GPU Drivers on RHEL 8
Managing Nvidia software comes with its own set of challenges. For the most common cases are covered by our Puppet configuration. Those are discussed in the first chapter, more details you find more below.
Hiera Configuration
Changes in Hiera are forwared by Puppet to the node, but not applied.
They are applied on reboot.
Alternatively you might execute /opt/pli/libexec/ensure-nvidia-software in a safe moment (no process using CUDA and the desktop will be restarted).
I need CUDA
Set in Hiera nvidia::cuda::install_software: true and it will automatically install the suitable Nvidia drivers and newest possible CUDA version.
To enable nvidia_persistenced you additionally need to set nvidia::cuda::nvidia_persistenced::enable: true.
I need a specific CUDA version
Then you can additionally set nvidia::cuda::version to the desired version.
The version must be fully specified (all three numbers, with X.Y.0 for the GA version).
Note that newer CUDA versions do not support older drivers, for details see Table 3 in the CUDA Release Notes.
I just need the Nvidia drivers
Nothing needs to be done, they are installed by default when Nvidia GPUs or accelerators are found.
I do not want the Nvidia drivers
Set in Hiera nvidia::driver::enable: false. Note this will be ignored if CUDA is enabled (see above).
Note they do not get automatically removed when already installed. That you would need to do by hand.
I need the Nvidia drivers from a specific driver branch
The driver branch can be selected in Hiera with nvidia::driver::branch. It will then use the latest driver version of that branch. Note that only production branches are available in the PSI package repository.
I need a Nvidia driver of a given version
This is not recommended, still it is possible to do so by setting the exact driver version (X.Y.Z, excluding the package iteration number) in Hiera with nvidia::driver::version.
If the driver version is too old, it will install an older kernel version and you will need a second reboot to activate it.