document nvidia software management

This commit is contained in:
2022-11-15 11:05:05 +01:00
parent b55f5e3c88
commit 901a037fff
3 changed files with 50 additions and 3 deletions
+2 -1
View File
@@ -39,7 +39,8 @@ parts:
sections:
- file: rhel8/installation
- file: rhel8/packages
- file: rhel8/desktop
- file: rhel8/nvidia
- file: rhel8/kerberos
- file: rhel8/desktop
- file: rhel8/vendor_documentation
+1 -2
View File
@@ -22,8 +22,6 @@ A lot works already out of the box, but no guarantee can be given unless your sp
## Major Known Issues
- sssd_kcm switches Kerberos credential cache midsession [PSILINUX-120](https://jira.psi.ch/browse/PSILINUX-120)
## Major Missing Features
- not all interesting 3rd-party packages available yet [PSILINUX-113](https://jira.psi.ch/browse/PSILINUX-113)
@@ -107,6 +105,7 @@ which is IMHO OK to not allow a normal user to do changes there.
* [Installation](installation)
* [Package Management](packages)
* [CUDA and Nvidia Drivers](nvidia)
* [Kerberos](kerberos)
* [Desktop](desktop)
* [Vendor Documentation](vendor_documentation)
+47
View File
@@ -0,0 +1,47 @@
# CUDA and Proprietary Nvidia GPU drivers on RHEL 8
Managing Nvidia software comes with its own set of challenges.
For the most common cases are covered by our Puppet configuration.
Those are discussed in the first chapter, more details you find more below.
## Hiera Configuration
Changes in Hiera are forwared by Puppet to the node, but **not applied**.
They are applied on **reboot**.
Alternatively you might execute `/opt/pli/libexec/ensure-nvidia-software` in a safe moment (no process using CUDA and the desktop will be restarted).
### I need CUDA
Set in Hiera `nvidia::cuda::install_software: true` and it will automatically install the suitable Nvidia drivers and newest possible CUDA version.
To enable `nvidia_persistenced` you additionally need to set `nvidia::cuda::nvidia_persistenced::enable: true`.
### I need a specific CUDA version
Then you can additionally set `nvidia::cuda::version` to the desired version.
The version must be fully specified (all three numbers, with X.Y.0 for the GA version).
Note that newer CUDA versions do not support older drivers, for details see Table 3 in the [CUDA Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html).
### I just need the Nvidia drivers
Nothing needs to be done, they are installed by default when Nvidia GPUs or accelerators are found.
### I do not want the Nvidia drivers
Set in Hiera `nvidia::driver::enable: false`. Note this will be ignored if CUDA is enabled (see above).
Note they do not get automatically removed when already installed. That you would need to do by hand.
### I need the Nvidia drivers from a specific driver branch
The driver branch can be selected in Hiera with `nvidia::driver::branch`. It will then use the latest driver version of that branch. Note that only production branches are available in the PSI package repository.
### I need a Nvidia driver of a given version
This is not recommended, still it is possible to do so by setting the exact driver version (X.Y.Z, excluding the package iteration number) in Hiera with `nvidia::driver::version`.
If the driver version is too old, it will install an older kernel version and you will need a second reboot to activate it.
## Versioning Mess
## Manual Operation