diff --git a/_toc.yml b/_toc.yml index 42d7109f..d328bb21 100644 --- a/_toc.yml +++ b/_toc.yml @@ -39,7 +39,8 @@ parts: sections: - file: rhel8/installation - file: rhel8/packages - - file: rhel8/desktop + - file: rhel8/nvidia - file: rhel8/kerberos + - file: rhel8/desktop - file: rhel8/vendor_documentation diff --git a/rhel8/index.md b/rhel8/index.md index bd36a7cb..327ee8a8 100644 --- a/rhel8/index.md +++ b/rhel8/index.md @@ -22,8 +22,6 @@ A lot works already out of the box, but no guarantee can be given unless your sp ## Major Known Issues -- sssd_kcm switches Kerberos credential cache midsession [PSILINUX-120](https://jira.psi.ch/browse/PSILINUX-120) - ## Major Missing Features - not all interesting 3rd-party packages available yet [PSILINUX-113](https://jira.psi.ch/browse/PSILINUX-113) @@ -107,6 +105,7 @@ which is IMHO OK to not allow a normal user to do changes there. * [Installation](installation) * [Package Management](packages) +* [CUDA and Nvidia Drivers](nvidia) * [Kerberos](kerberos) * [Desktop](desktop) * [Vendor Documentation](vendor_documentation) diff --git a/rhel8/nvidia.md b/rhel8/nvidia.md new file mode 100644 index 00000000..21d45beb --- /dev/null +++ b/rhel8/nvidia.md @@ -0,0 +1,47 @@ +# CUDA and Proprietary Nvidia GPU drivers on RHEL 8 + +Managing Nvidia software comes with its own set of challenges. +For the most common cases are covered by our Puppet configuration. +Those are discussed in the first chapter, more details you find more below. + +## Hiera Configuration +Changes in Hiera are forwared by Puppet to the node, but **not applied**. +They are applied on **reboot**. +Alternatively you might execute `/opt/pli/libexec/ensure-nvidia-software` in a safe moment (no process using CUDA and the desktop will be restarted). + +### I need CUDA + +Set in Hiera `nvidia::cuda::install_software: true` and it will automatically install the suitable Nvidia drivers and newest possible CUDA version. + +To enable `nvidia_persistenced` you additionally need to set `nvidia::cuda::nvidia_persistenced::enable: true`. + +### I need a specific CUDA version + +Then you can additionally set `nvidia::cuda::version` to the desired version. +The version must be fully specified (all three numbers, with X.Y.0 for the GA version). + +Note that newer CUDA versions do not support older drivers, for details see Table 3 in the [CUDA Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html). + +### I just need the Nvidia drivers +Nothing needs to be done, they are installed by default when Nvidia GPUs or accelerators are found. + +### I do not want the Nvidia drivers + +Set in Hiera `nvidia::driver::enable: false`. Note this will be ignored if CUDA is enabled (see above). + +Note they do not get automatically removed when already installed. That you would need to do by hand. + +### I need the Nvidia drivers from a specific driver branch + +The driver branch can be selected in Hiera with `nvidia::driver::branch`. It will then use the latest driver version of that branch. Note that only production branches are available in the PSI package repository. + +### I need a Nvidia driver of a given version + +This is not recommended, still it is possible to do so by setting the exact driver version (X.Y.Z, excluding the package iteration number) in Hiera with `nvidia::driver::version`. + +If the driver version is too old, it will install an older kernel version and you will need a second reboot to activate it. + +## Versioning Mess + +## Manual Operation +