cleanup
All checks were successful
Build and Deploy Documentation / build-and-deploy (push) Successful in 19s

This commit is contained in:
2025-09-03 16:39:16 +02:00
parent 2824765bc0
commit 2eb093e2ea
335 changed files with 0 additions and 14525 deletions

Binary file not shown.

View File

@@ -1,67 +0,0 @@
# HTTPS Certificates
We use DigiCert certificates.
## Request a Certificate
First create a certificate signing request (CSR) like this, replacing the values for `FQDN`
and `ALIASES`
```bash
ALIASES=xyz.psi.ch
FQDN=xyz00.psi.ch
cat >$FQDN.cnf <<EOF
FQDN = $FQDN
ORGNAME = Paul Scherrer Institut (PSI)
# subjectAltName entries: to add DNS aliases to the CSR, delete
# the '#' character in the ALTNAMES line, and change the subsequent
# 'DNS:' entries accordingly. Please note: all DNS names must
# resolve to the same IP address as the FQDN.
ALTNAMES = DNS:\$FQDN $ALIASES
# --- no modifications required below ---
[ req ]
default_bits = 2048
default_md = sha256
prompt = no
encrypt_key = no
distinguished_name = dn
req_extensions = req_ext
[ dn ]
C = CH
O = \$ORGNAME
CN = \$FQDN
OU = AWI
[ req_ext ]
subjectAltName = \$ALTNAMES
EOF
/usr/bin/openssl req -new -config $FQDN.cnf -keyout $FQDN.key -out $FQDN.csr
```
Finally, [submit the CSR](https://www.digicert.com/secure/requests/products?guest_key=11dqrl7540p3t4jm4qhnvsnzjkvk).
Please note that the URL will work when accessed from PSI network (e.g. VPN).
DigiCert will send an email including instructions on how to download the certificate.
Our teams practice is to always create a new private key and to back it up encrypted in Gitlab, either
- in Hiera as [EYAML](https://linux.psi.ch/admin-guide/puppet/hiera.html#secret-values)
- for central infrastructure hosts GPG encrypted in their [bootstrap repository](https://git.psi.ch/linux-infra/bootstrap)
- for the rest in our [team secret store](https://git.psi.ch/linux-infra/core-linux-secrets)
## Renew Certificate
Using the same configuration file as above, generate a new private key and CSR,
and submit the CSR as before.
## Revoke Certificate
If you would like to revoke a DigiCert certificate, please send an e-mail to pki@psi.ch

View File

@@ -1,4 +0,0 @@
# Configuration Guides
```{tableofcontents}
```

View File

@@ -1,6 +0,0 @@
# Allowing/Limiting Access
These guides show how to allow user into the system or to allow them to become administrator.
```{tableofcontents}

View File

@@ -1,25 +0,0 @@
# Bastion Hosts
Access for the `root` user can be limited to be only allowed from certain bastion hosts.
By default this is enabled except for a few networks, see [reponsible Puppet code](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/code/modules/profile/manifests/networking/params.pp) for details.
You may alternatively control the use of bastion hosts yourself by setting in Hiera the boolean value `aaa::user_bastions`.
The bastion hosts can be listed in the Hiera key `aaa:bastions`:
```
aaa::bastions:
- 'x05la-gw.psi.ch'
```
which then will override the default value
```
aaa::bastions:
- 'wmgt01.psi.ch'
- '129.129.190.25' # IP of wmgt01.psi.ch
- 'wmgt02.psi.ch'
- '129.129.190.104' # IP of wmgt02.psi.ch
```
**Caution**: an empty list will allow unrestricted login again!

View File

@@ -1,84 +0,0 @@
# Eaccounts (Experiment Account)
The eaccounts are managed via the Digital User Office (DUO) and used for single experiments at the beamlines.
Eaccounts reside in the AD in the subtree `OU=users,OU=experiment,OU=it,DC=d,DC=psi,DC=ch` whereas normal accounts are found below `OU=Users,OU=PSI,DC=d,DC=psi,DC=ch`
Normally eaccounts start with `e` followed by their uid. Some eaccounts where used not only for single experiments, more like GAC accounts, and have thus been renamed:
```
gac-alvra
gac-cristall
gac-bernina
gac-femto
gac-furka
gac-maloja
gac-slab
gac-x01dc
gac-x02da
gac-x03da
gac-x03ma
gac-x04db
gac-x04sa
gac-x05da
gac-x05la
gac-x06da
gac-x06sa
gac-x07da
gac-x07db
gac-x07ma
gac-x07mb
gac-x09la
gac-x09lb
gac-x10da
gac-x10sa
gac-x11ma
gac-x12sa
gac-x12saop
gac-x96sa
```
An eaccount is usually exactly bound to one beamline/endstations and is part of the beamlines/endstations group BEAMLINE (e.g. X06DA)
## Allow Eaccounts
```
aaa::allow_experiment_accounts: true
```
in Hiera enables eaccounts on a system, default is `false`.
## Eaccounts and `override_homedir`
By default the `override_homedir` setting (in Hiera `aaa::override_homedir` ore `base::local_homes`) is ignored for eaccounts.
The way this has been solved causes problems with group member lookups with eaccounts enabled.
Without eaccounts only normal users are found as part of the group (which is correct):
```
[root@lxdev04 ~]# getent group SARESA
SARESA:*:35184:babic_a,ebner,kapeller,...,huppert_m,carulla_m,schoel_m
[root@lxdev04 ~]#
```
but with eaccounts enabled only the eaccount members are listed:
```
[root@lxdev01 ~]# getent group SARESA
SARESA:*:35184:e21996,e21997,e21992,...,e17806,e17589,gac-alvra
[root@lxdev01 ~]#
```
If for a system this is a problem, and at the same time there is no need for ignoring `override_homedir`, you may enable the eaccounts with
```
aaa::allow_experiment_accounts: true
aaa::enable_eaccounts::ignore_override_homedir: false
```
then you get both type of members:
```
[root@lxdev07 ~]# getent group SARESA
SARESA:*:35184:e21996,...,e17589,ext-kapetanaki_s,...,ext-tyrala_k
[root@lxdev07 ~]#
```
There is a [open case](https://access.redhat.com/support/cases/#/case/03912615) at RedHat on how to deal best with this problem.

View File

@@ -1,22 +0,0 @@
# MFA - Multi Factor Authentication
MFA can be enabled on any standard system with following configuration:
```yaml
# disable kerberos authentication
ssh_server::enable_gssapi: false
# #disable ssh key authentication
ssh_server::enable_public_key: false
aaa::radius_auth: true
aaa::radius_shared_secret: ENC[PKCS7,MIIBuQYJK...9Z82qA==]
aaa::radius_servers: [ 'nps01.psi.ch', 'nps02.psi.ch' ]
aaa::radius_timeout: 60
```
Beside this, ensure that `ChallengeResponseAuthentication yes` is set correctly in your sshd config (this is the default configuration - so if no changes where configured to sshd this should be ok!).
Prerequisite for this is, that your server can reach the RADIUS servers (in the example nps01.psi.ch and nps02.psi.ch) and that you received a shared secret from the RADIUS admin.
(at the time of writing the RADIUS server are supported by group 9521)

View File

@@ -1,25 +0,0 @@
# SSH Host Hopping as Root (e.g. between cluster members)
This is to allow the user `root` on a given machine to log in as `root` onto another machine without using a password or a similar authentication.
The `ssh_server::root_host_trust` list in Hiera configures from which devices root is allowed to connect without special configuration:
```
ssh_server::root_host_trust:
- 'lxdev04.psi.ch'
- 'lxdev05.psi.ch'
```
From security perspective this nodes should have the same or stricter security rules/setup that the target host.
To actually use host trust the client also needs to configure that while connecting, e.g. in Hiera:
```
ssh_client::try_host_trust: true
```
or spontaneously on the ssh command line with:
```
ssh -o HostbasedAuthentication=yes ...
```
or by setting `HostbasedAuthentication yes` in the appropriate place in the ssh configuration (e.g. `~/.ssh/config`).

View File

@@ -1,45 +0,0 @@
# SSH Server Configuration (sshd)
## Extra Configuration
Custom configration to the sshd config file can be added via the `ssh_server::extra_config` key. The config will be added at the end of the `/etc/ssh/sshd_config` file.
### Force Command
To configure a force command use:
```yaml
# add force command
ssh_server::extra_config:
'Force command for non root users': |
Match User *,!root
ForceCommand /usr/bin/kpasswd
```
## Login Banner
A login banner can be configured as follows:
```yaml
# custom banner message on ssh login-prompt
ssh_server::banner_file: '/etc/sshgw/sshd_message'
files::files:
/etc/sshgw/sshd_message:
mode: '0644'
owner: 'root'
content: |
----
PAUL SCHERRER INSTITUTE
________________
| __ | ____| |
| ____|____ | |
|_| |______|__|
----
```
## SFTP Server
How to enable/disable and configure a sftp server please refer to the [SFTP Server](../files/sftp_server) guide.

View File

@@ -1,11 +0,0 @@
# Custom sudo Rules
Custom sudo rules can be specify in hiera as follows:
```yaml
aaa::sudo_rules:
- 'Defaults:telegraf !requiretty, !syslog'
- 'telegraf ALL=(root) NOPASSWD: /usr/lib/telegraf/scripts/nxserver_report.sh'
```
Beside that you might also simply deploy a file to /etc/sudoers.d e.g. via a technique described in [Distribute Files](../files/distribute_files).

View File

@@ -1,29 +0,0 @@
# Regular and Administrative User Access
## Regular Access
Access for ActiveDirectory (AD) user accounts or groups:
```
aaa::users:
- 'muster_h'
- '%unx-project_group'
```
To limit a login to the physical desktop/console only you can add a `:LOCAL` suffix like this:
```
aaa::users:
- 'muster_h:LOCAL'
```
Note that administrative users (see below) always have normal access without being explicitely listed in `aaa::users`.
## Administrative (root/sudo) Access
To give root access for AD user accounts or groups via sudo.
```
aaa::admins:
- 'muster_h'
- '%unx-project_group'
```

View File

@@ -1,4 +0,0 @@
# Basic System Configuration Guides
```{tableofcontents}

View File

@@ -1,73 +0,0 @@
# Configuration to Send/Relay Emails
## Sending Emails Via PSI Central Mail Gateway
To be able to send emails, the server needs to be registered on the PSI mail gateways.
This can be done by this ServiceNow request:
(Service Catalog > IT Systems & Data Storage > Register E-Mail Sender)
https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=68d60ca74f8833407f7660fe0310c7e3
The default PSI mail gateways for the different network segments are defined in the puppet common.yaml. Depending on the machines location a different default will take effect. The defaults are as follows:
```
mta::relays:
'default': 'smtpint.psi.ch'
'dmz': 'smtpdmz.psi.ch'
'extranet': 'smtpdmz.psi.ch'
'tier3': 'smtpdmz.psi.ch'
```
To enable the possibility to send emails from a machine to one of the standard gateways the flag `base::enable_mta` simply needs to be enabled. No other configuration is needed.
```yaml
base::enable_mta: true
```
The default name of the sending mail server is the hostname of the machine running postfix. To rename it use `mta::myhostname`.
```yaml
mta::myhostname: 'foobar.psi.ch'
```
## Sending Emails Via An Other SMTP Relay
If your machine is in the default network zone (i.e. PSI intranet) sending via a differnet mail gateway than the default can be done like this:
```yaml
base::enable_mta: true
mta::relays:
'default': 'test-smtp-relay.psi.ch'
```
Wether the email is accepted by the mail relay depends on the relays configuration. Contact the relay admin what the rules for his gateway are.
## Configure Server as Mail Relay
The following configuration is needed if you want to setup a email relay server accepting emails from clients.
Depending on the rule where your relay should forward messages, your server/relay must be registered/authorized on the relay it forwardes messages to. (e.g. PSI default mail relay, procedure see above)
```yaml
base::enable_mta: true
# if you want to use an other forward relay than the PSI defaults
# mta::relays:
# 'default': 'test-smtp-relay.psi.ch'
# interfaces postfix should accept emails
mta::inet_interfaces: # array[string] default: loopback-only
# networks from which this relay should accept emails
mta::mynetworks: # default: undefined
```
Example (assumption the servers ip address is 10.1.2.110):
```yaml
mta::inet_interfaces:
- '10.1.2.110'
- 'localhost'
# mta::mynetworks_style: 'subnet'
mta::mynetworks:
- '10.1.2.0/24'
- '10.1.3.0/24'
```

View File

@@ -1,37 +0,0 @@
# Permanent Kerberos with gssproxy and Password Keytab
If there are accounts which run software permamently e.g. used for data collection and shall always be able to write to Kerberos protected network shares, they you may provide the `gssproxy` service with a password keytab.
After Kerberos for NFS and CIFS is handled transparently and there is no need to `kinit`, renew and anything like this because that is handled in the back by `gssproxy` automatically.
**Attention: The keytab file generated in this guide is like a cleartext password and needs to be protected the same!**
**Note: when the password of the user changes, a new keytab file with the new password needs to be created.**
First you need the user name \(\$USER) and the user ID \(\$UID) to prepare the password keytab:
```
$ # ensure it does not exist, else it gets extended
$ rm $UID.keytab
$ ktutil
ktutil: add_entry -password -k 0 -f -p $USER
Password for $USER@D.PSI.CH:
ktutil: wkt $UID.keytab
ktutil: exit
$
```
Note that inside `ktutil` variables are not interpolated as this is not `bash`.
To test if the keytab works as intended do
```
$ kinit -t $UID.keytab -k $USER
$
```
and without output it is working fine.
Then as root (`sudo`) make it known to `gssproxy`:
```
# cp $UID.keytab /var/lib/gssproxy/clients/
# chmod 600 /var/lib/gssproxy/clients/$UID.keytab
# chown root:root /var/lib/gssproxy/clients/$UID.keytab
```
If you want to [distribute the keytab with Puppet/Hiera](../files/distribute_files), ensure it is [stored in Hiera encrypted](../../puppet/hiera).

View File

@@ -1,58 +0,0 @@
# Host Renaming
Following steps are required to rename a host.
## Introduce New Hostname
### Sysdb/bob
For sysdb create a new node with bob as usual.
For reference do
```
OLD_FQDN=my-old-hostname.psi.ch
NEW_FQDN=my-new-hostname.psi.ch
bob node list -v $OLD_FQDN
```
then add the new node
```
bob node add $NEW_FQDN $SYSDB_ENV
bob node set-attr $NEW_FQDN [all attributes as listed above]
```
and the same for the MAC addresses:
```
for mac in $(bob node list-macs $OLD_FQDN); do echo bob node add-mac $NEW_FQDN $mac; done
```
### Hiera
In Hiera ensure that a host specific configuration file exists for the FQDN with the same content as for the old hostname.
## Acutal Hostname Change
On the node as `root` run
```
hostnamectl set-hostname $NEW_FQDN
rm -rf /etc/puppetlabs/puppet/ssl
puppet agent -t
```
which changes the local hostname, removes the local Puppet certificate and updates the configuration with Puppet using the new hostname.
## Other Changes to Consider
- change DNS entry if a static IP address is assigned
- redo SSL/TLS certificates with new hostname
- if the host is icinga2 monitored you need to remove icinga2 from the host before running puppet so that a new cert, etc. is generated for the host
```
yum remove icinga2-common icinga2-bin icinga2
rm -rf /var/lib/icinga2
rm -rf /etc/icinga2
```
## Remove Old Hostname
- `bob node delete $OLD_FQDN`
- remove from Hiera
- inform the Linux Team (linux-eng@psi.ch) so it can remove from Puppet server the certificate and other data and remove the computer object from the AD

View File

@@ -1,276 +0,0 @@
# Network Configuration
Our Puppet configuration management supports five types of network configuration:
- **auto**: NetworkManager does automatic configuration while respecting local user managed configuration
- **auto_static_ip**: static NetworkManager connection with IP from DNS and network information from DHCP
- **managed**: NetworkManger is fully managed via Hiera/Puppet
- **unmanaged**: network configuration (incl. DNS) is not touched by Puppet
- **legacy**: Puppet keeps network configuration untouched except for DNS configuration and applying `network::*` Hiera settings
Not all types are supported by all RedHat versions:
| Type | RHEL7 | RHEL8 | RHEL9 |
|----------------|---------|---------|---------|
| auto | \- | ✓ | Default |
| auto_static_ip | \- | ✓ | ✓ |
| managed | \- | ✓ | ✓ |
| unmanaged | \- | ✓ | ✓ |
| legacy | Default | Default | \- |
## Automatic Network Configuration
The automatic network configuration will just let NetworkManager do the work as it does it by default.
In Hiera you can select this option with
```
networking::setup: auto
```
And what does NetworkManager actually do by default? It attemps automatic configuration on all interfaces (DHCP, SLAAC). Additionally the user may add desired connections. This might be Wifi, VPN, but also normal Ethernet. Automatic configuration is only attempted if there is no such specific configuration.
DNS configuration as such is learned by autoconfiguration/manual connection configuration and will not be managed by Puppet.
Note that when changing to `auto` all legacy `ifcfg` files for network configuration are removed.
## Automatic Network Configuration with Static IP Address
In a setup where there is just one static IP address which can be resolved via DNS, Puppet can configure a static connection with
```
networking::setup: auto_static_ip
```
Note this only works if there is a DHCP server in that network which provides the network mask and the default gateway IP address.
## Managed Network Configuration
The network configuration can be fully and fine-grained be managed from Hiera with
```
networking::setup: managed
```
and the configuration for the individual connections:
```
networking::connections:
- psi_network
- management_network
networking::connection::psi_network:
interface_name: 'eno0'
ipv4_method: 'manual'
ipv4_address: '129.129.241.66/24'
ipv4_gateway: '129.129.241.1'
ipv6_method: 'disabled'
networking::connection::management_network:
interface_name: 'eno1'
ipv4_method: 'manual'
ipv4_address: '192.168.71.10/24'
ipv6_method: 'disabled'
```
So there is the list `networking::connections` which selects the network connections which should be configured.
Then for each connection name listed there needs to be a hash in Hiera named `networking::connection::$CONNECTION_NAME`.
If you have a fixed IP address which is in the DNS, you might also interpolate the `ipv4_by_dns` or the `ipv6_by_dns` variable:
```
networking::connection::psi_network:
mac_address: '00:50:56:9d:bb:ad'
ipv4_method: 'manual'
ipv4_address: '%{ipv4_by_dns}/24'
ipv4_gateway: '129.129.187.1'
ipv6_method: 'manual'
ipv6_address: '%{ipv6_by_dns}'
ipv6_gateway: 'abcd:1234::1'
```
```{note}
The default value (if not specified) for `ipv6_method` is _auto_.
```
### Ethernet Connection Definition
The default connection type is `ethernet` (alias for `802-3-ethernet`).
The network connection hash needs to specify the NIC for the connection either by name with the key `interface_name` or by MAC address with the key `mac_address`.
Next you need to specify how IPv4 configuration should be done. The key `ipv4_method` supports the values `auto`, `dhcp`, `manual`, `disabled`, `link-local`. All except `manual` do not need further configuation. For `manual` the `ipv4_address` in the CIDR format "IP/network mask bits". For the default router has to be set with the key `ipv4_gateway`.
We did not look into IPv6 configuration yet and usualy it is best to switch it off by setting `ipv6_method` to `disabled`.
To keep an interface down the setting `state` can be set to `down` (default is `up`).
You may also add additional configuration like size of the ring buffer:
```
networking::connection::psi_network:
...
additional_config:
ethtool:
ring-rx: 8192
ring-tx: 8192
```
The first key level is the section in the NetworkManager configuration file (on the command line usually the first part before the dot), the second key level is the name of the value to be set.
### Infiniband Connection Definition
For infiniband connections the configuration is similar to ethernet. Additionally there is there is the `type: 'infiniband` setting accompanied by an `additional_config` key holding an `infiniband` key with the infinibad specific optioins, as you see in below example:
```
networking::connection::ipoib_network:
interface_name: 'ib0'
ipv4_method: 'manual'
ipv4_address: '192.168.1.16/24'
ipv4_gateway: '192.168.1.1'
ipv6_method: 'disabled'
type: 'infiniband'
additional_config:
infiniband:
mtu: 65520
transport-mode: 'connected'
```
### Other Connection Types
NetworkManager also supports other types like `wifi`, `vpn`, `bridge`, `vlan`, etc. Note that other types than `ethernet` and `ethernet` have so far not been tested. Please contact us if you managed to set up some other network type or need help to do so.
### Interface Bonding
For a bonded interfaces there is an own list named `networking::bondings` which defines which configuration defined in `networking::bonding::$NAME` hashes should be appled:
```
networking::bondings:
- bond0
networking::bonding::bond0:
ipv4_method: 'manual'
ipv4_address: '%{ipv4_by_dns}/24'
ipv4_gateway: '129.129.86.1'
ipv6_method: 'ignore'
slaves:
- 'eno5'
- 'eno6'
bond:
mode: '802.3ad'
xmit_hash_policy: 'layer2+3'
miimon: '100'
# optional
additional_config:
ethernet:
mtu: 9000
```
Feel free to extend the documentation here or provide a link for detailed explanation of the bond options.
By default the connection name is automatically used name of the bonding interface, Still, you can use speaking names instead for the connection if you specifiy the interface name explicitely:
```
networking::bondings:
- psi_network
networking::bonding::psi_network:
ifc_name: 'bond0'
ipv4_method: 'manual'
...
```
### Static Routes
Static routes can be added with the `additional_config` key in the connection settings.
Depending on the IP version there is a `ipv4` or `ipv6` subkey which then can contain many `routes` + number keys. So the first route is `routes1`, the second `routes2`, etc.
The minimal entry for `routesX` is the network and then, separated with a comma, the next hop IP. The optional third entry is the metric.
For each route you may also add a option key `routesX_options`, where multiple options in the form `name=value` can be added, again comma separated.
An example:
```
networking::connection::management_network:
interface_name: 'eno1'
ipv4_method: 'manual'
ipv4_address: '192.168.71.10/24'
ipv6_method: 'disabled'
additional_config:
ipv4:
routes1: '10.255.254.0/24,192.168.71.1'
routes2: '10.255.255.0/24,192.168.71.1,100'
routes2_options: 'lock-cwnd=false,lock-mtu=false'
```
All routing options can be found in the Red Hat NetworkManager documentation ([RHEL8](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_and_managing_networking/configuring-static-routes_configuring-and-managing-networking#how-to-use-the-nmcli-command-to-configure-a-static-route_configuring-static-routes), [RHEL9](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html/configuring_and_managing_networking/configuring-static-routes_configuring-and-managing-networking#how-to-use-the-nmcli-command-to-configure-a-static-route_configuring-static-routes)).
### Removing a Connection
A connection does **not get removed** when its configuration has been removed in Hiera and was applied on the node.
To remove it you may do it manually or reboot.
Manual removal is done with `nmcli connection down $ID/$CONNECTION_NAME`:
```
[root@lx-test-dmz-01 ~]# nmcli connection
NAME UUID TYPE DEVICE
dmz_network f77611ac-b6e2-5a08-841e-8a1023eefaed ethernet ens33
ens35 f3ba4a81-8c9b-4aec-88ee-ddffd32f67fa ethernet ens35
[root@lx-test-dmz-01 ~]# nmcli connection down f3ba4a81-8c9b-4aec-88ee-ddffd32f67fa
Connection 'ens35' successfully deactivated (D-Bus active path: /org/freedesktop/NetworkManager/ActiveConnection/2)
[root@lmu-user-dmz-01 ~]# nmcli connection
NAME UUID TYPE DEVICE
dmz_network f77611ac-b6e2-5a08-841e-8a1023eefaed ethernet ens33
[root@lx-test-dmz-01 ~]#
```
### DNS Override
The internal nameservers are configured according to the network zone by Puppet.
If for some reason that is unsuitable, you might set your own in Hiera:
```
networking::nameservers_override:
- 192.33.120.5
- 192.33.121.5
```
### No Automatic Migration to Legacy Configuration
Note that when changing to `managed` all legacy `ifcfg` files and all NetworkManager connections not reflected in Hiera are removed. So if you want to be able to go back to legacy mode you need to backup these files first.
## Unmanaged Network Configuration
Here Puppet keeps the fingers off any network related configuration like interface configuration, DNS or routing.
In Hiera you can select this option with
```
networking::setup: unmanaged
```
When you change to unmanaged network configuration, the configuration on the node will stay as is.
## Legacy Network Configuration
The legacy mode Puppet does not do configuration of network addresses and interfaces. This usually is done by the Kickstart file during OS installation and then not touched any more. Or with manual changes.
Additionally the `network` Puppet module can be used for more complex setups. But as this module is not maintained any more, we phase it out with RHEL9 and suggest to migrate away from it on RHEL8.
The legacy mode is selected by not setting `networking::setup` in Hiera.
### Custom Nameservers
The internal nameservers are configured according to the network zone by Puppet.
If for some reason that is unsuitable, you might set your own in Hiera:
```
networking::nameservers_override:
- 192.33.120.5
- 192.33.121.5
```
## Disable DNS Caching
Except for the `unmanaged` setup mode you may disable DNS caching with
```
networking::enable_dns_caching: false
```

View File

@@ -1,9 +0,0 @@
# Custom NTP Servers
You can add other nameservers to your list by extending in the Hiera key `ntp_client::servers` the list of your network zone (most probably `default`):
```
ntp_client::servers:
'default':
- '172.16.1.15'
```

View File

@@ -1,9 +0,0 @@
# NTP Server
Your node can serve as NTP server. To allow access you need to configure which networks/hosts are allowed to contact `chrony` in the Hiera list `ntp_server::allow`:
```
ntp_server::allow:
- '129.129.0.0/16'
- '10.10.10.10'
```

View File

@@ -1,51 +0,0 @@
# Puppet Agent Configuration
The Puppet Agent provides the basic system configuration as defined in Hiera.
## Automatic Puppet Agent Runs
By default the Puppet Agent runs daily somewhen between 5-8 AM.
The frequency can be changed in Hiera with the key `puppet_client::exec_frequency`.
Allowed parameters are
- 'halfhourly': every 30 minutes
- 'daily': once a day (default)
- 'weekly': every Monday
- 'boot_only': only shortly after bootup
The actual automatic Puppet Agent run is always on the same random time (except for `boot_only`). Check `systemctl list-timers pli-puppet-run.timer` for the exact time on a specific node.
For `daily` and `weekly` the time window is configured in Hiera with `puppet_client::exec_time`, the default is:
```
puppet_client::exec_time: '05:00 -- 08:00'
```
The time format used is '24-hour clock' `HH:MM -- HH:MM`.
## Temporarily Disable Automatic Puppet Agent Runs
Puppet execution can be disabled for a certain amount of time with the
`/opt/pli/libexec/pli-puppet-disable` command:
```
# /opt/pli/libexec/pli-puppet-disable
puppet currently not disabled
# /opt/pli/libexec/pli-puppet-disable '1 week'
# /opt/pli/libexec/pli-puppet-disable
Puppet disabled until: Wed Nov 1 08:00:05 CET 2017
# /opt/pli/libexec/pli-puppet-disable stop
Stopping
# /opt/pli/libexec/pli-puppet-disable
puppet currently not disabled
#
```
The disabling time has to be in the ``date`` format (see date(1)).
## Manual Execution of Puppet Agent
At any time you might update the node configuration by running the Puppet Agent manually. To do so run as user `root` the following command:
```
puppet agent -t
```
If you just wish to see what it would change without doing the actual change on the system, run
```
puppet agent -t --noop
```

View File

@@ -1,17 +0,0 @@
sysctl
======
You may change individual sysctl values in Hiera with the hash `sysctl::values`:
```
sysctl::values:
net.ipv4.tcp_slow_start_after_idle:
value: '0'
net.core.rmem_max:
value: '83886080'
net.core.wmem_max:
value: '83886080'
```
[https://www.kernel.org/doc/Documentation/sysctl/](https://www.kernel.org/doc/Documentation/sysctl/)
[https://www.kernel.org/doc/Documentation/networking/](https://www.kernel.org/doc/Documentation/networking/)

View File

@@ -1,10 +0,0 @@
# Configuration to Send/Relay Emails
Verbose boot can be configured in Hiera with
```
base::enable_verbose_boot: true
```
By default it is disabled on workstation type systems.

View File

@@ -1,4 +0,0 @@
# Desktop Configuration Guides
```{tableofcontents}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 253 KiB

File diff suppressed because one or more lines are too long

Before

Width:  |  Height:  |  Size: 22 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 242 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 1.4 MiB

View File

@@ -1,40 +0,0 @@
# Alternative Desktops/Window Managers
Per default Gnome is installed, but sometimes there is the wish for other Desktops or Window Managers.
When changing the default desktop, please check the last chapter "Reset User Default Desktop".
## XFCE
XFCE is installed when `desktop::enable_xfce: true` is set in Hiera.
It then is also used by default with `base::xfce_default: true` or `desktop::session_manager: xfce`.
## IceWM
IceWM is installed when `desktop::enable_icewm: true` is set in Hiera.
It then is also used by default with `desktop::session_manager: icewm-session`.
## Other (e.g. KDE)
The respective Desktop needs to be installed, either manually or through Puppet.
The respective Session Manager can be set as system default in Hiera with `desktop::session_manager`.
The name of the Session Manager you find in `/usr/share/xsessions/*.desktop` for `Xorg` and `/usr/share/wayland-sessions/*.desktop` for `Wayland` (not supported at PSI). For the `desktop::session_manager` Hiera setting use the respective file name excluding the `.desktop` suffix.
Example KDE:
```
base::pkg_group::kde:
- '@KDE Plasma Workspaces'
base::package_groups:
- 'kde'
desktop::session_manager: 'startplasma-x11'
```
## Reset User Default Desktop
Note when changing the default desktop aka Session Manager, previous users will still get the one they used before. To reset that, you need to
- stop AccountsService (`systemctl stop accounts-daemon.service`)
- delete `/var/lib/AccountsService/users/*` (for `gdm`)
- delete `/var/cache/lightdm/dmrc/*.dmrc` (for `lightdm`)
- delete `/var/lib/lightdm/.cache/lightdm-gtk-greeter/state` (for `lightdm` with `lightdm-gtk-greeter`)
- start AccountsService (`systemctl start accounts-daemon.service`)

View File

@@ -1,15 +0,0 @@
# Autologin
To configure a user to be automatically logged into the Desktop at system startup, use the following Hiera keys:
- `desktop::autologin_enable`: enable/disable autologin
- `desktop::autologin_user`: user account who should be looged in
Example:
```
desktop::autologin_enable: true
desktop::autologin_user: 'proton'
```
This is only implemented for `gdm` (default) and not yet for `lightdm` (rarely used at PSI)

View File

@@ -1,18 +0,0 @@
# Banner Message
To show a specific message on the Desktop login screen, use the Hiera key `gdm::banner_message`:
```
gdm::banner_message: 'Good morning, this is a test banner message!'
```
Which then looks on RHEL8 like
![Banner Message on Login Screen](_static/banner_message.png)
The default is
```
Please contact the service desk (phone: 4800) in case you have problems logging in.
```
As the key suggests this is only implemented for `gdm` (default).

View File

@@ -1,54 +0,0 @@
# Enabling Desktop Environment
A desktop/graphical interface can be enabled on any kind of systems (regardless of puppet role).
## Full Desktop Configuration
A full desktop configuration with Gnome, office software and printing is enabled for the Puppet roles `role::workstation` and `role::console` or, independent of the role, in Hiera with
```
base::is_workstation: true
```
## Customized Desktop Configuration
But desktop features can also be individually be switched on or off.
For printing check out the [respective guide](printing).
The availability of the desktop is controlled with
```yaml
base::enable_desktop: true
```
By default this will install and enable Gnome as desktop and gdm as Display Manager. Without below options do not have any effect!
The desktop configuration can be further refined and/or adapted by following keys:
```yaml
desktop::display_manager: gdm # available options: gdm (default), lightdm
# this will set the default session manager aka desktop
desktop::session_manager: gnome-xorg # availale options: gnome-xorg (default on RHEL8), gnome-wayland (default on RHEL9), gnome-classic, xfce, ...
```
Individual desktops can be enabled/disabled via:
```yaml
desktop::enable_gnome: true # true (default)
desktop::enable_xfce: true # false (default)
desktop::enable_icewm: true # false (default)
```
The installation of office applications can be enforced with
```yaml
desktop::enable_office_apps: true
```
Further refinements can be done as documented in the other guides in this section.
Finally here a rough overview of the destopprofile structure:
![Structure desktop profile](_static/desktop_profile.svg)
[Structure desktop_profile](_static/desktop_profile.excalidraw)

View File

@@ -1,8 +0,0 @@
# Gnome
## Gnome Tweaks
The Gnome desktop can be customized in various ways.
You can use [Gnome Tweaks](https://wiki.gnome.org/action/show/Apps/Tweaks) to modify the status bar, virtual desktops, etc.
![](_static/gnome_tweaks_start.png)
![](_static/gnome_tweaks_menu.png)

View File

@@ -1,15 +0,0 @@
# Keyboard Layout
The default keyboard layout as well as the list of available layouts to select from can be configured with the `desktop::keyboard_layouts` key in Hiera. The value is an list of keyboard layout identifier short.
Default is:
```
desktop::keyboard_layouts:
- 'us'
- 'de'
- 'ch'
```
The available options you find at `/usr/share/X11/xkb/rules/base.lst` in the section `! layout`.

View File

@@ -1,44 +0,0 @@
# Printing
Printing at PSI usually goes via the Findme printing system. More details about this and what printers are available can be found here: https://intranet.psi.ch/de/computing/pull-printing .
Each user can see his individual printing queue on: https://printmgmt.psi.ch/user
## Configuration
Enable printing on a system (printing is usually enabled by default on desktop systems):
```yaml
base::enable_printing: true
```
Special printing configuration:
```yaml
# Configure your own printers
printing::printers:
'C5500':
- 'Findme'
'C400':
- 'ODGA_006'
# set default printer - default if not specified: Findme
printing::default_printer: 'ODGA_006'
```
## Usage / Troubleshooting
```bash
# print a file (using the default queue)
[root@lx-test-02 ~]# lp test.txt
# list all print jobs
[root@lx-test-02 ~]# lpq -a
Rank Owner Job File(s) Total Size
1st root 3 test.txt 1024 bytes
# delete a print job from the queue
[root@lx-test-02 ~]# lprm 3
# list all print jobs
[root@lx-test-02 ~]# lpq -a
no entries
```

View File

@@ -1,37 +0,0 @@
# Gnome Program Starter: Dash, Dock or Panel
The default program starter on Gnome is called dash and it is a bit strange for users of other operating systems. But there are extensions which provide a move conventional dock like MacOS or a panel like Windows.
The dock can be enabled in Hiera with
```
gnome::program_starter: dock
```
other valid values are `panel` and `dash`.
The dock is default in RHEL9 as the panel currently has a [bug on RHEL9](https://github.com/home-sweet-gnome/dash-to-panel/issues/1891) which limits its usability.
Note that this will only set the default configuration which can be changed by the user.
To reset the user specific configuration run
```
# only reset gnome dconf settings
dconf reset -f /org/gnome
# reset all dconf settings
dconf reset -f /
# if the commands above does not work
rm ~/.config/dconf/user
```
## Known Problems
### Application Grid
Applications marked as favorites are not shown in the application grid. This is [expected behavior](https://gitlab.gnome.org/GNOME/gnome-shell/-/issues/4115).
### Dock
Favorite apps might not be shown when certain hardware is connected, see [upstream bug](https://github.com/micheleg/dash-to-dock/issues/941).
### Panel
Empty application grid on RHEL9, see [upsteam bug](https://github.com/home-sweet-gnome/dash-to-panel/issues/1891).

View File

@@ -1,17 +0,0 @@
# Screen Lock
To configure a user to be automatically logged into the Desktop at system startup, use the following Hiera keys:
- `gnome::lock_enabled`: enable/disable screen lock (default `true`)
- `gnome::idle_delay`: idle time in seconds until the screen switched off (default 5 minutes)
- `gnome::lock_delay`: time in seconds between screen switched off and actually being locked (default `0`)
Example:
```
gnome::lock_enabled: true
gnome::idle_delay: 900
gnome::lock_delay: 15
```
This is only implemented for Gnome desktop (default) and not yet for XFCE or any other desktop/window manager.

View File

@@ -1,43 +0,0 @@
# RDP Remote Access with XRDP
The basic configuration in Hiera is:
```
xrdp::enable: true
```
This allows by default to create new virtual desktop sessions or to connect to a local desktop session which is shared over VNC using port 5900.
Some more details could be ajusted when needed:
To disallow access to a shared desktop do
```
xrdp::shared_desktop::enable: false
```
whereas to disallow virtual desktop sessions there is
```
xrdp::virtual_desktop::enable: false
```
Often you may not want the user to keep their desktop sessions open forever, so you may configure their sessions to be closed after they have been disconnected for some time (seconds). The default value is `0` which disables this feature.
```
xrdp::disconnected_session_timeout: 3600
```
Also you may choose the backend for the virtual sessions, either `libxup.so` (default) or `libvnc.so`:
```
xrdp::virtual_desktop::backend: 'libvnc.so'
```
If you want to allow the same user to be able to open in parallel a local and a remote session then you can enable the systemd nest feature:
```
xrdp::nest_systemd::enable: true
```
## Notes
Users that are only allowed to login locally to the system (i.e. with an entry like this `+:ebner-adm:LOCAL` in `/etc/security/access_users.conf`) cannot use RDP virtual sessions.

View File

@@ -1,4 +0,0 @@
# Files, Volumes and Network Shares
```{tableofcontents}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 74 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 163 KiB

View File

@@ -1,41 +0,0 @@
# AFS
**Deprecation Note**
We plan to migrate away from AFS. We do not support AFS for RHEL9. Please contact the Linux Core Group for migration options.
AFS is depending on the Puppet role already configured by default. Additionally it can be enabled or disabled in Hiera with `base::enable_afs`:
```
base::enable_afs: true
```
respectively to disable:
```
base::enable_afs: false
```
Following details can be modified, but are usually not required:
- `afs_client::mountpoint`
- `afs_client::root_volume`
- `afs_client::enable_dynroot`
- `afs_client::min_cache_size` (e.g. `8G`)
- `afs_client::files`
- `afs_client::dcache`
When disabling AFS the daemon is not automatically switched off. There is additional manual effort required on the host:
```
systemctl disable yfs-client.service
reboot
```
If you want to do it without reboot, first stop all processes using AFS. You might figure them out e.g. with `lsof | grep /afs`.
Then do
```
umount /afs
systemctl stop yfs-client.service
systemctl disable yfs-client.service
afsd -shutdown
```

View File

@@ -1,36 +0,0 @@
# autofs
How to configure the `autofs` daemon.
## Daemon Configuration
In Hiera `base::enable_autofs` controls the `autofs` daemon. Start it with:
```
base::enable_autofs: true
```
or to keep it shut down
```
base::enable_autofs: false
```
or to keep Puppet off
```
base::enable_autofs: null
```
Note that `base::enable_central_storage_mount: true` always enables `autofs` and `base::enable_autofs` will be ignored.
## Automatic NFS on /net
The automatic mount of exported shares of an NFS server below `/net/$SERVER` is controlled in Hiera with the `autofs::slash_net` flag.
On RHEL7 or RHEL8 this feature is by default enabled, on RHEL9 and later disabled.
To have it always enabled do
```
autofs::slash_net: true
```
## Configure own autofs Maps
For own maps place the `auto.master` part of the configuration in `/etc/auto.master.d/` in an unique file with the `.autofs` suffix. From there you link your map files, which can be placed everywhere, often directly in `/etc`. To manage this via Puppet/Hiera you might check out the [Distribute Files Guide](../files/distribute_files).

View File

@@ -1,39 +0,0 @@
# Central Storage Mount (/psi.ch)
Mounts `/psi.ch` which gives Kerberos protected access to all network shares (NFS or CIFS/SMB/Windows) which have been configured/opened for this feature.
## Configuration
In Hiera enable it with
```
base::enable_central_storage_mount: true
```
or disable it with
```
base::enable_central_storage_mount: false
```
On workstation type systems this is enabled by default starting with RHEL9
## Adding a Share
For a new or exsting share find a suitable path below `/psi.ch/group` or `/psi.ch/project` and inform the [NAS Team](mailto:cits2-nas@psi.ch) or the [Linux Core Group](mailto:linux-eng@psi.ch).
## Kerberos and Permanent Running Software
Checkout [Permanent Kerberos with gssproxy and Password Keytab](../basic/gssproxy_with_keytab) if you want to access this, e.g. with background processes without having to type passwords (`kinit`) regularly.
## Debugging
Is autofs running and fine?
```
sudo systemctl status autofs
sudo journalctl -u autofs
```
Is the firewall blocking access to the file server?
For NFS shares, are there network access restrictions on server side for the share?

View File

@@ -1,131 +0,0 @@
# Distribute Files
With Hiera it is possible to download files and git repositories as well as to create directories and symlinks.
## Create Directories
The `files::directories` hash specifies directories to be created. The keys of the hash are the absolute pathnames of the directories, the optional value a hash with:
- `owner`: file owner (optional, default `root`)
- `group`: file owner (optional, default `root`)
- `mode`: file permissions (optional, default `755`)
Parent directories are automatically created with default settings. If that is not desired, a custom definition for each parent directory is required.
Example:
```
files::directories:
/etc/test1:
/etc/test2/foo/bar:
owner: 'buchel_k'
group: 'unx-nogroup'
mode: '775'
```
## Create Symlinks
The `files::symlinks` hash is used to configure symlinks. The keys of the hash are the absolute
pathnames of the symlinks, the values of the hash are the corresponding symlink
targets.
Example:
```
files::symlinks:
'/opt/foo': '/var/lib/foo'
```
Per default the symlink definitions are not merged over the full Hiera hierarchy, only the most specific definiton is used. To allow merge, set
```
files::symlinks::merge: true
```
Then also existing files and symlinks are not overwritten. Also this can be changed with
```
files::symlinks::force: true
```
but this is then applies for all symlink definitions.
## Create (Text) Files with Specific Content
Textfiles with specific content can be created as follows:
```yaml
files::files:
/testdir/test/three:
content: |
hello world t
this is a test
```
## Delete Files / Directories
Individual files and directories can be deleted as follows:
```yaml
files::files:
/testdir/test/two:
ensure: absent
force: true
```
The option `force: true` is only needed for directories. For a directory, if absent, the directory will not be deleted (even if it is empty)!
## Download Git Repositories
To synchronize a git repository to the host you might list them in the `files::git` hash. The key is the destination directory and the value is a hash with following options:
- `url`: URL of the public git repository to clone
- `revision`: what branch, tag or commit-hash should be checked out (optional)
Example:
```
files::git:
/var/test/container-images:
ensure: latest
url: 'https://git.psi.ch/linux-infra/container_images.git'
revision: 'main'
```
If the `ensure` is missing, it will initialize the checkout with the default branch, but afterwards leave the checkout as is and not try to ensure that it is on a given revision (branch, tag or commit).
Possible values for `ensure` are: present, bare, mirror, absent, latest .
More details on the possible values of ensure you can find in [this documentation](https://forge.puppet.com/modules/puppetlabs/vcsrepo/reference#ensure)
Note that submodules are automatically initialized.
## Download Files
Files to download need to be placed on a git repository on `git.psi.ch` (internal) or `gitlab.psi.ch` (DMZ, Extranet, Tier3), where they need to be publicly available.
For configuration in Hiera there is the `filecopy::files` hash where the keys is the destination path of the file. And the value is another hash with following options:
- `repo`: the Git repository to download from
- `branch`: the Git branch in the repository (optional, default `master`)
- `path`: the file path inside the repository
- `owner`: file owner (optional, default `root`)
- `mode`: file permissions (optional, default `0644`)
Example:
```
filecopy::files:
'/tmp/test1':
repo: 'talamo_i/copy-file-test'
path: 'abc'
mode: '0600'
owner: 'talamo_i'
```
Note that the `filecopy::files` hash is **not merged** over the hierarchy, so only the most specific one will apply.
This download functionality can be disabled with
```
base::enable_filecopy: false
```

View File

@@ -1,172 +0,0 @@
# Mounting Volumes
Managing mount points of local or network volumes can also be done in Hiera.
For more automatic network data setups please look at
- [Windows Drives in Home Directory](windows_drives_in_home)
- [Central Storage Mount](central_storage_mount)
- [autofs](autofs)
- [AFS](afs)
## Managing Mountpoints in Hiera
The configuration in Hiera is done with two parts:
1. the definition of a mountpoint (`mounter::def::$NAME`)
2. the list of mount points actually configured on a system (`mounter::mounts`)
Due to this the mountpoints can be prepared once on a high scope (e.g. for all systems in an environment), but then the individual systems pick out whatever is required for them.
Example:
```
mounter::def::scratch:
ensure: 'mounted'
mountpoint: '/scratch'
device: '/dev/vg_data/lv_scratch'
type: 'xfs'
mounter::mounts:
- 'scratch'
```
The directory of the mountpoint is automatically created when missing.
For auto-mounts, add another option to the mountpoint definition:
```
auto: true
```
## NFS
Remote NFS mountpoints can be defined as in following example:
```
mounter::def::data1:
'ensure': 'mounted'
'device': 'x01dc-fs-1:/export/X01DC/Data1'
'mountpoint': '/sls/X01DC/Data1'
'type': 'nfs'
'options': 'nfsvers=4.2,sec=krb5'
mounter::def::controls:
'ensure': 'mounted'
'device': 'sls-hafs:/export/sls/controls'
'mountpoint': '/gfa/.mounts/sls_controls'
'type': 'nfs'
mounter::mounts:
- 'data1'
- 'controls'
```
Ideally use NFSv4 (option `nfsvers=4.2`) and Kerberos authentication (option `sec=krb5`) is used. And of course also the NetApp side needs to be prepared accordingly.
Following options are possible for `sec`:
- `sys` client enforces access control (default on NFS3)
- `krb5` server enforces access control, client user authenticates with Kerberos
- `krb5i` server enforces access control, client user authenticates with Kerberos and traffic is integrity checked
- `krb5p` server enforces access control, client user authenticates with Kerberos and traffic is fully encrypted
NFS and Kerberos also needs ID mapping, which is automatically configured to the default domain `psi.ch`. Should a different domain be required, you may set it with Hiera:
```
nfs_idmap::domain: 'ethz.ch'
```
Checkout [Permanent Kerberos with gssproxy and Password Keytab](../basic/gssproxy_with_keytab) if you want to access a Kerberos protected share, e.g. with background processes without having to type passwords (`kinit`) regularly.
## CIFS
### CIFS with Multiuser Option and Kerberos
Mounting a CIFS share with the `multiuser` option and Kerberos has the advantage that no password is needed and each user gets personal access rights checked on the server side. But she/he needs to have, similar to AFS, an appropriate Kerberos ticket. Additionally the option `cifsacl` is required for showing the proper file owner.
```
mounter::def::scratch:
ensure: 'mounted'
device: '//scratch01.psi.ch/scratch'
mountpoint: '/media/scratch'
type: 'cifs'
options: 'multiuser,sec=krb5,cifsacl'
mounter::mounts:
- 'scratch'
```
For shares on NetApp (`fs00.psi.ch` until `fs03.psi.ch`) you can mount the `data` folder which contains all shares on this server:
```
mounter::def::fs00:
ensure: 'mounted'
device: '//fs00.psi.ch/data'
mountpoint: '/cifs/fs00'
type: 'cifs'
options: 'multiuser,sec=krb5,cifsacl'
mounter::mounts:
- 'fs00'
```
This only works if `everybody` has read access to the share itself, but that is not needed for subfolders.
Else you need a password as below or a keytab (here feel free to ask the Linux Group for support).
Checkout [Permanent Kerberos with gssproxy and Password Keytab](../basic/gssproxy_with_keytab) if you want to access a Kerberos protected share, e.g. with background processes without having to type passwords (`kinit`) regularly.
### CIFS with User and Password
Remote CIFS mountpoints can be defined as follows:
```
mounter::def::emf:
ensure: 'mounted'
device: '//172.23.75.16/Users'
mountpoint: '/emf/jeol2200fs/k2'
type: 'cifs'
options: 'credentials=/etc/cifs-utils/cifs_mpc2375,uid=35667,gid=35270,forcegid,file_mode=0660,dir_mode=0770'
cifs:
cifs_mpc2375:
username: 'allowedWindowsUser'
password: 'ENC[PKCS7,MIIBeQYJKoZIhvc...]'
mounter::mounts:
- 'emf'
```
In the above example, we need to create a `credentials` file with the content below the `cifs` -> `$NAME` parameter. This file will
be called like the key below `cifs` and will be located in `/etc/cifs-utils` and will contain information about the
username and password allowed to mount it. To be used the file needs to be referenced as `credentials=` in the mount options.
## Systemd Automount
Adding the options `noauto,x-systemd.automount` will make the mount not happen on startup, but will be automounted on the first use of the mountpoint.
## Bind Mounts
Bind mounts can be defined as follows:
```
mounter::def::e10550:
'ensure': 'mounted'
'device': '/gpfs/perf/MX/Data10-pro/e10550'
'mountpoint': '/sls/MX/Data10/e10550'
'type': 'none'
'options': 'bind,_netdev,x-systemd.requires-mounts-for=/gpfs/perf/MX/Data10-pro'
```
Note that beside the mandatory `bind` option there is
- `_netdev` to be set when the directory to bind (`device`) is on a network volume
- `x-systemd.requires-mounts-for=$OTHER_MOUNTPOINT` ensures that systemd prepares the bind mount after the volume on which the directory to bind (`device`) is located
## Removing a Mount
Only removing a mount point definition from Hiera does not unmount and remove it from the node. This can then be done manually by unmounting it and removing it from `/etc/fstab`.
Alternatively an `absent` mount defintion as in below example will automatically unmount and remove the mount `/etc/fstab` entry:
```
mounter::def::scratch:
ensure: 'absent'
mountpoint: '/media/scratch'
mounter::mounts:
- 'scratch'
```
This configuration can then be removed again after it has been rolled out once to all concernded nodes.

View File

@@ -1,50 +0,0 @@
# NFS Server
Your node can serve as NFS server. This guide is for RHEL8 and later.
By default RHEL8 serves NFS3 und NFS4, whereas RHEL9 and later will serve only NFS4. Still NFS3 can be configured if required.
To enable the NFS server, set `base::enable_nfs_server` accordingly in Hiera:
```
base::enable_nfs_server: true
```
## Exports
Then the exports go below `nfs_server::exports`:
```
'/home/meg/dcbboot':
clients:
- hosts: '*'
options: 'rw,sync,no_root_squash,no_subtree_check'
'/export/swissfel_athos/raw/maloja-staff':
options: 'fsid=1012'
clients:
- hosts: 'sf-export-1[2-3]-100g'
options: 'rw,async,no_root_squash'
```
For each directory you want to export place its path as key to `nfs_server::exports`. Below you can set the global options in `options` and a list of client specific access restrictions and options below `clients`.
Possible client host definitions and options you find in [`exports(5)`](https://man7.org/linux/man-pages/man5/exports.5.html).
## Non-Standard NFS Server Configuration
If you wish non-standard configuration in `/etc/nfs.conf`, you may set it in Hiera with key `nfs_server::nfs_conf`:
```
nfs_server::nfs_conf:
nfsd:
udp: 'y'
vers3: 'n'
threads: 32
```
For more details see [`nfs.conf(5)`](https://man7.org/linux/man-pages/man5/nfs.conf.5.html)
## Kerberos
Kerberos support for the NFS server is configured automatically by Puppet.
## Exporting from GPFS
If you want to export data backed on GPFS, please set
```
nfs_server::after_gpfs: true
```
to start the NFS server only after GPFS is ready.

View File

@@ -1,28 +0,0 @@
# Partially Shared Home - Automatically Link Files in Home - Admin Guide
Shared homes create a set of problems depending on how much a software can handle running multiple instances of the same software, maybe with different version, on different hosts, probably at the same time.
Still it is useful to have certain tool settings always with you.
Thus we suggest to have per default a local home drive `base::local_homes: true`, but having only selectively shared software configuration.
That configuration needs then to be on a file share, best the home drive (U:).
To ensure the tools find their configuration at the expected place, symlinks need automatically to be created which point to the shared version.
On the homde drive (U:) therein is also the personal configuration file of this feature (`linux_home_links.yaml` or `.linux_home_links.yaml`).
The mountpoints can be found in `~/network-drives`.
The `U:` drive is there named `home`, while the rest keeps their original share name.
This feature is enabled by default on workstation type systems starting RHEL9.
It can be controlled in Hiera with:
```
user_session::automatic_links_in_home: true
```
To the [User Guide](../../../user-guide/partially_shared_home).
This features depends on the ["Windows Drives in Home Directory" feature](windows_drives_in_home).
Note this does not work for RHEL7.

View File

@@ -1,98 +0,0 @@
# Partitioning
## Resize System Volumes
The size of the system volumes (inside volume group `vgroot`), are automatically set on a rather small size at the initial installation.
Note that due to the limitations of XFS a volume can only be increased, not be shrunk.
### Get Available Space and Volume Sizes
To check how much space is still available, use `pvs` and look for the volume group `vgroot`:
```
[root@lxdev00 ~]# pvs
PV VG Fmt Attr PSize PFree
/dev/sda2 vg_root lvm2 a-- <62.94g <5.19g
[root@lxdev00 ~]#
```
Then `lvs` gives the sizes of the volumes inside:
```
[root@lxdev00 ~]# lvs
LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lv_home vg_root -wi-ao---- <31.44g
lv_openafs vg_root -wi-ao---- 2.00g
lv_root vg_root -wi-ao---- 12.31g
lv_tmp vg_root -wi-ao---- 2.00g
lv_var vg_root -wi-ao---- 8.00g
lv_var_log vg_root -wi-ao---- 2.00g
[root@lxdev00 ~]#
```
### Extend System Volume
This can be done in Hiera with the `vgroot::path` key where for each volume which should be increased a new minium size can be set:
```
vgroot::path:
lv_root: 15GB
lv_var_log: 3GB
```
This is then applied on the next puppet run, you may trigger one as `root`:
```
puppet agent -t
```
## Custom Partitioning
To add a new volume to the system disk, you need to address the [lvm Puppet module](https://forge.puppet.com/modules/puppetlabs/lvm) directly in Hiera::
```yaml
lvm::volume_groups:
vg_root:
physical_volumes:
- /dev/nvme0n1p3
logical_volumes:
lv_data:
size: 3TB
fs_type: 'xfs'
mountpath: '/mnt/data'
size_is_minsize: true
```
Please note that you need also to list the partition on which `vg_root` is located.
The same you can do to add partitions outside of the system disk, but here you need to define the full LVM volume group:
```yaml
lvm::volume_groups:
vg_data:
physical_volumes:
- '/dev/sdb'
logical_volumes:
lv_usr_local:
mountpath: '/usr/local'
fs_type: 'xfs'
```
Sometimes the classical disk names `sda` and `sdb` are not stable or may change due to a connected USB stick.
You can address a disk also by UID or hardware path with one of the links below `/dev/disk`:
```yaml
lvm::volume_groups:
vg_data:
followsymlinks: true
physical_volumes:
- '/dev/disk/by-path/pci-0000:01:00.0-scsi-0:2:0:0'
logical_volumes:
lv_data:
mountpath: '/mnt/vd0'
fs_type: 'ext4'
```
but for the `lvm` Puppet module to accept this you need to set `followsymlinks: true`.

View File

@@ -1,78 +0,0 @@
# Resize VM Disk
To increase the harddisk of a VM, go to vcenter.psi.ch, select the machine, right click, and select __Edit Settings ...__
![](_static/resize_vm_disk_01.png)
Increase the harddisk size to the value you need to have:
![](_static/resize_vm_disk_02.png)
Now connect to the system and change to root.
List block devices:
```bash
[root@awi-ci-01 ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 42G 0 disk
├─sda1 8:1 0 600M 0 part /boot/efi
├─sda2 8:2 0 1G 0 part /boot
└─sda3 8:3 0 32.4G 0 part
├─vg_root-lv_root 253:0 0 14G 0 lvm /
├─vg_root-lv_var_tmp 253:1 0 2G 0 lvm /var/tmp
├─vg_root-lv_var_log 253:2 0 3G 0 lvm /var/log
├─vg_root-lv_var 253:3 0 8G 0 lvm /var
├─vg_root-lv_tmp 253:4 0 2G 0 lvm /tmp
└─vg_root-lv_home 253:5 0 2G 0 lvm /home
```
The command will probably not show the correct disk size. For this the VM either needs to be rebooted or you can trigger a re-read of the disksize without a reboot with this command (replace `sda` with the correct diskname you resized):
```
echo 1 > /sys/block/sda/device/rescan
```
Extend partition:
```bash
[root@awi-ci-01 ~]# parted /dev/sda
GNU Parted 3.2
Using /dev/sda
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) print
Model: VMware Virtual disk (scsi)
Disk /dev/sda: 45.1GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 630MB 629MB fat32 EFI System Partition boot, esp
2 630MB 1704MB 1074MB ext4
3 1704MB 36.5GB 34.8GB lvm
(parted) resizepart 3
End? [36.5GB]? -0
(parted) print
Model: VMware Virtual disk (scsi)
Disk /dev/sda: 45.1GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 630MB 629MB fat32 EFI System Partition boot, esp
2 630MB 1704MB 1074MB ext4
3 1704MB 45.1GB 43.4GB lvm
(parted) quit
```
Resize physical volume:
```bash
[root@awi-ci-01 ~]# pvresize /dev/sda3
Physical volume "/dev/sda3" changed
1 physical volume(s) resized or updated / 0 physical volume(s) not resized
```
Now the effective volumes of the machine can be extended as documented [Partitioning](partitioning.md)

View File

@@ -1,12 +0,0 @@
# SFTP Server
Per default SFTP is enabled in the `sshd` configuration. You may disable it with:
```
ssh_server::sftp::enable: false
```
or change it with e.g.:
```
ssh_server::sftp::server: '/usr/libexec/openssh/sftp-server -l INFO'
```
which configures more logging.

View File

@@ -1,30 +0,0 @@
# systemd-tmpfiles
[`systemd-tmpfiles`](https://www.freedesktop.org/software/systemd/man/latest/systemd-tmpfiles.html) can be used to create, delete, and clean up files and directories.
The system-wide configuration is in [`/etc/tmpfiles.d/*.conf`](https://www.freedesktop.org/software/systemd/man/latest/tmpfiles.d.html).
## Hiera Configuration
In Hiera below `profile::tmpfiles:` you can set several use cases (here `podman`) with individual configuration at `content` which contains the full configuration as documented in [`tmpfiles.d(5)`](https://www.freedesktop.org/software/systemd/man/latest/tmpfiles.d.html):
```
profile::tmpfiles:
podman:
content: |
# This file is distributed by Puppet: profile::tmpfiles
# See tmpfiles.d(5) for details
# Remove podman temporary directories on each boot
# https://github.com/containers/podman/discussions/23193
R! /tmp/containers-user-*
R! /tmp/podman-run-*
```
To undo above configuration requires
```
profile::tmpfiles:
podman:
ensure: absent
```

View File

@@ -1,20 +0,0 @@
# Windows Drives in Home Directory
The Windows shares which get automatically connected on a PSI Windows system can also be automatically mounted on login on RHEL system.
The mountpoints can be found in `~/network-drives`.
The `U:` drive is there named `home`, while the rest keeps their original share name.
This feature is enabled by default on workstation type systems.
It can be controlled in Hiera with:
```
user_session::mount_cifs_shares: true
```
These mounts are created with the first user session and will end with the last session closed.
If for some reason they are not created (e.g. due to offline login), you might execute `setup-network-drives` to bring them back in again.
Note this does not work for RHEL7.

View File

@@ -1,4 +0,0 @@
# Monitoring Configuration Guides
```{tableofcontents}

Binary file not shown.

Before

Width:  |  Height:  |  Size: 202 KiB

View File

@@ -1,41 +0,0 @@
# Configure Central Logging to Elastic
To ship the logs of a system to the central logging service (Elastic) following pre steps are needed:
1. Have a space in Elastic to ship the logs to
2. Have a space API key
Both prerequisites you can check with / talk to Michel Rebmann (michel.rebmann@psi.ch) / Group 9522, which will provide a configuration similar to the following:
```
{
"id" : "${space_id}",
"name" : "input_${space_name}",
"api_key" : "${space_api_key}",
"encoded" : "${space_encoded_key}"
}
```
Afterwards the log shipping can be configures as follows in hiera:
```
base::enable_elastic: true
elastic::space: "${space_name}"
elastic::space_api_key: "${space_id}:${space_api_key}" # The resulting string should be encrypted
```
```{note}
Replace the space name as well as the space_api_key according to your setup.
```
Notice that `space` contains the `name` without the `input_` prefix, while `space_api_key` contains a `:`-separated value:
* the first part corresponds to the `id` of the space,
* the second part corresponds to the `api_key`.
* The resulting string `"${space_id}:${space_api_key}"` **should be encrypted** with [eyaml](https://linux.psi.ch/admin-guide/puppet/hiera.html?highlight=eyaml#encrypting-data-with-the-public-key)
By default __all__ journald logs are shipped to the central Elastic instance. If you want to __limit__ the logs to specific units, the units can be specified as follows:
```
elastic::systemd_units:
- 'sshd.service'
```

View File

@@ -1,297 +0,0 @@
# Icinga2 Configuration
Icinga2 is productive, but the checks are still getting added:
- ✅ standard Linuxfabrik checks
- 🏗️ support for automatically installed Icinga1 checks by Puppet ([see issue](https://git.psi.ch/linux-infra/issues/-/issues/419))
- ✅ support for custom checks
The overview of your nodes in Icinga2 you get at [monitoring.psi.ch](https://monitoring.psi.ch) and there you can handle the alerts and create service windows, etc.
But the configuration as such is not done therein, but in Hiera and automatically propagated.
## TL;DR
I, admin of xyz.psi.ch want ...
... **monitoring with e-Mails during office hours**:
```
icinga2::enable: true
icinga2::agent::enable: true
icinga2::alerting::enable: true
```
... **monitoring with SMS all around the clock**:
```
icinga2::enable: true
icinga2::agent::enable: true
icinga2::alerting::enable: true
icinga2::alerting::severity: 1
```
... **just be able to check monitoring state on monitoring.psi.ch**:
```
icinga2::enable: true
icinga2::agent::enable: true
icinga2::alerting::enable: false
icinga2::alerting::severity: 5
```
... **no monitoring**:
```
icinga2::enable: false
```
## Basic Configuration
Enable monitoring with Icinga2 by
```
icinga2::enable: true
```
(which is `false` by default for RHEL7 and RHEL8, but `true` for RHEL9 and later).
This only does the ping test to check if the host is online on the network. For further checks on the host itself the agent needs to be started:
```
icinga2::agent::enable: true
```
(also here it is `false` by default for RHEL7 and RHEL8, but `true` for RHEL9 and later).
Still no alerts are generated, respectively they are suppressed by a global infinite service window. If you wish different, set
```
icinga2::alerting::enable: true
```
Per default these alerts are now sent during office hours to the admins. For further notification fine tuning checkout the chapters Notifications and Check Customization.
Finally, if Icinga2 shall be managed without Puppet (not recommended except for Icinga2 infrastructure servers), then set
```
icinga2::puppet: false
```
## Web Access
Users and groups in `aaa::admins` and `icinga2::web::users` will have access to these nodes on [monitoring.psi.ch](https://monitoring.psi.ch).
Prefix the group name with a `%` to distinguish them from users.
## Notifications
### Notification Recipients
By default the notifications are sent to all admins, this means users and groups listed in Hiera at `aaa::admins` with the exception of the default admins from `common.yaml` and the group `unx-lx_support`. If the admins should not be notified, then disable the sending of messages with
```
icinga2::alerting::notify_admins: false
```
Additionally to/instead of the admins you can list the notification recipients in the Hiera list `icinga2::alerting::contacts`. You can list
- AD users by login name
- AD groups with `%` as prefix to their name
- plain e-mail addresses
### Notificiation Time Restrictions
Notificiations for warnings and alerts are sent out by default during office hours, this means from Monday to Friday 08:00 - 17:00.
This can be configured in Hiera with the `icinga2::alerting::severity` key which is `4` by default. Following options are possible:
| node severity | media | time |
|---------------|------------------|--------------|
| `1` | SMS and e-mail | 24x7 |
| `2` | e-mail | 24x7 |
| `3` | e-mail | office hours |
| `4` | e-mail | office hours |
| `5` | no notifications | never |
(Currently `3` and `4` behave the same.)
Please note that services where the `criticality` variable is set then time when notifications are sent out is also restricted:
| service criticality | time |
|---------------------|--------------|
| - | 24x7 |
| `A` | 24x7 |
| `B` | office hours |
| `C` | never |
The minimal settings are applied, e.g. a service with criticality `C` will never cause a notificiation independent of the node severity.
To receive notification messages over SMS, you need to register your mobile phone with Icinga2. You may request this informing icinga2-support@psi.ch about your wish. Alternatively you will get an e-mail with the request to do so when the first SMS was supposed to be sent out for you and the phone number is still missing.
## Default Checks
By default we already run a comprehensive set of checks. Some of them can be fine-tuned in Hiera.
Whenever you have a use case which is not covered yet, please talk to us.
## Check Customization
Most checks can have custom parameters. The variables you can adapt you find as "Custom Variables" in the page of given service. In Hiera you can add below the key `icinga2::service_check::customize` as multi level hash the service name and below the variable name with the new values.
### Example "CPU Usage"
Lets look at the example of `CPU Usage` "service":
!["CPU Usage" service page](_static/icinga2_service_custom_variables.png)
If the machinge is a number cruncher and the CPU is fine to be fully utilitzied, then you might ignore it by setting it always fine:
```
icinga2::service_check::customize:
'CPU Usage':
cpu_usage_always_ok: true
```
If in contrary you want to get an immediate notification when CPU is overused, then following snipped is more advisable:
```
icinga2::service_check::customize:
'CPU Usage':
criticality: A
```
If it is a Linuxfabrik plugin, you find a link at "Notes" which points to the documentation of the check. This might shed more light on the effect of these variables.
### Example "Kernel Ring Buffer (dmesg)'"
Another check which can easily have false alerts, but also has a big potential to signal severe kernel or hardware issues, is the check of the kernel log (dmesg).
If you conclude that a given message can savely be ingored, you may add it onto the ignore list, where a partial string match will make it ignored in the future:
```
icinga2::service_check::customize:
'Kernel Ring Buffer (dmesg)':
'dmesg_ignore':
- 'blk_update_request: I/O error, dev fd0, sector 0'
- 'integrity: Problem loading X.509 certificate -126'
```
If you think that this log message can be globally ignored, please inform the Linux Team so we can ignore it by default.
Note that you can reset this check after dealing with it by executing on the node:
```
dmesg --clear
```
## Extra Checks
### TLS/SSL Certificate Expiration
To monitor the expiration of one or more certificates you need to give the node in Hiera the additional server role `ssl-cert` (except for `role::jupyterserver`):
```
icinga2::additional_server_role:
- 'ssl-cert'
```
Then list what certificate files you want to have checked:
```
icinga2::service_check::customize:
'TLS/SSL Certificate Expiration':
ssl_cert_files:
- '/etc/xrdp/cert.pem'
- '/etc/httpd/ssl/node.crt'
```
Beside the file list you may set the warning time in days with the attribute `ssl_cert_warning` (`7` by default) and the critical time with the attribute `ssl_cert_critical` (`3` by default).
If you run your own PKI, you might also check a CA certificate for expiration with
```
icinga2::additional_server_role:
- 'ca-cert'
icinga2::service_check::customize:
'CA Certificate Expiration':
ssl_cert_files:
- '/etc/my_pki/ca.pem'
```
Here the warning is below 180 days and below 30 days is critical by default.
### Check for Systemd Service Status
To check if a daemon or service has been successfully started by `systemd` configure a custom service using the `st-agent-awi-lx-service-active` template:
```
icinga2::custom_service:
'XRDP Active':
template: 'st-agent-awi-lx-service-active'
vars:
criticality: 'A'
service_names:
- 'xrdp'
- 'xrdp-sesman'
```
The name (here `XRDP Active`) needs to be unique over all Icinga "services" of a single host.
The `service_names` variable needs to contain one or more name of `systemd` services to be monitored.
You can create multiple of these checks.
Alternatively a more detailed configuration of a systemd unit state check can be done with the `st-agent-lf-systemd-unit` template:
```
icinga2::custom_service:
'Last Puppet Run':
template: 'st-agent-lf-systemd-unit'
vars:
systemd_unit_unit: 'pli-puppet-run'
systemd_unit_activestate: ['active', 'inactive']
systemd_unit_unitfilestate: 'static'
criticality: 'A'
```
### External Connection Checks (Active Checks)
For this we have fully custom service checks.
Below example is for a RDP port:
```
icinga2::custom_service:
'RDP Access':
command: 'tcp'
agent: false
perf_data: true
vars:
criticality: 'A'
tcp_port: 3389
```
Possible commands are [`http`](https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#plugin-check-command-http), [`tcp`](https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#plugin-check-command-tcp), [`udp`](https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#plugin-check-command-udp), [`ssl`](https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#plugin-check-command-ssl), [`ssh`](https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#plugin-check-command-ssh) or [`ftp`](https://icinga.com/docs/icinga-2/latest/doc/10-icinga-template-library/#plugin-check-command-ftp).
Note if you want to reference the hostname, you might use a macro, e.g.:
```
http_vhost: '$host.name$'
```
Note that macros only work for check command arguments.
The actual service name is up to you, it only needs to be unique.
### Other Custom Checks
It is possible to create a very custom check. But note the command or service template used needs to be available/configured by some other means on the Icinga Master. The check plugin executed on the Icinga Satellite or by the Icinga agent needs also to be already available or distributed by other means. So please reach out to the [Linux Team](mailto:linux-eng@psi.ch) to check how to do it best and to ensure that all is in place.
```
icinga2::custom_service:
'My Service Check 1':
template: st-agent-lf-file-size
vars:
criticality: 'B'
file_size_filename: '/var/my/growing/file'
file_size_warning = '100M'
file_size_critical = '200M'
'My Service Check 2':
command: 'tcp'
agent: false
vars:
criticality: 'A'
tcp_port: 3389
perf_data: true
```
Below `icinga2::custom_service` set the name of the service/service check as it will be seen in Icingaweb. Then the possible arguments are
- `command` to issue a check command
- `template` to inherit from given service template
- `agent` shall the `command` run on the agent or the satellite, only if `themplate` is not set, default is `true`
- `vars` hash with arguments for the service check
- `perf_data` if performance data should be recorded and performance graph should be shown, default is `false`
You are free in the use of the actual service name, it only needs to be unique.

View File

@@ -1,25 +0,0 @@
# Journald Tuning
For the systemd journal size restriction and rate limiting can be fine tuned.
## Size Restricion
In Hiera `log_client::journal_system_max_use` (default `50%`) limits the total size of the journal, whereas `log_client::journal_system_keep_free` (default `25%`) ensures how much disk space is keep free in `/var/log` for other use. In addition to the syntax described in [journald.conf(5)](https://www.freedesktop.org/software/systemd/man/latest/journald.conf.htm) (bytes or use K, M, G, T, P, E as units) we also support percentages, i.e. `25%` means that the journal will use mostly/spare at least 25% of `/var/log`. Note that for the use of a percentage limit `/var/log` must be an own partition, else absolute values need to be set.
`log_client::journal_system_max_file_size` limits the size of an individual journal file. Default ist `32M`.
If there is no need for a persistent log at all, it can be disabled with
```
log_client::persistent_journal: false
```
## Rate Limiting
In Hiera `log_client::journal_rate_limit_burst` defines how many messages of a service are at least logged in the interval period (default 30s). Note that the [actual limit depends on the available disk space](https://www.freedesktop.org/software/systemd/man/latest/journald.conf.html#RateLimitIntervalSec=) The default is `10000` messages.
So `log_client::journal_rate_limit_interval` defines the above mentioned interval period. Allowed time units are `s`, `min`, `h`, `ms` and `us`. If not specified seconds are assumed.
Rate limiting is disabled with
```
log_client::journal_rate_limit_interval: 0
```

View File

@@ -1,85 +0,0 @@
# Metric Collections - Configuration Telegraf
There is a central metrics server at PSI that is accessible via https://metrics.psi.ch. All standard Linux system will be able to send metrics to this server when telegraf metrics collection is enabled via hiera.
Following statement will enable the metrics collection:
```yaml
base::enable_telegraf: true
```
By default a number of metrics are collected, including cpu, disk usage, diskio, etc.
A detailed list with the defaults can be found in [common.yaml](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/data/common.yaml#L855) of the puppet repository.
Custom metrics can also be added. (documentation to be done - please contact the Linux Core group if you need this).
Depending on the location of the system hiera/puppet will configure the system to either send the data directly (PSI intranet) or via reverse proxy (DMZ, Extranet, tier3) to the central metrics server.
If you run your own metric server or you want to explicitly overwrite where data is send to you can do this as follows:
```yaml
telegraf::agent:
url: http://your-metric-server.psi.ch
```
If you want to tweak the configuration on how metrics are collected, you can do this as well like this (following are the defaults - only specify the keys you would like to overwrite):
```yaml
telegraf::agent:
interval: '1m'
collection_jitter: '0s'
flush_interval: '1m'
flush_jitter: '10s'
metric_buffer_limit: 10000
```
By default puppet will purge and recreate (if needed) all config files in `/etc/telegraf/telegraf.d`. If you want to deploy your own metrics collection scripts outside of puppet/hiera you can disable the purging via:
```yaml
telegraf::config::purge: false
```
You can also configure your own metric to be collected via hiera as follows:
```yaml
telegraf::metrics:
'your_metric':
plugin: 'exec'
timeout: '30s'
interval: '1m'
data_format: 'influx'
commands: ['sudo /your/script/location/script.sh']
enable: true
```
This will only work if you have deployed the necessary script (in the example `/your/script/location/script.sh`) and the necessary sudo rule(s) beforehand. For this you might wanna use techniques described in [Distribute Files](../files/distribute_files) and/or [Custom sudo Rules](../access/sudo).
## Examples
### Custom Script
A custom telegraf collector can look something like this:
```bash
#!/bin/bash
CONNECTED=$(/usr/NX/bin/nxserver --history | awk '$7 == "Connected" {print}' | wc -l)
DISCONNECTED=$(/usr/NX/bin/nxserver --history | awk '$7 == "Disconnected" {print}' | wc -l)
FINISHED=$(/usr/NX/bin/nxserver --history | awk '$7 == "Finished" {print}' | wc -l)
# Provide data to telegraf
echo "nxserver open_sockets=$(lsof -i -n -P | wc -l),connected_sessions=${CONNECTED},disconnected_sessions=${DISCONNECTED},finished_sessions=${FINISHED}"
```
The first string of the echo command is the name of the series the data is written into. This name can be overwritten in the metric config via `name_override = "nxserver_report"`
### Custom Config File
A custom config file in /etc/telegraf/telegraf.d could look like this:
```
[[inputs.exec]]
name_override = "remacc_report"
timeout = "30s"
interval = "5m"
data_format = "influx"
commands = ["sudo /usr/lib/telegraf/scripts/remacc_report.sh"]
```

View File

@@ -1,9 +0,0 @@
# Syslog Forwarding
To forward the system logs to a Syslog server, configure in Hiera `log_client::forward_to`:
```
log_client::forward_to:
- 'log1.psi.ch:1514'
- '@log2.psi.ch'
```
This sends to `log1.psi.ch` on the custom UDP port `1544`, but also `log2.psi.ch` gets a copy on TCP default port `514`.

View File

@@ -1,6 +0,0 @@
# Software Management
Guides on how to provide/install and run software on a computer
```{tableofcontents}

View File

@@ -1,14 +0,0 @@
# Citrix VDA Installation
There is a [installation guide](https://docs.citrix.com/en-us/linux-virtual-delivery-agent/current-release/installation-overview/manual-installation-overview/redhat.html) by Citrix to install the Citrix VDA manually on Red Hat systems.
The following Hiera settings will bring the system into a state as requested in the installation guide:
```
# Citrix VDA specialities
hostname::short: true
networking::hostname_on_lo: true
aaa::sssd_cache_creds: false
aaa::default_krb_cache: "FILE:/tmp/krb5cc_%{literal('%')}{uid}"
```
Note that for `hostname -f` to work correctly with `hostname::short: true` you need to set as well `networking::hostname_on_lo: true` because `glibc getaddrinfo()` reads the first hostname in `/etc/hosts` to determine the fully qalified hostname.

View File

@@ -1,28 +0,0 @@
# Cockpit
The Hiera example below will install and activate the [web based management interface Cockpit](https://cockpit-project.org/) plus eventually some required modules:
```
base::package_groups:
- 'cockpit'
base::pkg_group::cockpit:
- 'cockpit'
- 'cockpit-...'
base::sockets:
cockpit:
enable: true
```
If you would like to disallow access from externally and only allow connections from the host itself, then extend the socket configuration as below:
```
base::sockets:
cockpit:
enable: true
dropin: true
options:
Socket:
ListenStream: ['', '[::1]:9090']
```

View File

@@ -1,84 +0,0 @@
# Conda / Anaconda
Conda / Anaconda (https://conda.org) is a package manager that can be used to easily create tailored Python environments without having required C/... binaries of some packages installed on the local system. (i.e. conda will take care that required binaries are installed into the environment)
```{warning}
Due to certain license restrictions the usage of Anaconda at PSI is not allowed without obtaining a professional license from Anaconda.com!
However the usage of the command `conda` and/or of packages from conda-forge is still possible - see below ...
```
This guide explains how to install and configure conda in accordance to the Anaconda.com license terms.
## Installation
On a standard Linux system the conda package can be simply installed via a:
```
yum install conda
```
To instruct puppet to install the package you can have following hiera config:
```yaml
base::pkg_group::extra:
- 'conda'
base::package_groups:
- 'extra'
```
## Configuration
To overwrite the default configuration of conda (which would violate the license because packages would be installed from anaconda.org by default) following config file needs to be deployed in `/etc/conda/condarc.d/base.yml` (or similar):
```yaml
channels:
- conda-forge
- nodefaults
# Show channel URLs when displaying what is going to be downloaded
# and in 'conda list'. The default is False.
show_channel_urls: True
always_copy: true
```
To place the file via Puppet/Hiera you can have following configuration:
```yaml
files::files:
/etc/conda/condarc.d/base.yml:
content: |
channels:
- conda-forge
- nodefaults
# Show channel URLs when displaying what is going to be downloaded
# and in 'conda list'. The default is False.
show_channel_urls: True
always_copy: true
```
To be able to use the `conda` command a user needs to source the command into its current shell. This can be done like this:
```bash
source /opt/conda/etc/profile.d/conda.sh
```
Afterwards conda environments can be created in a license conformed way as documented at https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
## Usage
More details on the usage of the `conda` command can be found here:
https://docs.conda.io/projects/conda/en/stable/user-guide/index.html
### Important Info
It seems that the `conda list <package>` command still uses the default channel. However `conda install` seem to respect the config above.
Probably related to this here:
https://stackoverflow.com/questions/67695893/how-do-i-completely-purge-and-disable-the-default-channel-in-anaconda-and-switch

View File

@@ -1,31 +0,0 @@
# Flatpak Repositories/Remotes
## Flatpak Repository List
All flatpak remotes to be available on the system are listed in Hiera in the list `flatpak::repos::default`.
You can add other than the default `flathub` remote by providing a configuration as shown in the next chapter and then referencing that here:
```
rpm_repos::default:
- 'cpt'
```
## Flatpak Repository Definition
An individual flatpak remote is configured in Hiera within the namespace `flatpak::repo::*`, like following example:
```
flatpak::repo::flathub: |
[Flatpak Repo]
Title=Flathub
Url=https://flathub.psi.ch/repo/
Homepage=https://flathub.org/
Comment=Central repository of Flatpak applications (PSI CachingMirror)
Description=Central repository of Flatpak applications (PSI CachingMirror)
Icon=https://flathub.psi.ch/repo/logo.svg
GPGKey=mQINBFl...
```
The content is a single string in the [`.flatpakrepo` format](https://docs.flatpak.org/en/latest/flatpak-command-reference.html#flatpakrepo).
Usually the Flatpak repository will provide such a file.

View File

@@ -1,102 +0,0 @@
# LabView
We have a site license for LabView, the license server is `lic-ni.psi.ch:27020`.
## Installation of LabView
Select the desired version on the [NI Download Portal](https://www.ni.com/en/support/downloads/software-products/download.labview.html#544131).
It downloads a zip file which you need to extract and to install the rpm file according to your OS, e.g.:
```
sudo dnf install ni-labview-2024-pro-24.3.1.49155-0+f3-rhel9.noarch.rpm
```
This now installs the package repository for given LabView version. Now it is just left to install it
```
sudo dnf install ni-labview-2024-pro
```
Note that by default the package repository file will be removed by the next Puppet run.
To avoid this set `rpm_repos::purge: false` in Hiera or check the chapter "Installation via Puppet and Hiera".
## Installation of LabView Drivers
Again select you desired version on the [NI Download Portal](https://www.ni.com/en/support/downloads/drivers/download.ni-linux-device-drivers.html#544344).
Again it downloads a zip file you need to extract and then install the suitable rpm file, e.g.:
```
sudo dnf install ni-rhel9-drivers-2024Q3.rpm
```
This now installs the package repository for the drivers. Next install the needed drivers, e.g.
```
sudo dnf install ni-488.2
```
Now the kernel drivers need to be prepared
```
sudo dkms autoinstall
```
and finally reboot to prepare everything for showtime.
Note that by default the package repository file will be removed by the next Puppet run.
To avoid this set `rpm_repos::purge: false` in Hiera or check the next chapter "Installation via Puppet and Hiera".
## Installation via Puppet and Hiera
Above example could also be configured with Hiera:
```
rpm_repos::repo::labview:
name: 'labview'
descr: "NI LabVIEW 2024 Q3 pro"
baseurl: 'https://download.ni.com/ni-linux-desktop/LabVIEW/2024/Q3/f1/pro/rpm/ni-labview-2024/el9'
gpgkey: 'https://download.ni.com/ni-linux-desktop/stream/ni-linux-desktop-2019.pub'
disable: false
gpgcheck: false
repo_gpgcheck: true
rpm_repos::repo::labview_drivers:
name: 'labview_drivers'
descr: "NI Linux Software 2024 Q3"
baseurl: 'https://download.ni.com/ni-linux-desktop/2024/Q3/rpm/ni/el9'
gpgkey: 'https://download.ni.com/ni-linux-desktop/stream/ni-linux-desktop-2019.pub'
disable: false
gpgcheck: false
repo_gpgcheck: true
rpm_repos::default:
- 'labview'
- 'labview_drivers'
base::pkg_group::labview:
- 'ni-labview-2024-pro'
- 'ni-488.2'
base::package_groups:
- 'labview'
```
The main difficulty is to figure out the repository URLs, which is easiest done by the manual download, unpack and install the rpm file. Or by inspecting its content.
But with this Hiera configuration the setup can easily be replicated to other machines or to recreate the current setup.
What still needs to be done on first installation and cannot be automated with Puppet is:
Now the kernel drivers need to be prepared
```
sudo dkms autoinstall
```
and the following reboot.

View File

@@ -1,136 +0,0 @@
# Selecting Package Repositories
## Package Repository Lists
Also for configuring package repositories our configuration management works with lists containing the names of the repositories to be installed.
The default list (except for nodes with the `bootpc` and `appliances::lenovo::*` Puppet roles) is `rpm_repos::default`.
If repositories are managed in Hiera, feel free to add them to `rpm_repos::default` like
```
rpm_repos::default:
- 'gfa'
```
Note that repositories for different versions of RHEL can be added and only the fitting ones will be configured on the node.
If the package repositories are managed by a Puppet module, then it is good practice is to define a specific package repository list in [`common.yaml`](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/data/common.yaml) and then to install it only when needed. An example is `profile::telegraf` which only installes the repositories listed in `rpm_repos::influx` when needed.
## Package Repository Definition
An individual package repository is configured in Hiera within the namespace `rpm_repos::repo::*`, like following example:
```
rpm_repos::repo::epel_rhel8:
name: 'epel'
descr: "Extra Packages for Enterprise Linux 8"
baseurl: 'https://repos.psi.ch/rhel8/tags/$pli_repo_tag/epel/'
gpgkey: 'https://repos.psi.ch/rhel8/keys/epel.gpg'
disable: false
gpgcheck: true
osversion: 8
exclude:
- "slurm*"
```
### Package Repository Name
The reference name used in Hiera (the part after `rpm_repos::repo::` should be globally unique. An unfortunate practice is to use the same name for different package repositories. A current example is the `gfa` repository which has different URLs on different `sysdb` environments.
Note for `name` attribute, that only has to be unique on the machine where they are installed. So if there are two repositories defined to provide the same software for two different OS versions, then it is fine to have the same name there.
### Package Repository URL
Overriding the URL of a package repository definition on a stricter scope is considered bad practice. The URL defines the actual "identiy" of the package repository definition. It is confusing if it gets different meanings at different places. It is like one passport which will identify different persons in different countries.
If different sources are needed, define and name them appropriately. They point to one given repository and the package repository lists are the place to select what should be applied on a given node.
Also feel free to define all your package repositories in [`common.yaml`](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/data/common.yaml).
### Select Package Repository by OS Version
Usually a package repository only serves packages for one major OS version. This can be stated by the `osversion` attribute. When a package repository list is installed, only the repositories fitting the version of the OS installed on the node are selected and configured.
If the `osversion` attribute is not set, then it is always installed.
### Package Repository GPG Verification
GPG verification is optional, so `gpgkey` may not be defined and `gpgcheck` is `false` by default. But ideally the packages are signed and checked for tampering and corruption.
### Exclude Packages
If certain packages provided by given repository should be ignored on the nodes, then add them to the `exclude` list.
## Using Specific Package Repository Snapshot
Most of the externally sourced package repositories on https://repos.psi.ch/rhel7 (RHEL7), https://repos.psi.ch/rhel8 (RHEL 8) and https://repos.psi.ch/rhel9 (RHEL 9) have snapshots which can be used to freeze the available package versions to a given date.
The tags are different per major OS version and are definied in the Hiera hash `rpm_repos::tag`, below you see the default:
```
yum_client::repo_tag: 'prod'
rpm_repos::tag:
redhat7: "%{lookup('yum_client::repo_tag')}"
redhat8: 'rhel-8'
redhat9: 'rhel-9'
```
So for RHEL 7 the default is `prod` and can be overriden on `yum_client::repo_tag` (backwards compatibility) or on the `redhat7` attribute of `rpm_repos::tag`.
To fix to a specific snapshot on RHEL 8, the `redhat8` attribute has to be set on `rpm_repos::tag`, the default is `rhel-8` which points to the latest snapshot.
The available tags your find at
- [https://repos.psi.ch/rhel9/tags/](https://repos.psi.ch/rhel8/tags/) for RHEL 9
- [https://repos.psi.ch/rhel8/tags/](https://repos.psi.ch/rhel8/tags/) for RHEL 8 (note the `prod` tag will phase out)
- [https://repos.psi.ch/rhel7/tags/](https://repos.psi.ch/rhel7/tags/) for RHEL 7
## Package Repositories made Available by the Linux Group
Availabe on all systems are:
- RedHat [BaseOS](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/package_manifest/baseos-repository), [AppStream](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/package_manifest/appstream-repository) and [CodeReady](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/package_manifest/codereadylinuxbuilder-repository) repositories
- [Extra Packages for Enterprise Linux (EPEL) repositories](https://docs.fedoraproject.org/en-US/epel/)
- Puppet 7 repository
- Auristor repository for YFS and AFS related packages (RHEL 7 and 8 only)
- Google Chrome repository
- pli-misc (not tagged for RHEL7, but on RHEL 8/9)
- Code (Visual Studio Code from Microsoft)
- Microsoft Teams
- PowerScript et. al. (Microsoft)
- HashiCorp (`vault`, `terraform`, `vagrant`, ...)
- Oracle Instant Client 19 and 21
- Opera
Predefined and used when needed are:
- Influx (`influxdb`, `telegraf`, ...)
- CUDA
- Nomachine
To be added/defined in [`common.yaml`](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/data/common.yaml)?
- GPFS
- Epics (available for RHEL7)
### pli-misc Repository
A small list of packages managed by the Linux Team.
- *RHEL8*: make v4.3 from [CentOS](https://rpmfind.net/linux/RPM/centos-stream/9/baseos/x86_64/make-4.3-7.el9.x86_64.html) as v4.2.1 has been reported to to make trouble
- latest [Zoom client](https://zoom.us/download?os=linux)
- latest [Webex client](https://www.webex.com/downloads.html)
- latest [Slack client](https://slack.com/downloads/linux)
- latest [NoMachine Enterprise Client](https://downloads.nomachine.com/download/?id=11)
- latest [Real VNC Viewer](https://www.realvnc.com/en/connect/download/viewer/), recommended for VNC remote access to Windows machines
- `pli-assets` containing the PSI and the Customer Self Service logo, any hints about the source rpm are welcome
- *RHEL8*: [mod_gearman v4.0.1](https://mod-gearman.org/download/v4.0.1/rhel8/x86_64/)
- *RHEL8*: lightdm-gtk v2.0.8-3.pli, a patched lightdm-gtk-greeter ([SRPM](https://git.psi.ch/linux-infra/lightdm-gtk-rpm), [PR](https://github.com/Xubuntu/lightdm-gtk-greeter/pull/121)) which allows to limit the presented keyboard layouts
- Code Beamer Office pluging v9.5.0 managed by Gilles Martin
- storecli 007.2007.0000.0000 managed by Marc Caubet Serrabou
- [pam_single_kcm_cache PAM Module](https://github.com/paulscherrerinstitute/pam_single_kcm_cache) managed by Konrad Bucheli
- [nvidia-detect](http://elrepo.org/tiki/nvidia-detect) copied over from ElRepo to make it generally available
- [bob](https://git.psi.ch/linux-infra/bob)
## Package Repositories made Available by other PSI Groups
- `tivoli`, IBM backup software for Arema, managed by Datacenter and DB Services, AIT
- `nxserver` for NoMachine NX

View File

@@ -1,27 +0,0 @@
# Automated Package Updates
The automatic updates are controlled in Hiera (excluding RHEL7):
| Hiera key | default | comments |
|-------------------------------------|------------|-------------------------------------------------------------------------------|
| `base::automatic_updates::interval` | `weekly` | valid are `daily`, `weekly`, `boot_only` and `never` which disables the automatic updates |
| `base::automatic_updates::type` | `security` | `security` installs only security updates whereas `all` installs all updates |
| `base::automatic_updates::reboot` | `never` | valid are `never`, `when-needed` (when an updated package requests a reboot) and `when-changed` (after every update) |
| `base::automatic_updates::exclude` | `[]` | list of packages not to update, wildcards like "*" are allowed |
| `base::automatic_updates::kernel` | `false` | define if new kernel packages also should be installed automatically |
For system-wide installed Flatpak packages there is a separate key for automatically updating them:
| Hiera key | default | comments |
|------------------------------|------------|-------------------------------------------------------------------------------|
| `flatpak::updates::interval` | `weekly` | valid are `daily`, `weekly`, `boot_only` and `never` which disables the automatic updates |
Note that the updates run on midnight, for `weekly` from Sunday to Monday. There is no automatic reboot, e.g. for kernel updates.
---
**Important**
There will be no updates if you fix the package source to a snapshot/repo tag i.e. `rpm_repos::tag` or `yum_client::repo_tag` setting in Hiera point to a specfic snapshot.

View File

@@ -1,185 +0,0 @@
# Package Installation
## Install Packages with Hiera Package Groups
The packages automatically installed onto a system by Puppet are managed in the Hiera list `base::package_groups`. It contains the names of the package groups to be installed. Items can be added at all levels of the Hiera hierarchy and are merged.
The package groups itself are Hieara lists named `base::pkg_group::$USE_CASE`.
Here list all the packages you want to install.
Currently there exist the following package groups in the main [`common.yaml`](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/data/common.yaml):
- `base::pkg_group::system_tools` (installed by default)
- `base::pkg_group::daq_buffer`
- `base::pkg_group::desktop_settings`
- `base::pkg_group::dev`
- `base::pkg_group::login_server`
- `base::pkg_group::qt5`
- `base::pkg_group::root`
but further ones can be created in Hiera at lower hierachies and added to `base::package_groups`, for example
```
base::pkg_group::java:
- 'java-1.8.0-openjdk'
- 'java-11-openjdk'
- 'java-17-openjdk'
base::package_groups:
- 'java'
```
## Install a Group of Packages
To add a RedHat predefined group of packages (checkout out `dnf grouplist --hidden`) prepend the name of it with a `@`, e.g. for "Java Platform" it would be `@Java Platform`:
```
base::pkg_group::java:
- '@Java Platform'
```
## Install Latest Package Version
Puppet by default only checks if a package is installed and only installs it if missing.
To ensure that always the latest available package version is installed, append the `:latest` tag to the package name in the package group:
```
base::pkg_group::java:
- 'java-1.8.0-openjdk'
- 'java-11-openjdk'
- 'java-17-openjdk:latest'
```
## Install Packages only on Given OS Version
Certain packages are only used on a given OS Version, so a `os=` with the OS name and the major version selects a package only for given OS, where as a `os!` will filter away given package on hosts with given OS, so they are not installed there.
```
base::pkg_group::java:
- 'java-1.8.0-openjdk:os=redhat7'
- 'java-11-openjdk'
- 'java-17-openjdk:os!redhat7'
```
Note that this tag can be combined with the `latest` and `absent` tag.
## Install Module Stream
RHEL 8 introduced the concept of [module streams](https://docs.pagure.org/modularity/).
A specific stream can be selected with the `stream` tag:
```yaml
base::pkg_group::nodejs:
- 'nodejs:stream=12'
```
## Install Python Modules with `pip`
The `pip` tag can be used to install a PYPI Python package using `pip`, e.g. `pygame-utility`:
```yaml
base::pkg_group::my_pip_modules:
- 'pygame-utility:pip'
```
To install a packages for specific python versions use the tag `:pip<version>` (example: `- 'numpy:pip3.12'`).
Note that packages installed with `pip` are not updated automatically!
## Install Software using Flatpak
Flatpak is available by default on workstation type systems. Alternatively you may enable/disable flatpak support in Hiera by setting `base::enable_flatpak` to `true` respectively `false`.
The `flatpak` tag can be used to install a software via flatpak. Best use the full Flatpak Application ID, like `dev.zed.Zed`:
```yaml
base::pkg_group::my_flatpak_software:
- 'dev.zed.Zed:flatpak'
```
On [Flathub](https://flathub.org/) you find the Application ID if you press the down arrow next to the `Install` button:
![Flathub install commands](packages/zed_on_flathub.png)
Note that flatpak software is by default updated automatically, see [guide "Automated Package Updates"](package_updates) for details.
The Flathub package "remote" is already installed, to add other "remotes" check out the [Configure Flatpak Remotes](flatpak_remotes) guide.
## Remove Packages
To remove an already installed package, append the `:absent` tag to the package name in the package group:
```
base::pkg_group::java:
- 'java-1.8.0-openjdk:absent'
- 'java-11-openjdk'
- 'java-17-openjdk'
```
## Remove Packages
To remove an already installed package, append the `:absent` tag to the package name in the package group:
```
base::pkg_group::java:
- 'java-1.8.0-openjdk:absent'
- 'java-11-openjdk'
- 'java-17-openjdk'
```
## Ignore Packages
To make packages unavailable for installation, even though provided by the package repositories, add them in Hiera to the list `base::package_exclude`:
```
base::package_exclude:
- 'epics-base-7.0.6*'
```
This list is merged over the full Hiera hierachy, so there is no need to copy exclusions from higher levels when creating an exclusion on a low level.
This list can also be used to opt out packages from other, maybe inherited package groups. But unlike the `:absent` tag in a package list it will not uninstall a package when found.
### Version Lock
If you have the need to freeze a software on given version, you can use `dnf versionlock`. First you need to install the `python3-dnf-plugin-versionlock` package (see above).
Then on the node best run
```
dnf versionlock add $PACKAGE
```
for every package you do not wish to get updates installed any more. If there are newer packages you might test with `dnf update --assumeno` if it does not show your software any more, while other updates are still possible. Sometimes this can cause dependency resolution failures and you might need to add a version lock for one ore more packages which depend on the locked package.
After you best put the resulting `/etc/dnf/plugins/versionlock.list` in Hiera with [`files::files`](../files/distribute_files) for reproducability.
#### Kernel Version Lock
A full kernel version lock needs to include a number of packages. For RHEL8:
```
dnf versionlock add kernel kernel-core kernel-modules kernel-tools kernel-tools-libs kernel-headers kernel-devel
```
For RHEL9:
```
dnf versionlock add kernel kernel-core kernel-modules-core kernel-modules kernel-uki-vir kernel-tools kernel-tools-libs kernel-headers kernel-devel
```
and if AFS is configured
```
dnf versionlock add kmod-yfs
```
### Install Debuginfo Packages
The package repositories for debuginfo packages are disabled by default. To spontaneously install such a package, do
```
dnf --enablerepo '*_debug' install ...
```
## Missing Package
If there is no such package in the repositories, then
```
Error: Execution of '/usr/bin/dnf -d 0 -e 1 -y install non-existing-package-for-test' returned 1: Error: Unable to find a match: non-existing-package-for-test
```

Binary file not shown.

Before

Width:  |  Height:  |  Size: 111 KiB

View File

@@ -1,81 +0,0 @@
# Python
## Overview
```{note}
This guide only covers RHEL8/9 and newer. Also we will __not__ cover how to use Python 2 any more.
```
There are several versions of Python(3) available from RedHat. This guide shows how one can install specific/multiple versions and how to configure the default Python of a system.
The use of Python environments like [`venv`](https://docs.python.org/3/library/venv.html) or [Conda](https://docs.conda.io) is recommended if a single user needs one or multiple specific Python environments. But this is not part of this guide.
## Platform Python vs Default Python
As many system tools are written in Python each RHEL8/9 system comes with a so called Platform Python. This is a _fixed_ Python version than cannot be modified and usually is also not available via the commandline.
Beside that, on each system you can install multiple Python versions. One of this Python version can then be set as the Default Python. This means that this Python version is called if you call `python` and/or `python3` on the commandline/script.
To explicitly call a specific Python version always specify the full version as follows: `python<version>`, e.g. `python3.12`. The same applies to the `pip` command!
### Using the Platform Python
For system tools, that should run consistently on all systems you may decide to use the "Platform Python" instead of the Default Python. For doing so you need to set the shebang of your script to
```bash
#!/usr/libexec/platform-python
```
## Python Versions
On __RHEL8__ the Platform (and usual Default) Python version is __3.6__. As the time of writing following Python versions are additinally available: _3.8_, _3.9_, _3.11_, _3.12_ .
On __RHEL9__ the Platform (and usual Default) Python version is __3.9__. As the time of writing following Python versions are additinally available: _3.11_, _3.12_
## Configuration
Following hiera keys can be used to configure Python on a system.
### Installing and Setting a Default Python
To install a default Python you can use:
```yaml
python::default_version: '3.11'
```
This will make sure that that this Python version is installed on the system and that this Python version is available via the `python` and `python3` command. The same applies for `pip` and `pip3`.
### Installing Additional Python Versions
To install additional Python versions on the system you can use:
```yaml
python::install_versions: ['3.6', '3.12']
```
This will take care that the necessary packages are installed on the system. These versions are then available by calling the version specific Python command `python<version` (e.g. `python3.12`).
### Installing PYPI Packages
You can install PYPI packages for the individual Python versions the same way as installing RPMS on a system (i.e. via the Puppet package module).
This can be done by adding a label after the PYPI package name:
```yaml
base::pkg_group::python_packages:
- 'numpy:pip' # installs package for the systems default python
- 'numpy:pip3.12' # installs numpy via PYPI for `python3.12`
base::package_groups:
- 'python_packages'
```
__IMPORTANT:__ Note that packages installed with `pip` are not updated automatically!
For additional information on how to install packages (including `pip`) please check out [this guide](packages).
```{note}
The packages prefixed with `python3-` are for the default Default Python version. For newer versions RedHat has a versioned prefix like `python38-` or `python3.11-`. There might not be for all libraries packages for all availabe Python versions. Then you might need to install your library with `pip`.
```

View File

@@ -1,114 +0,0 @@
# SELinux Configuration
SELinux can be configured in Hiera.
For troubleshooting SELinux related problems please have a look at [SELinux Troublehooting Guide](../../troubleshooting/selinux)`
## Basic Settings
Enable or disable SELinux with `base::selinux`. Options:
* `enforcing`
* `permissive`
* `disabled`
Example:
```yaml
base::selinux_mode: 'disabled'
```
The default depends on the Puppet role, e.g. for servers it is `enforcing` while for workstations and consoles it is `disabled`.
The `permissive` option is useful for setting up a new server to see where SELinux would block if enabled.
## Logging Violations
To record such violations `auditd` needs to run:
```yaml
base::enable_auditd: true
```
On RHEL9 and later this is enabled by default if SELinux is `permissive` or `enforcing`.
Then `setroubleshootd` is very helpful to learn how to configure SELinux if an action is wrongly considered a violation:
```yaml
selinux::setroubleshootd: true
```
On RHEL9 and later this is enabled by default if SELinux is `permissive` or `enforcing`.
## Finetuning
### SELinux Booleans
Use NFS home directory:
```yaml
selinux::use_nfs_home_dirs: true
```
Set SELinux booleans:
```yaml
selinux::booleans: [ 'httpd_can_network_connect', 'domain_can_mmap_files']
```
### File Context (`fcontext`)
Set fcontext for specific directories/directory
```yaml
selinux::fcontext:
logbook-data:
pathspec: '/var/www/html/logbook-data(/.*)?'
seltype: 'httpd_sys_rw_content_t'
logbook-data-local:
pathspec: '/var/www/html/logbook-data-local(/.*)?'
seltype: 'httpd_sys_rw_content_t'
```
a unique arbitrary key name for each entry is needed.
If you wish to have the same fcontext configuation as another path do
```yaml
selinux::fcontext::equivalence:
apache_ssl_conf:
path: '/srv/online/config/ssl.conf'
target: '/etc/httpd/conf/httpd.conf'
apache_index_html:
path: '/srv/online/config/index.html'
target: '/var/www/html/index.html'
apache_online_web:
path: '/srv/online/web'
target: '/var/www/html'
apache_offlinecheck:
path: '/srv/offlinecheck'
target: '/var/www/html'
```
a unique arbitrary key name for each entry is needed here as well.
### Custom Module
Custom SELinux modules can also be added.
Such a module can be created from recorded violations with
```
ausearch --raw | audit2allow -r -m $CUSTOM_SELINUX_MODULE_NAME
```
Note that the `setroubleshootd` log output ususally gives you a narrower search filter for `ausearch` for each recorded violation.
Each such module needs to be added with a unique key at the Hiera key `selinux::modules::te`. A full example is
```yaml
selinux::modules::te:
# SELinux is preventing /usr/local/bin/musrview from setattr access on the directory /usr/lib/fontconfig/cache
'musrview-font-cache': |
module musrview-font-cache 1.0;
require {
type lib_t;
type httpd_sys_script_t;
class dir setattr;
}
allow httpd_sys_script_t lib_t:dir setattr;
```
Do not forget to increase the version number if you update such a module.

View File

@@ -1,111 +0,0 @@
# Managing Services with Systemd
Hiera can also be used to manage services and to automate reoccuring tasks with timers.
## Enabling/Starting a Service
If the software already comes with an systemd unit file, then it is sufficient to just enable it in Hiera by using the `base::services` key:
```
base::services:
netdata:
enable: true
```
The key inside is the `systemd` service name without the `.service` suffix.
## Disabling/Stopping a Service
To stop and disable an already running service, disable it in the `base::services` Hiera key with `enable: false`:
```
base::services:
netdata:
enable: false
```
## Manage a Socket Activated Service
For socket activated services use in Hiera `base:sockets` instead:
```
base::sockets:
cockpit:
enable: true
```
## Manage Services with Custom Unit Files
It is also possible to provide a full systemd unit file if there is none already. For this define the different secions and their content with subkeys below the `options` key as in below example:
```
# The following service stops users from accessing the node
# before the home directory is mounted
base::services:
'wait_for_home':
enable: true
options:
Unit:
Before: 'systemd-user-sessions.service'
Install:
WantedBy: 'multi-user.target'
RequiredBy: 'multi-user.target'
Service:
Type: 'oneshot'
ExecStart: '/opt/pli/libexec/waitformount -m /das/home'
RemainAfterExit: 'true'
```
If you need to set multiple values, then put the values into an list:
```
Service:
Environment:
- "FOO=bar"
- "BIZ=buz"
```
## Enhance a Service with a Dropin Unit File
It is possible to fine-tune already existing `systemd` unit files with dropins. These are placed as `.conf` files in `/etc/systemd/system/$SERVICE.service.d/`.
With the `dropin: true` setting the content of the `options` parameter is written into the according dropin directory:
```
base::services:
'name_of_enhanced_service':
enable: true
dropin: true
options:
...
```
Often this is done to start the service with different options, then you need to reset the orginal value with an emty entry:
```
base::services:
'name_of_enhanced_service':
enable: true
dropin: true
options:
Service:
ExecStart:
- ''
- '/usr/sbin/my_service --verbose'
```
If there are multiple dropins, you might also name them individually with the `dropin_name` parameter.
## Run Command on Startup
If a command should run only once at boot time you may create a `oneshot` service with `RemainAfterExit` and have one or more commands ready in `ExecStart`:
```
base::services:
tuned_setup:
enable: true
options:
Unit:
After: 'tuned.service'
Install:
WantedBy: 'multi-user.target'
Service:
Type: 'oneshot'
ExecStart:
- '/usr/sbin/tuned-adm active'
- '/usr/sbin/tuned-adm profile virtual-host'
RemainAfterExit: 'true'
```

View File

@@ -1,23 +0,0 @@
# Systemd Timers for Regular Tasks
To have custom executables run regulary on given time/interval, you may use the `base::timers` Hiera key:
```
base::timers:
'timer_test':
description: 'test timers'
command: '/usr/bin/logger foo'
on_calendar: '*:*:10'
timer_options:
persistence: false
```
For each timer following keys are mandatory
- `description` for a short explaination what it is about
- `command` for the command to run
- `on_calendar` defining when it should run using the [`systemd` calendar event format](https://www.freedesktop.org/software/systemd/man/systemd.time.html#Calendar%20Events), (alternatively see also chapter "CALENDAR EVENTS" of `man systemd.date`)
Optional is
- `timer_options` for additional options. In the example it is `persistence`which signals if the timer should run immediately after boot when the node was switched of on the last scheduled run time (default is `false`).

View File

@@ -1,57 +0,0 @@
# Container
```{warning}
Although we enable you to use both container runtimes Podman and Docker it is important to note that only Podman is supported by RedHat. So if you have a critical application and need/want to also rely on Redhat third level support you'd have to use Podman!
Also be aware that _compose_ files and _commands_ can vary between docker-ce and podman!
```
## Docker
Docker-CE will have always the latest features in docker engine and docker compose.
The docker repo is enabled by default and the packages can be installed by using follwing code in hiera.
```
base::pkg_group::extra:
- 'docker-ce'
```
## Podman
Podman engine runs "rootless" without any further configuration and is supported by Redhat Enterprise Support.
The Hiera configuration to install docker-podman with hiera would look like this:
```
base::pkg_group::extra:
- 'docker'
```
This will install podman from the appstream repository.
### Subuids and Subgids
To be able to run rootless containers with podman you need to define a subuid/subgid range for each user which should be able to launch containers with `podman` in `/etc/subuid` and `/etc/subgid`.
To be able to have these ids consistent PSI wide there is a small central database/API to register and lookup such IDs.
The API endpoint is `https://sysdb.psi.ch/subid/v1/config` and allows for one or more `user` parameters. Valid are numeric user ids (uid) or any username listed in our AD. It will then return the line(s) you need to add to `/etc/subuid` and `/etc/subgid`.
Note that the ID is now reserved for 2 years. With every lookup with the API the reservation gets renewed. After it timed out the ID range will be freed and will be used by someone else.
**Examples:**
This will get the ID range (the same for both subuid and subgid) for your user.
```
curl "https://sysdb.psi.ch/subid/v1/config?user=$USER"
```
This will get the ID range (the same for both subuid and subgid) for your user.
And for several users:
```
USER1=...
USER2=...
USER3=...
curl "https://sysdb.psi.ch/subid/v1/config?user=$USER1&user=$USER2&user=$USER3"
```
**Future Work**:
- Puppet integration
- automatic refresh on login

View File

@@ -1,93 +0,0 @@
# Custom RPM Repositories
It is possible to manage own RPM repositories which are accessible inside PSI.
If you need one please contact the Linux Core team (linux-eng@psi.ch).
To setup the your custom repository/ies we need following information:
- Name of the repository
- Do you want to have your repository available for __all__ RHEL major versions (i.e. all RHEL major versions will see the same packages) or do you need a repository for each RHEL major version (i.e. you can have different packages for each RHEL major version)?
## Usage
On `lxsup.psi.ch` you can easily access this data on `/packages_misc`.
The backend for these custom repos is on a NFS4 share. This share can be mounted from __lx-fs:/packages_misc__
On __Linux__ systems you should be able to mount the share like this
```bash
mount -t nfs4 -o sec=krb5 lx-fs.psi.ch:/packages_misc /mnt
```
On __MacOS__ you could mount and access the share like this:
```bash
mount_nfs -o sec=krb5,nfsvers=4 lx-fs.psi.ch:/packages_misc ~/some_directory
```
(on __Windows__ - it should be/is also possible with Windows >=10 to mount the NFS4 share- https://learn.microsoft.com/en-us/windows-server/administration/windows-commands/mount - details to be added here)
Once your folder/folders is/are created and your are autorized to write into these folders you can place RPMs and files into that directory/directories
__Important__: Once mounted, to be able to change things on the filesystem you have to have a valid kerberosticket for your PSI user.
```
kinit <username>@D.PSI.CH
# or on standard PSI linux systems
kinit
```
```{note}
In case you are in a firewall protected zone, make sure that there is a rule to access lx-fs on TCP port 2049!
```
## Overview
Custom repositories are provided by the central repository server as all other repositories. They will be available on the URL
```
https://repos.psi.ch/<os>/latest/<repository_name>
```
The custom repositories are snapshotted and tagged as any other repository.
Inside __hiera__ please use the URL:
```
https://$lx_reposerver/<os>/$lx_repo_tag/<repository_name>
```
This ensures that the systems will always use the correct repository server url to access the repositories. (e.g. in the DMZ/extranet the DNS hostname of the repository is different ...)
The content of the custom repositories are managed on a a dedicated NFS4 share. On this share groups can have different folders where they place/manage their rpms. These RPMs are periodically synced to the main repository share (i.e. every 15 minutes), so what you see on the `packages_misc` is not exactly what will end up on the repository server. Upon sync the repository will be automatically build via the create-repo command.
The structure of the share is usually as follows:
```
<base>/<repository name>
# content in the case of one repo for all major RHEL version
/rhel7
/rhel8
/rhle9 # in case of different repo for major RHEL version
```
The access control is done by a Unix group usually named `unx-<something>_adm` (most of the time this is a group that also gives access to hiera repos)
## Preparation for new Custom RPM Repository (by Linux Team)
The `packages_misc` share is only root writeable from lx-sync-01.psi.ch. Therefore the initial setup needs to be done there.
Creation of a new repo (on lx-sync-01 in `/packages_misc`):
```
cd /packages_misc
mkdir <reponame>
chgrp 35588 <reponame> ## We need to use the numeric group id here as lx-sync is not directly connected to AD
chmod g+w <reponame>
chmod g+s <reponame>
```
To sync the repo and make it available there needs to be a sync config added to https://git.psi.ch/linux-infra/rpm-repo-utils/-/tree/main/etc?ref_type=heads
(inside the __*-misc__ directories)
After adding this config the changes need to be deployed on lx-sync-01.psi.ch.
(either bootstrap/ansible or a manual `git pull` in `/opt/rpm-repo-utils/`)

View File

@@ -1,4 +0,0 @@
# Deployment
```{tableofcontents}
```

View File

@@ -1,142 +0,0 @@
# Basic Installation
Linux systems are installed using PXE and Kickstart. The Kickstart
configuration is auto-generated base on the configuration stored in sysdb/bob.
```{note}
When PXE boot is not an option, e.g. in restricted networks, it is possible to start iPXE from a USB stick or other media.
```
The general process for a installation is:
1. (Hardware) Register the machine in Package-Fence: https://enter.psi.ch:1443 to give it access to the PSI network
1. (Hardware Machine / static ip address) Register a static ip in https://qip.psi.ch/qip (or via request to the Network Team (via ServiceNow)) for the console
1. Register and configure the system with sysdb
2. Tell sysdb to perform an installation on the next boot
3. Reboot the system and trigger a PXE boot (usually by pressing F12 during
POST)
The default way to interact with sysdb is to use [bob](https://git.psi.ch/linux-infra/bob). `bob` is already set up on `lxsup.psi.ch` for general use. Remember that you need a valid Kerberos ticket before modifying a sysdb entry via `bob`.
Altenatively you many have on your workstation a local copy of `bob`. This can be done by installing the RPM from the pli-misc repository - https://repos.psi.ch/rhel8/latest/pli-misc/ or by installing the Python package manually. More details regaring `bob` can be found [here](https://git.psi.ch/linux-infra/bob).
## Hardware/VM Requirements
For hardware based system please check the Hardware compatibility page for
- [RHEL8](rhel8/hardware_compatibility.md)
- [RHEL9](rhel9/hardware_compatibility.md)
In any case these are the **minimal** system requirements:
- RHEL8
- RAM: 4GB
- Harddisk: 33GB
- RHEL9
- RAM: 4GB
- Harddisk: 64GB
## Sysdb Configuration
Register node:
```bash
bob node add $FQDN $ENV netboot
```
To be able to PXE boot we need to configure at least one MAC address for the new node:
```bash
FQDN=test.psi.ch
MAC_ADDRESS=00:00:00:00:00:00 # get this from the hardware or vcenter console
bob node add-mac $FQDN $MAC_ADDRESS
```
Finally we need to configure the installer to use, and the Puppet-related parameters:
```bash
bob node set-attr $FQDN ipxe_installer=rhel8install
bob node set-attr $FQDN puppet_role=role::server
```
and **optional**:
```bash
# static IP address (options: static, dhcp)
bob node set-attr $FQDN network=static
# if you want to use hiera groups and sub-groups
bob node set-attr $FQDN puppet_group=cluster
bob node set-attr $FQDN puppet_subgroup=compute
# use a differnt puppet environment
bob node set-attr $FQDN puppet_env=prod
```
### Example
Minimal example:
```bash
bob node add test.psi.ch lx netboot
bob node add-mac test.psi.ch 00:00:00:00:00:00
bob node set-attr test.psi.ch ipxe_installer=rhel8install puppet_role=role::server
# show the configuration
bob node list -v test.psi.ch
# start network boot on the machine
```
### Special Settings
#### Custom Kernel Commandline Arguments
For custom kernel commandline arguments for the installer (e.g. to provide drivers) the sysdb attribute `kernel_cmdline` can be used:
```bash
bob node set-attr lx-test-02.psi.ch kernel_cmdline=inst.dd=https://linuxsoft.cern.ch/elrepo/dud/el8/x86_64/dd-megaraid_sas-07.725.01.00-1.el8_9.elrepo.iso
```
#### Custom/Fixed System Disk
By default the whole space available on the first block device is used and any existing partition is removed.
Alternatively you might set the sysdb attribute `system_disk` with the device name of the disk which should be used instead:
```bash
bob node set-attr $FQDN system_disk=md126
```
The ordering of disks (`sda`, `sdb`, ...) might sometimes not be stable.
To explicitely select the disk you might use on of the links below `/dev/disk` like
```bash
bob node set-attr $FQDN system_disk=disk/by-path/pci-0000:a1:00.0-ata-1
```
#### Custom Partitioning
Partitions system are configured with a standard schema using LVM, so that they can be possibly changed afterwards.
It is also possible to customize the partitioning by using the `partitions` attribute on sysdb. See https://git.psi.ch/linux-infra/bob for more details.
## BIOS / UEFI Boot
All systems should use UEFI boot for booting! BIOS based boot should only be used where UEFI is not an option.
### UEFI
__NOTE:__ After the installation the boot order will be changed to localboot again! So if you reinstall make sure that you re-set the bootorder via the efi menu or the commandline: https://linux.die.net/man/8/efibootmgr
```bash
[root@lx-test-02 ~]# efibootmgr
BootCurrent: 0004
BootOrder: 0004,0002,0000,0001,0003
Boot0000* EFI Virtual disk (0.0)
Boot0001* EFI VMware Virtual SATA CDROM Drive (0.0)
Boot0002* EFI Network
Boot0003* EFI Internal Shell (Unsupported option)
Boot0004* Red Hat Enterprise Linux
[root@lx-test-02 ~]# efibootmgr --bootorder 2,4,0,1,3
```
(there is no need to have the leading 000 )

View File

@@ -1,37 +0,0 @@
# Console Installation
## Overview
A console is a multi user system (ideally running on standard hardware) with a graphical desktop. The individual users do not have admin rights on the system and all configuration and packages must be deployed by puppet (ensuring reproducibility and fast re-installation in case of hardware failures, etc.)
Consoles are, for example, used at experimental stations, beamlines, endstations.
The standard naming of a console is: __&lt;group&gt;-cons-&lt;two digit number&gt;__
Due to various reasons these systems __must__ have a static IP assigned.
## Installation Workflow
1. Register the machine in Package-Fence: https://enter.psi.ch:1443 to give it access to the PSI network
1. Register a static ip in https://qip.psi.ch/qip (or via request to the Network Team (via ServiceNow)) for the console
2. Create the necessary bob entries for the machine:
```bash
bob node add <your-console.psi.ch> <hiera environment without "data-"> netboot
bob node add-mac <your-console.psi.ch> xx:xx:xx:xx:xx:xx
bob node set-attr <your-console.psi.ch> network=static
bob node set-attr <your-console.psi.ch> ipxe_installer=rhel8install
bob node set-attr <your-console.psi.ch> puppet_role=role::console
bob node set-attr <your-console.psi.ch> puppet_env=prod
bob node set-attr <your-console.psi.ch> puppet_group=default # replace default if needed
# Optional
bob node set-attr <your-console.psi.ch> puppet_subgroup=collector
```
3. Create a host specific file (`<your-console.psi.ch>.yaml`) in the respective hiera repository/directory with the following content:
```yaml
networking::setup: auto_static_ip
```
4. Ensure that the UEFI/BIOS is set to netboot
5. Kickstart the machine

View File

@@ -1,81 +0,0 @@
# DMZ Installation
The deployment in the DMZ ist the basically the same as [internaly](basic_installation), but there are a few points to consider:
- a firewall rule for puppet is needed
- the commissioning can only be done in the special DMZ commissioning network
Because of this commissioning network we suggest that the DMZ VM gets for commissioning two interfaces, a "front-door" to the actual network where it will finally provide its service and the "back-door" in the commissioning network. After successful setup that interface will be removed.
## Preparation
- get static IP addresss for "front-door" interface
- For Puppet you need to [order a firewall rule](https://psi.service-now.com/psisp?id=psi_new_sc_cat_item&sys_id=faccb8644fe58f8422b0119f0310c7f7) from your machine to `puppet01.psi.ch` using TCP port 8140.
- (let) the VM be set up with to interfaces, the first one in the final network ("front-door") and the second one attached to `172.23.206.0/24` ("back-door")
- get both MAC addresses
- prepare the node in Sysdb/`bob` with the "back-door" MAC address
- in Hiera following network configuration is suggested which keeps the "front-door" interface disabled for the start:
```yaml
networking::setup: managed
networking::connections:
- dmz_network
- commissioning_network
networking::connection::dmz_network:
mac_address: '00:50:56:9d:47:eb'
ipv4_method: 'disabled'
ipv6_method: 'disabled'
networking::connection::commissioning_network:
mac_address: '00:50:56:9d:c7:fe'
ipv4_method: 'auto'
ipv6_method: 'disabled'
```
## Commissioning/Kickstart
- commission/kickstart the node via network boot
- for SSH access get assigned IP address from VMWare or Puppet facts or QIP
- at the moment puppet will fail, provide the IP address to your fellow friendly Core Linux Team member to manually finish the first boot
- if the configuration is fully ready, configure the "front-door" interface:
```yaml
networking::setup: managed
networking::connections:
- dmz_network
- commissioning_network
networking::connection::dmz_network:
mac_address: '00:50:56:9d:47:eb'
ipv4_method: 'manual'
ipv4_address: '192.33.120.60/24'
ipv4_gateway: '192.33.120.1'
ipv6_method: 'disabled'
networking::connection::commissioning_network:
mac_address: '00:50:56:9d:c7:fe'
ipv4_method: 'auto'
ipv6_method: 'disabled'
```
## Cleanup
- check if you still have management access (`ssh`) over the front door interface
- remove the configuration of the "back-door" interface:
```yaml
networking::setup: managed
networking::connections:
- dmz_network
networking::connection::dmz_network:
mac_address: '00:50:56:9d:47:eb'
ipv4_method: 'manual'
ipv4_address: '192.33.120.60/24'
ipv4_gateway: '192.33.120.1'
ipv6_method: 'disabled'
```
- remove the "back-door" interface from the VM

View File

@@ -1,18 +0,0 @@
# Re-Installation
Basically a reinstall can be done without doing anything other than doing the PXE boot, but there are some caveats to consider:
__Netboot__
After the initial installation the boot mode has been reset from netboot to local so it will then always boot from the local disk. For a redeployment the netboot needs to be set anew (on UEFI based systems netboot also needs to be always selected by the UEFI menu))
```bash
bob node netboot $FQDN
```
__Puppet Certificates__
The puppet client certificate is saved on the puppet server. By default corresponding certificate on the client is tried to be saved by the kickstart script. If you do a new install to a blank drive, but the puppet server has a certificate saved for the host, the client will generate a new cert, but the server will not, so the certificates saved on the 2 sides, will not match and will never work. In this case both sides need to be cleaned up before a new puppet run is attempted.
Puppet client certs can be deleted at https://puppet.psi.ch/ and on that page, the command to delete the client cert is specified.
To access https://puppet.psi.ch one needs to authenticate with your username/password. The server uses a invalid https certificate that is not accepted by modern safari/chrome any more. Use Firefox as a workaround.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 216 KiB

View File

@@ -1,97 +0,0 @@
@startuml
title
**Desktop Authentication**
Needs a shared credential cache with //systemd --user// as it is used to start some processes
and the TGT needs to be updated on reauthentication when unlocking the screen.
end title
actor user
box gdm
participant gdm
participant libpam
participant "pam_sssd.so" as pam_sssd
participant "pam_systemd.so" as pam_systemd
participant "pam_single_kcm_cache.so" as pam_single_kcm_cache
end box
participant sssd
participant "systemd --user" as systemd
box KCM
participant "sssd-kcm" as sssd_kcm
participant "credential cache KCM:$UID:61555" as default_cache
participant "credential cache KCM:$UID:desktop" as shared_cache
end box
box Gnome
participant "gnome-session-binary" as gnome_session
participant "gnome-shell" as gnome_shell
participant Firefox as firefox
participant "gnome-terminal" as gnome_terminal
end box
box Active Directory
participant KDC as kdc
end box
== authentication ==
user -> gdm : authenticates with password
gdm -> libpam : authenticate user
libpam -> pam_sssd : //pam_sm_setcred()//
pam_sssd -> sssd : authenticate
sssd -> kdc : authenticate and get TGT
sssd -> sssd_kcm : get default cache
sssd -> default_cache : place TGT
libpam -> pam_single_kcm_cache : //pam_sm_setcred()//
pam_single_kcm_cache -> sssd_kcm : iterate all suitable caches to find newest TGT
note right: the default cache may change in between
pam_single_kcm_cache -> default_cache: get TGT
pam_single_kcm_cache -> sssd_kcm : create new shared cache if it does not exist yet
create shared_cache
sssd_kcm -> shared_cache: create
pam_single_kcm_cache -> shared_cache: place newest TGT
pam_single_kcm_cache -> libpam: set //KRB5CCNAME=KCM:$UID:desktop//
gdm -> libpam : setup session
libpam -> pam_systemd : //pam_sm_open_session()//
create systemd
pam_systemd -> systemd: start if not running yet
== starting the desktop ==
create gnome_session
gdm -> gnome_session : start Gnome session
gnome_session -> systemd : start some Gnome services
gnome_session -> gnome_session: start more Gnome services
create gnome_shell
gnome_session -> gnome_shell: start Gnome Shell
== starting programs ==
user -> gnome_shell: open browser
create firefox
gnome_shell -> firefox : start
user -> gnome_shell : open terminal
gnome_shell -> systemd: start gnome-terminal
create gnome_terminal
systemd -> gnome_terminal: start
== screen lock and unlock ==
user -> gnome_shell : lock screen
gnome_shell -> gdm : lock screen
user -> gdm : authenticates with password
gdm -> libpam : authenticate user
libpam -> pam_sssd : //pam_sm_setcred()//
pam_sssd -> sssd : authenticate
sssd -> kdc : authenticate and get TGT
sssd -> sssd_kcm : get default cache
sssd -> default_cache : place TGT
libpam -> pam_single_kcm_cache : //pam_sm_setcred()//
pam_single_kcm_cache -> sssd_kcm : iterate all suitable caches to find newest TGT
note right: the default cache may change in between
pam_single_kcm_cache -> default_cache: get TGT
pam_single_kcm_cache -> sssd_kcm : get shared cache
pam_single_kcm_cache -> shared_cache: place newest TGT
note over gdm : no session setup step
gdm -> gnome_shell : screen unlocked
@enduml

Binary file not shown.

Before

Width:  |  Height:  |  Size: 90 KiB

View File

@@ -1,54 +0,0 @@
@startuml
title
**SSH with Password Authentication**
Provide every shell session an individual and isolated credential cache in KCM.
end title
hide footbox
actor user
box sshd
participant sshd
participant libpam
participant "pam_sssd.so" as pam_sssd
participant "pam_systemd.so" as pam_systemd
participant "pam_single_kcm_cache.so" as pam_single_kcm_cache
end box
participant sssd
participant "systemd --user" as systemd
box KCM
participant "sssd-kcm" as sssd_kcm
participant "credential cache KCM:$UID:61555" as default_cache
participant "credential cache KCM:$UID:sitmchszro" as random_cache
end box
participant bash
box Active Directory
participant KDC as kdc
end box
user -> sshd : connects using //ssh//\nwith authentication method //password//
sshd -> libpam : authenticate user
libpam -> pam_sssd : //pam_sm_setcred()//
pam_sssd -> sssd : authenticate
sssd -> kdc : authenticate and get TGT
sssd -> sssd_kcm : get default cache
sssd -> default_cache : place TGT
sshd -> libpam : setup session
libpam -> pam_systemd : //pam_sm_open_session()//
create systemd
pam_systemd -> systemd: start if not running yet
libpam -> pam_single_kcm_cache : //pam_sm_open_session()//
pam_single_kcm_cache -> sssd_kcm : iterate all suitable caches to find newest TGT
note right: the default cache may change in between
pam_single_kcm_cache -> default_cache: get TGT
pam_single_kcm_cache -> sssd_kcm : create new random cache
create random_cache
sssd_kcm -> random_cache: create
pam_single_kcm_cache -> random_cache: place newest TGT
pam_single_kcm_cache -> libpam: set //KRB5CCNAME=KCM:$UID:sitmchszro//
create bash
sshd -> bash : start
@enduml

Binary file not shown.

Before

Width:  |  Height:  |  Size: 80 KiB

View File

@@ -1,49 +0,0 @@
@startuml
title
**SSH with TGT Delegation**
Provide every shell session an individual and isolated credential cache in KCM.
end title
hide footbox
actor user
box sshd
participant sshd
participant libpam
participant "pam_systemd.so" as pam_systemd
participant "pam_single_kcm_cache.so" as pam_single_kcm_cache
end box
participant "systemd --user" as systemd
box KCM
participant "sssd-kcm" as sssd_kcm
participant "credential cache KCM:$UID:61555" as new_cache
participant "credential cache KCM:$UID:sitmchszro" as random_cache
end box
participant bash
user -> sshd : connects using //ssh//\nwith //GSSAPIDelegateCredentials=yes//\nand authentication method //gssapi-with-mic//
note right: authentication is done without libpam
sshd -> sssd_kcm : get new cache
create new_cache
sssd_kcm -> new_cache : create
sshd -> new_cache : place delegated TGT
sshd -> libpam : setup session
libpam -> pam_systemd : //pam_sm_open_session()//
create systemd
pam_systemd -> systemd: start if not running yet
libpam -> pam_single_kcm_cache : //pam_sm_open_session()//
pam_single_kcm_cache -> sssd_kcm : iterate all suitable caches to find newest TGT
note right: the default cache might be KCM:$UID:61555 or not
pam_single_kcm_cache -> new_cache: get TGT
pam_single_kcm_cache -> sssd_kcm : create new random cache
create random_cache
sssd_kcm -> random_cache: create
pam_single_kcm_cache -> random_cache: place newest TGT
pam_single_kcm_cache -> libpam: set //KRB5CCNAME=KCM:$UID:sitmchszro//
create bash
sshd -> bash : start
@enduml

Binary file not shown.

Before

Width:  |  Height:  |  Size: 50 KiB

View File

@@ -1,35 +0,0 @@
@startuml
title
**Startup of Systemd User Instance**
One single //systemd --user// instance spans from the start of the first session
to the end of the last session and has access to the same credential cache as the desktop.
end title
hide footbox
box Systemd User Instance
participant "systemd --user" as systemd
participant libpam
participant "pam_single_kcm_cache.so" as pam_single_kcm_cache
end box
box KCM
participant "sssd-kcm" as sssd_kcm
participant "credential cache KCM:$UID:61555" as default_cache
participant "credential cache KCM:$UID:desktop" as shared_cache
end box
note over systemd : no authentication step
systemd -> libpam : setup session
libpam -> pam_single_kcm_cache : //pam_sm_open_session()//
pam_single_kcm_cache -> sssd_kcm : iterate all suitable caches to find newest TGT
note right: the default cache may change in between
pam_single_kcm_cache -> default_cache: get TGT
pam_single_kcm_cache -> sssd_kcm : create shared cache if not yet exists
create shared_cache
sssd_kcm -> shared_cache: create
pam_single_kcm_cache -> shared_cache: place newest TGT
pam_single_kcm_cache -> libpam: set //KRB5CCNAME=KCM:$UID:desktop//
@enduml

View File

@@ -1,55 +0,0 @@
# Desktop on RHEL 8
## Many Servers and Managers
Following software is involved in getting the desktop on Linux up and running.
- **Display Server** paints the image onto the screen
- **Xorg**: good ol' Unix X Server with network redirection
- **Wayland**: new and modern
- **Display Manager** shows up at startup to authenticate the user and then start the desktop session
- **gdm** Gnome Display Manager is default on RHEL 8
- **lightdm** is very flexible, but automatic Gnome screen lock does not work with it, manual locking would be needed (`dm_tool lock`)
- **sddm** the Simple Desktop Display Manager from the KDE world fails due to a kernel bug on RHEL 8.6
- **Greeter**: user interface part of the display manager, e.g. for `lightdm` it is exchangable
- **Accounts Service** (`accounts-daemon`) used by `gdm` to learn about/store user information (last desktop session, profile image, etc)
- **Session Manager** starts the actual desktop. The installed options are found in `/usr/share/wayland-sessions/` for `Wayland` and `/usr/share/xsessions/` for `Xorg`.
- **gnome-session** normal Gnome starter (for `Xorg` and `Wayland`)
- **gnome-session-custom-session** to select a specific saved Gnome session
- **icewm-session** IceWM starter, `Xorg` only
- **startxfce4** XFCE starter, `Xorg` only
- **startplasma-x11** KDE Plasma starter for `Xorg`
- **startplasma-wayland** KDE Plasma starter for `Wayland`
Out of the box RHEL 8 starts Gnome using `gdm` and `Wayland`. `Xorg` is also supported. Others can be installed from EPEL, but there is no support from Red Hat.
## PSI Specific Desktop Settings
Per default Puppet starts `gdm` which then starts Gnome with `Xorg` using `/usr/share/xsessions/gnome-xorg.desktop`.
Normally the Display Managers offer the user to select one of the available Desktop Sessions (`*.desktop` files in `/usr/share/wayland-sessions/` and `/usr/share/xsessions/`). This has been disabled as normally at the PSI this is more set per system and not per user.
In Hiera the actual Desktop Session to be started can be selected/overriden by setting the `desktop::session_manager` to one of the `.desktop` files in above listed directories. Set it e.g. to `gnome-wayland` to test `Wayland`. It will then end up as default session manager in `/etc/accountsservice/user-templates/standard`.
Note when changing the default Session Manager, previous users will still get the one they used before. To reset that, you need to delete
- stop AccountsService (`systemctl stop accounts-daemon.service`)
- `/var/lib/AccountsService/users/*` (for `gdm`)
- `/var/cache/lightdm/dmrc/*.dmrc` (for `lightdm`)
- `/var/lib/lightdm/.cache/lightdm-gtk-greeter/state` (for `lightdm` with `lightdm-gtk-greeter`)
- start AccountsService (`systemctl start accounts-daemon.service`)
### XFCE
XFCE is installed when `base::enable_xfce: true` is set in Hiera.
It then is also used by default with `base::xfce_default: true` or `desktop::session_manager: xfce`.
### IceWM
IceWM is installed when `base::enable_icewm: true` is set in Hiera.
It then is also used by default with `desktop::session_manager: icewm-session`.
### Using a different Desktop (e.g. KDE)
The respective Desktop needs to be installed, either manually or through Puppet.
The respective Session Manager can be set as system default in Hiera with `desktop::session_manager`.
If a different Display Manager is needed, or `lightdm` on other occasions, then changes in our Puppet code are required.

View File

@@ -1,96 +0,0 @@
# Hardware Compatibility
Here are hardware tests with RHEL8 on standard PSI hardware documented.
Generally speaking has Linux a rather good hardware compatibility with PC hardware, usually the older the hardware the better the support.
## Desktop Hardware
### HP Elite Mini 800 G9 Desktop PC
- ✔ NIC
- ✔ GPU
- ✔ HDMI
- ✔ DP
- ✔ USB
- ✔ Audio
### HP Z2 Tower G9 Workstation Desktop PC
- ✔ NIC
- ✔ GPU
- ✔ DP
- ✔ USB
- ✔ Audio
## Mobile Hardware
### HP ZBook Studio 16 inch G10 Mobile Workstation PC
- ❌/✔ Installation only with extra steps
- ✔ NIC (HP USB-C to RJ45 Adapter G2)
- ✔ WLAN
- ✔ GPU
- ✔ USB
- ❌/✔ Audio (manual firmware binary install from https://github.com/thesofproject/sof-bin/ required), microphone and ear plugs work, but not the speaker
- ✔ Webcam
- ✔ Bluetooth
- SIM slot (not tested)
- fingerprint scanner (not tested)
#### Installation
- Nouveau driver fails on the Nvidia chip, so modesetting needs to be disabled:
```
bob node set-attr pcXYZ.psi.ch "kernel_cmdline=nomodeset nouveau.modeset=0"
```
- Register the system "Pass Through" MAC address (you find it in the BIOS).
- Installation did not work with the "HP USB-C to RJ45 Adapter G2": use another, registered adapter or dock instead.
### HP ZBook Studio 16 inch G11 Mobile Workstation PC
- ✔ NIC (HP USB-C to RJ45 Adapter G2)
- ✔ WLAN
- ❌/✔ GPU
- ✔ USB
- ❌ Monitor via USB C
- ✔ Audio
- ✔ Webcam
- ❌ Bluetooth
- SIM slot (not tested)
- fingerprint scanner (not tested)
#### Installation
The device has two GPUs, an Intel and an Nvidia:
```
# lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation Meteor Lake-P [Intel Arc Graphics] (rev 08)
01:00.0 VGA compatible controller: NVIDIA Corporation Device 28b9 (rev a1)
#
```
With the propretary Nvidia driver this does not work, but with the open source Nouveau driver it is fine, except that it cannot handle more than the primary monitor.
To enable the Nouveau driver put `nvidia::driver::enable: false` in Hiera, run as root `puppet agent -t` and remove all Nvidia related driver packages, then reboot.
### HP EliteBook 840 14 inch G10 Notebook PC
- ✔ WLAN
- ✔ GPU
- ✔ HDMI
- ✔ USB
- ✔ Audio
- ✔ Webcam
- ✔ Bluetooth
- SIM slot (not tested)
- Card reader (not tested)
## Test Details
- network card by installation via network
- GPU by running graphical desktop, for Nvidia GPUs check if Nvidia drivers are used
- HDMI
- DP
- USB by mouse, keyboard
- notebook:
- wifi
- bluetooth (headset working?)
- microphone
- speaker
- webcam
- Dock
- network
- USB by mouse, keyboard
- HDMI

View File

@@ -1,177 +0,0 @@
# Red Hat Enterprise Linux 8
## Production Ready
The central infrastructure (automatic provisioning, upstream package synchronisation and Puppet) are stable and production ready.
The configuration management is done with Puppet like for RHEL 7. RHEL 7 and RHEL 8 hosts can share the same hierarchy in Hiera and thus also the "same" configuration. In cases where the configuration for RHEL 7 or RHEL 8 differs, the idea is to have both in parallel in Hiera and Puppet shall select the right one.
Please still consider also implementing following two migrations when moving to RHEL 8:
- migrate from Icinga1 to [Icinga2](../../configuration/monitoring/icinga2.md), as Icinga1 will be decommissioned by end of 2024
- explicit [network configuration in Hiera](../../configuration/basic/networking.md) with `networking::setup`, especially if you have static IP addresses or static routes
Bugs and issues can be reported in the [Linux Issues project](https://git.psi.ch/linux-infra/issues).
## Documenation
* [Installation](../basic_installation.md)
* [CUDA and Nvidia Drivers](nvidia)
* [Kerberos](kerberos)
* [Desktop](desktop)
* [Hardware Compatibility](hardware_compatibility)
* [Vendor Documentation](vendor_documentation)
## Disk Layout
The default partition schema for RHEL8 is:
- create one primary ``/boot`` partition of 1Gb;
- create the ``vg_root`` Volume Group that uses the rest of the disk;
- on ``vg_root`` create the following logical volumes:
- ``lv_root`` of 14 Gb size for ``/root``;
- ``lv_home`` of 2 Gb size for ``/home``;
- ``lv_var`` of 8 Gb size for ``/var``;
- ``lv_var_log`` of 3 Gb size for ``/var/log``;
- ``lv_var_tmp`` of 2 Gb size for ``/var/log``;
- ``lv_tmp`` of 2 Gb size for ``/tmp``.
## Caveats
### Missing or Replaced Packages
[List of packages removed in RHEL 8](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/considerations_in_adopting_rhel_8/index#removed-packages_changes-to-packages)
| RHEL 7 | RHEL 8 | remarks |
| --- | --- | --- |
| `a2ps` | recommends to use `enscript` instead | [`enscript` upstream](https://www.gnu.org/software/enscript/) [`a2ps` upstream](https://www.gnu.org/software/a2ps/) |
| `blt` | - | [`blt` upstream](http://blt.sourceforge.net/), does not work with newer Tk version ([source](https://wiki.tcl-lang.org/page/BLT)) |
| `gnome-icon-theme-legacy` | - | used for RHEL 7 Icewm |
| ... | ... | here I stopped research, please report/document further packages |
| `devtoolset*` | `gcc-toolset*` | |
| `git-cvs` | - | `cvs` itself is not supported by RHEL8, but available through EPEL. Still missing is the support for `git cvsimport`. |
### Missing RAID Drivers
#### Missing RAID Drivers during Installation
For RHEL 8 Red Hat phased out some hardware drivers, here is an [official list](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/considerations_in_adopting_rhel_8/hardware-enablement_considerations-in-adopting-rhel-8#removed-adapters_hardware-enablement), but I also found some stuff missing not listed there.
Installation with an unsupported RAID adapter then fails as the installer does not find a system disk to use.
To figure out what driver you need, best go the the installer shell or boot a rescue linux over the network and on the shell check the PCI Device ID of the RAID controller with
```
$ lspci -nn
...
82:00.0 RAID bus controller [0104]: 3ware Inc 9750 SAS2/SATA-II RAID PCIe [13c1:1010] (rev 05)
...
```
The ID is in the rightmost square brackets. Then check if there are drivers available.
I will now focus on [ElRepo](https://elrepo.org/) which provides drivers not supported any more by Red Hat. Check the PCI Device ID on their list of (https://elrepo.org/tiki/DeviceIDs). If you found a driver, then there are also [driver disks provided](https://linuxsoft.cern.ch/elrepo/dud/el8/x86_64/).
There are two option in providing this driver disk to the installer:
1. Download the according `.iso` file and extract it on an USB stick labelled with `OEMDRV` and have it connected during installation.
2. Extend the kernel command line with `inst.dd=$URL_OF_ISO_FILE`, e.g. with a custom Grub config on the [boot server](https://git.psi.ch/linux-infra/network-boot) or with the sysdb/bob attribute `kernel_cmdline`.
([Red Hat documentation of this procedure](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/performing_an_advanced_rhel_8_installation/index#updating-drivers-during-installation_installing-rhel-as-an-experienced-user))
At the end do not forget to enable the ElRepo RPM package repository in Hiera to also get new drivers for updated kernels:
```
# enable 3rd-party drivers from ElRepo
rpm_repos::default:
- 'elrepo_rhel8'
```
#### Missing RAID Drivers on Kernel Upgrade
If the machine does not boot after provisioning or after an kernel upgrade with
```
Warning: /dev/mapper/vg_root-lv_root does not exist
Warning: /dev/vg_root/lv_root does not exist
```
after a lot of
```
Warning: dracut-initqueue timeout - starting timeout scripts
```
the it could be that the RAID controller supported was removed with the new kernel, e.g. for the LSI MegaRAID SAS there is a [dedicated article](https://access.redhat.com/solutions/3751841).
For the LSI MegaRAID SAS there is still a driver available in ElRepo, so it can be installed during provisioning by Puppet.
To do so add to Hiera:
```
base::pkg_group::....:
- 'kmod-megaraid_sas'
rpm_repos::default:
- 'elrepo_rhel8'
```
### AFS cache partition not created due to existing XFS signature
It can happen when upgrading an existing RHEL 7 installation that the puppet run produces
```
Error: Execution of '/usr/sbin/lvcreate -n lv_openafs --size 2G vg_root' returned 5: WARNING: xfs signature detected on /dev/vg_root/lv_openafs at offset 0. Wipe it? [y/n]: [n]
```
This needs to be fixed manually:
- run the complaining command and approve (or use `--yes`)
- run `puppet agent -t` to finalize the configuration
### Puppet run fails to install KCM related service/timer on Slurm node
The Puppet run fails with
```
Notice: /Stage[main]/Profile::Aaa/Systemd::Service[kcm-destroy]/Exec[start-global-user-service-kcm-destroy]/returns: Failed to connect to bus: Connection refused
Error: '/usr/bin/systemctl --quiet start --global kcm-destroy.service' returned 1 instead of one of [0]
Error: /Stage[main]/Profile::Aaa/Systemd::Service[kcm-destroy]/Exec[start-global-user-service-kcm-destroy]/returns: change from 'notrun' to ['0'] failed: '/usr/bin/systemctl --quiet start --global kcm-destroy.service' returned 1 instead of one of [0] (corrective)
Notice: /Stage[main]/Profile::Aaa/Profile::Custom_timer[kcm-cleanup]/Systemd::Timer[kcm-cleanup]/Exec[start-global-user-timer-kcm-cleanup]/returns: Failed to connect to bus: Connection refused
Error: '/usr/bin/systemctl --quiet start --global kcm-cleanup.timer' returned 1 instead of one of [0]
Error: /Stage[main]/Profile::Aaa/Profile::Custom_timer[kcm-cleanup]/Systemd::Timer[kcm-cleanup]/Exec[start-global-user-timer-kcm-cleanup]/returns: change from 'notrun' to ['0'] failed: '/usr/bin/systemctl --quiet start --global kcm-cleanup.timer' returned 1 instead of one of [0] (corrective)
```
This is caused by the use of KCM as default Kerberos credential cache in RHEL8:
- for RHEL8 it was recommended to use the KCM provided by sssd as Kerberos Credential Cache.
- a major issue of this KCM is that it does not remove outdated caches
- this leads to a Denial-of-Service situation when all 64 slots are filled, new logins start to fail after (this is persistent, reboot does not help).
- we fix this issue by running regularly cleanup script in user context
- this "user context" is handled by the `systemd --user` instance, which is started on the first login and keeps running until the last session ends.
- that systemd user instance is started by `pam_systemd.so`
- `pam_systemd.so` and `pam_slurm_adopt.so` conflict because both want to set up cgroups
- because of this there is no `pam_systemd.so` configured on Slurm nodes thus there is no `systemd --user` instance
I see two options to solve this issue:
- do not use KCM
- get somehow systemd user instance running
#### do not use KCM
Can be done in Hiera, to get back to RHEL7 behavior do
aaa::default_krb_cache: "KEYRING:persistent:%{literal('%')}{uid}"
then there will be no KCM magic any more.
We could also make this automatically happen in Puppet when Slurm is enabled.
#### get somehow systemd user instance running
`pam_systemd.so` does not want to take its hands off cgroups:
https://github.com/systemd/systemd/issues/13535
But there is documented how to get (part?) of the `pam_systemd.so` functionality running with Slurm:
https://slurm.schedmd.com/pam_slurm_adopt.html#PAM_CONFIG
(the Prolog, TaskProlog and Epilog part).
I wonder if that also starts a `systemd --user` instance or not. Or if it is possible to somehow integrate the start of it therein.
### Workstation Installation Takes Long and Seams to Hang
On the very first puppet run the command to install the GUI packages takes up to 10 minutes and it looks like it
is hanging. Usually it is after the installation of `/etc/sssd/sssd.conf`. Just give it a bit time.
### "yum/dnf search" Gives Permission Denied as Normal User
It works fine beside the below error message:
```
Failed to store expired repos cache: [Errno 13] Permission denied: '/var/cache/dnf/x86_64/8/expired_repos.json'
```
which is IMHO OK to not allow a normal user to do changes there.

View File

@@ -1,333 +0,0 @@
# Kerberos on RHEL 8
This document describes the state of Kerberos on RHEL 8.
This includes the current open issues, a user guide and how we solved the KCM (Kerberos Cache Manager) issues.
At the bottom you find sequence diagrams showing the interactions concerning authentication and Kerberos.
## Open Problems
- cleanup of caches, else we might end up in DoS situation. Best we do this `systemd --unit` managed.
- Kerberos with Firefox does not work yet.
## User Guide
### Manage Ticket for Admin User
If you need for administrative operations a TGT from your admin user (e.g. `buchel_k-adm`), then do
```
OLD_KRB5CCNAME=$KRB5CCNAME
export KRB5CCNAME=KCM:$(id -u):admin
kinit $(id -un)-adm
```
and after you are done do
```
kdestroy
export KRB5CCNAME=$OLD_KRB5CCNAME
```
to delete your administrative tickets and to get back to your normal credential cache.
### Update TGT on Long Running Sessions
The TGT will be automatically renewed for 7 days.
Note that a screen unlock or a new connection with NoMachine NX will update the credential cache with a new TGT.
But also manual reauthentication is possible. Inside the session you can do
```
kinit
```
Outside of the session you first need to figure out the credential cache used.
First get the process ID of the process which needs authentication, then
```
$ strings /proc/$PID/environ | grep KRB5CCNAME
KRB5CCNAME=KCM:44951:iepgjskbkd
$
```
and then a
```
KRB5CCNAME=KCM:44951:iepgjskbkd kinit
```
will update given credential cache.
Note that for AFS it will look in all caches for a valid TGT, so logging in on the desktop or ssh with password or ticket delegation is sufficient to make AFS access work for another week.
### List all Credential Caches
```
KRB5CCNAME=KCM: klist -l
```
lists all caches and
```
KRB5CCNAME=KCM: klist -A
```
also the tickets therein.
## Kerberos Use and Test Cases
- ssh authentication (authentication method `gssapi-with-mic`)
- ssh TGT (ticket granting ticket) delegation (with `GSSAPIDelegateCredentials yes`)
- AFS authentication (`aklog`)
- AFS administrative operations where the user switches to a separate admin principal (e.g. `buchel_k-adm`)
- long running sessions with `nohup`, `tmux` and `screen`
- local desktop: get new TGT on login
- local desktop: TGT renewal after reauthentication on lock screen
- remote desktop with NoMachine NX: get new TGT on login
- remote desktop with NoMachine NX: TGT renewal after reconnection
- website authentication (`SPNEGO` with Firefox, Chrome)
## KCM (Kerberos Cache Manager)
In RHEL 7 we are using the `KEYRING` (kernel keyring) cache,
whereas for RHEL 8 there came early the wish to use KCM instead,
which also is the [new default](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/considerations_in_adopting_rhel_8/identity-management_considerations-in-adopting-rhel-8#kcm-replace-keyring-default-cache_considerations-in-adopting-RHEL-8).
The Kerberos documentation contains a [reference for all available cache types]( https://web.mit.edu/kerberos/www/krb5-latest/doc/basic/ccache_def.html).
The KCM cache is provided by a dedicated daemon, for RHEL8 this is `sssd_kcm` which has been programmed by Red Hat itself.
### Advantages of KCM
The advantage of KCM is that the caches are permanent and survive daemon restarts and system reboots without the need to fiddle around with files and file permission. This simplifies daemon and container use cases.
It also automatically renews tickets which is handy for every use case.
### User Based vs Session Based
Intuitively I would expect that something delicate as authentication is managed per session (ssh, desktop, console login, ...).
Aparently with KCM this is not the case. It provides a default cache which is supposed to be the optimal for you and that can change any time.
Problems I see with this are
- user may change his principal, eg. for admin operations (`kinit buchel_k-adm`) which is then used by all sessions
- user may destroy the cache (it is good security practice to have a `kdestroy` in `.bash_logout` to ensure nobody on the machine can use your tokens after logging out)
- software may put tokens into the cache which suddenly are not there any more
- the magic/heuristic used to select might not work optimally for all use cases (as we see below `sshd-kcm` fails horribly...)
So if we have more than one session on a machine (e.g. people connecting via remote desktop and ssh at the same time), the cross-session side-effects can cause unexpected behaviour.
In contrast to this for AFS token renewal having access to new tokens is helpful, as this allows prolong the time a `PAG` (group of processes authenticated against AFS) is working as long as there is at least one valid ticket available.
Or even to recover when a new ticket comes available again.
A way to get KCM of of the business of selecting the "optimal" cache is to select it yourself and provide the session/software one specific cache by setting the `KRB5CCNAME` environment variable accordingly (e.g. `KCM:44951:66120`). Note when set to `KCM:` it will use as default cache the one KCM believes should be the default cache. And that can change for whatever reason.
### Problems of `sssd_kcm`
To check the Kerberos credential cache, you can use `klist` to look a the current default cache and `klist -l` to look at all available caches. Note that the first listed cache is the default cache. Of course that is only valid when there is no `KRB5CCNAME` environment variable set or it is `KCM:`.
#### No Cleanup of Expired Caches
The most obvious and [well known problem](https://github.com/SSSD/sssd/issues/3593) of `sshd-kcm` is that it does not remove expired tokens and credential caches. I agree that it should not have an impact as this is mostly cosmetic. But that is only the case when everything can cope with that...
By default is is limited to 64 caches, but when that limit was hit, then it was not possible any more to authenticate on the lock screen:
```
Okt 05 14:57:11 lxdev01.psi.ch krb5_child[43689]: Internal credentials cache error
```
So this causes a denial of service problem, we need to deal with somehow, e.g. by regulary removing expired caches. And note that these caches are persistent and do not get removed on reboot.
#### Use of Expired Credential Caches
In below example you see that on the ssh login, I got a new default cache. But after a few minutes (there was a Desktop login from my side and maybe an automatic AFS token renewal in between), I get an expired cache as default cache.
```
$ ssh lxdev01.psi.ch
Last login: Tue Oct 4 09:50:33 2022
[buchel_k@lxdev01 ~]$ klist -l
Principal name Cache name
-------------- ----------
buchel_k@D.PSI.CH KCM:44951:42923
buchel_k@D.PSI.CH KCM:44951:12312 (Expired)
buchel_k@D.PSI.CH KCM:44951:42199 (Expired)
buchel_k@D.PSI.CH KCM:44951:40168
buchel_k@D.PSI.CH KCM:44951:8914 (Expired)
buchel_k@D.PSI.CH KCM:44951:62275 (Expired)
buchel_k@D.PSI.CH KCM:44951:27078 (Expired)
buchel_k@D.PSI.CH KCM:44951:73924 (Expired)
buchel_k@D.PSI.CH KCM:44951:72006
buchel_k@D.PSI.CH KCM:44951:64449 (Expired)
buchel_k@D.PSI.CH KCM:44951:60061 (Expired)
buchel_k@D.PSI.CH KCM:44951:36925 (Expired)
buchel_k@D.PSI.CH KCM:44951:48361 (Expired)
buchel_k@D.PSI.CH KCM:44951:49651 (Expired)
buchel_k@D.PSI.CH KCM:44951:76984 (Expired)
buchel_k@D.PSI.CH KCM:44951:54227 (Expired)
buchel_k@D.PSI.CH KCM:44951:85800 (Expired)
[buchel_k@lxdev01 ~]$ klist -l
Principal name Cache name
-------------- ----------
buchel_k@D.PSI.CH KCM:44951:12312 (Expired)
buchel_k@D.PSI.CH KCM:44951:42199 (Expired)
buchel_k@D.PSI.CH KCM:44951:40168
buchel_k@D.PSI.CH KCM:44951:8914 (Expired)
buchel_k@D.PSI.CH KCM:44951:62275 (Expired)
buchel_k@D.PSI.CH KCM:44951:27078 (Expired)
buchel_k@D.PSI.CH KCM:44951:73924 (Expired)
buchel_k@D.PSI.CH KCM:44951:72006
buchel_k@D.PSI.CH KCM:44951:64449 (Expired)
buchel_k@D.PSI.CH KCM:44951:60061 (Expired)
buchel_k@D.PSI.CH KCM:44951:36925 (Expired)
buchel_k@D.PSI.CH KCM:44951:48361 (Expired)
buchel_k@D.PSI.CH KCM:44951:42923
buchel_k@D.PSI.CH KCM:44951:49651 (Expired)
buchel_k@D.PSI.CH KCM:44951:76984 (Expired)
buchel_k@D.PSI.CH KCM:44951:54227 (Expired)
buchel_k@D.PSI.CH KCM:44951:85800 (Expired)
[buchel_k@lxdev01 ~]$
```
Note that the automatic AFS token renewal was created after we have experienced this issue.
#### Busy Loop of `goa-daemon`
If the [GNOME Online Accounts](https://wiki.gnome.org/Projects/GnomeOnlineAccounts) encounters a number of Kerberos credential caches it goes into a busy loop and causes `sssd-kcm` to consume 100% of one core. Happily ignored bugs at [Red Hat](https://bugzilla.redhat.com/show_bug.cgi?id=1645624#c113) and [Gnome](https://gitlab.gnome.org/GNOME/gnome-online-accounts/-/issues/79).
#### Zombie Caches by NoMachine NX
On a machine with remote desktop access using NoMachine NX I have seen following cache list in the log:
```
# /usr/bin/klist -l
Principal name Cache name
-------------- ----------
fische_r@D.PSI.CH KCM:45334:73632 (Expired)
buchel_k@D.PSI.CH KCM:45334:55706 (Expired)
fische_r@D.PSI.CH KCM:45334:44226 (Expired)
fische_r@D.PSI.CH KCM:45334:40904 (Expired)
fische_r@D.PSI.CH KCM:45334:62275 (Expired)
fische_r@D.PSI.CH KCM:45334:89020 (Expired)
buchel_k@D.PSI.CH KCM:45334:25061 (Expired)
buchel_k@D.PSI.CH KCM:45334:35168 (Expired)
fische_r@D.PSI.CH KCM:45334:73845 (Expired)
fische_r@D.PSI.CH KCM:45334:47508 (Expired)
fische_r@D.PSI.CH KCM:45334:34317 (Expired)
fische_r@D.PSI.CH KCM:45334:52058 (Expired)
fische_r@D.PSI.CH KCM:45334:16150 (Expired)
fische_r@D.PSI.CH KCM:45334:84445 (Expired)
fische_r@D.PSI.CH KCM:45334:69076 (Expired)
buchel_k@D.PSI.CH KCM:45334:87346 (Expired)
fische_r@D.PSI.CH KCM:45334:57070 (Expired)
```
or on another machine in my personal list:
```
[buchel_k@pc14831 ~]$ klist -l
Principal name Cache name
-------------- ----------
buchel_k@D.PSI.CH KCM:44951:69748
buchel_k@D.PSI.CH KCM:44951:18506 (Expired)
buchel_k@D.PSI.CH KCM:44951:5113 (Expired)
buchel_k@D.PSI.CH KCM:44951:52685 (Expired)
buchel_k@D.PSI.CH KCM:44951:13951 (Expired)
PC14831$@D.PSI.CH KCM:44951:43248 (Expired)
PC14831$@D.PSI.CH KCM:44951:58459 (Expired)
buchel_k@D.PSI.CH KCM:44951:14668 (Expired)
buchel_k@D.PSI.CH KCM:44951:92516 (Expired)
[buchel_k@pc14831 ~]$
```
Both show principals which I am very sure that they have not been added manually by the user. And somewhere there is a security issue, either `sssd-kcm` or NoMachine NX.
In another experiment I logged into a machine with `ssh` and did `kdestroy -A` which should destroy all caches:
```
[buchel_k@mpc2959 ~]$ kdestroy -A
[buchel_k@mpc2959 ~]$ klist -l
Principal name Cache name
[buchel_k@mpc2959 ~]$
```
After I logged in via NoMachine NX I got a cache expired since more than two month:
```
[buchel_k@mpc2959 ~]$ klist -l
Principal name Cache name
buchel_k@D.PSI.CH KCM:44951:16795 (Expired)
buchel_k@D.PSI.CH KCM:44951:69306
[buchel_k@mpc2959 ~]$ klist
Ticket cache: KCM:44951:16795
Default principal: buchel_k@D.PSI.CH
Valid starting Expires Service principal
13.07.2022 11:35:51 13.07.2022 21:26:19 krbtgt/D.PSI.CH@D.PSI.CH
renew until 14.07.2022 11:26:19
[buchel_k@mpc2959 ~]$ date
Do Sep 22 08:37:41 CEST 2022
[buchel_k@mpc2959 ~]$
```
Note that a non-expired cache is available, but NoMachine NX explicitely sets `KRB5CCNAME` to a specific KCM cache. And it contains a ticket/cache which is supposed to the gone.
So there is a security bug in `sssd-kcm`: it does not fully destroy tickets when being told so. And there is another security issue in the NoMachine NX -> `sssd-kcm` interaction. I assume that it talks with the KCM as root and gets somehow (or has saved somewhere) old caches and moves them over into user context. But the cache may originally not have belonged to the user...
I have not found a lot concerning Kerberos on the NoMachine website.
## Solution Attempts
Ideally we would get to a solution which can do the following:
- interactive user sessions are isolated do not interfer with each other
- AFS can get hold of new tickets and inject them into the PAGs as long as the user somehow regularly authenticates
- `systemd --user` which is residing outside of the interactive user sessions is happy as well
- `goa-daemon` sees only one cache
- expired caches get somehow cleaned up
### Only One Cache
The `sssd-kcm` limits the number of caches by default to 64, but that can be changed to 1 with the `max_uid_ccaches`.
So there would be only one cache, shared by all sessions, but at least the KCM cannot serve anything but the latest.
But some logins do not work any more when the maximum number of caches is hit as already documented above in the chapter "No Cleanup of Expired Caches".
### renew-afstoken Script/Daemon
For AFS we (Achim and I) made the script `renew-afstoken` which is started as per PAG daemon by PAM upon login.
Out of the available KCM caches it selects a suitable one to regulary get a new AFS token.
This now works very robust and can also recover from expiration when a new ticket gets available.
### Setup Shared or Isolated Caches with KRB5CCNAME in own PAM Module
The self-made PAM module `pam_single_kcm_cache.so` improves the situation by setting
- `KRB5CCNAME=KCM:$UID:desktop` to use a shared credential cache for desktop sessions and `systemd --user`
- `KRB5CCNAME=KCM:$UID:$RANDOM_LETTERS` for text sessions to provide session isolation
and providing a working TGT in these caches.
I identified so far two cases of the program flow in PAM to manage:
- **TGT delegation** as done by `sshd` with authentication method `gssapi-with-mic`, where a new cache is created by `sshd` and then filled with the delegated ticket
- **TGT creation** as done by `pam_sss.so` upon password authentication, where a new TGT is created an placed into the `KCM` managed default cache.
Now there is no simple and bullet proof selection of where the TGT ends up in KCM.
The KCM designated default cache might it be or not.
To work around this, the module iterates through all credential caches provided by the KCM copies a TGT which is younger than 10 s and has a principal fitting the username.
Note that the reason for `systemd --user` to use the same credential cache as the desktop sessions is that at least Gnome uses it to start the user programs like Evolution or Firefox.
The code is publicly available on [Github](https://github.com/paulscherrerinstitute/pam_single_kcm_cache).
## Diagrams about Kerberos related Interactions
Below diagrams show how PAM and especially `pam_single_kcm_cache.so` interact with the KCM in different use cases.
### Login with SSH using Password Authentication
![Login with SSH and Password Authentication](_static/kerberos_sshd_password_only.png)
That is kind of the "common" authentication case where all important work is done in PAM. This is the same for login on the virtual console or when using `su` with password. At the end there is an shell session with a credential cache which is not used by any other session (unless the user shares it somehow manually). Like this session isolation is achieved.
### Login with SSH using Kerberos Authentication and TGT Delegation
![Login with SSH and Password Authentication](_static/kerberos_sshd_tgt_delegation.png)
This is a bit simpler as all the authentication is done in `sshd` and only the session setup is done by PAM. Note that `sshd` does not use the default cache, but instead creates always a new one with the delegated TGT.
### Systemd User Instance
In above diagrams we see how `systemd --user` is being started. It is also using PAM to setup its own session, but it does not do any authentication.
![Login with SSH and Password Authentication](_static/kerberos_systemd_user.png)
Here we use a predefined name for the credential cache so it can be shared with the desktop sessions. The next diagram shows more in detail how `systemd --user` and the Gnome desktop interact.
### Gnome Desktop
This is the most complex use case:
![Gnome Desktop](_static/kerberos_desktop.png)
At the end we have a well known shared credential cache between Gnome and `systemd --user`. This is needed `systemd --user` is used extensively by Gnome. Important is that the Kerberos setup already happens at authentication phase as there is no session setup phase for screen unlock as the user returns there to an already existing session.
With NoMachine NX this is configured similarly.
## PS
There is an advantage in the broken `sssd-kcm` default cache selection: it forces us to make our stuff robust against KCM glitches, which might also occur with a better manager, just way less often and then it would be more harder to explain and to track down.

View File

@@ -1,175 +0,0 @@
# CUDA and Proprietary Nvidia GPU Drivers on RHEL 8
Managing Nvidia software comes with its own set of challenges.
For the most common cases are covered by our Puppet configuration.
Those are discussed in the first chapter, more details you find more below.
## Hiera Configuration
Changes in Hiera are forwared by Puppet to the node, but **not applied**.
They are applied on **reboot**.
Alternatively you might execute `/opt/pli/libexec/ensure-nvidia-software` in a safe moment (no process using CUDA and the desktop will be restarted).
### I just need the Nvidia GPU drivers
Nothing needs to be done, they are installed by default when Nvidia GPUs or accelerators are found.
### I need CUDA
Set in Hiera `nvidia::cuda::enable true` and it will automatically install the suitable Nvidia drivers and newest possible CUDA version.
The `nvidia_persistenced` service is automatically started. If you do not want it, to set `nvidia::cuda::nvidia_persistenced::enable: false`.
### I need a specific CUDA version
Then you can additionally set `nvidia::cuda::version` to the desired version.
The version must be fully specified (all three numbers, with X.Y.0 for the GA version).
Note that newer CUDA versions do not support older drivers, for details see Table 3 in the [CUDA Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html).
### I do not want the Nvidia drivers
Set in Hiera `nvidia::driver::enable: false`. Note this will be ignored if CUDA is enabled (see above).
Note they do not get automatically removed when already installed. That you would need to do by hand.
### I need the Nvidia drivers from a specific driver branch
The driver branch can be selected in Hiera with `nvidia::driver::branch`. It will then use the latest driver version of that branch. Note that only production branches are available in the PSI package repository.
### I need a Nvidia driver of a given version
This is not recommended, still it is possible to do so by setting the exact driver version (X.Y.Z, excluding the package iteration number) in Hiera with `nvidia::driver::version`.
If the driver version is too old, it will install an older kernel version and you will need a second reboot to activate it.
### My hardware is very old
The oldest driver branch packaged by Nvidia for RHEL 8 is `470`. For hardware only supported by older drivers it falls back to ElRepo packaged drivers. You might do that also on purpose in Hiera by setting `nvidia::driver::branch: elrepo` (or when you want an specific ElRepo branch: `nvidia::driver::branch: 390xx`).
Or you might just live with the fallback to Nouveau (`nvidia::driver::enable: false` in Hiera).
Alternatively you might also just download and install the Nvidia driver manually.
Go to their [Download page](https://www.nvidia.de/Download/index.aspx), select and download the according installer and run it.
You best keep Puppet off your driver by setting `nvidia::driver::enable: false` in Hiera.
## Versioning Mess
I did not find much information about Nvidia driver version structure and policy. Still I concluded that they use following pattern.
### Driver Branches
Their drivers are oranized in driver branches. As you see for example in their [Unix Driver Archive](https://www.nvidia.com/en-us/drivers/unix/) noted as e.g. `470.xx series`.
There are `Production` and `New Feature` branches (and, on the above linked page, a `Beta Version` which is not linked to any of the above branches (yet?)).
Such a branch can be considered a major release and with new braches adding support for new hardware or removing support for old hardware.
The drivers within certain branches are maintained quite a long time. Individual drivers in that branch get increasing version numbers which just start with the same first "branch" number.
In the RPM repo there are more branches available than listed in the [Unix Driver Archive](https://www.nvidia.com/en-us/drivers/unix/). It is not possible to find out retrospectively to what type of branch it belongs. My guess is that the "Legacy" section lists only the production/long term support branches.
Also it is not possible to find out from the package meta information if a driver is considered beta or not. That you only find out by googling "Nvidia $DRIVER_VERSION" and looking at the respective driver page. In my experience the first few driver versions of a branch are usually "beta".
### What Driver \[Branch] for which Hardware
The most authorative way to do so is to chech the [Appendix A of the README of a recent driver](http://us.download.nvidia.com/XFree86/Linux-x86_64/525.78.01/README/supportedchips.html).
There search for your model or PCI ID. Then check out at the top of the respective table which legacy driver it still supports.
Or it might be the current driver.
Another more automated option to figure out the driver is the third-party tool [`nvidia-detect`](http://elrepo.org/tiki/nvidia-detect) by ElRepo. It tells which driver package from ElRepo it suggests, but it can also be used to figure out which production/long term support branch can be used.
### CUDA - Driver Compatibility
A CUDA version needs a suitably new driver version, but old CUDA versions are supported by newer driver versions (drivers are backwards-compatible). To figure out up to which CUDA version runs on your installed driver, check out "Table 3" of the [CUDA release notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html). For each driver branch there is a major 11.x.0 release with possible further bugfix releases.
## Manual Operation
Instead of using Puppet/Hiera, you may also manage the drivers manually.
Note that drivers made available by default are curated, that means it contains only non-beta production drivers. If you want all drivers available, you need to use `https://repos.psi.ch/rhel8/sources/cuda8/` as URL for the package repository.
### Select the Driver Branch
In the RPM package repository the driver branches are mapped to module streams, so there are different streams for different branches and `dnf module list nvidia-driver` will tell you what is available:
```
# dnf module list nvidia-driver
Last metadata expiration check: 2:37:29 ago on Mon 28 Nov 2022 09:15:57 AM CET.
CUDA and drivers from Nvidia
Name Stream Profiles Summary
nvidia-driver latest default [d], fm, ks, src Nvidia driver for latest branch
nvidia-driver latest-dkms [d] default [d], fm, ks Nvidia driver for latest-dkms branch
nvidia-driver open-dkms default [d], fm, ks, src Nvidia driver for open-dkms branch
nvidia-driver 418 default [d], fm, ks, src Nvidia driver for 418 branch
nvidia-driver 418-dkms default [d], fm, ks Nvidia driver for 418-dkms branch
nvidia-driver 440 default [d], fm, ks, src Nvidia driver for 440 branch
nvidia-driver 440-dkms default [d], fm, ks Nvidia driver for 440-dkms branch
nvidia-driver 450 default [d], fm, ks, src Nvidia driver for 450 branch
nvidia-driver 450-dkms default [d], fm, ks Nvidia driver for 450-dkms branch
nvidia-driver 455 default [d], fm, ks, src Nvidia driver for 455 branch
nvidia-driver 455-dkms default [d], fm, ks Nvidia driver for 455-dkms branch
nvidia-driver 460 default [d], fm, ks, src Nvidia driver for 460 branch
nvidia-driver 460-dkms default [d], fm, ks Nvidia driver for 460-dkms branch
nvidia-driver 465 default [d], fm, ks, src Nvidia driver for 465 branch
nvidia-driver 465-dkms default [d], fm, ks Nvidia driver for 465-dkms branch
nvidia-driver 470 default [d], fm, ks, src Nvidia driver for 470 branch
nvidia-driver 470-dkms [e] default [d] [i], fm, ks Nvidia driver for 470-dkms branch
nvidia-driver 495 default [d], fm, ks, src Nvidia driver for 495 branch
nvidia-driver 495-dkms default [d], fm, ks Nvidia driver for 495-dkms branch
nvidia-driver 510 default [d], fm, ks, src Nvidia driver for 510 branch
nvidia-driver 510-dkms default [d], fm, ks Nvidia driver for 510-dkms branch
nvidia-driver 515 default [d], fm, ks, src Nvidia driver for 515 branch
nvidia-driver 515-dkms default [d], fm, ks Nvidia driver for 515-dkms branch
nvidia-driver 515-open default [d], fm, ks, src Nvidia driver for 515-open branch
nvidia-driver 520 default [d], fm, ks, src Nvidia driver for 520 branch
nvidia-driver 520-dkms default [d], fm, ks Nvidia driver for 520-dkms branch
nvidia-driver 520-open default [d], fm, ks, src Nvidia driver for 520-open branch
Hint: [d]efault, [e]nabled, [x]disabled, [i]nstalled
#
```
The first try would be to pick the number of the desired branch. Currently the `520*` and `latest` are empty because the drivers where removed.
The "number only" module streams contain precompiled drivers for some kernels. Note that for older branches or older drivers it may not be precompiled for the latest kernel version. For older branches I had the experience that the `*-dkms` module stream works better for newer kernels. But I did not manage to do "real" DKMS with them, that means compiling the translation layer of any given driver version for whatever kernel. Feel free to update this guide or to tell the Core Linux Team if you found a working procedure.
Finally the `*-open` module streams contain the new open source drivers which currently do not provide the full feature set of the propretiary ones.
### Install a Driver
Best works to install the whole module stream:
```
dnf module install "nvidia-driver:$STREAM"
```
Alternatively the module stream might be enabled first (`dnf module enable "nvidia-driver:$STREAM"`) and the packages installed individually after, but then you have to figure out yourself what all is needed.
If the installation command is rather unhappy and complains a lot about `is filtered out by modular filtering`, then there is already a module stream enabled and some driver installed. So to clean that up do:
```
dnf remove cuda-driver nvidia-driver
dnf module reset nvidia-driver
```
Note that this will also remove installed CUDA packages.
### Install CUDA
It is not recommended to install the `cuda` meta-package directly, because that required the latest drivers from the "new feature" branch. It is better to install the `cuda-11-x` meta-package instead, which installs the CUDA version suitable to your driver and keeps it then updated with bugfix releases to this specific major release. Check out the Table 3 in the [CUDA Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html) for details.
The `cuda` meta-package is by default excluded as explained above. If you still want to use it, do
```
dnf --disableexcludes cuda install cuda
```
After manual CUDA installation you should think about enabling and starting `nvidia-persistenced`:
```
systemctl enable nvidia-persistenced
systemctl start nvidia-persistenced
```
## Regular Tasks by the Core Linux Team
- classify new driver branches and beta versions in the [snapshot preparation script](https://git.psi.ch/linux-infra/rpm-repo-utils/-/blob/main/bin/fix-snapshot/20_remove_nvidia_beta_drivers#L90)
- update the latest production branch in [Puppet managed vidia software installation script](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/code/modules/profile/files/nvidia/ensure-nvidia-software#L17)
- add more production/long term support branches supported by [`nvidia-detect`](http://elrepo.org/tiki/nvidia-detect) to the [Puppet managed Nvidia software installation script](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/code/modules/profile/files/nvidia/ensure-nvidia-software#L62)
- update the [driver version to CUDA version mapping script](https://git.psi.ch/linux-infra/puppet/-/blob/preprod/code/modules/profile/files/nvidia/suitable_cuda_version#L21) according to new entries in the [CUDA Release Notes](https://docs.nvidia.com/cuda/cuda-toolkit-release-notes/index.html)

View File

@@ -1,37 +0,0 @@
# Vendor Documentation
## User Documentation
* [Using the desktop environment in RHEL-8](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_the_desktop_environment_in_rhel_8/)
## Administrator Documentation
* [Configuring basic system settings](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/configuring_basic_system_settings/)
* [Administration and configuration tasks using System Roles in RHEL](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/administration_and_configuration_tasks_using_system_roles_in_rhel/)
* [Deploying different types of servers](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/deploying_different_types_of_servers/)
* [Using SELinux ](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/using_selinux/)
## Product Documentation
### Red Hat Enterprise Linux 8
* [Red Hat Enterprise Linux 8](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/)
### Red Hat Ansible Tower (AWX)
* [Red Hat Ansible Tower (AWX)](https://docs.ansible.com/ansible-tower/)
* [Red Hat Ansible Tower (AWX) Installation and Reference Guide](http://docs.ansible.com/ansible-tower/latest/html/installandreference/index.html)
* [Red Hat Ansible Tower (AWX) User Guide](http://docs.ansible.com/ansible-tower/latest/html/userguide/index.html)
* [Red Hat Ansible Tower (AWX) Adminsitration Guide](http://docs.ansible.com/ansible-tower/latest/html/administration/index.html)
### Red Hat Ansible Engine
* [Ansible Documentation Overview](https://docs.ansible.com/)
* [Ansible Engine](https://docs.ansible.com/ansible/latest/index.html)
* [Ansible User Guide](https://docs.ansible.com/ansible/latest/user_guide/index.html)
### Red Hat Satellite
* [Red Hat Satellite Documentation](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/)
* [Administering Red Hat Satellite](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/html/administering_red_hat_satellite/)
* [Managing Hosts](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/html/managing_hosts/)
* [Provisioning Guide](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/html/provisioning_guide/)
* [Content Management Guide](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/html/content_management_guide/)
* [Adding Custom RPM Repositories](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/html/content_management_guide/importing_custom_content#Importing_Custom_Content-Creating_a_Custom_RPM_Repository)
* [Uploading Content to Custom RPM Repositories](https://access.redhat.com/documentation/en-us/red_hat_satellite/6.9/html/content_management_guide/importing_custom_content#uploading-content-to-a-custom-rpm-repository)

View File

@@ -1,87 +0,0 @@
# Hardware Compatibility
Here are hardware tests with RHEL9 on standard PSI hardware documented.
Generally speaking has Linux a rather good hardware compatibility with PC hardware, usually the older the hardware the better the support.
## Desktop Hardware
### HP Elite Mini 800 G9 Desktop PC
- ✔ NIC
- ✔ GPU
- ✔ HDMI
- ✔ DP
- ✔ USB
- ✔ Audio
### HP Z2 Tower G9 Workstation Desktop PC
- ✔ NIC
- ✔ GPU
- ✔ DP
- ✔ USB
- ✔ Audio
## Mobile Hardware
### HP ZBook Studio 16 inch G10 Mobile Workstation PC
- ❌/✔ Installation only with extra steps
- ✔ NIC (HP USB-C to RJ45 Adapter G2)
- ✔ WLAN
- ✔ GPU
- ✔ USB
- ❌/✔ Audio (manual firmware binary install from https://github.com/thesofproject/sof-bin/ required), microphone and ear plugs work, but not the speaker
- ✔ Webcam
- ✔ Bluetooth
- SIM slot (not tested)
- fingerprint scanner (not tested)
#### Installation
- Nouveau driver fails on the Nvidia chip, so modesetting needs to be disabled:
```
bob node set-attr pcXYZ.psi.ch "kernel_cmdline=nomodeset nouveau.modeset=0"
```
- Register the system "Pass Through" MAC address (you find it in the BIOS).
- Installation did not work with the "HP USB-C to RJ45 Adapter G2": use another, registered adapter or dock instead.
- At the final boot the Nvidia drivers are installed automatically, but cannot be activated and the screen turns black. Power off and on again, then the Nvidia drivers get loaded properly on the next boot.
### HP ZBook Studio 16 inch G11 Mobile Workstation PC
- ✔ NIC (HP USB-C to RJ45 Adapter G2)
- ✔ WLAN
- ✔ GPU
- ✔ USB
- ✔ Monitor via USB C
- ✔ Audio
- ✔ Webcam
- ✔ Bluetooth
- SIM slot (not tested)
- fingerprint scanner (not tested)
### HP EliteBook 840 14 inch G10 Notebook PC
- ✔ WLAN
- ✔ GPU
- ✔ HDMI
- ✔ USB
- ❌/✔ Audio (manual firmware binary install from https://github.com/thesofproject/sof-bin/ required)
- ✔ Webcam
- ✔ Bluetooth
- SIM slot (not tested)
- Card reader (not tested)
## Test Details
- network card by installation via network
- GPU by running graphical desktop, for Nvidia GPUs check if Nvidia drivers are used
- HDMI
- DP
- USB by mouse, keyboard, monitor (USB C)
- notebook:
- wifi
- bluetooth (headset working?)
- microphone
- speaker
- webcam
- Dock
- network
- USB by mouse, keyboard
- HDMI

View File

@@ -1,66 +0,0 @@
# Red Hat Enterprise Linux 9
## Hardware Compatibility
- [Hardware Compatibility](hardware_compatibility.md)
## Alpha Testing
We encourage you to install RHEL9 on testing systems and tell us what we may have broken, what bugs you run into and what features you are missing. Please be aware of certain things we changed on how we want the base operating system to work, listed below.
Bugs and issues can be reported in the [Linux Issues project](https://git.psi.ch/linux-infra/issues).
Additional ressource [Considerations in adopting RHEL 9](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/considerations_in_adopting_rhel_9/index#doc-wrapper)
## Install Documentation
* [Installation Documentation](../basic_installation.md)
## Changes for Base Installation
### No Support for AFS
The future support model for AFS or the service and functionality it provides is currently under consideration. Consequently it has been decided not to support AFS in the PSI RHEL9 distribution at the time being.
### Network Configuration
By default the network configuration is handed over to NetworkManager which does automatic configuation.
If you wish different behaviour, e.g. for static IP addresses, please check out the [Network Configuration guide](../../configuration/basic/networking.md).
### Disk Layout
Provided disk size is 64GB for virtual machines.
| Name | Path | Size | Type | LVM |
| --- | --- | --- | --- | --- |
| root | / | 16GB | xfs | yes |
| home | /home | 2GB | xfs | yes |
| tmp | /tmp | 2GB | xfs | yes |
| var | /var | 16GB | xfs | yes |
| log | /var/log | 4GB | xfs | yes |
| boot | /boot | 1GB | ext4 | no |
| swap | - | 4GB | swap | no |
### Workstation Package Groups
Changed the ammount of packages installed by default on workstation installation. See the comparison below:
| RHEL 7&8 | RHEL 9 |
| --- | --- |
| <ul><li>VMware platform specific packages (platform-vmware)</li><li> Container Management (container-management)</li><li> Internet Browser (internet-browser)</li><li> GNOME (gnome-desktop)</li><li> Headless Management (headless-management)</li><li> Server product core (server-product)</li><li> Hardware Monitoring Utilities (hardware-monitoring)</li><li> base-x (base-x)</li><li> core (core)</li><li> fonts (fonts)</li><li> guest-desktop-agents (guest-desktop-agents)</li><li> hardware-support (hardware-support)</li><li> input-methods (input-methods)</li><li> multimedia (multimedia)</li><li> networkmanager-submodules (networkmanager-submodules)</li><li> print-client (print-client)</li><li> standard (standard)</li></ul> | <ul><li>Internet Browser</li><li>VMware platform specific packages (platform-vmware)</li><li>GNOME (gnome-desktop)</li><li>Core (core)</li><li> fonts (fonts)</li><li> multimedia (multimedia)</li><li> office-suite (office-suite)</li><li> print-client (print-client)</li></ul> |
## Caveats
### Missing or Replaced Packages
[List of packages removed in RHEL 9](https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/9/html-single/considerations_in_adopting_rhel_9/index#removed-packages_assembly_changes-to-packages)
| RHEL 8 | RHEL 9 | Remarks |
| --- | --- | --- |
| `mailx` | `s-nail` | S-nail is MIME capable and has extensions for line editing, S/MIME, SMTP, IMAP, POP3, and more. |
| `platform-python, python2 (python27:2.7), python36 (python36:3.6), python38 (python38:3.8), python39 (python39:3.9)` | `python3` | As for all python* packages |
| `pulseaudio` | `pipewire-pulseaudio` | The pulseaudio server implementation has been replaced by the pipewire-pulseaudio implementation. Note that only the server implementation has been switched. The pulseaudio client libraries are still in use. |
| `inkscape1` | `inkscape` | Also affects `inkscape-docs` and `inkscape-view` |

View File

@@ -1,53 +0,0 @@
# Guidelines
## General Directives
Regardless of the content of this documentation, all PSI directives have precedence and need to be applied/followed. Among others these are some of the most relevant ones regarding IT systems:
- [AW-95-06-01](https://DLS01P.PSI.CH/documents/jsp/qv?id=PSIcgc_fi20220000526102) / [AW-95-06-01e](https://DLS01P.PSI.CH/documents/jsp/qv?id=PSIcgc_fi20220000526095) Usage and Monitoring of IT Ressources at PSI
- [AA-9500-142](https://DLS01P.PSI.CH/documents/jsp/qv?pri=PSIcgc&ft=cgcDocument@STORE_MAIN_CGC&q_cgcId=ID22023011616445798386)/[AA-9500-142e](https://DLS01P.PSI.CH/documents/jsp/qv?id=PSIcgc_fi20230003212302) Handling of Software Updates
## Version Control
Everything must be in version control before being used on production systems. In particular, scripts and other software, SPEC files for packages, relevant documentation, Puppet code, etc.
## Hiera
The naming of the variables inside Hiera depends on the scope usage of the variables. The variables beeing used only inside one specific class will be named `base_class_name::variable` where `base_class_name` is the last part of class name, without the part before the last `::` separator. Example: the `permit_root_login` variable for the `profile::ssh_server` class will be named `ssh_server::permit_root_login`.
## External Code
Although the installation infrastructure makes large usage of external code the system has to avoid as much as possible dependency from external services availability. A WAN outage or a remote http server failure should not influence the installation system. For this reason, all the external code is mirrored internally in specific git repositories.
## Servers and services
Every server should support exactly one service, e.g. Puppet, or FTP. This makes the services more independent (e.g. for downtimes), simplifies the structure of the corresponding Puppet code, makes it easier to reason about the environment, and prevents conflicts regarding certain configuration settings.
### Naming Convention Nodes / Servers
Node/Server names have the form `lx-purpose-[0-9][0-9].psi.ch`, where `purpose` is the purpose of the server or the service provided by it. Example: `lx-boot-01.psi.ch` is the **boot** server
The production server always has an DNS alias `purpose.psi.ch` and clients should always use this alias to connect to the server.
When putting system names into configuration files, we always use lower case and the fully qualified domain name.
## Software Update Policy
It is in the responsibilty of the owner/administrator of a system to care about the software update policy and its application!
### Automatic Updates
By default once a week (in the night from Sunday to Monday) security updates are automatically applied. Other updates, including Kernel updates, need to be installed manually.
This is [configurable](configuration/software/package_updates.md), you may switch it off completely, make it run daily or make it install all updates.
Reboots are never done automatically.
Also for software which have been installed from other sources than RPM package repositories (like `pip` or manual install) there is no automatic update procedure.
### Snapshots
On specially protected systems where stability is more important than being up-to-date, there is the option to freeze the provided RPM package version to a specified date. Also this can be [configured in Hiera](configuration/software/package_repositories.md)(chapter "Using Specific Package Repository Snapshot"). If such a system is set by such a "Repo Tag" to a specific snapshot, the update procedure cannot get newer than the given state.
Again, this should only be done for nodes in protected networks, e.g. with access restrictions through an [ssh gateway](../services/admin-guide/ssh_gateways.md) and requires consent with IT Security.

View File

@@ -1,6 +0,0 @@
# Admin Guide
This is documenation relevant for system admins.
```{tableofcontents}
```

View File

@@ -1,4 +0,0 @@
# Puppet
```{tableofcontents}
```

View File

@@ -1,116 +0,0 @@
# Hiera
Please refer to `here <https://docs.puppet.com/hiera/>`_ for a general Hiera
introduction.
Our current hierarchy has seven levels (first will be considered first
during value lookup):
- nodes (FQDN)
- subgroup (optional, ``puppet_subgroup`` attribute in sysdb)
- group (``puppet_group`` attribute in sysdb)
- sysdb environments
- Puppet server specific
- global
- common
The first four layers can be edited by the admin in the respective hiera git repository. The common layer (default values) and the server specific layer (differences between test and prod) are part of the Puppet code repository. Finally the global layer contains a few configurations which are managed by the Core Linux Group outside of the normal Puppet release process, eg. for license management.
The values can be stored as classical YAML values or with [encrypted yaml](https://github.com/TomPoulton/hiera-eyaml) for secrets.
The filesystem structure is as follows (the last 3 cannot be controlled by a common admin):
1. ``%{::sysdb_env}/%{::group}/%{::fqdn}.yaml`` or ``%{::sysdb_env}/%{::group}/%{::subgroup}/%{::fqdn}.yaml``
2. ``%{::sysdb_env}/%{::group}/%{::subgroup}.yaml``
3. ``%{::sysdb_env}/%{::group}.yaml``
4. ``%{::sysdb_env}/%{::sysdb_env}.yaml``
5. ``%{::environment}/data/server_%{server_facts.servername}.yaml``
6. ``/srv/puppet/data/global/global.yaml``
7. ``%{::environment}/data/common.yaml``
Depending if a subgroup is defined, the node specific YAML is at a different level in the filesysystem hierarchy.
The ``%{variable}`` notation is hiera specific.
## Repositories
Hiera data are organized in different repositories. These repositories are located at: https://git.psi.ch/linux-infra/hiera
Each __sysdb environment__ has a dedicated hiera repository, called ``data-<sydbenv>``, eg. [data-hpc]( https://git.psi.ch/linux-infra/hiera/data-hpc).
The first 4 levels of the filesystem structure shown before are actually the files inside this kind of repositories.
Any change to the repo will automatically trigger a redeployment of the new version of its content on the puppet master within a few seconds from the push.
## Configuration
### Secrets
Secrets and clear-text values can be mixed inside the same yaml file, eg.::
```yaml
ntp_client::servers:
- pstime1.psi.ch
- pstime2.psi.ch
- pstime3.psi.ch
secret_key: ENC[PKCS7,MIIBiQYJKoZIhvcNA...AMA==]
```
The encrypted values can be decrypted transparently from Hiera (on a host having the proper hiera key):
```bash
[root]# hiera secret_key
this is a secret value
```
You can edit secure data inside any yaml file with the command `/opt/puppetlabs/puppet/bin/eyaml edit common.yaml`. In this case secure data will appear in clear-text inside the editor.
### Encrypt Data
To encrypting data you have to use the public key from your Hiera (`data-*`) git repository named `eyaml_public_key.pem`
For the lower layers (global, server or data) it is on the Puppet server at [`/etc/puppetlabs/keys/eyaml/public_key.pkcs7.pem`](https://git.psi.ch/linux-infra/bootstrap/-/blob/prod/instcode/puppet/puppet_server/files/crypto/public_key.pkcs7.pem).
Beside this key you also need to have `hiera-eyaml` tool installed on your system.
```bash
eyaml encrypt --pkcs7-public-key=eyaml_public_key.pem -s secret_string
```
While a complete file can be encrypted with:
```bash
eyaml encrypt --pkcs7-public-key=eyaml_public_key.pem -f secret_file
```
#### Example
To encrypting password for a system you can go about like this:
```bash
# openssl passwd -6 | eyaml encrypt --pkcs7-public-key=eyaml_public_key.pem --stdin
Password:
Verifying - Password:
string: ENC[PKCS7,MIIBxxxxxxxx...xxxxxxxx]
OR
block: >
ENC[PKCS7,MIIBxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
...
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx]
#
```
and place either the string or the block at the required place in your Hiera YAML.
# Hiera Variable Interpolation
Within Hiera also variable interpolation might be use to include other Hiera keys or facts, etc. into the values.
For details check out the [Puppet documentation](https://www.puppet.com/docs/puppet/7/hiera_merging.html#interpolation_functions)
As such an interpolation starts with `%{`, some key or file content (especially in Apache configuration) might be interpreted as variable interpolation and result in some part of the text disappear.
Or it might simply the puppet run with `Syntax error in string` if Puppet fails to parse what it considers an interpolation.
To escape a `%` you can write `%{literal('%')}` instead.

Some files were not shown because too many files have changed in this diff Show More