diff --git a/_toc.yml b/_toc.yml index 4ea974d3..aabe6a23 100644 --- a/_toc.yml +++ b/_toc.yml @@ -22,8 +22,9 @@ parts: - file: rhel8/vendor_documentation - file: rhel8/design_guiding_principles -- caption: RHEL8 Guides +- caption: RHEL8 Guides (Beta) chapters: - file: rhel8-guides-beta/developer_guide - file: rhel8-guides-beta/installation_guide + - file: rhel8-guides-beta/admin_guide \ No newline at end of file diff --git a/rhel8-guides-beta/admin_guide.md b/rhel8-guides-beta/admin_guide.md new file mode 100644 index 00000000..48a2bafc --- /dev/null +++ b/rhel8-guides-beta/admin_guide.md @@ -0,0 +1,995 @@ +# Admin Guide + +## Introduction +> This guide can be copy-pasted for the next release and changed accordingly + +This document aims to describe PSI Linux Administrators how to configure *PSI's Red Hat Enterprise Linux 8* with *Ansible Inventory* settings. Use cases with configuration examples will explain the configuration that can be achieved. + +The settings presented here can either be applied on a host or on a group of hosts or groups. + +Intermediate understanding of Ansible is a prerequisite. + +**Only important use cases are covered, others can be inquired with the respective Ansible Role owner** + +## Table of Contents + +* [Examples](#examples) +* Ansible Inventories + * [RHEL-8 PSI Defaults](#rhel-8-psi-defaults) + * [AIT](#ait) + * [CPT](#cpt) + * [GFA](#gfa) + * [HPCE](#hpce) +* Use Cases + * [System Information and Responsibility](#system-information-and-responsibilty) + * [Network Configuration](#network-configuration) + * [Storage Configuration](#storage-configuration) + * [Icinga/NRPE/SNMP](#icinga-client-nrpe-and-snmp) + * [System Registration](#system-registration) + * [System Security](#system-security) + * [Systemd Services](#systemd-services) + * [System Time](#system-timentp) + * [User Management](#user-management) + * [Software Management](#software-management) + * [AFS](#afs) + +## Examples + +Easy, simple and understandable examples are available under [PSI RHEL-8 RC1 Examples](https://git.psi.ch/linux/engineering/ansible/inventories/psi-rhel-8-rc1-examples/tree/master) + +## Ansible Inventories + +### RHEL-8 PSI Defaults + +This repository hosts the PSI wide defaults inventory. It automatically groups systems from Satellite to AIT, CPT, GFA or HPCE. Link [here](https://git.psi.ch/linux/engineering/ansible/inventories/rhel-8-psi-defaults). + +### AIT + +[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/ait) for AIT. + +### CPT + +[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/cpt) for cpt. + +### GFA + +[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/gfa) for gfa. + +### HPCE + +[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/hpce) for hpce. + +## Use cases + +### System Information and Responsibilty +Owned by @kapeller + +The system `/etc/motd` can be changed by settings as + +```yaml +psi_motd_ou: CPT +psi_motd_contact: Gilles Martin / +41 56 310 36 90 +``` + +or + +```yaml +psi_motd_ou: AIT +psi_motd_contact_list: true +psi_motd_contact: + - Alvise Dorigo / +41 56 310 55 67 + - Thomas Klar / +41 56 310 39 56 + - Leonardo Sala / +41 56 310 33 69 + - Sascha Spreitzer / +41 56 310 37 55 +``` + +additional information can be provided as well + +```yaml +psi_motd_ou: AIT +psi_motd_contact_list: true +psi_motd_contact: + - Alvise Dorigo / +41 56 310 55 67 + - Thomas Klar / +41 56 310 39 56 + - Leonardo Sala / +41 56 310 33 69 + - Sascha Spreitzer / +41 56 310 37 55 +psi_motd_additional: | + Please be careful with this system. + It is very sensitive. +``` + + +--- +### Network Configuration +Owned by @caubet_m + +#### Configuring bonding re-using existing IP and interface + +First, one needs to remove the **"System eth0"** created during the installation which is the active interface. Then, one can create the bonding with a master interface (i.e. `bond0`) and the slave interface with a new name (i.e. `eth0` and, when using NetworkManager, it will generate a new `connection.id`). We ensure that the *state is up* and we *allow network restart* to apply changes on the fly, and we *persistent* changes. + +**Note:** Is important to have it persistent and state *up* and `network_allow_restart` for applying **online** changes affecting to a connected interface, otherwise the network service (or machine) needs to be rebooted. + +```yaml +- hosts: all,rhel-8-dev-7a95e9bb.psi.ch + vars: + network_allow_restart: yes + network_connections: + - name: "System eth0" + persistent_state: absent + state: down + - name: bond0 + type: bond + interface_name: bond0 + bond: + mode: 'active-backup' + miimon: 100 + persistent_state: present + ip: + address: "{{ ansible_default_ipv4.address }}/24" + dns: + - 129.129.190.11 + - 129.129.230.11 + dns_search: + - psi.ch + gateway4: '{{ ansible_default_ipv4.gateway }}' + state: up + - name: eth0 + type: ethernet + interface_name: eth0 + persistent_state: present + mac: "{{ ansible_default_ipv4.macaddress }}" + master: bond0 + slave_type: bond + state: up + roles: + - linux-system-roles.network +``` + +#### DHCP interfaces + +Adding a new interface `eth1` with *dhcp* protocol for getting the IP address: + +```yaml +- hosts: all,rhel-8-dev-7a95e9bb.psi.ch + vars: + network_allow_restart: yes + network_connections: + - name: eth1 + type: ethernet + interface_name: eth1 + persistent_state: present + mac: "0A:0B:0C:0D:0E:0F" + ip: + dhcp4: yes + state: up +``` + +#### Using ethtool for changing interface settings + +One can change network specific settings on an interface with ethtool. In example, we wante to disable `scatter-gather`: + +```shell +[root@rhel-8-dev-7a95e9bb ~]# ethtool -k eth0 | grep scatter-gather +scatter-gather: on + tx-scatter-gather: on + tx-scatter-gather-fraglist: off [fixed] +``` + +We can modify with *ethtool* the interface to change this setting as follows: + +```yaml +- hosts: all,rhel-8-dev-7a95e9bb.psi.ch + vars: + network_allow_restart: yes + network_connections: + - name: eth0 + type: ethernet + interface_name: eth0 + persistent_state: present + mac: "{{ ansible_default_ipv4.macaddress }}" + ip: + dhcp4: yes + state: up + ethtool: + features: + tx_scatter_gather: no +``` + +As a result, we disable `scatter-gather`. + +```shell +[root@rhel-8-dev-7a95e9bb ~]# ethtool -k eth0 | grep scatter-gather +scatter-gather: off + tx-scatter-gather: off + tx-scatter-gather-fraglist: off [fixed] +``` + +--- +### Icinga client (NRPE) and SNMP + +#### NRPE + +For enabling the Nagios client together with NRPE, is necesary to have EPEL in the system (either enabled or disabled). Also, one needs to enable `psi_icinga_client_configure_nrpe`. In case that EPEL is not available in the system, one can enable the installation of the repository from the module itself (by enabling `psi_icinga_client_configure_epel`, which takes it from the official EPEL repositories. + +Important parameters are: +* `psi_icinga_client_nrpe_allowed_hosts` (`String`) where one should specify a comma separated list of allowed hosts. Usually, this will be centrally updated from the default variables inventory, however, when a new Nagios worker or server is setup, might be useful to update this setting until this is centrally changed. +* `psi_icinga_client_nrpe_dont_blame` (`Boolean`) , this option determines whether or not the NRPE daemon will allow clients to specify arguments to commands that are executed. Since this option is a security risk, is disabled by default. However, there are many cases where this is needed, so this is the reason why is provided (under administrator's responsability). +* `psi_icinga_client_nrpe_allow_bash_command_substitution` (`Boolean`) , which determines whether or not the NRPE daemon will allow clients to specify arguments that contain bash command substitutions of the form $(...). Since this is also a security risk, is default by default. +* Icinga checks, which have three different variables. The reason for that is that Ansible is not capable to merge down variables, and this is the way to workaround it. Each settins is a `Hash` where: + * Item name is the file name that will be placed in `include_dir` (usually `/etc/nrpe.d/`). + * For each item: + * one or more `commands` can be specified, and will be placed in the same file + * all commands specified in that file, might need sudo or not. One can enable `sudo` for that file, which will place the proper sudoers rules in the default sudoers location (usually `/etc/sudoers.d/`). + * The 3 variables are: + * `psi_icinga_client_nagios_include_dir_checks` (`Hash`) + * `psi_icinga_client_nagios_include_dir_checks_common` (`Hash`) + * `psi_icinga_client_nagios_include_dir_checks_extra` (`Hash`) + +An example for setting Icinga alarms is the following: + +```yaml +# Allow different Icinga hosts (PSI workers) +psi_icinga_client_nrpe_allowed_hosts: "emonma00.psi.ch,vemonma00.psi.ch,wmonag00.psi.ch,emonag00.psi.ch,eadmin00.psi.ch,wadmin00.psi.ch,monaggfa.psi.ch,monaggfa2.psi.ch,monagxbl.psi.ch,wmonagcpt.psi.ch,vwmonagcpt.psi.ch,monagmisc.psi.ch,wmonagnet.psi.ch,vwmonagnet.psi.ch,monagsfel.psi.ch" + +# Allow arguments: NRPE Don't Blame +psi_icinga_client_nrpe_dont_blame: True + +# Allow arguments: Bash Command Substitution +psi_icinga_client_nrpe_allow_bash_command_substitution: True + +# Define NRPE checks with and withou "sudo" +psi_icinga_client_nagios_include_dir_checks: + system_checks: + commands: + - command: "check_disk" + path: "{{ psi_icinga_client_nagios_plugins_dir }}/check_disk" + arguments: "$ARG1$" + - command: "check_load" + path: "{{ psi_icinga_client_nagios_plugins_dir }}/check_load" + arguments: "$ARG1$"psi_icinga_client_nagios_include_dir_checks_common + +psi_icinga_client_nagios_include_dir_checks_common: {} + +psi_icinga_client_nagios_include_dir_checks_extra: + gpfs_checks: + sudo: True + commands: + - command: "check_gpfs_health" + path: "{{ psi_icinga_client_nagios_plugins_dir }}/check_gpfs_health" + arguments: "--unhealth --ignore-tips" +``` + +#### SNMP + +For enabling SNMP, one needs to enable `psi_icinga_client_configure_snmp`. Once enabled, default settings should be ok for most of the use cases. However, is important to update at least: +* `psi_icinga_client_snmpd_syscontact` (which defaults to *servicesdesk@psi.ch*) +* `psi_icinga_client_snmpd_rocommunity`, which by default contains only the *PSI public network* (129.129.0.0/16) and *localhost*. Hence, one needs to specify extra networks if necessary. + +An example for configuring SNMP: + +```yaml +# Configure SNMP +psi_icinga_client_configure_snmp: True +psi_icinga_client_snmpd_dontLogTCPWrappersConnects: true +psi_icinga_client_snmpd_trapcommunity: psi +psi_icinga_client_snmpd_syslocation: PSI +psi_icinga_client_snmpd_syscontact: marc.caubet@psi.ch +psi_icinga_client_snmpd_sysservices: 76 +psi_icinga_client_snmpd_rocommunity: + - community: psi + network: 172.21.0.0/16 + oid: .1.3.6.1 + - community: psi + network: 129.129.0.0/16 + oid: .1.3.6.1 + - community: psi + network: 192.168.1.0/24 + oid: .1.3.6.1 + - community: psi + network: localhost + oid: .1.3.6.1 +``` +--- +### Storage Configuration +Owned by @dorigo_a + +#### Configuring a partition +Define the following variable: +```yaml +psi_local_storage_physical_volume: + - /dev/ +``` +This just tells to Ansible which device (or partition) must be used for the creation/modification of a volume group. +Multiple instances can be used; for example: +```yaml +psi_local_storage_physical_volumes: + - /dev/sdb1 + - /dev/sdb2 + ... + - /dev/sdb5 +``` + + +`` can be either a block device (`sda`, `sdb`, …) or a partition previously (and manually) created in a block device using `fdisk/parted` (`sda1`, `sdc3`,...). + +#### Configuring a volume group +```yaml +psi_local_storage_physical_name: +``` +`` is the name of a new volume group or the name of an existing volume group in which one wants to create/modify logical volumes. +If the volume group already exists the role will simply add to it the new physical volumes specified in the previous variable `psi_local_storage_physical_volumes`, or no action is taken if the volume group is already built on top of the same physical volumes. + +#### Configuring a logical volume +```yaml +psi_local_storage_logical_volumes: + - name: + size: N # size in unit of GB + fstype: ext4 # or xfs + mount_point: + createfs: +``` +The above configuration will do two different things depending on existence of ``. Please note that `psi_local_storage_logical_volumes` is a list of dictionaries, meaning that one can create/modify multiple logical volumes: +##### `` doesn’t exist +A logical volume name is created inside the volume group specified above (``). It’s size will be N GBytes. If `` is true then a filesystem will be created in the device `/dev//` of type `fstype` and mounted persistently on ``. +##### `` already exists and a filesystem is already present in it +The logical volume name `` is expanded (together with the filesystem) to `N` GBytes (only if `N` is greater than the current size). Filesystem is always expanded to fill up the entire volume group. +##### `` already exists and a filesystem is not present in it +The logical volume name `` is expanded (together with the filesystem) to `N` GB (only if `N` is greater than the current size); if `` is true (or `yes`) then a filesystem of type `fstype` is created and mounted persistently on ``. + +#### Use Case 1: extending an existing partition after system first installation +ASSUMPTION 1: you have just installed a new system with the following partition scheme LVM-based: +``` +[vagrant@your_server ~]$ df -h +Filesystem Size Used Avail Use% Mounted on +devtmpfs 393M 0 393M 0% /dev +tmpfs 410M 0 410M 0% /dev/shm +tmpfs 410M 5.7M 404M 2% /run +tmpfs 410M 0 410M 0% /sys/fs/cgroup +/dev/mapper/cl-root 50G 2.6G 48G 6% / +/dev/sda1 976M 183M 726M 21% /boot +/dev/mapper/cl-home 27G 225M 27G 1% /home +tmpfs 82M 0 82M 0% /run/user/1000 +/dev/mapper/myvg_root-first 1014M 40M 975M 4% /mnt/first +/dev/mapper/myvg_root-second 976M 2.6M 907M 1% /mnt/second +/dev/mapper/myvg_root-third 1014M 40M 975M 4% /mnt/third +``` +ASSUMPTION 2: the partition scheme was not created by you (or by the automatic Tower system) using a specific playbook. It is just there and you do not like the current size of `/mnt/first` and/or `/mnt/second` and/or `/mnt/third`. + +ASSUMPTION 3: the volume group/physical devices supporting the logical volumes `myvg_root-[first,second,third]` do have some extra un-allocated space you can use to extend the parition mounted on `/mnt/[first,second,third]`. + +Write the following playbook (named `extend.yml`) assuming that you want the new partitions respectively resized to 4, 6 and 2 GB: +``` +--- +- name: Extend partition + hosts: your_server.psi.ch + roles: + - name: psi.local_storage + psi_local_storage_resizefs: + - path: '/mnt/second' + size: 4 + - path: '/mnt/third' + size: 6 + - path: '/mnt/first' + size: 2 +... +``` +Execute it with usual `ansible-playbook` command: +``` +[vagrant@control ~]$ ansible-playbook extend.yml + +PLAY [Extend storage] ****************************************************************************************************************************** + +TASK [Gathering Facts] ***************************************************************************************************************************** +[DEPRECATION WARNING]: Distribution centos 8.2.2004 on host your_server should use /usr/libexec/platform-python, but is using /usr/bin/python for +backward compatibility with prior Ansible releases. A future Ansible release will default to using the discovered platform python for this host. +See https://docs.ansible.com/ansible/2.9/reference_appendices/interpreter_discovery.html for more information. This feature will be removed in +version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg. +ok: [your_server] + +TASK [psi.local_storage : Check that user specified a fstype which is supported] ******************************************************************* + +TASK [psi.local_storage : Ensure lmv2 package is installed] **************************************************************************************** +skipping: [your_server] + +TASK [psi.local_storage : Create VG '' on physical volume '[]'] ************************************************************************************ +skipping: [your_server] + +TASK [psi.local_storage : Create logical volume(s) on ''] ****************************************************************************************** + +TASK [psi.local_storage : Create not mounted filesystem(s)] **************************************************************************************** + +TASK [psi.local_storage : Mount filesystem(s)] ***************************************************************************************************** + +TASK [psi.local_storage : Resize Filesystem] ******************************************************************************************************* +changed: [your_server] => (item={'path': '/mnt/second', 'size': 4}) +changed: [your_server] => (item={'path': '/mnt/third', 'size': 6}) +changed: [your_server] => (item={'path': '/mnt/first', 'size': 2}) + +PLAY RECAP ***************************************************************************************************************************************** +your_server : ok=2 changed=1 unreachable=0 failed=0 skipped=6 rescued=0 ignored=0 +``` +#### Use Case 2: extending an existing partition after system first installation using additional device + +ASSUMPTION 1: you have just installed a new system with the following partition scheme LVM-based: +``` +[vagrant@your_server ~]$ df -h +Filesystem Size Used Avail Use% Mounted on +devtmpfs 393M 0 393M 0% /dev +tmpfs 410M 0 410M 0% /dev/shm +tmpfs 410M 5.7M 404M 2% /run +tmpfs 410M 0 410M 0% /sys/fs/cgroup +/dev/mapper/cl-root 50G 2.6G 48G 6% / +/dev/sda1 976M 183M 726M 21% /boot +/dev/mapper/cl-home 27G 225M 27G 1% /home +tmpfs 82M 0 82M 0% /run/user/1000 +/dev/mapper/myvg_root-first 1014M 40M 975M 4% /mnt/first +/dev/mapper/myvg_root-second 976M 2.6M 907M 1% /mnt/second +/dev/mapper/myvg_root-third 1014M 40M 975M 4% /mnt/third +``` +ASSUMPTION 2: the partition scheme was not created by you (or by the automatic Tower system) using a specific playbook. It is just there and you do not like the current Size of `/mnt/first` + +ASSUMPTION 3: you have a new HDD/SSD attached to your node and identified by device `/dev/sdc`, and you want to use it to expand the volume groups, logical volumes in order to be free to freely enlarge your partitions. + +This procedure requires a bit more know-how on linux and logical volume management, but the following explanation will try to guide you as much as possible. + +##### Step1 - Get Volume Group and physical devices +Identify the volume group associated with the partition `/mnt/first` you want to expand, by executing this: +``` +[root@your_server ~]# theVG=$(lvdisplay $(df -h /mnt/first|grep /mnt/first|awk '{print $1}')|grep "VG Name"|awk '{print $NF}') +``` + +Identify the physical volumes on which the `vggroup` is built on: +``` +[root@your_server ~]# vgdisplay -v $theVG |grep "PV Name"|awk '{print $NF}' +/dev/sdb1 +/dev/sdb2 +/dev/sdb3 +``` +Take notes of these three physical devices and remember that you will have to add to this list your new device `/dev/sdd`. + +##### Step2 - Get Logical volume name and filesystem type + +Execute the following command to get the logical volume name: +``` +[root@your_server ~]# theLV=$(lvdisplay $(df -h /mnt/first|grep /mnt/first|awk '{print $1}')|grep "LV Name"|awk '{print $NF}') +[root@your_server ~]# echo $theLV +first +``` +Identify the filesystem type by executing this command: +``` +[root@your_server ~]# theFS=$(mount|grep "/mnt/first"|awk '{print $5}') +[root@your_server ~]# echo $theFS +xfs +``` +##### Step3 - Prepare the ansible playbook + +Write the following playbook (named `extend.yml`) taking care to use the correct volume group name (`theVG`), logical volume name (`theLV`) and the filesystem type (`theFS`); of course also keep unchanged the mount point: +``` +--- +- name: Extend storage + hosts: your_server + roles: + - name: psi.local_storage + psi_local_storage_physical_volumes: + - /dev/sdb1 + - /dev/sdb2 + - /dev/sdb3 + - /dev/sdc # it doesn't matter that you didn't do any partition inside sdc; LVM is able to cope with RAW devices as well + psi_local_storage_physical_name: 'myvg_root' + psi_local_storage_logical_volumes: + - name: 'first' + size: 3 + fstype: 'xfs' + mount_point: '/mnt/first' + createfs: true +... +``` +Note that we have put in the playbook the three devices that we found above (output of `vgdisplay`: `/dev/sdb[1,2,3]`) plus the new one `/dev/sdd`. + +Also note that the original size of the `/mnt/first` filesystem was 1GB, now in the playbook we put 3 (implicit unit is GB). + + +Execute your playbook (suppose you called it as `extend.yml`): +``` +[vagrant@control ~]$ ansible-playbook extend.yml + +PLAY [Extend storage] ********************************************************************************************************************************************************************************************** + +TASK [Gathering Facts] ********************************************************************************************************************************************************************************************* + +TASK [psi.local_storage : Check that user specified a fstype which is supported] *********************************************************************************************************************************** +skipping: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True}) + +TASK [psi.local_storage : Ensure lmv2 package is installed] ******************************************************************************************************************************************************** +ok: [your_server] + +TASK [psi.local_storage : Create VG 'myvg_root' on physical volume '['/dev/sdb1', '/dev/sdb2', '/dev/sdb3']'] ****************************************************************************************************** +ok: [your_server] + +TASK [psi.local_storage : Create logical volume(s) on 'myvg_root'] ************************************************************************************************************************************************* +changed: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True}) + +TASK [psi.local_storage : Create not mounted filesystem(s)] ******************************************************************************************************************************************************** +ok: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True}) + +TASK [psi.local_storage : Mount filesystem(s)] ********************************************************************************************************************************************************************* +ok: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True}) + +TASK [psi.local_storage : Resize XFS filesystem(s)] **************************************************************************************************************************************************************** +changed: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True}) + +TASK [psi.local_storage : Resize EXT filesystem(s)] **************************************************************************************************************************************************************** +skipping: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True}) + +PLAY RECAP ********************************************************************************************************************************************************************************************************* +your_server : ok=7 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0 +``` + +And now check the size of `/mnt/first` on your your_server node: +``` +[root@your_server ~]# df -h +Filesystem Size Used Avail Use% Mounted on +devtmpfs 393M 0 393M 0% /dev +tmpfs 410M 0 410M 0% /dev/shm +tmpfs 410M 5.7M 404M 2% /run +tmpfs 410M 0 410M 0% /sys/fs/cgroup +/dev/mapper/cl-root 50G 2.6G 48G 6% / +/dev/sda1 976M 183M 726M 21% /boot +/dev/mapper/cl-home 27G 225M 27G 1% /home +tmpfs 82M 0 82M 0% /run/user/1000 +/dev/mapper/myvg_root-first 3.0G 55M 3.0G 2% /mnt/first +/dev/mapper/myvg_root-second 976M 2.6M 907M 1% /mnt/second +/dev/mapper/myvg_root-third 1014M 40M 975M 4% /mnt/third + +``` + +#### Mount mountpoints +Define the following dictionary: +```yaml +psi_mounts_mounts: + - fstype: + mount_point: + device: /dev/ + options: + state: mounted|unmounted|absent|present|remounted + - fstype: + mount_point: + device: /dev/ + options: + state: mounted|unmounted|absent|present|remounted + - [...] +``` +and run this playbook. +Note that block devices `/dev/something` must already be there and filesystem must me created. + * `mounted/unmounted` has a clear meaning + * `absent/present` concern the presence or not of the mount directive in /etc/fstab to have the mount automatically done at every boot of the system + * `remounted` means that you want to remount the partition because, for example, you changed some parameter or option. + +--- +### System Registration +Owned by @spreitzer_s + +Basically your system should receive default values that are supplied globally, from the PSI Default Inventory in GitLab and AWX, to successfully register it with Satellite, so it will have access to software repositories automatically. + +However the following settings can be made. + +```yaml +psi_subscription_manager_activation_key: RHEL8-GFA +psi_subscription_manager_org: PSI +psi_subscription_manager_server: satint.psi.ch +psi_subscription_manager_force_register: 'False' +``` + +#### psi_subscription_manager_activation_key +The Satellite activation key to use. Usually something that starts with `RHEL8-`. + +#### psi_subscription_manager_org +The Satellite Organization to use. Usually `PSI`. + +#### psi_subscription_manager_server +The Satellite to register with. Usually `satint.psi.ch`. + +#### psi_subscription_manager_force_register +Whether to or not run the subscription-management enforced. Usually `true` to ensure a system has software access. + +--- +### System Security +Owned by @caubet_m + +This documentation shows how to manage SELinux with the Ansible *selinux* role. More examples can be found in the **['linux-system-roles' Official GitHUB Project Webpage](https://github.com/linux-system-roles/selinux)** + +#### Enabling/Disabling SELinux + +The defaults at PSI are: + +```yaml +selinux_state: enforcing +selinux_policy: targeted +``` +* Allowed values for **`selinux_state`** are `disabled`, `enforcing` and `permissive`. +* Allowed values for **`selinux_policy`** are `targeted`, and `mls`. + +#### (Un)Setting SELinux booleans + +Common examples for setting SELinux booleans are the following: +* Enabling the `use_nfs_home_dirs` Boolean to allow the usage of NFS based home directories, and make it persistent accross machine reboots. +* Enabling the `httpd_use_nfs` Boolean to allow *httpd* to access and share NFS volumes. + +```yaml +selinux_booleans: + - name: use_nfs_home_dirs + state: on + persistent: 'yes' + - name: httpd_use_nfs + state: on +``` + Enable the httpd_use_nfs Boolean to allow httpd to access and share NFS volumes (labeled with the nfs_t type): + +#### Set SELinux file contexts + +In this example, we set /tmp/test_dir directories with `user_home_dir_t` context. + +```yaml +selinux_fcontexts: + - target: '/tmp/test_dir(/.*)?' + setype: user_home_dir_t + ftype: d + state: present +``` + +#### Set SELinux Ports + +In the example below, we allow SSH to use TCP port 22100, in that way we can tell *sshd* to listen on a non-standard port 22100 instead of the standard port 22. For that, we would neeed to update also `/etc/ssh/sshd_config` by changing `Port 22`to `Port 22100`. + +```yaml +selinux_ports: + - ports: '22100' + proto: tcp + setype: ssh_port_t +``` + +* (Persistent file contextes, semanage fcontext, if you have time) + +#### Set linux user to SELinux mapping + +When `selinux_policy: mls`, one would need to update linux users to SELinux users mapping. + +In the example, we remove `feichtinger` from `staff_u`, and we add a new user `caubet_m`, as well as a generic username `staff`, to be mapped to the SELinux user `staff_u` (`caubet_m`, has more security privileges than a generic `staff` user; this is defined with `serange`). On the other hand, we set that any other not mapped user (`__default__`), should be mapped to the SELinux user `user_u`. Any user mapped in that way, have very low security level (`s0`, which is the lowest). + +```yaml +selinux_logins: + - login: feichtinger + seuser: staff_u + state: absent + - login: caubet_m + seuser: staff_u + serange: 's0-s15:c0.c1023' + - login: staff + seuser: staff_u + serange: 's2:c100' + - login: __default__ + seuser: user_u + serange: 's0-s0:' + +``` + +In example: + +```bash +[root@hpc-rhel8devel01 home]# semanage login -l + +__default__ user_u s0-s0 * +caubet_m staff_u s0-s15:c0.c1023 * +root root s0-s15:c0.c1023 * +staff staff_u s2:c100 * +sysadm staff_u s0-s15:c0.c1023 * +system_u system_u s0-s15:c0.c1023 * +``` + +#### Restorecon + +Run `restorecon` on filesystem trees for applying `selinux` policies: + +```yaml +selinux_restore_dirs: + - /var + - /tmp +``` + +--- +### Systemd Services +Owned by @caubet_m + +This role creates by default **systemd** **service** *units*, however, is also possible to configure +other system *units* such like **slice**, **socket**, **timers*, **mount**, etc. + +Full examples for the **systemd** Ansible role can be found in the **['0x0I' Official GitHUB Project Webpage](https://github.com/0x0I/ansible-role-systemd#role-variables)**. The example below, shows how to create different *systemd* units: *service*, *socket*, *mount*, *target* and *timer* + +```yaml +unit_config: + - name: "test-service" + Unit: + Description: "This is a test service unit which listens at port 1234" + After: network-online.target + Wants: network-online.target + Requires: test-service.socket + Service: + User: 'kitchen' + Group: 'kitchen' + ExecStart: '/usr/bin/sleep infinity' + ExecReload: '/bin/kill -s HUP $MAINPID' + Install: + WantedBy: 'multi-user.target' + - name: "test-service" + type: "socket" + Unit: + Description: "This is a test socket unit which specifies the test-service 'socket' unit type" + Socket: + ListenStream: '0.0.0.0:1234' + Accept: 'true' + Install: + WantedBy: 'sockets.target' + - name: "tmp-stdin" + type: "mount" + path: "/run/systemd/system" + Unit: + Description: "This is a test mount unit which overrides the default unit path" + Mount: + What: '/dev/stdin' + Where: '/tmp/stdin' + Install: + WantedBy: 'mount.target' + - name: "test-target" + type: "target" + path: "/etc/systemd/system" + Unit: + Description: This is an example unit Target + Wants: test-service.service test-service.socket tmp-stdin.mount + PartOf: test-service.service + - name: dnf-makecache + type: timer + Unit: + Description: "This is a test timer unit which refreshes dnf cache" + Timer: + OnBootSec: 10min + OnUnitInactiveSec: 1h + Unit: dnf-makecache.service + Install: + WantedBy: multi-user.target +``` + +--- +### System Time/NTP +Owned by @caubet_m + +This document describes how to configure the system 'time' on RHEL8 based systems. Current *defaults* should fit for most of the cases at PSI: +* The recommended service on RHEL8 systems for configuring *system time* is **`chrony`** +* PSI provides different NTP servers which should be accessible by most of the PSI subnets: + * `pstime1.psi.ch` + * `pstime2.psi.ch` + * `pstime3.psi.ch` +* We usually apply custom settings in Chrony or logging and rapid clock measuring during boot time: + * We setup `initstepslew` to `60` seconds. It is, if system's error is found to be 60 seconds or less, a slew will be used to correct it; if the error is above 60 seconds, a step will be used. + * We log different metrics which would help to debug different timesync related problems: `measurements statistics tracking` + +Example of default configuration at PSI, which should be adapted according to your needs: + +```yaml +# linux-system-roles.timesync settings +timesync_chrony_custom_settings: + - "# Allow chronyd to make a rapid measurement of the system and correct clock error at boot time" + - "initstepslew 60 pstime1.psi.ch pstime2.psi.ch pstime3.psi.ch" + - "# Select which information is logged." + - "log measurements statistics tracking" +timesync_ntp_provider: chrony +timesync_ntp_servers: + - hostname: pstime1.psi.ch + - hostname: pstime2.psi.ch + - hostname: pstime3.psi.ch +``` + +--- +### User Management +Owned by @spreitzer_s + +User management is divided in two parts: +* PSI Active Directory +* Local system + +**Overall users and group and group memberships must be managed in Active Directory!** Please consult the PSI Service Catalog to request users, groups and group membership as well as their removal. http://css.psi.ch/psisp + +*Use `*_common` for inventory group variables and `*_extra` for host variables.* + +#### psi_aaa_allow_groups{_common,_extra} + +List of groups that are allowed to login to a system. + +```yaml +psi_aaa_allow_groups_extra: + - unx-ait + - unx-sls +``` + +#### psi_aaa_allow_user{_common,_extra} + +List of users that are allowed to login to a system. *Prefer using groups over users!* + +```yaml +psi_aaa_allow_user_extra: + - kapeller + - klar_t + - spreitzer_s +``` + +### Local User Management (Do not use, prefer Active Directory) + +#### psi_aaa_local_sudo_rules{_common,_extra} + +Manage local sudo roles by lists of (name, content and state). Be very cautios with the sudo rules, as one faulty rule will break sudo for the whole system. + +```yaml +psi_aaa_local_sudo_rules_extra: + - name: sspreitz-root-nopasswd + content: "sspreitz ALL=(ALL) NOPASSWD: ALL\n" + - name: group-wheel-root-nopasswd + content: "%wheel ALL=(ALL) NOPASSWD: ALL\n" + - name: linuxsupport-root-nopasswd + content: | + jill ALL=(ALL) NOPASSWD: ALL + joe ALL=(ALL) NOPASSWD: ALL + jack ALL=(ALL) NOPASSWD: ALL + tom ALL=(ALL) NOPASSWD: ALL + - name: sam-root-nopasswd + state: absent +``` + +#### psi_aaa_local_groups{_common,_extra} + +Manage local groups by a list of ansible group definitions. https://docs.ansible.com/ansible/latest/collections/ansible/builtin/group_module.html + +```yaml +psi_aaa_local_groups_extra: + - name: group1 + gid: 30000 + - name: group2 + - name: support + system: yes + - name: group3 + state: absent +``` + +#### psi_aaa_local_users{_common,_extra} + +Manage local users by a list of ansible user definitions. https://docs.ansible.com/ansible/latest/collections/ansible/builtin/user_module.html + +```yaml +psi_aaa_local_users_extra: + - name: guest + - name: joe + uid: 1000 + group: group1 + groups: + - wheel + - staff + - audio + home: /home/joe + shell: /bin/fish + # mkpasswd -m sha512crypt joe + password: '$6$Mrq9msM24W$boAK1IYwuG6ze1qgk.HpqMqvj/zRThT2fTrb80kJTAiMg1CNXjbEEMH7A8KwAeKQJZuF14KRrpOK5NXxYvqqn1' + - name: jill + state: absent + remove: yes +``` + +#### psi_aaa_local_authorized_keys{_common,_extra} + +Manage local ssh authorized keys by ansible ssh authorized keys definitions. https://docs.ansible.com/ansible/latest/collections/ansible/posix/authorized_key_module.html + +```yaml +psi_aaa_local_authorized_keys_extra: + - user: sspreitz + key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9gU640HBk4m0OA4b2ziTCnVP6QYhs2Zs/LJWTN85+vCudgZfiMip2MAAR0OlOVtB4JYXJh83Rihj0REA13ei3akAPzgG+B4Qlk3QYA2Bf2YDjRGqwgpmhVlTNgJy+l9lS9rn5kPheXTi1GOgGVKi4jd5f6TuYhMBmSl64oCtWnanIwXd/u6teStTd7V0HKgev+GbAvTJPFoxOHFSV51mMvFkkW0s0cPTwLvekAPsnjw4ztEoX8Ar72U+KOnt6YLOEuKB0bKZ4PKTEz7woltDcXKzN9g5HKSY+RgSk9APrOol+HVgs841/1KChri7xPao4J1OzU0Ap6wkG+GfqPVc/ sspreitz@redhat.com' + - user: evil + key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9gU640HBk4m0OA4b2ziTCnVP6QYhs2Zs/LJWTN85+vCudgZfiMip2MAAR0OlOVtB4JYXJh83Rihj0REA13ei3akAPzgG+B4Qlk3QYA2Bf2YDjRGqwgpmhVlTNgJy+l9lS9rn5kPheXTi1GOgGVKi4jd5f6TuYhMBmSl64oCtWnanIwXd/u6teStTd7V0HKgev+GbAvTJPFoxOHFSV51mMvFkkW0s0cPTwLvekAPsnjw4ztEoX8Ar72U+KOnt6YLOEuKB0bKZ4PKTEz7woltDcXKzN9g5HKSY+RgSk9APrOol+HVgs841/1KChri7xPao4J1OzU0Ap6wkG+GfqPVc/ mrevil@example.com' + state: absent +``` + +--- +### Software Management +Owned by @klar_t + +#### psi_packer_repo + +A merged dictionary of yum repository definitions + +#### psi_packer_inst + +A merged list of rpm packages to be installed + +#### psi_packer_rem + +A merged list of rpm packages to be removed + +#### psi_packer_update + +`true` or `false` on whether to update all packages on each ansible run + +#### Important + +the `psi_packer_repo` and the `psi_packer_inst` variables are merged. +It is a wildcard merge, so any suffix can be used, but it is recommended to use the group- or hostname, so there is no accidental overlap. +The list of enabled and the list of disabled repos will be added to the repo file only. Otherwise a repo may e defined but will be ignored. +These 2 lists are also wildcard merged. + +```yaml +- hosts: servers + vars: + psi_packer_update: true + psi_packer_repo_group: + myrepo: + description: This is my repo + baseurl: http://example.com/repos/myrepo/ + gpgcheck: yes + gpgkey: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-myrepo + psi_packer_repo_host: + myotherrepo: + description: This is my other repo + baseurl: http://example.com/repos/myotherrepo/ + gpgcheck: yes + gpgkey: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-myrepo + psi_packer_enabled_repos_group: + - myrepo + psi_packer_disabled_repos_host: + - myotherrepo + psi_packer_inst_group: + - httpd + - mariadb + psi_packer_inst_host: + - mc + - nano + psi_packer_del_group: + - matlab + - office + psi_packer_del_host: + - kernel-devel + - afs + roles: + - psi.packer +``` +--- +### AFS +Owned by @klar_t + +Note: AFS and AFS homes are not enabled by default in RHEL-8. + +PSI Linux Engineering does not support AFS and AFS homes. Contact Achim Gsell if you need AFS and AFS homes. + +LVM partitioning and free space on the root VG are necessary to use this role. (The VG is selected based on where the root file system is located, the actual name does not matter) + +#### psi_yfs_size + +Default: `2147483648` + +Cache LV size, strictly in bytes + +#### psi_yfs_remove + +Default: `false` + +Set this to true and remove everything the role would have installed + + +#### Example Playbook + +An example of how to use this role (with variables passed in as parameters). + +```yaml +- hosts: servers + roles: + - psi.yfs + psi_yfs_size: 2147483648 +``` + +--- + + diff --git a/rhel8-guides-beta/developer_guide.md b/rhel8-guides-beta/developer_guide.md index c6bbd4e5..4d78719f 100644 --- a/rhel8-guides-beta/developer_guide.md +++ b/rhel8-guides-beta/developer_guide.md @@ -1,4 +1,4 @@ -# Developer's Guide of RHEL-8 +# Developer Guide **This guide is under heavy development and just drafted, expect frequent changes and check back every now and then**