add beta admin guide

This commit is contained in:
2021-08-04 16:53:22 +02:00
parent ad9722bf5d
commit f957f2f419
3 changed files with 998 additions and 2 deletions

View File

@@ -22,8 +22,9 @@ parts:
- file: rhel8/vendor_documentation
- file: rhel8/design_guiding_principles
- caption: RHEL8 Guides
- caption: RHEL8 Guides (Beta)
chapters:
- file: rhel8-guides-beta/developer_guide
- file: rhel8-guides-beta/installation_guide
- file: rhel8-guides-beta/admin_guide

View File

@@ -0,0 +1,995 @@
# Admin Guide
## Introduction
> This guide can be copy-pasted for the next release and changed accordingly
This document aims to describe PSI Linux Administrators how to configure *PSI's Red Hat Enterprise Linux 8* with *Ansible Inventory* settings. Use cases with configuration examples will explain the configuration that can be achieved.
The settings presented here can either be applied on a host or on a group of hosts or groups.
Intermediate understanding of Ansible is a prerequisite.
**Only important use cases are covered, others can be inquired with the respective Ansible Role owner**
## Table of Contents
* [Examples](#examples)
* Ansible Inventories
* [RHEL-8 PSI Defaults](#rhel-8-psi-defaults)
* [AIT](#ait)
* [CPT](#cpt)
* [GFA](#gfa)
* [HPCE](#hpce)
* Use Cases
* [System Information and Responsibility](#system-information-and-responsibilty)
* [Network Configuration](#network-configuration)
* [Storage Configuration](#storage-configuration)
* [Icinga/NRPE/SNMP](#icinga-client-nrpe-and-snmp)
* [System Registration](#system-registration)
* [System Security](#system-security)
* [Systemd Services](#systemd-services)
* [System Time](#system-timentp)
* [User Management](#user-management)
* [Software Management](#software-management)
* [AFS](#afs)
## Examples
Easy, simple and understandable examples are available under [PSI RHEL-8 RC1 Examples](https://git.psi.ch/linux/engineering/ansible/inventories/psi-rhel-8-rc1-examples/tree/master)
## Ansible Inventories
### RHEL-8 PSI Defaults
This repository hosts the PSI wide defaults inventory. It automatically groups systems from Satellite to AIT, CPT, GFA or HPCE. Link [here](https://git.psi.ch/linux/engineering/ansible/inventories/rhel-8-psi-defaults).
### AIT
[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/ait) for AIT.
### CPT
[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/cpt) for cpt.
### GFA
[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/gfa) for gfa.
### HPCE
[Ansible Inventory Git Repository](https://git.psi.ch/linux/engineering/ansible/inventories/hpce) for hpce.
## Use cases
### System Information and Responsibilty
Owned by @kapeller
The system `/etc/motd` can be changed by settings as
```yaml
psi_motd_ou: CPT
psi_motd_contact: Gilles Martin <gilles.martin@psi.ch> / +41 56 310 36 90
```
or
```yaml
psi_motd_ou: AIT
psi_motd_contact_list: true
psi_motd_contact:
- Alvise Dorigo <alvise.dorigo@psi.ch> / +41 56 310 55 67
- Thomas Klar <thomas.klar@psi.ch> / +41 56 310 39 56
- Leonardo Sala <leonardo.sala@psi.ch> / +41 56 310 33 69
- Sascha Spreitzer <sascha.spreitzer@psi.ch> / +41 56 310 37 55
```
additional information can be provided as well
```yaml
psi_motd_ou: AIT
psi_motd_contact_list: true
psi_motd_contact:
- Alvise Dorigo <alvise.dorigo@psi.ch> / +41 56 310 55 67
- Thomas Klar <thomas.klar@psi.ch> / +41 56 310 39 56
- Leonardo Sala <leonardo.sala@psi.ch> / +41 56 310 33 69
- Sascha Spreitzer <sascha.spreitzer@psi.ch> / +41 56 310 37 55
psi_motd_additional: |
Please be careful with this system.
It is very sensitive.
```
---
### Network Configuration
Owned by @caubet_m
#### Configuring bonding re-using existing IP and interface
First, one needs to remove the **"System eth0"** created during the installation which is the active interface. Then, one can create the bonding with a master interface (i.e. `bond0`) and the slave interface with a new name (i.e. `eth0` and, when using NetworkManager, it will generate a new `connection.id`). We ensure that the *state is up* and we *allow network restart* to apply changes on the fly, and we *persistent* changes.
**Note:** Is important to have it persistent and state *up* and `network_allow_restart` for applying **online** changes affecting to a connected interface, otherwise the network service (or machine) needs to be rebooted.
```yaml
- hosts: all,rhel-8-dev-7a95e9bb.psi.ch
vars:
network_allow_restart: yes
network_connections:
- name: "System eth0"
persistent_state: absent
state: down
- name: bond0
type: bond
interface_name: bond0
bond:
mode: 'active-backup'
miimon: 100
persistent_state: present
ip:
address: "{{ ansible_default_ipv4.address }}/24"
dns:
- 129.129.190.11
- 129.129.230.11
dns_search:
- psi.ch
gateway4: '{{ ansible_default_ipv4.gateway }}'
state: up
- name: eth0
type: ethernet
interface_name: eth0
persistent_state: present
mac: "{{ ansible_default_ipv4.macaddress }}"
master: bond0
slave_type: bond
state: up
roles:
- linux-system-roles.network
```
#### DHCP interfaces
Adding a new interface `eth1` with *dhcp* protocol for getting the IP address:
```yaml
- hosts: all,rhel-8-dev-7a95e9bb.psi.ch
vars:
network_allow_restart: yes
network_connections:
- name: eth1
type: ethernet
interface_name: eth1
persistent_state: present
mac: "0A:0B:0C:0D:0E:0F"
ip:
dhcp4: yes
state: up
```
#### Using ethtool for changing interface settings
One can change network specific settings on an interface with ethtool. In example, we wante to disable `scatter-gather`:
```shell
[root@rhel-8-dev-7a95e9bb ~]# ethtool -k eth0 | grep scatter-gather
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
```
We can modify with *ethtool* the interface to change this setting as follows:
```yaml
- hosts: all,rhel-8-dev-7a95e9bb.psi.ch
vars:
network_allow_restart: yes
network_connections:
- name: eth0
type: ethernet
interface_name: eth0
persistent_state: present
mac: "{{ ansible_default_ipv4.macaddress }}"
ip:
dhcp4: yes
state: up
ethtool:
features:
tx_scatter_gather: no
```
As a result, we disable `scatter-gather`.
```shell
[root@rhel-8-dev-7a95e9bb ~]# ethtool -k eth0 | grep scatter-gather
scatter-gather: off
tx-scatter-gather: off
tx-scatter-gather-fraglist: off [fixed]
```
---
### Icinga client (NRPE) and SNMP
#### NRPE
For enabling the Nagios client together with NRPE, is necesary to have EPEL in the system (either enabled or disabled). Also, one needs to enable `psi_icinga_client_configure_nrpe`. In case that EPEL is not available in the system, one can enable the installation of the repository from the module itself (by enabling `psi_icinga_client_configure_epel`, which takes it from the official EPEL repositories.
Important parameters are:
* `psi_icinga_client_nrpe_allowed_hosts` (`String`) where one should specify a comma separated list of allowed hosts. Usually, this will be centrally updated from the default variables inventory, however, when a new Nagios worker or server is setup, might be useful to update this setting until this is centrally changed.
* `psi_icinga_client_nrpe_dont_blame` (`Boolean`) , this option determines whether or not the NRPE daemon will allow clients to specify arguments to commands that are executed. Since this option is a security risk, is disabled by default. However, there are many cases where this is needed, so this is the reason why is provided (under administrator's responsability).
* `psi_icinga_client_nrpe_allow_bash_command_substitution` (`Boolean`) , which determines whether or not the NRPE daemon will allow clients to specify arguments that contain bash command substitutions of the form $(...). Since this is also a security risk, is default by default.
* Icinga checks, which have three different variables. The reason for that is that Ansible is not capable to merge down variables, and this is the way to workaround it. Each settins is a `Hash` where:
* Item name is the file name that will be placed in `include_dir` (usually `/etc/nrpe.d/`).
* For each item:
* one or more `commands` can be specified, and will be placed in the same file
* all commands specified in that file, might need sudo or not. One can enable `sudo` for that file, which will place the proper sudoers rules in the default sudoers location (usually `/etc/sudoers.d/`).
* The 3 variables are:
* `psi_icinga_client_nagios_include_dir_checks` (`Hash`)
* `psi_icinga_client_nagios_include_dir_checks_common` (`Hash`)
* `psi_icinga_client_nagios_include_dir_checks_extra` (`Hash`)
An example for setting Icinga alarms is the following:
```yaml
# Allow different Icinga hosts (PSI workers)
psi_icinga_client_nrpe_allowed_hosts: "emonma00.psi.ch,vemonma00.psi.ch,wmonag00.psi.ch,emonag00.psi.ch,eadmin00.psi.ch,wadmin00.psi.ch,monaggfa.psi.ch,monaggfa2.psi.ch,monagxbl.psi.ch,wmonagcpt.psi.ch,vwmonagcpt.psi.ch,monagmisc.psi.ch,wmonagnet.psi.ch,vwmonagnet.psi.ch,monagsfel.psi.ch"
# Allow arguments: NRPE Don't Blame
psi_icinga_client_nrpe_dont_blame: True
# Allow arguments: Bash Command Substitution
psi_icinga_client_nrpe_allow_bash_command_substitution: True
# Define NRPE checks with and withou "sudo"
psi_icinga_client_nagios_include_dir_checks:
system_checks:
commands:
- command: "check_disk"
path: "{{ psi_icinga_client_nagios_plugins_dir }}/check_disk"
arguments: "$ARG1$"
- command: "check_load"
path: "{{ psi_icinga_client_nagios_plugins_dir }}/check_load"
arguments: "$ARG1$"psi_icinga_client_nagios_include_dir_checks_common
psi_icinga_client_nagios_include_dir_checks_common: {}
psi_icinga_client_nagios_include_dir_checks_extra:
gpfs_checks:
sudo: True
commands:
- command: "check_gpfs_health"
path: "{{ psi_icinga_client_nagios_plugins_dir }}/check_gpfs_health"
arguments: "--unhealth --ignore-tips"
```
#### SNMP
For enabling SNMP, one needs to enable `psi_icinga_client_configure_snmp`. Once enabled, default settings should be ok for most of the use cases. However, is important to update at least:
* `psi_icinga_client_snmpd_syscontact` (which defaults to *servicesdesk@psi.ch*)
* `psi_icinga_client_snmpd_rocommunity`, which by default contains only the *PSI public network* (129.129.0.0/16) and *localhost*. Hence, one needs to specify extra networks if necessary.
An example for configuring SNMP:
```yaml
# Configure SNMP
psi_icinga_client_configure_snmp: True
psi_icinga_client_snmpd_dontLogTCPWrappersConnects: true
psi_icinga_client_snmpd_trapcommunity: psi
psi_icinga_client_snmpd_syslocation: PSI
psi_icinga_client_snmpd_syscontact: marc.caubet@psi.ch
psi_icinga_client_snmpd_sysservices: 76
psi_icinga_client_snmpd_rocommunity:
- community: psi
network: 172.21.0.0/16
oid: .1.3.6.1
- community: psi
network: 129.129.0.0/16
oid: .1.3.6.1
- community: psi
network: 192.168.1.0/24
oid: .1.3.6.1
- community: psi
network: localhost
oid: .1.3.6.1
```
---
### Storage Configuration
Owned by @dorigo_a
#### Configuring a partition
Define the following variable:
```yaml
psi_local_storage_physical_volume:
- /dev/<device>
```
This just tells to Ansible which device (or partition) must be used for the creation/modification of a volume group.
Multiple instances can be used; for example:
```yaml
psi_local_storage_physical_volumes:
- /dev/sdb1
- /dev/sdb2
...
- /dev/sdb5
```
`<device>` can be either a block device (`sda`, `sdb`, …) or a partition previously (and manually) created in a block device using `fdisk/parted` (`sda1`, `sdc3`,...).
#### Configuring a volume group
```yaml
psi_local_storage_physical_name: <vg_name>
```
`<vg_name>` is the name of a new volume group or the name of an existing volume group in which one wants to create/modify logical volumes.
If the volume group already exists the role will simply add to it the new physical volumes specified in the previous variable `psi_local_storage_physical_volumes`, or no action is taken if the volume group is already built on top of the same physical volumes.
#### Configuring a logical volume
```yaml
psi_local_storage_logical_volumes:
- name: <lv_name>
size: N # size in unit of GB
fstype: ext4 # or xfs
mount_point: <somepath>
createfs: <boolean_value>
```
The above configuration will do two different things depending on existence of `<lv_name>`. Please note that `psi_local_storage_logical_volumes` is a list of dictionaries, meaning that one can create/modify multiple logical volumes:
##### `<lv_name>` doesnt exist
A logical volume name is created inside the volume group specified above (`<vg_name>`). Its size will be N GBytes. If `<boolean_value>` is true then a filesystem will be created in the device `/dev/<vg_name>/<lv_name>` of type `fstype` and mounted persistently on `<somepath>`.
##### `<lv_name>` already exists and a filesystem is already present in it
The logical volume name `<lv_name>` is expanded (together with the filesystem) to `N` GBytes (only if `N` is greater than the current size). Filesystem is always expanded to fill up the entire volume group.
##### `<lv_name>` already exists and a filesystem is not present in it
The logical volume name `<lv_name>` is expanded (together with the filesystem) to `N` GB (only if `N` is greater than the current size); if `<boolean_value>` is true (or `yes`) then a filesystem of type `fstype` is created and mounted persistently on `<some_path>`.
#### Use Case 1: extending an existing partition after system first installation
ASSUMPTION 1: you have just installed a new system with the following partition scheme LVM-based:
```
[vagrant@your_server ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 393M 0 393M 0% /dev
tmpfs 410M 0 410M 0% /dev/shm
tmpfs 410M 5.7M 404M 2% /run
tmpfs 410M 0 410M 0% /sys/fs/cgroup
/dev/mapper/cl-root 50G 2.6G 48G 6% /
/dev/sda1 976M 183M 726M 21% /boot
/dev/mapper/cl-home 27G 225M 27G 1% /home
tmpfs 82M 0 82M 0% /run/user/1000
/dev/mapper/myvg_root-first 1014M 40M 975M 4% /mnt/first
/dev/mapper/myvg_root-second 976M 2.6M 907M 1% /mnt/second
/dev/mapper/myvg_root-third 1014M 40M 975M 4% /mnt/third
```
ASSUMPTION 2: the partition scheme was not created by you (or by the automatic Tower system) using a specific playbook. It is just there and you do not like the current size of `/mnt/first` and/or `/mnt/second` and/or `/mnt/third`.
ASSUMPTION 3: the volume group/physical devices supporting the logical volumes `myvg_root-[first,second,third]` do have some extra un-allocated space you can use to extend the parition mounted on `/mnt/[first,second,third]`.
Write the following playbook (named `extend.yml`) assuming that you want the new partitions respectively resized to 4, 6 and 2 GB:
```
---
- name: Extend partition
hosts: your_server.psi.ch
roles:
- name: psi.local_storage
psi_local_storage_resizefs:
- path: '/mnt/second'
size: 4
- path: '/mnt/third'
size: 6
- path: '/mnt/first'
size: 2
...
```
Execute it with usual `ansible-playbook` command:
```
[vagrant@control ~]$ ansible-playbook extend.yml
PLAY [Extend storage] ******************************************************************************************************************************
TASK [Gathering Facts] *****************************************************************************************************************************
[DEPRECATION WARNING]: Distribution centos 8.2.2004 on host your_server should use /usr/libexec/platform-python, but is using /usr/bin/python for
backward compatibility with prior Ansible releases. A future Ansible release will default to using the discovered platform python for this host.
See https://docs.ansible.com/ansible/2.9/reference_appendices/interpreter_discovery.html for more information. This feature will be removed in
version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
ok: [your_server]
TASK [psi.local_storage : Check that user specified a fstype which is supported] *******************************************************************
TASK [psi.local_storage : Ensure lmv2 package is installed] ****************************************************************************************
skipping: [your_server]
TASK [psi.local_storage : Create VG '' on physical volume '[]'] ************************************************************************************
skipping: [your_server]
TASK [psi.local_storage : Create logical volume(s) on ''] ******************************************************************************************
TASK [psi.local_storage : Create not mounted filesystem(s)] ****************************************************************************************
TASK [psi.local_storage : Mount filesystem(s)] *****************************************************************************************************
TASK [psi.local_storage : Resize Filesystem] *******************************************************************************************************
changed: [your_server] => (item={'path': '/mnt/second', 'size': 4})
changed: [your_server] => (item={'path': '/mnt/third', 'size': 6})
changed: [your_server] => (item={'path': '/mnt/first', 'size': 2})
PLAY RECAP *****************************************************************************************************************************************
your_server : ok=2 changed=1 unreachable=0 failed=0 skipped=6 rescued=0 ignored=0
```
#### Use Case 2: extending an existing partition after system first installation using additional device
ASSUMPTION 1: you have just installed a new system with the following partition scheme LVM-based:
```
[vagrant@your_server ~]$ df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 393M 0 393M 0% /dev
tmpfs 410M 0 410M 0% /dev/shm
tmpfs 410M 5.7M 404M 2% /run
tmpfs 410M 0 410M 0% /sys/fs/cgroup
/dev/mapper/cl-root 50G 2.6G 48G 6% /
/dev/sda1 976M 183M 726M 21% /boot
/dev/mapper/cl-home 27G 225M 27G 1% /home
tmpfs 82M 0 82M 0% /run/user/1000
/dev/mapper/myvg_root-first 1014M 40M 975M 4% /mnt/first
/dev/mapper/myvg_root-second 976M 2.6M 907M 1% /mnt/second
/dev/mapper/myvg_root-third 1014M 40M 975M 4% /mnt/third
```
ASSUMPTION 2: the partition scheme was not created by you (or by the automatic Tower system) using a specific playbook. It is just there and you do not like the current Size of `/mnt/first`
ASSUMPTION 3: you have a new HDD/SSD attached to your node and identified by device `/dev/sdc`, and you want to use it to expand the volume groups, logical volumes in order to be free to freely enlarge your partitions.
This procedure requires a bit more know-how on linux and logical volume management, but the following explanation will try to guide you as much as possible.
##### Step1 - Get Volume Group and physical devices
Identify the volume group associated with the partition `/mnt/first` you want to expand, by executing this:
```
[root@your_server ~]# theVG=$(lvdisplay $(df -h /mnt/first|grep /mnt/first|awk '{print $1}')|grep "VG Name"|awk '{print $NF}')
```
Identify the physical volumes on which the `vggroup` is built on:
```
[root@your_server ~]# vgdisplay -v $theVG |grep "PV Name"|awk '{print $NF}'
/dev/sdb1
/dev/sdb2
/dev/sdb3
```
Take notes of these three physical devices and remember that you will have to add to this list your new device `/dev/sdd`.
##### Step2 - Get Logical volume name and filesystem type
Execute the following command to get the logical volume name:
```
[root@your_server ~]# theLV=$(lvdisplay $(df -h /mnt/first|grep /mnt/first|awk '{print $1}')|grep "LV Name"|awk '{print $NF}')
[root@your_server ~]# echo $theLV
first
```
Identify the filesystem type by executing this command:
```
[root@your_server ~]# theFS=$(mount|grep "/mnt/first"|awk '{print $5}')
[root@your_server ~]# echo $theFS
xfs
```
##### Step3 - Prepare the ansible playbook
Write the following playbook (named `extend.yml`) taking care to use the correct volume group name (`theVG`), logical volume name (`theLV`) and the filesystem type (`theFS`); of course also keep unchanged the mount point:
```
---
- name: Extend storage
hosts: your_server
roles:
- name: psi.local_storage
psi_local_storage_physical_volumes:
- /dev/sdb1
- /dev/sdb2
- /dev/sdb3
- /dev/sdc # it doesn't matter that you didn't do any partition inside sdc; LVM is able to cope with RAW devices as well
psi_local_storage_physical_name: 'myvg_root'
psi_local_storage_logical_volumes:
- name: 'first'
size: 3
fstype: 'xfs'
mount_point: '/mnt/first'
createfs: true
...
```
Note that we have put in the playbook the three devices that we found above (output of `vgdisplay`: `/dev/sdb[1,2,3]`) plus the new one `/dev/sdd`.
Also note that the original size of the `/mnt/first` filesystem was 1GB, now in the playbook we put 3 (implicit unit is GB).
Execute your playbook (suppose you called it as `extend.yml`):
```
[vagrant@control ~]$ ansible-playbook extend.yml
PLAY [Extend storage] **********************************************************************************************************************************************************************************************
TASK [Gathering Facts] *********************************************************************************************************************************************************************************************
TASK [psi.local_storage : Check that user specified a fstype which is supported] ***********************************************************************************************************************************
skipping: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True})
TASK [psi.local_storage : Ensure lmv2 package is installed] ********************************************************************************************************************************************************
ok: [your_server]
TASK [psi.local_storage : Create VG 'myvg_root' on physical volume '['/dev/sdb1', '/dev/sdb2', '/dev/sdb3']'] ******************************************************************************************************
ok: [your_server]
TASK [psi.local_storage : Create logical volume(s) on 'myvg_root'] *************************************************************************************************************************************************
changed: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True})
TASK [psi.local_storage : Create not mounted filesystem(s)] ********************************************************************************************************************************************************
ok: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True})
TASK [psi.local_storage : Mount filesystem(s)] *********************************************************************************************************************************************************************
ok: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True})
TASK [psi.local_storage : Resize XFS filesystem(s)] ****************************************************************************************************************************************************************
changed: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True})
TASK [psi.local_storage : Resize EXT filesystem(s)] ****************************************************************************************************************************************************************
skipping: [your_server] => (item={'name': 'first', 'size': 3, 'fstype': 'xfs', 'mount_point': '/mnt/first', 'createfs': True})
PLAY RECAP *********************************************************************************************************************************************************************************************************
your_server : ok=7 changed=2 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
```
And now check the size of `/mnt/first` on your your_server node:
```
[root@your_server ~]# df -h
Filesystem Size Used Avail Use% Mounted on
devtmpfs 393M 0 393M 0% /dev
tmpfs 410M 0 410M 0% /dev/shm
tmpfs 410M 5.7M 404M 2% /run
tmpfs 410M 0 410M 0% /sys/fs/cgroup
/dev/mapper/cl-root 50G 2.6G 48G 6% /
/dev/sda1 976M 183M 726M 21% /boot
/dev/mapper/cl-home 27G 225M 27G 1% /home
tmpfs 82M 0 82M 0% /run/user/1000
/dev/mapper/myvg_root-first 3.0G 55M 3.0G 2% /mnt/first
/dev/mapper/myvg_root-second 976M 2.6M 907M 1% /mnt/second
/dev/mapper/myvg_root-third 1014M 40M 975M 4% /mnt/third
```
#### Mount mountpoints
Define the following dictionary:
```yaml
psi_mounts_mounts:
- fstype: <ext4|xfs>
mount_point: <somepath>
device: /dev/<somedevice_or_lvm>
options: <otps-to-be-put-in-fstab>
state: mounted|unmounted|absent|present|remounted
- fstype: <ext4|xfs>
mount_point: <somepath2>
device: /dev/<somedevice_or_lvm>
options: <otps-to-be-put-in-fstab>
state: mounted|unmounted|absent|present|remounted
- [...]
```
and run this playbook.
Note that block devices `/dev/something` must already be there and filesystem must me created.
* `mounted/unmounted` has a clear meaning
* `absent/present` concern the presence or not of the mount directive in /etc/fstab to have the mount automatically done at every boot of the system
* `remounted` means that you want to remount the partition because, for example, you changed some parameter or option.
---
### System Registration
Owned by @spreitzer_s
Basically your system should receive default values that are supplied globally, from the PSI Default Inventory in GitLab and AWX, to successfully register it with Satellite, so it will have access to software repositories automatically.
However the following settings can be made.
```yaml
psi_subscription_manager_activation_key: RHEL8-GFA
psi_subscription_manager_org: PSI
psi_subscription_manager_server: satint.psi.ch
psi_subscription_manager_force_register: 'False'
```
#### psi_subscription_manager_activation_key
The Satellite activation key to use. Usually something that starts with `RHEL8-`.
#### psi_subscription_manager_org
The Satellite Organization to use. Usually `PSI`.
#### psi_subscription_manager_server
The Satellite to register with. Usually `satint.psi.ch`.
#### psi_subscription_manager_force_register
Whether to or not run the subscription-management enforced. Usually `true` to ensure a system has software access.
---
### System Security
Owned by @caubet_m
This documentation shows how to manage SELinux with the Ansible *selinux* role. More examples can be found in the **['linux-system-roles' Official GitHUB Project Webpage](https://github.com/linux-system-roles/selinux)**
#### Enabling/Disabling SELinux
The defaults at PSI are:
```yaml
selinux_state: enforcing
selinux_policy: targeted
```
* Allowed values for **`selinux_state`** are `disabled`, `enforcing` and `permissive`.
* Allowed values for **`selinux_policy`** are `targeted`, and `mls`.
#### (Un)Setting SELinux booleans
Common examples for setting SELinux booleans are the following:
* Enabling the `use_nfs_home_dirs` Boolean to allow the usage of NFS based home directories, and make it persistent accross machine reboots.
* Enabling the `httpd_use_nfs` Boolean to allow *httpd* to access and share NFS volumes.
```yaml
selinux_booleans:
- name: use_nfs_home_dirs
state: on
persistent: 'yes'
- name: httpd_use_nfs
state: on
```
Enable the httpd_use_nfs Boolean to allow httpd to access and share NFS volumes (labeled with the nfs_t type):
#### Set SELinux file contexts
In this example, we set /tmp/test_dir directories with `user_home_dir_t` context.
```yaml
selinux_fcontexts:
- target: '/tmp/test_dir(/.*)?'
setype: user_home_dir_t
ftype: d
state: present
```
#### Set SELinux Ports
In the example below, we allow SSH to use TCP port 22100, in that way we can tell *sshd* to listen on a non-standard port 22100 instead of the standard port 22. For that, we would neeed to update also `/etc/ssh/sshd_config` by changing `Port 22`to `Port 22100`.
```yaml
selinux_ports:
- ports: '22100'
proto: tcp
setype: ssh_port_t
```
* (Persistent file contextes, semanage fcontext, if you have time)
#### Set linux user to SELinux mapping
When `selinux_policy: mls`, one would need to update linux users to SELinux users mapping.
In the example, we remove `feichtinger` from `staff_u`, and we add a new user `caubet_m`, as well as a generic username `staff`, to be mapped to the SELinux user `staff_u` (`caubet_m`, has more security privileges than a generic `staff` user; this is defined with `serange`). On the other hand, we set that any other not mapped user (`__default__`), should be mapped to the SELinux user `user_u`. Any user mapped in that way, have very low security level (`s0`, which is the lowest).
```yaml
selinux_logins:
- login: feichtinger
seuser: staff_u
state: absent
- login: caubet_m
seuser: staff_u
serange: 's0-s15:c0.c1023'
- login: staff
seuser: staff_u
serange: 's2:c100'
- login: __default__
seuser: user_u
serange: 's0-s0:'
```
In example:
```bash
[root@hpc-rhel8devel01 home]# semanage login -l
__default__ user_u s0-s0 *
caubet_m staff_u s0-s15:c0.c1023 *
root root s0-s15:c0.c1023 *
staff staff_u s2:c100 *
sysadm staff_u s0-s15:c0.c1023 *
system_u system_u s0-s15:c0.c1023 *
```
#### Restorecon
Run `restorecon` on filesystem trees for applying `selinux` policies:
```yaml
selinux_restore_dirs:
- /var
- /tmp
```
---
### Systemd Services
Owned by @caubet_m
This role creates by default **systemd** **service** *units*, however, is also possible to configure
other system *units* such like **slice**, **socket**, **timers*, **mount**, etc.
Full examples for the **systemd** Ansible role can be found in the **['0x0I' Official GitHUB Project Webpage](https://github.com/0x0I/ansible-role-systemd#role-variables)**. The example below, shows how to create different *systemd* units: *service*, *socket*, *mount*, *target* and *timer*
```yaml
unit_config:
- name: "test-service"
Unit:
Description: "This is a test service unit which listens at port 1234"
After: network-online.target
Wants: network-online.target
Requires: test-service.socket
Service:
User: 'kitchen'
Group: 'kitchen'
ExecStart: '/usr/bin/sleep infinity'
ExecReload: '/bin/kill -s HUP $MAINPID'
Install:
WantedBy: 'multi-user.target'
- name: "test-service"
type: "socket"
Unit:
Description: "This is a test socket unit which specifies the test-service 'socket' unit type"
Socket:
ListenStream: '0.0.0.0:1234'
Accept: 'true'
Install:
WantedBy: 'sockets.target'
- name: "tmp-stdin"
type: "mount"
path: "/run/systemd/system"
Unit:
Description: "This is a test mount unit which overrides the default unit path"
Mount:
What: '/dev/stdin'
Where: '/tmp/stdin'
Install:
WantedBy: 'mount.target'
- name: "test-target"
type: "target"
path: "/etc/systemd/system"
Unit:
Description: This is an example unit Target
Wants: test-service.service test-service.socket tmp-stdin.mount
PartOf: test-service.service
- name: dnf-makecache
type: timer
Unit:
Description: "This is a test timer unit which refreshes dnf cache"
Timer:
OnBootSec: 10min
OnUnitInactiveSec: 1h
Unit: dnf-makecache.service
Install:
WantedBy: multi-user.target
```
---
### System Time/NTP
Owned by @caubet_m
This document describes how to configure the system 'time' on RHEL8 based systems. Current *defaults* should fit for most of the cases at PSI:
* The recommended service on RHEL8 systems for configuring *system time* is **`chrony`**
* PSI provides different NTP servers which should be accessible by most of the PSI subnets:
* `pstime1.psi.ch`
* `pstime2.psi.ch`
* `pstime3.psi.ch`
* We usually apply custom settings in Chrony or logging and rapid clock measuring during boot time:
* We setup `initstepslew` to `60` seconds. It is, if system's error is found to be 60 seconds or less, a slew will be used to correct it; if the error is above 60 seconds, a step will be used.
* We log different metrics which would help to debug different timesync related problems: `measurements statistics tracking`
Example of default configuration at PSI, which should be adapted according to your needs:
```yaml
# linux-system-roles.timesync settings
timesync_chrony_custom_settings:
- "# Allow chronyd to make a rapid measurement of the system and correct clock error at boot time"
- "initstepslew 60 pstime1.psi.ch pstime2.psi.ch pstime3.psi.ch"
- "# Select which information is logged."
- "log measurements statistics tracking"
timesync_ntp_provider: chrony
timesync_ntp_servers:
- hostname: pstime1.psi.ch
- hostname: pstime2.psi.ch
- hostname: pstime3.psi.ch
```
---
### User Management
Owned by @spreitzer_s
User management is divided in two parts:
* PSI Active Directory
* Local system
**Overall users and group and group memberships must be managed in Active Directory!** Please consult the PSI Service Catalog to request users, groups and group membership as well as their removal. http://css.psi.ch/psisp
*Use `*_common` for inventory group variables and `*_extra` for host variables.*
#### psi_aaa_allow_groups{_common,_extra}
List of groups that are allowed to login to a system.
```yaml
psi_aaa_allow_groups_extra:
- unx-ait
- unx-sls
```
#### psi_aaa_allow_user{_common,_extra}
List of users that are allowed to login to a system. *Prefer using groups over users!*
```yaml
psi_aaa_allow_user_extra:
- kapeller
- klar_t
- spreitzer_s
```
### Local User Management (Do not use, prefer Active Directory)
#### psi_aaa_local_sudo_rules{_common,_extra}
Manage local sudo roles by lists of (name, content and state). Be very cautios with the sudo rules, as one faulty rule will break sudo for the whole system.
```yaml
psi_aaa_local_sudo_rules_extra:
- name: sspreitz-root-nopasswd
content: "sspreitz ALL=(ALL) NOPASSWD: ALL\n"
- name: group-wheel-root-nopasswd
content: "%wheel ALL=(ALL) NOPASSWD: ALL\n"
- name: linuxsupport-root-nopasswd
content: |
jill ALL=(ALL) NOPASSWD: ALL
joe ALL=(ALL) NOPASSWD: ALL
jack ALL=(ALL) NOPASSWD: ALL
tom ALL=(ALL) NOPASSWD: ALL
- name: sam-root-nopasswd
state: absent
```
#### psi_aaa_local_groups{_common,_extra}
Manage local groups by a list of ansible group definitions. https://docs.ansible.com/ansible/latest/collections/ansible/builtin/group_module.html
```yaml
psi_aaa_local_groups_extra:
- name: group1
gid: 30000
- name: group2
- name: support
system: yes
- name: group3
state: absent
```
#### psi_aaa_local_users{_common,_extra}
Manage local users by a list of ansible user definitions. https://docs.ansible.com/ansible/latest/collections/ansible/builtin/user_module.html
```yaml
psi_aaa_local_users_extra:
- name: guest
- name: joe
uid: 1000
group: group1
groups:
- wheel
- staff
- audio
home: /home/joe
shell: /bin/fish
# mkpasswd -m sha512crypt joe
password: '$6$Mrq9msM24W$boAK1IYwuG6ze1qgk.HpqMqvj/zRThT2fTrb80kJTAiMg1CNXjbEEMH7A8KwAeKQJZuF14KRrpOK5NXxYvqqn1'
- name: jill
state: absent
remove: yes
```
#### psi_aaa_local_authorized_keys{_common,_extra}
Manage local ssh authorized keys by ansible ssh authorized keys definitions. https://docs.ansible.com/ansible/latest/collections/ansible/posix/authorized_key_module.html
```yaml
psi_aaa_local_authorized_keys_extra:
- user: sspreitz
key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9gU640HBk4m0OA4b2ziTCnVP6QYhs2Zs/LJWTN85+vCudgZfiMip2MAAR0OlOVtB4JYXJh83Rihj0REA13ei3akAPzgG+B4Qlk3QYA2Bf2YDjRGqwgpmhVlTNgJy+l9lS9rn5kPheXTi1GOgGVKi4jd5f6TuYhMBmSl64oCtWnanIwXd/u6teStTd7V0HKgev+GbAvTJPFoxOHFSV51mMvFkkW0s0cPTwLvekAPsnjw4ztEoX8Ar72U+KOnt6YLOEuKB0bKZ4PKTEz7woltDcXKzN9g5HKSY+RgSk9APrOol+HVgs841/1KChri7xPao4J1OzU0Ap6wkG+GfqPVc/ sspreitz@redhat.com'
- user: evil
key: 'ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9gU640HBk4m0OA4b2ziTCnVP6QYhs2Zs/LJWTN85+vCudgZfiMip2MAAR0OlOVtB4JYXJh83Rihj0REA13ei3akAPzgG+B4Qlk3QYA2Bf2YDjRGqwgpmhVlTNgJy+l9lS9rn5kPheXTi1GOgGVKi4jd5f6TuYhMBmSl64oCtWnanIwXd/u6teStTd7V0HKgev+GbAvTJPFoxOHFSV51mMvFkkW0s0cPTwLvekAPsnjw4ztEoX8Ar72U+KOnt6YLOEuKB0bKZ4PKTEz7woltDcXKzN9g5HKSY+RgSk9APrOol+HVgs841/1KChri7xPao4J1OzU0Ap6wkG+GfqPVc/ mrevil@example.com'
state: absent
```
---
### Software Management
Owned by @klar_t
#### psi_packer_repo
A merged dictionary of yum repository definitions
#### psi_packer_inst
A merged list of rpm packages to be installed
#### psi_packer_rem
A merged list of rpm packages to be removed
#### psi_packer_update
`true` or `false` on whether to update all packages on each ansible run
#### Important
the `psi_packer_repo` and the `psi_packer_inst` variables are merged.
It is a wildcard merge, so any suffix can be used, but it is recommended to use the group- or hostname, so there is no accidental overlap.
The list of enabled and the list of disabled repos will be added to the repo file only. Otherwise a repo may e defined but will be ignored.
These 2 lists are also wildcard merged.
```yaml
- hosts: servers
vars:
psi_packer_update: true
psi_packer_repo_group:
myrepo:
description: This is my repo
baseurl: http://example.com/repos/myrepo/
gpgcheck: yes
gpgkey: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-myrepo
psi_packer_repo_host:
myotherrepo:
description: This is my other repo
baseurl: http://example.com/repos/myotherrepo/
gpgcheck: yes
gpgkey: file:///etc/pki/rpm-gpg/RPM-GPG-KEY-myrepo
psi_packer_enabled_repos_group:
- myrepo
psi_packer_disabled_repos_host:
- myotherrepo
psi_packer_inst_group:
- httpd
- mariadb
psi_packer_inst_host:
- mc
- nano
psi_packer_del_group:
- matlab
- office
psi_packer_del_host:
- kernel-devel
- afs
roles:
- psi.packer
```
---
### AFS
Owned by @klar_t
Note: AFS and AFS homes are not enabled by default in RHEL-8.
PSI Linux Engineering does not support AFS and AFS homes. Contact Achim Gsell if you need AFS and AFS homes.
LVM partitioning and free space on the root VG are necessary to use this role. (The VG is selected based on where the root file system is located, the actual name does not matter)
#### psi_yfs_size
Default: `2147483648`
Cache LV size, strictly in bytes
#### psi_yfs_remove
Default: `false`
Set this to true and remove everything the role would have installed
#### Example Playbook
An example of how to use this role (with variables passed in as parameters).
```yaml
- hosts: servers
roles:
- psi.yfs
psi_yfs_size: 2147483648
```
---

View File

@@ -1,4 +1,4 @@
# Developer's Guide of RHEL-8
# Developer Guide
**This guide is under heavy development and just drafted, expect frequent changes and check back every now and then**