add docs from infrastructure doc

This commit is contained in:
2021-05-05 15:05:43 +02:00
parent aa373939e7
commit a9d6d0a819
18 changed files with 734 additions and 24 deletions
+5 -24
View File
@@ -6,28 +6,9 @@
- part: Admin Guide
chapters:
- file: admin-guide/index
- part: Infrastructure Guide
chapters:
- file: infrastructure-guide/home
# sections:
# - file: admin-guide/architecture
# sections:
# - file: admin-guide/architecture/overview
# - file: admin-guide/architecture/accounts-and-groups
# - file: admin-guide/architecture/authentication-authorization
# - file: admin-guide/architecture/networking
# - file: admin-guide/architecture/services-cron-etc
# - file: admin-guide/architecture/version-control
# - file: admin-guide/architecture/security
# - file: admin-guide/architecture/active-directory
# - file: admin-guide/architecture/certificates
# - file: admin-guide/guidelines
# sections:
# - file: admin-guide/guidelines/conventions
# - file: admin-guide/deployment
# sections:
# - file: admin-guide/deployment/ipxe
# - file: admin-guide/deployment/kickstart
# - file: admin-guide/deployment/partitioning
# - file: admin-guide/deployment/sample
# - file: admin-guide/deployment/infrastructure
# - file: admin-guide/deployment/workflow
+9
View File
@@ -0,0 +1,9 @@
https://git.psi.ch/linux-infra/sysdb/ is pulled into /var/www/sysdb/app/ (no automation, just by hand)
httpd runs the service and it needs restarting, when pulling changes
Access rights are granted on the environment level (bob env list). At this time, most of the users and groups come from the AD, except for the sysdb-admins, which is defined locally, see /etc/group
Detailed documentation of the software is at:
http://linux-infra.gitpages.psi.ch/admin-guide/index.html
+46
View File
@@ -0,0 +1,46 @@
List of systems and their primary role:
* [pxeserv01](pxeserv01) - 129.129.190.59 - TFTP server for PXE booting
* [boot00](boot00) - 129.129.160.210 - Runs sysdb, providing the dynamic iPXE, Grub and kickstart files
* [puppet00](puppet00) - 129.129.160.211 - Runs the puppet server for the RHEL7 infra
* [repo00](repo00) - 129.129.160.212 - RPM/Yum repository server for RHEL7
* [reposync](reposync) - 129.129.161.222 - RPM/Yum repository server for RHEL8
* [lxweb00](lxweb00) - 129.129.190.46 - Exports further repositories from AFS
* [login](login) - 129.129.190.131 129.129.190.132 129.129.190.133 - Shell login service for users
* [influx00](influx00) - 129.129.190.225 - Influx database server
* [metrics00](metrics00) - 129.129.190.226 - Grafana frontend for Influx
* [rocket](rocket) - 129.129.161.234 - Rocket chat server
* [lxsup00](lxsup00) - 129.129.190.24 - Shell for linux support, primarily to run bob
* [satint](satint) - 129.129.160.114 - PSI Satellite server
There is a keepass file with passwords (Heinz or Edgar)
Access to the redhat.com knowledge base:
Login: kbaccess
Passwort: Kb4cc3ss
**Procedures**
* [Adding a new RHEL version to the RHEL7 install mechanism](newver)
* [How to grant access to RHEL7 infrastructure](https://git.psi.ch/linux-infra/user-ca/blob/master/README.md#automated-with-ansible-for-pli-infrastructure-systems-of-rhel-7)
* [Grant new person right for bob/sysdb](newbob)
* [How to reinstall a machine](howtoreinstall)
**Tools**
* [SSH config](sshconf)
# Metrics
* [Overview Infrastructure](https://metrics.psi.ch/d/1SL13Nxmz/gfa-linux-tabular?orgId=1&from=now-6h&to=now&refresh=30s&var-env=telegraf_pli&var-host=boot00.psi.ch&var-host=influx00.psi.ch&var-host=lxweb00.psi.ch&var-host=metrics00.psi.ch&var-host=puppet00.psi.ch&var-host=pxeserv01.psi.ch&var-host=repo00.psi.ch&var-host=reposync.psi.ch)
+15
View File
@@ -0,0 +1,15 @@
# How to reinstall a machine
Generally speaking, a reinstall can be done without doing anything other than doing the PXE boot, but there are some caveats to consider:
- the puppet server certificate is saved on the puppet server
- the puppet client certificate is saved by the kickstart script (which obviously can only happen, if the machine is reinstalled to the same drive with an intact file-system)
- if you do a new install to a blank drive, but the puppet server has a certificate saved for the host, the client will generate a new cert, but the server will not, so the certificates saved on the 2 sides, will not match and will never work. In this case both sides need to be cleaned up before a new puppet run is attempted.
- somewhat unrelated to the other points, but a similar case is the ssh server keys, which are stored on the puppet server and are put in place by puppet agent, so they remain unchanged under all reinstall scenarios
It's already documented (https://git.psi.ch/linux-infra/docs/wikis/puppet00) how puppet server certs can be deleted at https://puppet00.psi.ch/ and on that page, the command to delete the client cert is specified.
To access https://puppet00.psi.ch one needs to authenticate with your username/password. The server uses a invalid https certificate that is not accepted by modern safari/chrome any more. Use Firefox as a workaround.
+101
View File
@@ -0,0 +1,101 @@
This is a RHEL7 machine and is puppet managed:
https://git.psi.ch/linux-infra/data-pli/blob/master/default/influx00.psi.ch.yaml
Runs the influxdb backend for the metrics.psi.ch service, as part of the telegraph, influxdb and grafana stack.
Influx version installed:
```
[root@influx00 ~]# rpm -qf /usr/bin/influxd
influxdb-1.8.3-1.x86_64
```
Open ports on this server are:
```
[root@influx00 influxdb]# ss -tln
State Recv-Q Send-Q Local Address:Port Peer Address:Port
LISTEN 0 128 *:22 *:*
LISTEN 0 128 127.0.0.1:8088 *:*
LISTEN 0 100 127.0.0.1:25 *:*
LISTEN 0 5 *:5666 *:*
LISTEN 0 128 *:111 *:*
LISTEN 0 128 [::]:8086 [::]:*
LISTEN 0 128 [::]:22 [::]:*
LISTEN 0 100 [::1]:25 [::]:*
LISTEN 0 5 [::]:5666 [::]:*
LISTEN 0 128 [::]:111 [::]:*
```
There is no firewall running on this machine.
Note: Do not update to influxdbd 2.x. The new version requires authentication by the clients, which is not implemented in puppet / telegraph.
Data is stored at `/var/lib/influxdb` "locally" on the virtual machine.
The influx configuration can be found `/etc/influxdb/influxdb.conf`
# Questions
- Is there a more detailed documenation/script/playbook that descibes the setup of this server?
- Beyond the hiera config, the machine is set to use the influxdb role, which in turn applies the influxdb profile: https://git.psi.ch/linux-infra/puppet/blob/preprod/code/modules/profile/manifests/influxdb.pp
- How was the influxdb package installed on that machine?
- From the profile and now it is locked via the versionlock yum plugin.
- The storage for the data is "locally" to the virtual machine?
- Yes, all the data is stored on the VM disk image.
- The configuration file `/etc/influxdb/influxdb.conf` does report that its managed via puppet. However I don't see anything in the puppet configuration https://git.psi.ch/linux-infra/data-pli/blob/master/default/influx00.psi.ch.yaml. Where does this file come from?
```
########################################################################
#
# THIS FILE IS MANAGED BY PUPPET - DO NOT MODIFY!
#
########################################################################
```
Is it this one: https://git.psi.ch/linux-infra/puppet/blob/preprod/code/modules/profile/manifests/influxdb.pp ? Through what config does this get applied to the server?
- Correct
```
[klart@klart ~]$ bob node list -v influx00.psi.ch
influx00.psi.ch pli local ipxe_installer=rhel73server network=static puppet_env=pmons puppet_role=role::influxdb
```
- The influx service seems to be started by systemd, however it seems that the systemd service file does not come with a package - was this one placed manually there?
```
[root@influx00 ~]# rpm -qf /usr/lib/systemd/system/influxdb.service
file /usr/lib/systemd/system/influxdb.service is not owned by any package
[root@influx00 ~]# cat /usr/lib/systemd/system/influxdb.service
# If you modify this, please also make sure to edit init.sh
[Unit]
Description=InfluxDB is an open-source, distributed, time series database
Documentation=https://docs.influxdata.com/influxdb/
After=network-online.target
[Service]
User=influxdb
Group=influxdb
LimitNOFILE=65536
EnvironmentFile=-/etc/default/influxdb
ExecStart=/usr/bin/influxd -config /etc/influxdb/influxdb.conf $INFLUXD_OPTS
KillMode=control-group
Restart=on-failure
[Install]
WantedBy=multi-user.target
Alias=influxd.service
[root@influx00 ~]#
```
- Answer:
- It is installed by the rpm, but from the install script of the rpm, not as a file.
- What are the other open ports needed for? :111 :8086 :25 :8088
- 25 is postfix (SMTP), 111 belongs to NFS, the 80xx ports both belong to influx
- It's certainly not being used, nothing is mounted or exported via NFS. It isn't even enabled in puppet for this host. However, most of the puppet works in a way, where it installs things, but doesn't remove anything, if the settings change. So if it was enabled at any time in the past, it was just left behind. Though the NFS service is not running, only the rpcbind is.
+3
View File
@@ -0,0 +1,3 @@
This is a cluster made up of 3 hosts, which all are VMs running on the AIT vmware cluster. These machines are pretty standard RHEL7 hosts managed by Puppet: https://git.psi.ch/linux-infra/data-pli/blob/master/login.yaml.
Info for users is published at https://intranet.psi.ch/de/computing/linux-login-clusters
+2
View File
@@ -0,0 +1,2 @@
This is a RHEL7 machine and is puppet managed:
https://git.psi.ch/linux-infra/data-pli/blob/master/default/lxsup00.psi.ch.yaml
+59
View File
@@ -0,0 +1,59 @@
This is a RHEL7 machine and is puppet managed. The httpd configuration seem to be managed there as well:
https://git.psi.ch/linux-infra/data-pli/blob/master/default/lxweb00.psi.ch.yaml
Exports various paths from AFS to http(s), see /etc/httpd/conf.d/ for details.
The AFS directories exported are:
```
Alias /dist "/afs/psi.ch/project/linux/www/dist"
Alias /kickstart "/afs/psi.ch/project/linux/www/kickstart"
Alias /mirror "/afs/psi.ch/project/linux/www/mirror"
Alias /pxe "/afs/psi.ch/service/linux/tftpboot"
Alias /ext/cpt "/afs/psi.ch/project/cpt/repo/"
Alias /ext/gfa "/afs/psi.ch/project/gfa-controls-sw-repo"
Alias /ext/gpfs "/afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/GPFS"
Alias /ext/hpc-extra "/afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/hpc-extra"
Alias /ext/lmu "/afs/psi.ch/project/lmu/lmu_rpm/"
Alias /ext/ofed "/afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/OFED"
Alias /ext/slurm "/afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/slurm"
Alias /ext/tier3 "/afs/psi.ch/software/linux/dist/scientific/6/tier3"
```
The httpd config files are located here:
```
[root@lxweb00 conf.d]# ls -la /etc/httpd/conf.d/25*
-rw-r--r-- 1 root root 3294 Dec 3 07:55 /etc/httpd/conf.d/25-linux.web.psi.ch_non_ssl.conf
-rw-r--r-- 1 root root 3559 Dec 3 07:55 /etc/httpd/conf.d/25-linux.web.psi.ch_ssl.conf
```
The content is served on port 80 and 443
```
[root@lxweb00 conf.d]# netstat -tulnp | grep http
tcp6 0 0 :::80 :::* LISTEN 19619/httpd
tcp6 0 0 :::443 :::* LISTEN 19619/httpd
```
The https certificate is located/installed in `/etc/pki/tls`
# Questions
- who is taking care of this certificate, how is it installed? how is the expiration monitored?
- The owner/admin of this system must take care of the certificate. There is no monitoring or automation. The standard SWITCH procedure is to be used.
- Why is "/afs/psi.ch/service/linux/tftpboot" exported on this server as well?
- Don't know, could only guess.
- Who is responsible or the contact person for the different exported AFS directories?
- I don't know, it's not really formalized. I have some guesses for some parts:
| AFS | Responsible / Contact Person |
| ------ | ------ |
| /afs/psi.ch/project/linux/www/dist | |
| /afs/psi.ch/project/linux/www/kickstart | |
| /afs/psi.ch/project/linux/www/mirror | |
| /afs/psi.ch/service/linux/tftpboot | (why is this needed at all ?)|
| /afs/psi.ch/project/cpt/repo/ | Gilles Martin |
| /afs/psi.ch/project/gfa-controls-sw-repo | Rene Kapeller|
| /afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/GPFS | Leo's group |
| /afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/hpc-extra | Marc & Ivano |
| /afs/psi.ch/project/lmu/lmu_rpm/ | Andrea Raselli |
| /afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/OFED | Marc & Ivano |
| /afs/psi.ch/software/linux/dist/scientificlinux/7x/x86_64/slurm | Marc & Ivano |
| /afs/psi.ch/software/linux/dist/scientific/6/tier3 | Derek? |
+10
View File
@@ -0,0 +1,10 @@
This machine is a RHEL7 machine and is puppet managed:
https://git.psi.ch/linux-infra/data-pli/blob/master/default/metrics00.psi.ch.yaml
Runs the grafana frontend service at https://metrics.psi.ch, as part of the telegraph, influxdb and grafana stack.
There are two main processes on this server:
- /usr/sbin/grafana-server
- /usr/sbin/httpd
The installation is done by a puppet role: https://git.psi.ch/linux-infra/puppet/blob/preprod/code/modules/profile/manifests/grafana.pp
+14
View File
@@ -0,0 +1,14 @@
# How to grand a person access to bob/sysdb
bob is making http calls to the sysdb app. Authorization (https://git.psi.ch/linux-infra/sysdb#authentication-and-authorization) is done via krb5 tokens. Operations outside of environments (creating/changing the owner of/deleting environments) needs to be done by a sysdb admin, ie someone who is a member of the group sysdb-admins. Group membership of the authenticated users is evaluated on the OS level on boot00. So group memberships can be set both locally or in the AD. This makes it a bit confusing, but both are used.
The sysdb-admins specifically is a local group, see boot00:/etc/group
For the envs (bob env list), only adding and listing are implemented in bob, any other operation, like deletion or modification can only be performed in the sysdb sqlite database itself.
Each env can only have one user and one group assigned to it.
To grant access to different environments data-xxx repositories normal Git access control is used.
Nothing overrides the access control of the git server.
+27
View File
@@ -0,0 +1,27 @@
**Adding a new RHEL version to the RHEL7 install mechanism**
Download the iso image on repo00 from https://id-sat-prd.ethz.ch/pub/isos/
```
[root@repo00 ~]# cd /var/www/html/iso/
[root@repo00 iso]# wget https://id-sat-prd.ethz.ch/pub/isos/7Server/rhel-server-7.9-x86_64-dvd.iso
[root@repo00 iso]# systemctl restart pli-mount-iso-images.service
```
The service restart mounts the iso as a loop device into a directory of the same name.
Then the ipxe and grub templates of the sysdb have to be edited, to add the new version:
https://git.psi.ch/linux-infra/sysdb/blob/prod/sysdb/ipxe_template.py
https://git.psi.ch/linux-infra/sysdb/blob/prod/sysdb/grub_template.py
Once the change is committed, the changes have to be pulled on boot00:
```
[root@boot00 ~]# cd /var/www/sysdb/app/
[root@boot00 app]# git pull
[root@boot00 app]# systemctl restart httpd
```
The changes only come live after a restart of the httpd.
+46
View File
@@ -0,0 +1,46 @@
Runs the puppet server for rhel7
The main code repositories are synced into:
* /srv/puppet/code/base/preprod/
* /srv/puppet/code/base/prod/
Other optional environments can be arbitrarily created and immediately used under this path:
* /srv/puppet/code/dev/envs/
At https://puppet00.psi.ch/ , a small web app to delete server side certificates is made available. The authentication uses LDAP against the AD, but access rights are granted from the /etc/httpd/conf.d/ssl.conf
**Branches**
You can create a branch to develop new code from the master branch of the puppet repository. To test the code, a directory with the same name as the branch can be created at puppet00:/srv/puppet/code/dev/envs/ . Upon creating the directory, preprod gets rsynced in here. If the branch alrady exists and if it's to be pulled, that can be done via the command:
```
git pull origin xyz
```
This can then be tested on any controlled host by running:
```
puppet agent -t --environment=xyz
```
**Merge process**
Merge meetings are usually held weekly. To record the meeting a https://git.psi.ch/linux-infra/org/wikis/meeting_reports/YYYY-MM-DD page is to be created based on the https://git.psi.ch/linux-infra/org/wikis/merge-meeting-guidelines template.
**Modules**
The modules, which are not part of the base repo are to be pulled into /srv/puppet/code/dev/envs/(pre)prod/code/modules/
The correct way to pull the modules is with the use of librarian. However, at this time, the puppetfile contains "prod" or "production" as versions for some of the modules. Librarian can not understand this. As a result, it turns to the puppetfile.lock , where the commit of the initial pull is saved. As long as the lock is present, librarian will always pull the commit saved there, it will not pull the latest commit and will even revert, if that latest commit is pulled manually.
The solution is to always run librarian with the lock file removed:
```
[root@puppet00 prod]# cd /srv/puppet/code/base/prod
[root@puppet00 prod]# rm -f Puppetfile.lock
[root@puppet00 prod]# /opt/puppetlabs/puppet/bin/librarian-puppet install --path=code/modules
```
This way the latest commit will be pulled for all incorrectly defined modules.
+117
View File
@@ -0,0 +1,117 @@
The server is a RHEL 8, installed manually and registered directly with redhat.com . This is so that it's completely independent from anything else at PSI.
For historical reasons, the tftpboot directory is hosted on AFS. But for an unknown reason, the RHEL8 tftpd can not read the files from AFS. (It's not SELinux) So the data is mirrored to the local drive and is served from this copy.
See /etc/crontab
There is a cron job defined in /etc/crontab to sync the content of the `/afs` directory to the local `/tftpboot` directory.
This job runs **every minute** and is defined as follows:
```
* * * * * root rsync -aah --exclude '*rhel-8-poc*' --delete /afs/psi.ch/service/linux/tftpboot/ /tftpboot
```
This server hosts the tftp service (port 69) used for pxe booting.
Permissions /tftpboot directory:
```bash
[root@pxeserv01 ~]# du -sh /tftpboot/
5.2G /tftpboot/
[root@pxeserv01 ~]# ls -lad /tftpboot/
drwxr-xr-x 13 5122 840 4096 Mar 4 17:26 /tftpboot/
```
Permissions of the /afs tftpboot directory:
```
[root@pxeserv01 tftpboot]# fs listacl
Access list for . is
Normal rights:
psi:nodes rl
svc.linux:administrators rlidwka
svc.linux:pxe rl
svc.linux:readonly rl
svc.linux:tools rl
svc.linux:users l
web:hosts rl
[root@pxeserv01 tftpboot]# pwd
/afs/psi.ch/service/linux/tftpboot
[root@pxeserv01 tftpboot]#
```
Current members of linux.administrators:
```
[klar_t@pc13255 ~]$ pts membership svc.linux:administrators
Members of svc.linux:administrators (id: -10574) are:
system:administrators
ebner
lutz_h
stadler_h
kapeller
feichtinger
barabas
sala
gsell
ozerov_d
talamo_i
dorigo_a
nazlikul_m
caubet_m
klar_t
taylor_j
spreitzer_s
```
# Important
For any changes in the pxe config settings there might be a delay of 1 minute before the clients see the changes!
# pxelinux.cfg directory
(/afs/psi.ch/service/linux/tftpboot/pxelinux.cfg /tftpboot/pxelinux.cfg)
There are several syntax variants, which can be used for pxelinux file names. It can do hostnames, ip addresses, hexa encoded expressions of either an IP or a subnet, partial or full MACs. etc.. For details see: https://wiki.syslinux.org/wiki/index.php?title=PXELINUX
# Questions
- is there a special tftpd configuration, if yes, where?
- It's not really a config, but the systemd unit file, which is changed from the default
```
/usr/lib/systemd/system/tftp.service
```
- chronyd has a port open on 323, for what is this needed?
- That's nothing special or custom, that's just how chronyd works by default
- Are firewall rules set explicitly? Current firewall rules:
```
[root@pxeserv01 ~]# firewall-cmd --list-all
public (active)
target: default
icmp-block-inversion: no
interfaces: ens192
sources:
services: cockpit dhcpv6-client ssh tftp
ports:
protocols:
masquerade: no
forward-ports:
source-ports:
icmp-blocks:
rich rules:
[root@pxeserv01 ~]#
```
- answer:
- tftp is opened up additionally, nothing more
```
firewall-cmd --zone=public --add-service=tftp --permanent
firewall-cmd --zone=public --add-service=tftp
```
- is this service also used for windows systems? (I can see a `uefiwin` directory in the tftpboot directory)
- yes, this is the one and only tftp server for all PSI networks
- The one person I ever talked to about PXE booting and windows was Niklaus Baumann.
- Can you please explain the structure of the /tftpboot directory - which directories are populated by which services/users. Who is managing all this content? Is there additional documentation on this?
- Not really, it's a mess I inherited. On legacy mode, pxelinux.0 is loaded, on uefi, it is the grubx64.efi . These have their configs in pxelinux.cfg and grub.cfg respectively.
+125
View File
@@ -0,0 +1,125 @@
This machine is a RHEL7 system **not** under Puppet control. This machine has no AFS dependencies.
The basic service provided by this system is: httpd
This machine acts as a mirror for the RHEL7 iso images.
The installer iso images are downloaded from https://id-sat-prd.ethz.ch/pub/isos/ and put into /var/www/html/iso manually
The iso images in `/var/www/html/iso` then (automatically) mounted as loop devices by the `pli-mount-iso-images.service`
```
[root@repo00 ~]# df -kh | grep /var/www
/dev/mapper/vg_repo-repofiles 1.4T 1003G 373G 73% /var/www/html
/dev/loop2 7.1G 7.1G 0 100% /var/www/html/iso/HP.SPP.2020.03
/dev/loop3 903M 903M 0 100% /var/www/html/iso/IP330.2019_0207.248
/dev/loop4 12M 12M 0 100% /var/www/html/iso/Memtest86-7.5
/dev/loop6 158M 158M 0 100% /var/www/html/iso/MLNX_OFED_LINUX-4.1-1.0.2.0-rhel7.4-x86_64
/dev/loop7 163M 163M 0 100% /var/www/html/iso/MLNX_OFED_LINUX-4.2-1.0.0.0-rhel7.4-x86_64
/dev/loop9 163M 163M 0 100% /var/www/html/iso/MLNX_OFED_LINUX-4.2-1.2.0.0-rhel7.4-x86_64
/dev/loop10 275M 275M 0 100% /var/www/html/iso/MLNX_OFED_LINUX-4.7-1.0.0.1-rhel7.6-x86_64
/dev/loop11 275M 275M 0 100% /var/www/html/iso/MLNX_OFED_LINUX-4.7-1.0.0.1-rhel7.7-x86_64
/dev/loop12 5.5G 5.5G 0 100% /var/www/html/iso/P03093_001_spp-Gen8.1-SPPGen81.4
/dev/loop13 5.7G 5.7G 0 100% /var/www/html/iso/P14481_001_spp-2019.03.0-SPP2019030.2019_0206.85
/dev/loop14 5.8G 5.8G 0 100% /var/www/html/iso/P19473_001_spp-2019.09.0-SPP2019090.2019_0905.39
/dev/loop15 7.0G 7.0G 0 100% /var/www/html/iso/P26228_001_spp-2019.12.0-SPP2019120.2019_1209.4
/dev/loop17 7.9G 7.9G 0 100% /var/www/html/iso/rhel-8.2-x86_64-dvd
/dev/loop18 8.9G 8.9G 0 100% /var/www/html/iso/rhel-8.3-x86_64-dvd
/dev/loop19 3.8G 3.8G 0 100% /var/www/html/iso/rhel-server-7.4-x86_64-dvd
/dev/loop20 4.4G 4.4G 0 100% /var/www/html/iso/rhel-server-7.5-x86_64-dvd
/dev/loop21 4.2G 4.2G 0 100% /var/www/html/iso/rhel-server-7.6-x86_64-dvd
/dev/loop22 4.2G 4.2G 0 100% /var/www/html/iso/rhel-server-7.7-x86_64-dvd
/dev/loop23 4.3G 4.3G 0 100% /var/www/html/iso/rhel-server-7.8-x86_64-dvd
/dev/loop24 4.3G 4.3G 0 100% /var/www/html/iso/rhel-server-7.9-x86_64-dvd
```
The `pli-repo-mirror.timer` runs a daily sync, which pulls the repos into `/var/www/html/el7/sources` . The name is misleading, these are actually all the latest repos.
From the above, a weekly snapshot is taken by the `pli-repo-snapshot.timer`.
The `/opt/pli/libexec/pli-repo-zoom.sh` is run from `/etc/crontab` , it maintains the zoom repo at /var/www/html/zoom/
```
23 23 * * * root /opt/pli/libexec/pli-repo-zoom.sh
```
The scripts and files in /opt/pli (except the crontab entry) can be found in this repository:
https://git.psi.ch/linux-infra/repo00_pli-scripts
Provided http services:
```
[root@repo00 ~]# netstat -tulnp | grep http
tcp6 0 0 :::80 :::* LISTEN 11278/httpd
tcp6 0 0 :::443 :::* LISTEN 11278/httpd
[root@repo00 ~]#
```
The httpd configuration can be found in /etc/httpd/conf.d
```
[root@repo00 ~]# ls -l /etc/httpd/conf.d/
total 12
-rw-r--r--. 1 root root 694 Apr 9 2019 25-repo00.psi.ch_non_ssl.conf
-rw-r--r--. 1 root root 1131 Apr 9 2019 25-repo00.psi.ch_ssl.conf
-rw-r--r--. 1 root root 366 Oct 9 2020 README
[root@repo00 ~]#
```
The ssh certificate is located in `/etc/pki/tls/`
# Questions / TODO
- I added the /opt/pli directory under git control, the repo is https://git.psi.ch/linux-infra/repo00_pli-scripts. Ideally the pli-* service files in /etc/systemd/system should be replaced with links to the /opt/pli/systemd/pli* files. Could you please do that and test whether things still work.
- SELinux is enforcing, this will not work.
- Could you please replace the `/etc/crontab` entry with a systemd service and timer and put these two files also in /opt/pli/systemd and link them in /etc/systemd/system. This way also this functionality is version controlled.
- Timer added and cron removed
- Can you explain a little bit more the structure of the /var/www/html/ directory (what is where, who is responsible for certain directories, what are they needed for, ...). The content of the web directory:
```
[root@repo00 ~]# ls -la /var/www/html/
total 56
drwxr-xr-x. 11 root root 4096 Mar 29 11:32 .
drwxr-xr-x. 4 root root 31 Oct 9 2020 ..
drwxr-xr-x. 7 root root 71 Apr 12 2019 el7
drwxr-xr-x. 3 root root 16 Sep 21 2020 fcos
drwxr-xr-x. 3 root root 4096 Apr 24 2020 HP.FW.RPMs
drwxr-xr-x. 23 root root 4096 Apr 12 14:27 iso
-rw-r--r--. 1 root root 8605 Jun 11 2019 lxdev00.ks
-rw-r--r--. 1 root root 8604 Jun 13 2019 lxdev01.ks
drwxr-xr-x. 5 root root 4096 Oct 30 2018 mt86
drwxr-xr-x. 2 root root 87 Aug 31 2020 ppc
drwxr-xr-x. 5 root root 69 Apr 24 2020 rhcos
-rw-r--r--. 1 root root 356 Feb 18 13:58 rhel7_hashes.txt
-rw-r--r--. 1 root root 211 Nov 27 2018 rhel8.ipxe
drwxr-xr-x. 25 root root 4096 Nov 21 2019 yum
drwxr-xr-x. 3 root root 4096 Apr 12 23:23 zoom
```
* el7 - where the automated mirroring and snapshotting is done
* iso - where the images are placed and mounted
* zoom - zoom repo
The rest were put there by hand. Much of it is probably not needed, but wouldn't know who needed them.
- Is there any additional documentation on how this system was set up? Where can I find this? If not, could you add here some more details which packages and configs are important (beside the /opt/pli scripts/services)
- I know of no further documentation and it was set up by Kai, years ago. It would take quite a bit of trial and error to reproduce.
- Is the pli-mount-iso-images.service run manually? I do not see any timer/watchdog that executes this periodically or upon new .iso files appearing
- It is an enabled service, so it runs once on system boot automatically. Otherwise changes are not monitored, if one puts an iso there and wants it mounted, a manual restart of the service is required for anything to happen.
- Is the mentiond httpd config everything that is needed, who is taking care of this certificate, how is it installed? how is the expiration monitored?
- I don't think anything further is needed. The cert is requested from SWITCH and placed here manually. It is not monitored. The owner/admin of this server must take care of this.
- Could you replace the files in /etc/httpd/conf.d/25* with a link to /opt/pli/httpd/25* and see whether things still work (this way also the httpd config would be versioned).
- No, SELinux.
- Who belongs this repo? https://repo00.psi.ch/mt86/ (I guess mt86 is a person short code - unfortunately I cannot find this code in the phonebook)
- It's memory test for x86 systems, not a person at PSI.
+25
View File
@@ -0,0 +1,25 @@
This server exports all repositories for RHEL8 and uses btrfs to keep weekly snapshots of these. The btrfs is mounted at /var/www/html . Here the 'hot' directory is a sub-volume and the 'cold-*' directories are snapshots of the hot.
This system forms an integral part of the psi.packer ansible role.
Other than the initial setup of the btrfs, everything under /var/www/html/ is created and modified automatically. No file here should ever be touched manually.
The scripts and configuration files for the reposync system are found under /srv and they are triggered by /etc/crontab
The contents of /srv :
Directories:
* **gpg** - contains the GPG files used by the repos
* **repos** - must have one repo file for each repo to be mirrored, the name must be the same for the repo file as the repo defined inside and this name must only contain alphanumeric characters or underscores (no dashes or dots)
* **rhn** - contains the certificates, which enable access to the redhat CDN. The TLS files are actively pulled from pxeserv01, as these expire and get renewed with some frequency.
* **zoom** - this is where the zoom repo is created
Files:
* **header.sh** - Creates the /var/www/html/hot/stat/ files and the header.html, which all become part of each snapshot
* **list.sh** - Creates the list.html from the stat files
* **sync_nosnap.sh** - Runs daily, runs the sync and list scripts
* **sync.sh** - run by the other 2 sync scripts to perform the actual reposync, creates the list.api file and runs the header script
* **sync_snap.sh** - Once a week, run the sync, create a snapshot, update the local host and reboot
* **zoom.sh** - script to pull the zoom.rpm and create the repo
+3
View File
@@ -0,0 +1,3 @@
Manually installed following the https://docs.rocket.chat/installation/manual-installation/centos guide.
https is implemented by the use of the httpd as a reverse proxy front-end.
+7
View File
@@ -0,0 +1,7 @@
Manually installed installed RHEL7, registered directly with redhat.com
Standard RH Satellite installation through the puppet-based foreman installer. To install any extra packages or perform upgrades, the foreman-maintain command has to be used.
* **/root/httpd_psi_hack.sh** - Enables the non-authenticated access of the repositories stored on the host. Necessary to run after most foreman-maintain runs.
In case of issues with the manifest or anything else around the redhat.com account, contact **rhn@id.ethz.ch**
+120
View File
@@ -0,0 +1,120 @@
This config covers automatic ssh gateway selection and recursive porxy jumping as of April 2020 for all PSI networks I know about.
**Operating principles**
* The match directives select the gateway to use. As the config is used for contacting gateways as well, recursion is built in
* The control directives make it so that a second connection to a host uses the active socket and does not require authentication. Especially useful for wmgt with the RSA login
* Default username specified in case it differens from the AD user
* Identity file specifies the CA signed key
```
## Network matches, exclusions at the beginning
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '^(129\.129\.194\.98|129\.129\.190\.25|129\.129\.146\.12[1357]|129\.129\.146\.119|129\.129\.146\.15[45]|129\.129\.146\.20)'| grep -qE '^(10\.129\.1[69]0\.|10\.33\.120\.|172\.24\.5\.|192\.33\.12[07]\.|192\.168\.[18]\.|192\.168\.13\.|192\.168\.71\.|192\.33\.126\.[34]|129\.129\.146\.|129\.129\.15[078]\.|129\.129\.160\.|129\.129\.18[89]\.|129\.129\.19[045]\.|129\.129\.230\.|129\.129\.24[01]\.|192\.33\.126\.|172\.24\.6|129\.129\.95\.)'"
ProxyJump wmgt01
Match exec "host %h | cut -d ' ' -f 4 |grep -vE '(172\.24\.6\.34)'| grep -qE '^(129\.129\.8[789]\.|172\.24\.6\.|172\.24\.52\.|172\.24\.42\.)'"
ProxyJump cptgate01.psi.ch
Match exec "host %h | cut -d ' ' -f 4 |grep -qE '(172\.23\.9[89]\.)'"
ProxyJump esi-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(172\.20\.3\.)'"
ProxyJump sls-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(172\.21\.1[012]\.)'"
ProxyJump fin-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(172\.21\.70\.)'"
ProxyJump trfcb-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(172\.25\.11\.|172\.25\.60\.)'"
ProxyJump proscan-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(172\.19\.10\.|172\.22\.120\.)'"
ProxyJump hipa-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(129\.129\.242\.)'"
ProxyJump saresa-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(129\.129\.243\.)'"
ProxyJump saresb-gw.psi.ch
## gw excluded from the wmgt01 batch
# jump host doesn't exist ???
#Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(129\.129\.242\.)'"
#ProxyJump sls-proscan.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -qE '^(172\.26\.[08]\.|172\.26\.16\.|172\.26\.24\.|172\.26\.32\.|172\.26\.40\.|172\.26\.110\.|172\.26\.120\.)'"
ProxyJump sf-gw.psi.ch
## gw excluded from the wmgt01 batch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.98\.12)' | grep -qE '^(129\.129\.98\.)'"
ProxyJump x01dc-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.99\.12)' | grep -qE '^(129\.129\.99\.)'"
ProxyJump x02da-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.101\.12)' | grep -qE '^(129\.129\.101\.)'"
ProxyJump x03ma-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.102\.12)' | grep -qE '^(129\.129\.102\.)'"
ProxyJump x03da-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.104\.12)' | grep -qE '^(129\.129\.104\.)'"
ProxyJump x04sa-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.105\.12)' | grep -qE '^(129\.129\.105\.)'"
ProxyJump x04db-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.106\.12)' | grep -qE '^(129\.129\.106\.)'"
ProxyJump x05la-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.107\.12)' | grep -qE '^(129\.129\.107\.)'"
ProxyJump x05da-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.109\.12)' | grep -qE '^(129\.129\.109\.)'"
ProxyJump x06sa-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.110\.12)' | grep -qE '^(129\.129\.110\.)'"
ProxyJump x06da-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.126\.12)' | grep -qE '^(129\.129\.126\.)'"
ProxyJump x06mx-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.112\.12)' | grep -qE '^(129\.129\.112\.)'"
ProxyJump x07ma-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.113\.12)' | grep -qE '^(129\.129\.113\.)'"
ProxyJump x07da-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.116\.12)' | grep -qE '^(129\.129\.116\.)'"
ProxyJump x09lb-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.117\.12)' | grep -qE '^(129\.129\.117\.)'"
ProxyJump x09la-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.118\.12)' | grep -qE '^(129\.129\.118\.)'"
ProxyJump x10sa-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.119\.12)' | grep -qE '^(129\.129\.119\.)'"
ProxyJump x10da-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.121\.12)' | grep -qE '^(129\.129\.121\.)'"
ProxyJump x11ma-gw.psi.ch
Match exec "host %h | cut -d ' ' -f 4 | grep -vE '(129\.129\.122\.12)' | grep -qE '^(129\.129\.122\.)'"
ProxyJump x12sa-gw.psi.ch
Host *
User klar_t
IdentityFile ~/.ssh/id_rsa-cert.pub
PubkeyAcceptedKeyTypes ecdsa-sha2-nistp256,ecdsa-sha2-nistp256-cert-v01@openssh.com,sk-ecdsa-sha2-nistp256@openssh.com,sk-ecdsa-sha2-nistp256-cert-v01@openssh.com,ecdsa-sha2-nistp384,ecdsa-sha2-nistp384-cert-v01@openssh.com,ecdsa-sha2-nistp521,ecdsa-sha2-nistp521-cert-v01@openssh.com,ssh-ed25519,ssh-ed25519-cert-v01@openssh.com,sk-ssh-ed25519@openssh.com,sk-ssh-ed25519-cert-v01@openssh.com,rsa-sha2-256,rsa-sha2-256-cert-v01@openssh.com,rsa-sha2-512,rsa-sha2-512-cert-v01@openssh.com,ssh-rsa,ssh-rsa-cert-v01@openssh.com,ssh-dss,ssh-dss-cert-v01@openssh.com
ControlMaster auto
ControlPath ~/.ssh/cm_socket/%r@%h:%p
```