From 4a2d81d012b233dfa38c5760c702245ad66900b7 Mon Sep 17 00:00:00 2001 From: ebner Date: Wed, 7 Aug 2024 13:41:53 +0200 Subject: [PATCH] update documenation --- infrastructure-guide/influx.md | 55 ++++++ infrastructure-guide/influx00.md | 157 ------------------ .../infrastructure_systems.md | 22 +-- infrastructure-guide/login.md | 13 +- .../{metrics00.md => metrics.md} | 16 +- 5 files changed, 72 insertions(+), 191 deletions(-) create mode 100644 infrastructure-guide/influx.md delete mode 100644 infrastructure-guide/influx00.md rename infrastructure-guide/{metrics00.md => metrics.md} (56%) diff --git a/infrastructure-guide/influx.md b/infrastructure-guide/influx.md new file mode 100644 index 00000000..06287e90 --- /dev/null +++ b/infrastructure-guide/influx.md @@ -0,0 +1,55 @@ +# Metrics Server - Influx DB - influx.psi.ch + +This is puppet managed: +https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/default/lx-influx-01.psi.ch.yaml + +Runs the influxdb backend for the metrics.psi.ch service, as part of the telegraph, influxdb and grafana stack. + +Data is stored at `/var/lib/influxdb` "locally" on the virtual machine. +The influx configuration can be found `/etc/influxdb/influxdb.conf` + + +# Administration Influx Database + +Connect to the influx server +```bash +ssh influx.psi.ch +``` + +Connect to the influx database: +```bash +influx +``` + +Now you should have the prompt of the influx cli client + + +Show all databases +```bash +show databases +``` + +Switch to a database: +```bash +use "database_name" +``` + +Show all measurements +```bash +show measurements +``` + +Show all series (of all measurements) +```bash +show series +``` + +Show all series for a particular host: +```bash +show series where "host" = 'lx-puppet-01.psi.ch' +``` + +Delete all series for a particular host: +``` +DROP SERIES FROM /.*/ WHERE "host" = 'lx-puppet-01.psi.ch' +``` \ No newline at end of file diff --git a/infrastructure-guide/influx00.md b/infrastructure-guide/influx00.md deleted file mode 100644 index 470f0f36..00000000 --- a/infrastructure-guide/influx00.md +++ /dev/null @@ -1,157 +0,0 @@ -# influx00 - -This is a RHEL7 machine and is puppet managed: -https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/default/influx00.psi.ch.yaml - -Runs the influxdb backend for the metrics.psi.ch service, as part of the telegraph, influxdb and grafana stack. - -Influx version installed: -``` -[root@influx00 ~]# rpm -qf /usr/bin/influxd -influxdb-1.8.3-1.x86_64 -``` - -Open ports on this server are: -``` -[root@influx00 influxdb]# ss -tln -State Recv-Q Send-Q Local Address:Port Peer Address:Port -LISTEN 0 128 *:22 *:* -LISTEN 0 128 127.0.0.1:8088 *:* -LISTEN 0 100 127.0.0.1:25 *:* -LISTEN 0 5 *:5666 *:* -LISTEN 0 128 *:111 *:* -LISTEN 0 128 [::]:8086 [::]:* -LISTEN 0 128 [::]:22 [::]:* -LISTEN 0 100 [::1]:25 [::]:* -LISTEN 0 5 [::]:5666 [::]:* -LISTEN 0 128 [::]:111 [::]:* -``` - -There is no firewall running on this machine. - -Note: Do not update to influxdbd 2.x. The new version requires authentication by the clients, which is not implemented in puppet / telegraph. - - -Data is stored at `/var/lib/influxdb` "locally" on the virtual machine. -The influx configuration can be found `/etc/influxdb/influxdb.conf` - - -# Administration - -Connect to the influx server -``` -ssh influx.psi.ch -``` - -Connect to the influx database: -``` -influx -``` - -Now you should have the prompt of the influx cli client - - -Show all databases -``` -show databases -``` - -Switch to a database: -``` -use "database_name" -``` - -Show all measurements -``` -show measurements -``` - -Show all series (of all measurements) -``` -show series -``` - -Show all series for a particular host: -``` -show series where "host" = 'puppet00.psi.ch' -``` - -Delete all series for a particular host: -``` -DROP SERIES FROM /.*/ WHERE "host" = 'puppet00.psi.ch' -``` - - -If there is a new influx database (telegraf_XXXX) this source has to be explicitly added to the metrics server - -For this, connect to: -* https://metrics.psi.ch -* Login -* ![](add_new_metric_source.png) - - - -# Questions -- Is there a more detailed documenation/script/playbook that descibes the setup of this server? - - - Beyond the hiera config, the machine is set to use the influxdb role, which in turn applies the influxdb profile: https://git.psi.ch/linux-infra/puppet/blob/preprod/code/modules/profile/manifests/influxdb.pp - -- How was the influxdb package installed on that machine? - - - From the profile and now it is locked via the versionlock yum plugin. - -- The storage for the data is "locally" to the virtual machine? - - - Yes, all the data is stored on the VM disk image. - -- The configuration file `/etc/influxdb/influxdb.conf` does report that its managed via puppet. However I don't see anything in the puppet configuration https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/default/influx00.psi.ch.yaml. Where does this file come from? -``` -######################################################################## -# -# THIS FILE IS MANAGED BY PUPPET - DO NOT MODIFY! -# -######################################################################## -``` -Is it this one: https://git.psi.ch/linux-infra/puppet/blob/preprod/code/modules/profile/manifests/influxdb.pp ? Through what config does this get applied to the server? - - - Correct -``` -[klart@klart ~]$ bob node list -v influx00.psi.ch -influx00.psi.ch pli local ipxe_installer=rhel7install network=static puppet_env=pmons puppet_role=role::influxdb -``` - -- The influx service seems to be started by systemd, however it seems that the systemd service file does not come with a package - was this one placed manually there? -``` -[root@influx00 ~]# rpm -qf /usr/lib/systemd/system/influxdb.service -file /usr/lib/systemd/system/influxdb.service is not owned by any package -[root@influx00 ~]# cat /usr/lib/systemd/system/influxdb.service -# If you modify this, please also make sure to edit init.sh - -[Unit] -Description=InfluxDB is an open-source, distributed, time series database -Documentation=https://docs.influxdata.com/influxdb/ -After=network-online.target - -[Service] -User=influxdb -Group=influxdb -LimitNOFILE=65536 -EnvironmentFile=-/etc/default/influxdb -ExecStart=/usr/bin/influxd -config /etc/influxdb/influxdb.conf $INFLUXD_OPTS -KillMode=control-group -Restart=on-failure - -[Install] -WantedBy=multi-user.target -Alias=influxd.service -[root@influx00 ~]# -``` - -- Answer: - - - It is installed by the rpm, but from the install script of the rpm, not as a file. - -- What are the other open ports needed for? :111 :8086 :25 :8088 - - - 25 is postfix (SMTP), 111 belongs to NFS, the 80xx ports both belong to influx - - It's certainly not being used, nothing is mounted or exported via NFS. It isn't even enabled in puppet for this host. However, most of the puppet works in a way, where it installs things, but doesn't remove anything, if the settings change. So if it was enabled at any time in the past, it was just left behind. Though the NFS service is not running, only the rpcbind is. diff --git a/infrastructure-guide/infrastructure_systems.md b/infrastructure-guide/infrastructure_systems.md index 0c169b42..c0dd44b4 100644 --- a/infrastructure-guide/infrastructure_systems.md +++ b/infrastructure-guide/infrastructure_systems.md @@ -5,34 +5,30 @@ List of systems and their primary role: __Core Infrastructure:__ * [boot.psi.ch](boot_server) - TFTP server for PXE booting * [sysdb.psi.ch](sysdb_server) - Runs sysdb, providing the dynamic iPXE, Grub and kickstart files -* [puppet.psi.ch](puppet01) - puppet.psi.ch - 129.129.160.118 - Runs the puppet server for the RHEL7 infra -* [repos.psi.ch](repo_server) - RPM/Yum repository server for RHEL7/8/... +* [puppet.psi.ch](puppet01) - puppet.psi.ch - Puppet server +* [repos.psi.ch](repo_server) - Repository server * [lx-sync-01.psi.ch](sync_server) - System to mirror external yum repositories / packages / ... -* [lxweb00](lxweb00) - http://linux.web.psi.ch - 129.129.190.46 - Exports further repositories from AFS +* [lxweb00](lxweb00) - http://linux.web.psi.ch - legacy - 129.129.190.46 - Exports further repositories from AFS __Additional Infrastructure__ Sysdb Access: - -* [lxsup](lxsup) - Shell for linux support, primarily to run bob +* [lxsup](lxsup) - Standard node for the linux support, primarily to run bob Monitoring: * [Icinga2](icinga2) - automatic integration into Icinga2 - -* [influx00](influx00) - 129.129.190.225 - Influx database server - -* [metrics00](metrics00) - 129.129.190.226 - Grafana frontend for Influx +* [lx-influx-01.psi.ch](influx) - Influx database server +* [lx-metrics-01.psi.ch](metrics) - Grafana frontend for Influx __Enduser Systems__ -* [login](login) - 129.129.190.131 129.129.190.132 129.129.190.133 - Shell login service for users - - +* [login.psi.ch](login) - Set of nodes for enduser use +* cpw.psi.ch - Node to change passwords @@ -42,7 +38,7 @@ __Enduser Systems__ ## Metrics -* [Overview Infrastructure](https://metrics.psi.ch/d/1SL13Nxmz/gfa-linux-tabular?orgId=1&from=now-6h&to=now&refresh=30s&var-env=telegraf_lx&var-host=influx00.psi.ch&var-host=lx-boot-01.psi.ch&var-host=lx-puppet-01.psi.ch&var-host=lx-repos-01.psi.ch&var-host=lx-sysdb-01.psi.ch&var-host=lxweb00.psi.ch&var-host=metrics00.psi.ch&var-host=puppet01.psi.ch) +* [Overview Infrastructure](https://metrics.psi.ch/d/1SL13Nxmz/gfa-linux-tabular?orgId=1&from=now-6h&to=now&refresh=30s&var-env=telegraf_lx&var-host=lx-influx-01.psi.ch&var-host=lx-boot-01.psi.ch&var-host=lx-puppet-01.psi.ch&var-host=lx-repos-01.psi.ch&var-host=lx-sysdb-01.psi.ch&var-host=lxweb00.psi.ch&var-host=lx-metrics-01.psi.ch&var-host=puppet01.psi.ch) ## HTTPS Certificates * [HTTPS Certificates](https://linux.psi.ch/admin-guide/operations/certificates.html) diff --git a/infrastructure-guide/login.md b/infrastructure-guide/login.md index 00eacc75..3eef3dfc 100644 --- a/infrastructure-guide/login.md +++ b/infrastructure-guide/login.md @@ -1,12 +1,3 @@ -# login +# Login Nodes - login.psi.ch -This is a cluster made up of 3 hosts, which all are VMs running on the AIT vmware cluster. These machines are pretty standard RHEL7 hosts managed by Puppet: https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/login.yaml. - -Info for users is published at https://intranet.psi.ch/de/computing/linux-login-clusters - - -This is the definition who can access the loginXX nodes: - -* All PSI employees (unx-lx_users) -* Everybody that can access RA, MEG, Merlin (svc-cluster_merlin6, svc-cluster_meg and svc-cluster_ra) -* External users that are part of the group unx-lx_login_ext +This is a set of 3 machines which have the same DNS alias (connection to a random machine) The machines are managed by Puppet: https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/login.yaml. diff --git a/infrastructure-guide/metrics00.md b/infrastructure-guide/metrics.md similarity index 56% rename from infrastructure-guide/metrics00.md rename to infrastructure-guide/metrics.md index b5716782..acd4d6a9 100644 --- a/infrastructure-guide/metrics00.md +++ b/infrastructure-guide/metrics.md @@ -1,16 +1,10 @@ -# metrics00 +# Metrics Server - Grafana - metrics.psi.ch -This machine is a RHEL7 machine and is puppet managed: -https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/default/metrics00.psi.ch.yaml +This machine is puppet managed: +https://git.psi.ch/linux-infra/hiera/data-lx/blob/master/default/lx-metrics-01.psi.ch.yaml Runs the grafana frontend service at https://metrics.psi.ch, as part of the telegraph, influxdb and grafana stack. -There are two main processes on this server: -- /usr/sbin/grafana-server -- /usr/sbin/httpd - -The installation is done by a puppet role: https://git.psi.ch/linux-infra/puppet/blob/preprod/code/modules/profile/manifests/grafana.pp - The rights for Grafana are managed with these two AD groups: | Group | Notes | | ---- | ---- | @@ -19,5 +13,7 @@ The rights for Grafana are managed with these two AD groups: ## Administration -Add new metric: +New telegraf/hiera metric sources are automatically added by a systemd timer once a day (~ 01:08). + +To add a new metric source manually go about like this: ![](add_new_metric_source.png)