diff --git a/infrastructure-guide/icinga2.md b/infrastructure-guide/icinga2.md index cc3b163b..2d1c844a 100644 --- a/infrastructure-guide/icinga2.md +++ b/infrastructure-guide/icinga2.md @@ -5,11 +5,11 @@ We want to support monitoring of the Linux machines in Icinga2. The Icinga2 infr ## Icinga2 Servers - PROD [monitoring.psi.ch](https://monitoring.psi.ch/) (Loadbalancer) - - Primary [vemonma01a.psi.ch](https://vemonma01a.psi.ch/) (with Director) + - Primary [vemonma01a.psi.ch](https://vemonma01a.psi.ch/) (with Icinga Director) - Secondary [wmonma01b.psi.ch](https://wmonma01b.psi.ch/) - DEV - - Primary [vmonma02a.psi.ch](https://vmonma02a.psi.ch/) (with Director) + - Primary [vmonma02a.psi.ch](https://vmonma02a.psi.ch/) (with Icinga Director) - Secondary [vmonma02b.psi.ch](https://vmonma02b.psi.ch/) @@ -17,9 +17,9 @@ We want to support monitoring of the Linux machines in Icinga2. The Icinga2 infr The Linux part of the Icinga2 Master configuration is manged using Ansible in the [`icinga_master` role in the `bootstrap` repo](https://git.psi.ch/linux-infra/bootstrap/-/tree/prod/ansible/roles/icinga_master). -For Puppet managed nodes there is an automated import pipeline using the Director. For the central infrastructure itself there is a predefined Configuration Basket snapshot which is installed by manual Ansible run. +For Puppet managed nodes there is an automated import pipeline using the Icinga Director. For the central infrastructure itself there is a predefined Configuration Basket snapshot which is installed by manual Ansible run. -Configuration which is shared and used by both type of systems are found in the [`awi-lx-basic` Director Configuration Basket](https://git.psi.ch/linux-infra/bootstrap/-/blob/prod/ansible/roles/icinga_master/files/etc/icingaweb2/psi/lx-core/Director-Basket_awi-lx-basic.json) +Configuration which is shared and used by both type of systems are found in the [`awi-lx-basic` Configuration Basket](https://git.psi.ch/linux-infra/bootstrap/-/blob/prod/ansible/roles/icinga_master/files/etc/icingaweb2/psi/lx-core/Director-Basket_awi-lx-basic.json) ### Puppet Managed Nodes @@ -30,7 +30,7 @@ The individual host configuration is automatically generated using already known TODO: diagram, details how this is achieved -The Icinga2 Director import pipeline is provides as [Configuration Basket template `awi-lx-sysdb`](https://git.psi.ch/linux-infra/bootstrap/-/blob/prod/ansible/roles/icinga_master/templates/Director-Basket_awi-lx-sysdb.json) +The Icinga Director import pipeline is provides as [Configuration Basket template `awi-lx-sysdb`](https://git.psi.ch/linux-infra/bootstrap/-/blob/prod/ansible/roles/icinga_master/templates/Director-Basket_awi-lx-sysdb.json) #### Import of Hiera Data to Sysdb @@ -40,9 +40,9 @@ The Icinga2 Director import pipeline is provides as [Configuration Basket templa ### Ansible Managed Central Infrastructure (e.g. Puppet Server) -## Development of Director Import Pipeline +## Development of Icinga Director Import Pipeline -The base are always the Configuration Basket snapshots (JSON files) which we have in Git. For changes either change them directly or change them in the Director web UI and then create a new snapshot of the according Configuration Basket, download it, modify if necessary: +The base are always the Configuration Basket snapshots (JSON files) which we have in Git. For changes either change them directly or change them in the Icinga Director web UI and then create a new snapshot of the according Configuration Basket, download it, modify if necessary: - if it is templated as for the Sysdb import pipeline - fix the definition of the Configuration Basket itself which is stringified JSON and should be plain JSON ([bug](https://github.com/Icinga/icingaweb2-module-director/issues/2774)) and then commit it to the git repo. @@ -53,22 +53,46 @@ Note that it will only attempt to import the Configuration Basket snapshot as pr rm /etc/icingaweb2/psi/lx-core/* ``` -Further there is an issue with updated Sync Rules in the Configuration Basket snapshot. There is a [bug which makes their property list not updated on import](https://github.com/Icinga/icingaweb2-module-director/issues/2779). To work around you need to delete the Sync Rule manually in the Director UI. They cannot be deleted from shell with `icingacli director` ([feature request](https://github.com/Icinga/icingaweb2-module-director/issues/2706)). +Further there is an issue with updated Sync Rules in the Configuration Basket snapshot. There is a [bug which makes their property list not updated on import](https://github.com/Icinga/icingaweb2-module-director/issues/2779). To work around you need to delete the Sync Rule manually in the Icinga Director UI. They cannot be deleted from shell with `icingacli director` ([feature request](https://github.com/Icinga/icingaweb2-module-director/issues/2706)). ## Bootstrap The Icinga2 infrastructure is maintained and prepared by AIT. Following items need to be prepared from their side: - basic setup of Icinga2 Master -- add the Director module +- add the Icinga Director module - add Fileshipper module with following configuration (`/etc/icingaweb2/modules/fileshipper/imports.ini`): -``` -[Import AWI Linux Infrastructure Servers] -basedir = "/etc/icingaweb2/psi/lx-core" -``` + ``` + [Import AWI Linux Infrastructure Servers] + basedir = "/etc/icingaweb2/psi/lx-core" + ``` - in `roles.ini` have a `Generic User Role` with read/monitoring-only permissions - the `/etc/icingaweb2/psi/merge-roles-ini.py` script to be able to merge in roles via Ansible/Sysdb API - - +From our side we need the following manual setup +- prepare the Scheduled Downtime `Generic Linux Alert Suppression` (cannot be imported with Configuration Basket, see [feature request](https://github.com/Icinga/icingaweb2-module-director/issues/2795)) with + - Downtime name: `Generic Linux Alert Suppression` + - Author: `Core Linux Research Services` + - Comment: + ``` + By default manged RHEL systems do not alert or send notifications, they just collect monitoring information in Icinga2. + To enable alerting, set in Hiera: + icinga2::alerting::enable: true + ``` + - Fixed: `Yes` + - Disabled: `No` + - Apply to: `Hosts` + - With Services: `Yes` + - Assign where: `host.vars.lx_disabled_alerting` `is true (or set)` + - and finally on "Ranges" add a range with + Days: `january 1 - december 31` + Timeperiods: `00:00-24:00` +- run the Ansible playbook: + ``` + ansible-playbook -i inventory_test.yaml --vault-pass-file ./vault-pass prepare_icinga_master.yaml + ``` + or for production + ``` + ansible-playbook -i inventory.yaml -i inventory_dmz.yaml --vault-pass-file ./vault-pass prepare_icinga_master.yaml + ```