80 lines
3.6 KiB
Markdown
80 lines
3.6 KiB
Markdown
# Deployment
|
|
|
|
A deployment roughly has the following phases:
|
|
1. DHCP followed by PXE boot.
|
|
2. Kickstart installation followed by a reboot.
|
|
3. Initial Puppet run, followed by updates, followed by another Puppet run and a reboot.
|
|
|
|
|
|
## PXE boot/iPXE
|
|
|
|
When deployment fails during the PXE phase it usually due to one of the following:
|
|
|
|
1. No network connectivity - This is usually indicated by messages similar to ``No link on XXX``.
|
|
2. No DHCP in the connected network (eg DMZ, tier3) - The DHCP requests by the BIOS/UEFI firmware will time out.
|
|
3. Firewall (no TFTP/HTTP to the relevant servers)
|
|
4. Incompatibilities between iPXE and network card (NIC)
|
|
5. Incorrect sysdb entry (hence iPXE entry incorrect).
|
|
|
|
If there is not DHCP, the static network information provided manually is possibly wrong or for a different network than the one connected to the host.
|
|
|
|
|
|
## Infiniband
|
|
|
|
Infiniband can generally cause installation problem, expecially in the initial phase, when iPXE tries to load the configuration file. As a general rule, disable PXE on all Infiniband cards.
|
|
|
|
Anyway this is not always enough since it happens that iPXE recognized anyway the Infiniband card as the first device (with MAC address ``79:79:79:79:79:79``) and tries to get configuration file for that.
|
|
|
|
|
|
## Kickstart
|
|
|
|
Typical problems during the Kickstart phase:
|
|
1. The Kickstart file cannot be retrieved from the sysdb server __sysdb.psi.ch__. Typically caused by incorrect sysdb entries or firewalls.
|
|
2. Partitioning fails. This can happen because
|
|
- No disk is recognized, or the wrong disk is used
|
|
- Packages or other installation data cannot be downloaded. Can be caused by firewalls or incorrect sysdb entries.
|
|
|
|
## Hiera
|
|
|
|
A typical problem are Hiera errors, eg the following::
|
|
```bash
|
|
Info: Using configured environment 'prod'
|
|
Info: Retrieving pluginfacts
|
|
Info: Retrieving plugin
|
|
Info: Loading facts
|
|
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Function lookup() did not find a value for the name 'console::mount_root' at /srv/puppet/code/dev/envs/prod/code/modules/role/manifests/console.pp:1 on node lxdev05.psi.ch
|
|
Warning: Not using cache on failed catalog
|
|
Error: Could not retrieve catalog; skipping run
|
|
```
|
|
|
|
The error message shows that the value for `console::mount_root` could not be found in Hiera.
|
|
|
|
|
|
## Active Directory
|
|
|
|
Sometimes the Active Directory join fails, usually for one of these three reasons:
|
|
|
|
- There is already an Active Directory computer object for the same system from a previous Windows installation. In this case, delete the computer object and restart the installation.
|
|
- Firewall restrictions
|
|
- Old Puppet certificates from a previous SL6 installation are used on the system. In this case delete the certificates on the client with `find /etc/puppetlabs -name '*.pem' -delete` and clean up any certificates on the Puppet server with ``puppet cert clean $HOSTNAME``. Then restart the installation.
|
|
|
|
### Rejoin Active Directory
|
|
|
|
If the AD join seams to be broken (failed logins, etc.), then the node can be automatically rejoined again:
|
|
- remove `/etc/krb5.keytab`
|
|
- run puppet, e.g. with `puppet agent --test`
|
|
|
|
|
|
## YFS / AFS
|
|
|
|
If the ``yfs-client`` does not start (cannot load kernel module) due to `key not available`:
|
|
|
|
```bash
|
|
Sep 02 13:21:34 pc12661.psi.ch systemd[1]: Starting AuriStorFS Client Service...
|
|
Sep 02 13:21:34 pc12661.psi.ch modprobe[29282]: modprobe: ERROR: could not insert 'yfs': Required key not available
|
|
```
|
|
|
|
then there is most probably SecureBoot blocking the loading of the unsigned `yfs` kernel module.
|
|
|
|
Disable secure boot in the BIOS/EFI settings.
|