113 lines
3.8 KiB
ReStructuredText
113 lines
3.8 KiB
ReStructuredText
============
|
|
Deployment
|
|
============
|
|
|
|
Deployment roughly has the following phases:
|
|
|
|
1. DHCP followed by PXE boot.
|
|
2. Kickstart installation followed by a reboot.
|
|
3. Initial Puppet run, followed by updates, followed by another Puppet run and a
|
|
reboot.
|
|
|
|
|
|
PXE boot/iPXE
|
|
=============
|
|
|
|
When deployment fails during the PXE phase it usually due to one of the
|
|
following:
|
|
|
|
1. No network connectivity
|
|
|
|
This is usually indicated by messages similar to ``No link on XXX``.
|
|
|
|
2. No DHCP in the connected network (eg DMZ, tier3)
|
|
|
|
The DHCP requests by the BIOS/UEFI firmware will time out.
|
|
|
|
3. Firewall (no TFTP/HTTP to the relevant servers)
|
|
4. Incompatibilities between iPXE and network card (NIC)
|
|
5. Incorrect sysdb entry (hence iPXE entry incorrect).
|
|
|
|
If there is not DHCP, the static network information provided manually is
|
|
possibly wrong or for a different network than the one connected to the host.
|
|
|
|
|
|
Infiniband
|
|
----------
|
|
|
|
Infiniband can generally cause installation problem, expecially in the
|
|
initial phase, when iPXE tries to load the configuration file.
|
|
|
|
As a general rule, disable PXE on all Infiniband cards.
|
|
|
|
Anyway this is not always enough since it happens that iPXE recognized
|
|
anyway the Infiniband card as the first device (with MAC
|
|
address ``79:79:79:79:79:79``) and tries to get configuration file for
|
|
that.
|
|
|
|
|
|
Kickstart
|
|
=========
|
|
|
|
Typical problems during the Kickstart phase:
|
|
|
|
1. The Kickstart file cannot be retrieved from the boot server
|
|
``boot00.psi.ch``. Typically caused by incorrect sysdb entries or firewalls.
|
|
2. Partitioning fails. This can happen because
|
|
|
|
a) No disk is recognized, or the wrong disk is used
|
|
b) Packages or other installation data cannot be downloaded. Can be caused by
|
|
firewalls or incorrect sysdb entries.
|
|
|
|
|
|
First Puppet Run
|
|
================
|
|
|
|
A typical problem are Hiera errors, eg the following::
|
|
|
|
# puppet agent --test
|
|
Info: Using configured environment 'prod'
|
|
Info: Retrieving pluginfacts
|
|
Info: Retrieving plugin
|
|
Info: Loading facts
|
|
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Function lookup() did not find a value for the name 'console::mount_root' at /srv/puppet/code/dev/envs/prod/code/modules/role/manifests/console.pp:1 on node lxdev05.psi.ch
|
|
Warning: Not using cache on failed catalog
|
|
Error: Could not retrieve catalog; skipping run
|
|
|
|
The error message shows that the value for ``console::mount_root`` could not be
|
|
found in Hiera.
|
|
|
|
Sometimes the Active Directory join fails, usually for one of these three
|
|
reasons:
|
|
|
|
- There is already an Active Directory computer object for the same system from
|
|
a previous Windows installation. In this case, delete the computer object and
|
|
restart the installation.
|
|
- Firewall restrictions
|
|
- Old Puppet certificates from a previous SL6 installation are used on the
|
|
system. In this case delete the certificates on the client with ``find
|
|
/etc/puppetlabs -name '*.pem' -delete`` and clean up any certificates on the
|
|
Puppet server with ``puppet cert clean $HOSTNAME``. Then restart the
|
|
installation.
|
|
|
|
Rejoin the Active Directory
|
|
===========================
|
|
|
|
If the AD join seams to be broken (failed logins, etc.), then the node can be automatically rejoined again:
|
|
|
|
- remove ``/etc/krb5.keytab``
|
|
- run puppet, e.g. with ``puppet agent --test``
|
|
|
|
|
|
Cannot Load YFS Kernel Module
|
|
=============================
|
|
|
|
If the ``yfs-client`` does not start due to "key not available" ::
|
|
|
|
Sep 02 13:21:34 pc12661.psi.ch systemd[1]: Starting AuriStorFS Client Service...
|
|
Sep 02 13:21:34 pc12661.psi.ch modprobe[29282]: modprobe: ERROR: could not insert 'yfs': Required key not available
|
|
|
|
then there is most probably SecureBoot blocking the loading of the unsigned ``yfs`` kernel module.
|
|
|
|
Please disable secure boot in the BIOS/firmware settings.
|