forked from Controls/gitea-pages
92 lines
3.1 KiB
ReStructuredText
92 lines
3.1 KiB
ReStructuredText
============
|
|
Deployment
|
|
============
|
|
|
|
Deployment roughly has the following phases:
|
|
|
|
1. DHCP followed by PXE boot.
|
|
2. Kickstart installation followed by a reboot.
|
|
3. Initial Puppet run, followed by updates, followed by another Puppet run and a
|
|
reboot.
|
|
|
|
|
|
PXE boot/iPXE
|
|
=============
|
|
|
|
When deployment fails during the PXE phase it usually due to one of the
|
|
following:
|
|
|
|
1. No network connectivity
|
|
|
|
This is usually indicated by messages similar to ``No link on XXX``.
|
|
|
|
2. No DHCP in the connected network (eg DMZ, tier3)
|
|
|
|
The DHCP requests by the BIOS/UEFI firmware will time out.
|
|
|
|
3. Firewall (no TFTP/HTTP to the relevant servers)
|
|
4. Incompatibilities between iPXE and network card (NIC)
|
|
5. Incorrect sysdb entry (hence iPXE entry incorrect).
|
|
|
|
If there is not DHCP, the static network information provided manually is
|
|
possibly wrong or for a different network than the one connected to the host.
|
|
|
|
|
|
Infiniband
|
|
----------
|
|
|
|
Infiniband can generally cause installation problem, expecially in the
|
|
initial phase, when iPXE tries to load the configuration file.
|
|
|
|
As a general rule, disable PXE on all Infiniband cards.
|
|
|
|
Anyway this is not always enough since it happens that iPXE recognized
|
|
anyway the Infiniband card as the first device (with MAC
|
|
address ``79:79:79:79:79:79``) and tries to get configuration file for
|
|
that.
|
|
|
|
|
|
Kickstart
|
|
=========
|
|
|
|
Typical problems during the Kickstart phase:
|
|
|
|
1. The Kickstart file cannot be retrieved from the boot server
|
|
``boot00.psi.ch``. Typically caused by incorrect sysdb entries or firewalls.
|
|
2. Partitioning fails. This can happen because
|
|
|
|
a) No disk is recognized, or the wrong disk is used
|
|
b) Packages or other installation data cannot be downloaded. Can be caused by
|
|
firewalls or incorrect sysdb entries.
|
|
|
|
|
|
First Puppet Run
|
|
================
|
|
|
|
A typical problem are Hiera errors, eg the following::
|
|
|
|
# puppet agent --test
|
|
Info: Using configured environment 'prod'
|
|
Info: Retrieving pluginfacts
|
|
Info: Retrieving plugin
|
|
Info: Loading facts
|
|
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Function lookup() did not find a value for the name 'console::mount_root' at /srv/puppet/code/dev/envs/prod/code/modules/role/manifests/console.pp:1 on node lxdev05.psi.ch
|
|
Warning: Not using cache on failed catalog
|
|
Error: Could not retrieve catalog; skipping run
|
|
|
|
The error message shows that the value for ``console::mount_root`` could not be
|
|
found in Hiera.
|
|
|
|
Sometimes the Active Directory join fails, usually for one of these three
|
|
reasons:
|
|
|
|
- There is already an Active Directory computer object for the same system from
|
|
a previous Windows installation. In this case, delete the computer object and
|
|
restart the installation.
|
|
- Firewall restrictions
|
|
- Old Puppet certificates from a previous SL6 installation are used on the
|
|
system. In this case delete the certificates on the client with ``find
|
|
/etc/puppetlabs -name '*.pem' -delete`` and clean up any certificates on the
|
|
Puppet server with ``puppet cert clean $HOSTNAME``. Then restart the
|
|
installation.
|