how to handle PCIe bus errors

This commit is contained in:
2023-10-10 11:39:26 +02:00
parent d58ea199c7
commit 8dd42a478a
2 changed files with 19 additions and 0 deletions

View File

@@ -103,6 +103,7 @@ chapters:
- file: admin-guide/troubleshooting/boot
- file: admin-guide/troubleshooting/kerberos
- file: admin-guide/troubleshooting/sssd
- file: admin-guide/troubleshooting/pcie_bus_error
- file: admin-guide/order-vm
- file: infrastructure-guide/index

View File

@@ -0,0 +1,18 @@
# PCIe Bus Error
When there are PCI Express bus errors like
```
Oct 05 11:26:19 pc16209.psi.ch kernel: pcieport 10000:e0:06.0: AER: TLP Header: 34000000 e1000010 89148914 00000000
Oct 05 11:26:19 pc16209.psi.ch kernel: pcieport 10000:e0:06.0: PCIe Bus Error: severity=Uncorrected (Non-Fatal), type=Transaction Layer, (Requester ID)
Oct 05 11:26:19 pc16209.psi.ch kernel: pcieport 10000:e0:06.0: device [8086:464d] error status/mask=00100000/00010000
Oct 05 11:26:19 pc16209.psi.ch kernel: pcieport 10000:e0:06.0: [20] UnsupReq (First)
```
you might try with disabling **Active State Power Management** (ASPM) in the kernel.
To do so set in Hiera
```
base::enable_pcie_aspm: false
```
the apply it with `puppet agent -t` and reboot.