Created attachment 305388 [details] The kernel log shows the AER error flood We have an ASUS X555UQ laptop equipped with Intel i7-6500U CPU and Realtek RTL8723BE PCIe Wireless adapter. We tested it with kernel 6.6. System keeps showing AER error message flood, even hangs up, until rtl8723be's ASPM is disabled. kernel: pcieport 0000:00:1c.5: PCIe Bus Error: severity=Corrected, type=Physical Layer, (Receiver ID) kernel: pcieport 0000:00:1c.5: device [8086:9d15] error status/mask=00000001/00002000 kernel: pcieport 0000:00:1c.5: [ 0] RxErr (First) kernel: pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 kernel: pcieport 0000:00:1c.5: AER: can't find device of ID00e5 kernel: pcieport 0000:00:1c.5: AER: Corrected error received: 0000:00:1c.5 kernel: pcieport 0000:00:1c.5: AER: can't find device of ID00e5 kernel: pcieport 0000:00:1c.5: AER: Multiple Corrected error received: 0000:00:1c.5 kernel: pcieport 0000:00:1c.5: AER: can't find device of ID00e5 Here is the PCI tree: $ lspci -tv -[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Host Bridge/DRAM Registers +-02.0 Intel Corporation Skylake GT2 [HD Graphics 520] +-04.0 Intel Corporation Xeon E3-1200 v5/E3-1500 v5/6th Gen Core Processor Thermal Subsystem +-14.0 Intel Corporation Sunrise Point-LP USB 3.0 xHCI Controller +-14.2 Intel Corporation Sunrise Point-LP Thermal subsystem +-15.0 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #0 +-15.1 Intel Corporation Sunrise Point-LP Serial IO I2C Controller #1 +-16.0 Intel Corporation Sunrise Point-LP CSME HECI #1 +-17.0 Intel Corporation Sunrise Point-LP SATA Controller [AHCI mode] +-1c.0-[01]----00.0 NVIDIA Corporation GM108M [GeForce 940MX] +-1c.4-[02]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller +-1c.5-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8723BE PCIe Wireless Network Adapter +-1f.0 Intel Corporation Sunrise Point-LP LPC Controller +-1f.2 Intel Corporation Sunrise Point-LP PMC +-1f.3 Intel Corporation Sunrise Point-LP HD Audio \-1f.4 Intel Corporation Sunrise Point-LP SMBus
Created attachment 305389 [details] The PCI bridge's detail information
Created attachment 305390 [details] The PCI RTL8723BE's detail information
Notice a long time ago discussion mail: Dmesg filled with "AER: Corrected error received" [1] So, I force write 1 to clear Receiver Error Status bit of Correctable Error Status Register, like diff --git a/drivers/pci/pcie/aer.c b/drivers/pci/pcie/aer.c index 9c8fd69ae5ad..39faedd2ec8e 100644 --- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -1141,8 +1160,9 @@ static void aer_isr_one_error(struct aer_rpc *rpc, e_info.multi_error_valid = 0; aer_print_port_info(pdev, &e_info); - if (find_source_device(pdev, &e_info)) - aer_process_err_devices(&e_info); + //if (find_source_device(pdev, &e_info)) + // aer_process_err_devices(&e_info); + pci_write_config_dword(pdev, pdev->aer_cap + PCI_ERR_COR_STATUS, 0x1); } if (e_src->status & PCI_ERR_ROOT_UNCOR_RCV) { Then, system should clear the error right away. However, system still get the AER flood ... Seems that we still have to disable rtl8723be's ASPM. [1]: https://lore.kernel.org/all/20151229155822.GA17321@localhost/T/#r7ca71d16bb63a651b456fd14bbbd889aa97b8ba4
Sent a patch to disable the rtl8723be's ASPM when the PCI bridge is some kinds of Intel devices as a workaround https://lore.kernel.org/lkml/05390e0b-27fd-4190-971e-e70a498c8221@lwfinger.net/T/