Created attachment 107458 [details] dmesg output with Rafael's patches + s/dbg/pr_info/ in hotplug_event() Recent changes to acpiphp result in significantly more console output. On a Lenovo S20 workstation: CONFIG_HOTPLUG_PCI_ACPI=y $ dmesg | wc 5697 49935 384368 $ dmesg | tail --lines=1 [ 53.137123] Ebtables v2.0 registered # CONFIG_HOTPLUG_PCI_ACPI is not set $ dmesg | wc 1053 9176 71652 $dmesg | tail --lines=1 [ 28.917220] Ebtables v2.0 registered This is partially resolved by Rafael's patches: https://lkml.org/lkml/2013/9/6/197 This gets us down to: $ dmesg | wc 2169 19047 152710 $ dmesg | tail --lines=1 [ 39.980918] Ebtables v2.0 registered But we're still getting an extra 1100 lines of output, slowing down the boot by about 10s with acpiphp enabled. The extra output is almost entirely from repeated bus scans caused by device checks on the root ports: $ dmesg | colrm 1 15 | sort | uniq -c | sort -nr | head --lines=25 73 pci 0000:00:1f.0: BAR 13: [io 0x1000-0x107f] has bogus alignment 73 pci 0000:00:1e.0: PCI bridge to [bus 06] 64 pci 0000:00:1e.0: bridge window [mem 0x81100000-0x812fffff 64bit pref] 64 pci 0000:00:1e.0: bridge window [mem 0x80f00000-0x810fffff] 64 pci 0000:00:1e.0: bridge window [io 0x7000-0x7fff] 38 pci 0000:00:1c.4: PCI bridge to [bus 05] 38 pci 0000:00:1c.4: bridge window [mem 0xf4f00000-0xf4ffffff] 38 pci 0000:00:1c.0: PCI bridge to [bus 04] 38 pci 0000:00:07.0: PCI bridge to [bus 03] 38 pci 0000:00:07.0: bridge window [mem 0xf2000000-0xf40fffff] 38 pci 0000:00:07.0: bridge window [mem 0xe0000000-0xf1ffffff 64bit pref] 38 pci 0000:00:07.0: bridge window [io 0x4000-0x4fff] 38 pci 0000:00:03.0: PCI bridge to [bus 02] 38 pci 0000:00:03.0: bridge window [mem 0xf4e00000-0xf4efffff] 38 pci 0000:00:03.0: bridge window [mem 0xd0000000-0xdfffffff 64bit pref] 38 pci 0000:00:03.0: bridge window [io 0x3000-0x3fff] 38 pci 0000:00:01.0: PCI bridge to [bus 01] 38 pci 0000:00:01.0: bridge window [mem 0xf4100000-0xf4bfffff] 38 pci 0000:00:01.0: bridge window [io 0x2000-0x2fff] 37 pci 0000:00:1c.4: bridge window [mem 0x80c00000-0x80dfffff 64bit pref] 37 pci 0000:00:1c.4: bridge window [io 0x6000-0x6fff] 37 pci 0000:00:1c.0: bridge window [mem 0x80a00000-0x80bfffff 64bit pref] 37 pci 0000:00:1c.0: bridge window [mem 0x80800000-0x809fffff] 37 pci 0000:00:1c.0: bridge window [io 0x5000-0x5fff] 36 pci 0000:00:01.0: bridge window [mem 0x80000000-0x807fffff 64bit pref] Converting 'dbg' to 'pr_info' in hotplug_event(), we see this: $ dmesg | grep "check notify" [ 1.633228] hotplug_event: Device check notify on \_SB_.PCI0.PEX2 [ 2.472004] hotplug_event: Device check notify on \_SB_.PCI0.PEX3 [ 2.477288] hotplug_event: Device check notify on \_SB_.PCI0.PEX4 [ 2.482571] hotplug_event: Device check notify on \_SB_.PCI0.PEX5 [ 2.482579] hotplug_event: Device check notify on \_SB_.PCI0.PEX6 [ 8.204953] hotplug_event: Device check notify on \_SB_.PCI0.PEX2 [ 8.209632] hotplug_event: Device check notify on \_SB_.PCI0.PEX3 [ 8.214272] hotplug_event: Device check notify on \_SB_.PCI0.PEX4 [ 8.218894] hotplug_event: Device check notify on \_SB_.PCI0.PEX5 [ 8.218901] hotplug_event: Device check notify on \_SB_.PCI0.PEX6
Created attachment 107459 [details] lspci -vv
Created attachment 107460 [details] dmesg output with Rafael's patches + s/dbg/pr_info/ in hotplug_event() (uncompressed) Uncompressed for easy viewing.
Created attachment 107461 [details] lspci -vv (uncompressed)
Alex, can you please attach the output from acpidump on the affected machine?
From dmesg in comment #2 it looks like we go to the host bridge and scan the entire PCI hierarchy every time, which is suspicious. If the PEXn objects really are ports, I'd expect that we'd scan only the branch below each of them on a device notify. For example, this: Device check notify on \_SB_.PCI0.PEX2 should trigger the scan below PEX2, *unless* it is not regarded as a bridge.
Created attachment 107471 [details] acpidump
OK, so now it's clear what happens. PEX2, PEX3, PEX4 and PEX6 are ACPI device objects that have no corresponding PCI devices in the system, so when a device check notify for one of them occurs, we go to the parent bus (which is a root bus) and rescan from there. PEX5 actually corresponds to a PCIe port, so the notify for that object doesn't generate the excess output. Clearly, we should not blindly go for rescanning the parent if the notify is for an object without an existing physical device corresponding to it.
Created attachment 107491 [details] ACPIPHP: Avoid bus scans for spurious device check notifications Alex, can you please try this patch? It's purpose is to skip the parent bus scan if we don't find anything new in the given slot after a device check notify. And by the way, I don't think there are too many systems like this out there and I do think that it's pretty damn broken (it is not OK to send device check notifications for devices that aren't and have never been physically present in my opinion).
The patch is on top of: https://patchwork.kernel.org/patch/2854069/ https://patchwork.kernel.org/patch/2854540/ https://patchwork.kernel.org/patch/2854543/
(In reply to Rafael J. Wysocki from comment #8) > Created attachment 107491 [details] > ACPIPHP: Avoid bus scans for spurious device check notifications > > Alex, can you please try this patch? Works well, new stats below. Handful more lines, mostly from still having the s/dbg/pr_info/ additional output. Timing is well within the noise factor for my system. Not nearly as many repeated lines in dmesg. Thanks! $ dmesg | wc 1066 9271 72456 $ dmesg | tail --lines=1 [ 29.185420] Ebtables v2.0 registered $ dmesg | colrm 1 15 | sort | uniq -c | sort -nr | head --lines=25 7 SMP alternatives: lockdep: fixing up alternatives 7 pci 0000:01:00.0: reg 0x184: [mem 0x00000000-0x00003fff 64bit] 5 pci 0000:01:00.0: reg 0x190: [mem 0x00000000-0x00003fff 64bit] 5 ACPI Warning: 5 (20130725/utaddress-251) 4 pci 0000:01:00.1: reg 0x184: [mem 0x00000000-0x00003fff 64bit] 3 pci 0000:01:00.1: reg 0x190: [mem 0x00000000-0x00003fff 64bit] 3 ACPI: If an ACPI driver is available for this device, you should use it instead of the native driver 2 reg 3, base: 12GB, range: 2GB, type WB 2 reg 2, base: 8GB, range: 4GB, type WB 2 pci 0000:00:1c.4: PCI bridge to [bus 05] 2 pci 0000:00:1c.4: bridge window [mem 0xf4f00000-0xf4ffffff] 2 pci 0000:00:1c.0: PCI bridge to [bus 04] 2 pci 0000:00:07.0: PCI bridge to [bus 03] 2 pci 0000:00:07.0: bridge window [mem 0xf2000000-0xf40fffff] 2 pci 0000:00:07.0: bridge window [mem 0xe0000000-0xf1ffffff 64bit pref] 2 pci 0000:00:07.0: bridge window [io 0x4000-0x4fff] 2 pci 0000:00:03.0: PCI bridge to [bus 02] 2 pci 0000:00:03.0: bridge window [mem 0xf4e00000-0xf4efffff] 2 pci 0000:00:03.0: bridge window [mem 0xd0000000-0xdfffffff 64bit pref] 2 pci 0000:00:03.0: bridge window [io 0x3000-0x3fff] 2 pci 0000:00:01.0: PCI bridge to [bus 01] 2 pci 0000:00:01.0: bridge window [mem 0xf4100000-0xf4bfffff] 2 pci 0000:00:01.0: bridge window [io 0x2000-0x2fff] 2 No AGP bridge found
OK, let's mark it as resolved, then. Thanks for testing!
Fixed by commit a47d8c8 ACPI / hotplug / PCI: Avoid parent bus rescans on spurious device checks.