When I connect my X1C8, running a 5.8.0 kernel with lockdep enabled, to a Lenovo 2nd gen thunderbolt dock I get the following warning from lockdep: [ 139.754540] pcieport 0000:06:01.0: PCI bridge to [bus 08-2c] [ 139.754545] pcieport 0000:06:01.0: bridge window [io 0x4000-0x4fff] [ 139.754552] pcieport 0000:06:01.0: bridge window [mem 0xdc100000-0xe80fffff] [ 139.754558] pcieport 0000:06:01.0: bridge window [mem 0xa0000000-0xbfffffff 64bit pref] [ 139.754856] pcieport 0000:08:00.0: enabling device (0000 -> 0003) [ 139.755955] pcieport 0000:09:02.0: enabling device (0000 -> 0003) [ 139.757097] pcieport 0000:09:04.0: enabling device (0000 -> 0002) [ 139.757622] pcieport 0000:09:04.0: pciehp: Slot #4 AttnBtn- PwrCtrl- MRL- AttnInd- PwrInd- HotPlug+ Surprise+ Interlock- NoCompl+ IbPresDis- LLActRep+ [ 139.759255] ============================================ [ 139.759257] WARNING: possible recursive locking detected [ 139.759259] 5.8.0+ #16 Tainted: G E [ 139.759260] -------------------------------------------- [ 139.759261] irq/125-pciehp/143 is trying to acquire lock: [ 139.759263] ffff95ee9f3d1f38 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_check_presence+0x23/0x80 [ 139.759269] but task is already holding lock: [ 139.759270] ffff95eee497e738 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xdf/0x120 [ 139.759274] other info that might help us debug this: [ 139.759275] Possible unsafe locking scenario: [ 139.759276] CPU0 [ 139.759277] ---- [ 139.759278] lock(&ctrl->reset_lock); [ 139.759279] lock(&ctrl->reset_lock); [ 139.759281] *** DEADLOCK *** [ 139.759282] May be due to missing lock nesting notation [ 139.759283] 4 locks held by irq/125-pciehp/143: [ 139.759284] #0: ffff95eee497e738 (&ctrl->reset_lock){.+.+}-{3:3}, at: pciehp_ist+0xdf/0x120 [ 139.759288] #1: ffffffffa2a25e70 (pci_rescan_remove_lock){+.+.}-{3:3}, at: pciehp_configure_device+0x22/0x110 [ 139.759291] #2: ffff95eec9a9a240 (&dev->mutex){....}-{3:3}, at: __device_attach+0x25/0x170 [ 139.759296] #3: ffff95ee9f0b19b0 (&dev->mutex){....}-{3:3}, at: __device_attach+0x25/0x170 [ 139.759299] stack backtrace: [ 139.759301] CPU: 5 PID: 143 Comm: irq/125-pciehp Tainted: G E 5.8.0+ #16 [ 139.759303] Hardware name: LENOVO 20U90SIT19/20U90SIT19, BIOS N2WET16W (1.06 ) 05/10/2020 [ 139.759304] Call Trace: [ 139.759310] dump_stack+0x92/0xc8 [ 139.759314] __lock_acquire.cold+0x121/0x296 [ 139.759319] lock_acquire+0xa4/0x3d0 [ 139.759321] ? pciehp_check_presence+0x23/0x80 [ 139.759327] down_read+0x45/0x130 [ 139.759329] ? pciehp_check_presence+0x23/0x80 [ 139.759331] pciehp_check_presence+0x23/0x80 [ 139.759333] pciehp_probe+0x156/0x1a0 [ 139.759337] pcie_port_probe_service+0x31/0x50 [ 139.759339] really_probe+0x2d4/0x410 [ 139.759342] driver_probe_device+0xe1/0x150 [ 139.759344] ? driver_allows_async_probing+0x50/0x50 [ 139.759346] bus_for_each_drv+0x6d/0xa0 [ 139.759349] __device_attach+0xe4/0x170 [ 139.759352] bus_probe_device+0x9f/0xb0 [ 139.759354] device_add+0x389/0x810 [ 139.759356] ? __init_waitqueue_head+0x45/0x60 [ 139.759359] pcie_port_device_register+0x296/0x520 [ 139.759363] ? disable_irq_nosync+0x10/0x10 [ 139.759365] pcie_portdrv_probe+0x2d/0xb0 [ 139.759368] local_pci_probe+0x42/0x80 [ 139.759371] pci_device_probe+0xd9/0x190 [ 139.759374] really_probe+0x167/0x410 [ 139.759377] ? disable_irq_nosync+0x10/0x10 [ 139.759379] driver_probe_device+0xe1/0x150 [ 139.759381] ? driver_allows_async_probing+0x50/0x50 [ 139.759383] bus_for_each_drv+0x6d/0xa0 [ 139.759386] __device_attach+0xe4/0x170 [ 139.759389] pci_bus_add_device+0x4b/0x70 [ 139.759391] pci_bus_add_devices+0x2c/0x70 [ 139.759394] pci_bus_add_devices+0x57/0x70 [ 139.759396] pciehp_configure_device+0x92/0x110 [ 139.759399] pciehp_handle_presence_or_link_change+0x17b/0x2a0 [ 139.759402] pciehp_ist+0x116/0x120 [ 139.759404] irq_thread_fn+0x20/0x60 [ 139.759407] ? irq_thread+0x8c/0x1b0 [ 139.759409] irq_thread+0xf0/0x1b0 [ 139.759412] ? irq_finalize_oneshot.part.0+0xd0/0xd0 [ 139.759414] ? irq_thread_check_affinity+0xb0/0xb0 [ 139.759417] kthread+0x138/0x160 [ 139.759419] ? kthread_create_worker_on_cpu+0x40/0x40 [ 139.759423] ret_from_fork+0x1f/0x30
On the linux-pci list Lukas Wunner wrote the following about this: False positive, the reset_lock is per-controller and multiple instances of the lock are held concurrently because pciehp controllers are nested with Thunderbolt. This was already reported by Theodore T'so: https://lore.kernel.org/linux-pci/20190402021933.GA2966@mit.edu/ So the issue is on my radar and I have some ideas how to fix it. Let me get back to you with a solution later. In the meantime, thank you for the report.
This is fixed by this commit: https://git.kernel.org/pub/scm/linux/kernel/git/helgaas/pci.git/commit/?id=42a46c70045915bcbdced3e694dc5825d124fb5c Which should get merged into 5.17-rc1 soon-ish, closing.
Note the branch the commit from comment 2 references was just rebased so the commit hash will likely stop working eventually. So for future reference the commit fixing this has the following title/subject: "PCI: pciehp: Use down_read/write_nested(reset_lock) to fix lockdep errors"