Created attachment 107100 [details] lockdep dependency chain Probably related to bug 60672. Reported by Peter Wu <lekensteyn@gmail.com>: When trying to rescan for PCI devices (after removing a child), I get a lockdep warning in my logs. The commands were: # tee /sys/bus/pci/devices/0000\:03\:00.0/remove <<<1 1 # tee /sys/bus/pci/devices/0000\:02\:00.0/rescan <<<1 1 I did not experience actual issues, just letting you know about this. I can reproduce this on every reboot: 1. Boot 2. (pci-stub owns the device) 4. remove from parent bus 5. rescan 6. Lockdep warning found. (7. pci-stub claims device again) Interestingly, I can only reproduce this after freshly rebooting. Repeating steps 4 and 5 do not trigger a new lockdep warning. Regards, Peter ====================================================== [ INFO: possible circular locking dependency detected ] 3.11.0-rc2-cold-00096-gae2ad35-dirty #1 Tainted: G O ------------------------------------------------------- tee/29902 is trying to acquire lock: (pci_remove_rescan_mutex){+.+.+.}, at: [<ffffffff8134be96>] dev_rescan_store+0x56/0x80 but task is already holding lock: (s_active#296){++++.+}, at: [<ffffffff811fddad>] sysfs_write_file+0xcd/0x170 which lock already depends on the new lock.
Created attachment 107106 [details] dmesg boot + remove + rescan As requested in the mail, a full dmesg. At 32.959667, I ran: sudo tee /sys/bus/pci/devices/0000\:03\:00.0/remove <<<1 At 41.208461, I ran: sudo tee /sys/bus/pci/devices/0000\:02\:00.0/rescan <<<1 Note, the PCI ID of this 03:00.0 device is different than the one from my previous mail in which the device was owned by the pci-stub driver. Due to a hardware (?) bug, I have to kick the EEPROM to change the wrong PCI ID 10ec:8129 to 10ec:8169. For this lockdep issue, that should not matter though. I am mentioning it for completeness and to take away any possible confusion. The kernel is 3.11.0-rc2-cold-00096-gae2ad35-dirty which is mainline plus some r8169 patches that I was working on (for dumping/changing EEPROM).
Created attachment 107107 [details] lspci -vv `lspci -vv` attached, `lspci -tvnn` below: -[0000:00]-+-00.0 Intel Corporation 2nd Generation Core Processor Family DRAM Controller [8086:0100] +-02.0 Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller [8086:0102] +-16.0 Intel Corporation 6 Series/C200 Series Chipset Family MEI Controller #1 [8086:1c3a] +-1a.0 Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #2 [8086:1c2d] +-1b.0 Intel Corporation 6 Series/C200 Series Chipset Family High Definition Audio Controller [8086:1c20] +-1c.0-[01]-- +-1c.3-[02-03]----00.0-[03]----00.0 Realtek Semiconductor Co., Ltd. RTL8169 PCI Gigabit Ethernet Controller [10ec:8169] +-1c.4-[04]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller [1b6f:7023] +-1c.5-[05]----00.0 Etron Technology, Inc. EJ168 USB 3.0 Host Controller [1b6f:7023] +-1c.6-[06]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168 PCI Express Gigabit Ethernet controller [10ec:8168] +-1c.7-[07]----00.0 Marvell Technology Group Ltd. 88SE9172 SATA 6Gb/s Controller [1b4b:9172] +-1d.0 Intel Corporation 6 Series/C200 Series Chipset Family USB Enhanced Host Controller #1 [8086:1c26] +-1f.0 Intel Corporation Z68 Express Chipset Family LPC Controller [8086:1c44] +-1f.2 Intel Corporation 6 Series/C200 Series Chipset Family SATA AHCI Controller [8086:1c02] \-1f.3 Intel Corporation 6 Series/C200 Series Chipset Family SMBus Controller [8086:1c22]
Probably fixed by these commits: f41b32613138 ACPI / hotplug / PCI: Move PCI rescan-remove locking to hotplug_event() d42f5da23400 ACPI / hotplug / PCI: Scan root bus under the PCI rescan-remove lock Peter, can you test v3.15 (which will probably be released on 6/8 or so) and figure out whether this is fixed or still broken?
Created attachment 138451 [details] dmesg (v3.15-rc6) Bjorn, tested as follows on v3.15-rc6 (plus "r8169: add ethtool eeprom change/dump feature" patch): sudo tee /sys/bus/pci/devices/0000\:04\:00.0/remove <<<1 sudo tee /sys/bus/pci/devices/0000\:03\:00.0/rescan <<<1 The original problem is gone, now a new lockdep issue emerges. See attached dmesg.