I have a Sony VAIO VPCZ23A4R laptop. It comes with a docking station that contains some PCI Express devices: the second video card, the second Ethernet controller, a Marvell ATA controller with a BluRay writer attached, and some USB controllers. If I boot the laptop without the docking station and then connect it, the following bugs occur: 1. When docking, the kernel does not understand that it has to rescan the PCI bus. 2. When told to rescan (echo 1 > /sys/class/pci_bus/0000\:00/rescan) it emits a lot of warnings and sometimes panics due to the radeon driver. 3. When undocking, the kernel does not understand that it needs to forget about the PCI devices in the dock station. In fact, undocking fails. I will attach logs to this bug.
Created attachment 95971 [details] Here is what the boot WITH the dock looks like
Created attachment 95981 [details] Devices in the dock
Created attachment 95991 [details] Log spam due to docking and rescanning the PCI bus
Created attachment 96001 [details] PCI bus scan results after hot-plugging the dock PCI bus scan results when hot-plugging the dock differ from the results when booting with the dock already connected. Sometimes this even makes the ethernet controller appear as enp17s0 instead of enp25s0 (could not capture the dmesg of this).
Created attachment 96011 [details] acpidump (from pmtools) output
Created attachment 96021 [details] radeon-related panic photo
With linux-3.9-rc5 and CONFIG_PCI_REALLOC_ENABLE_AUTO=y, the kernel does not see devices in the dock after hot-plugging it and attempting to rescan.
Looks like a pci problem? Is there a working kernel? Thanks.
There is no fully working kernel: 3.8.2 sees the devices after being told manually to rescan the PCI bus, but warns a lot and may panic during the rescan 3.9-rc5 does not see the devices even after being told manually to rescan the PCI bus, and thus does not warn or panic old kernels behave like 3.8.2 Should I attempt to bisect what change caused the 3.9-rc5 kernel to ignore the manual PCI rescan request? As for "Looks like a pci problem?" - I think there are multiple problems here, a sony-laptop or ACPI problem, a PCI problem, and a radeon problem. Sony-laptop or ACPI problem: why doesn't the kernel rescan the PCI bus automatically after docking/undocking? PCI problem: "BAR 14: can't assign mem (size 0x10400000)" and similar messages after manual rescanning. Radeon problem: "radeon 0000:16:00.0: Fatal error during GPU init" and panic. What other info is needed?
Hi Alex, I agree with you, and you have explained very clearly, thanks. I think we can focus on the first problem here: Sony-laptop or ACPI problem: why doesn't the kernel rescan the PCI bus automatically after docking/undocking? This is what the dock module should handle, it may be a dock driver problem or a problem of the asl code in ACPI table. For other problems(pci and radeon), you will need to file separate bugs to the appropriate categories.
OK, let's focus on the dock problem here. I will file separate bugs for other issues when I return from work.
Created attachment 98061 [details] Add some debug statement to dsdt table Hi, I've prepared a customized DSDT, which I've placed some debug statement, let's see what happened when you dock the computer after boot. Please use this kernel command line param when boot: acpi.aml_debug_output=1.
BTW, there are two ways to override dsdt: 1 Documentation/acpi/initrd_table_override.txt, override dsdt through initrd. 2 https://lesswatts.org/projects/acpi/overridingDSDT.php, tells how to build the new dsdt directly into the kernel. If you are going to use method 1, please keep a copy of the original initrd file, as I do not hope a mistake made in the attached dsdt makes your system unusable.
Created attachment 98171 [details] dmesg output with debug (kernel 3.8.2) Thank you for detailed instructions. I hope that this dmesg output contains what you asked for.
Created attachment 98181 [details] dmesg output with debug (kernel 3.9-rc5) Just in case, here is the same debug output from 3.9-rc5.
Forgot to say: both dmesgs contain one dock attempt (by connecting the dock cable) and one undock attempt (by pressing the "undock" button on the cable but not disconeccting the cable). Apparently, on undocking, the laptop wants to redock (maybe just to verify success) and fails.
Thanks for the test. So on dock, the pci bridge 0000:00:1c.6 doesn't get notified by the BIOS. I assume this is a BIOS bug, and can do a workaround in the dsdt table. And on undock, the dock device is notified again, so it tried to dock again, but it found the dock is no longer there, so it prints "Unable to dock". This error message is not a big deal, it doesn't cause any problem.
Created attachment 98311 [details] Call _Q07 in _DCK on dock Please test this dsdt table, hopefully, it will automatically rescan.
On 3.9-rc5, your hacked DSDT does not help: [ 80.911787] ACPI: \_SB_.DOCK: docking [ 80.911807] [ACPI Debug] String [0x04] "_DCK" [ 80.911836] [ACPI Debug] String [0x04] "DSTS" [ 80.911853] [ACPI Debug] Integer 0x00000001 [ 80.912004] [ACPI Debug] String [0x0C] "_Q07 in _DCK" [ 80.912110] [ACPI Debug] String [0x04] "_Q07" [ 80.912122] [ACPI Debug] String [0x04] "DSTS" [ 80.912143] [ACPI Debug] Integer 0x00000001 [ 80.912165] [ACPI Debug] String [0x31] "Notify LPMB 0x01, DOCK 0x00, RP07 0x00, PCI0 0x01" [ 80.914950] _handle_hotplug_event_root: Device check notify on \_SB_.PCI0 This helps, but the first line may be redundant (will check later): echo 1 > /sys/class/pci_bus/0000\:08/rescan echo 1 > /sys/class/pci_bus/0000\:05/rescan echo 1 > /sys/class/pci_bus/0000\:08/rescan There are "can't assign mem" and "can't assign io" messages, but we agreed that they should be in a separate bug that I have not filed yet. Good news: with 3.9-rc5, there is no radeon panic, the card is just non-functional when docking.
With 3.9-rc5 and your DSDT, on undocking after a manual PCI rescan, there are many messages like this: [ 559.193225] ACPI: Device does not support D3cold [ 559.193314] ACPI: Device does not support D3cold [ 559.193455] ACPI: Device does not support D3cold [ 559.193557] ACPI: Device does not support D3cold [ 559.193688] ACPI: Device does not support D3cold [ 559.193814] ACPI: Device does not support D3cold [ 559.193942] ACPI: Device does not support D3cold [ 559.194092] ACPI: Device does not support D3cold The end result is that the PCI devices in the dock don't go away, they just return something like this in lspci -v: 10:02.0 PCI bridge: Intel Corporation Device 151b (rev ff) (prog-if ff) !!! Unknown header type 7f Kernel driver in use: pcieport
With 3.8.2, same result: no automatic rescan, one needs to: echo 1 > /sys/class/pci_bus/0000\:05/rescan echo 1 > /sys/class/pci_bus/0000\:08/rescan
(In reply to comment #19) > On 3.9-rc5, your hacked DSDT does not help: > > [ 80.911787] ACPI: \_SB_.DOCK: docking > [ 80.911807] [ACPI Debug] String [0x04] "_DCK" > [ 80.911836] [ACPI Debug] String [0x04] "DSTS" > [ 80.911853] [ACPI Debug] Integer 0x00000001 > [ 80.912004] [ACPI Debug] String [0x0C] "_Q07 in _DCK" > [ 80.912110] [ACPI Debug] String [0x04] "_Q07" > [ 80.912122] [ACPI Debug] String [0x04] "DSTS" > [ 80.912143] [ACPI Debug] Integer 0x00000001 > [ 80.912165] [ACPI Debug] String [0x31] "Notify LPMB 0x01, DOCK 0x00, RP07 > 0x00, PCI0 0x01" > [ 80.914950] _handle_hotplug_event_root: Device check notify on \_SB_.PCI0 These message shows that on dock, the pci bridge 0000:00:1c.6 and the pci root bridge are all notified a BUS_CHECK, and my understanding is that, the handler for such a notification should rescan the whole tree starting from it. PCI bus 8-20 belongs to pci bridge 0000:00:1c.6. I'll need to check the handler code, see what it does on such a notification, but that belongs to PCI I think.
Please add acpiphp.debug=1 to the kernel command line, together with the hack dsdt, and attach the dmesg after you dock/undock, thanks.
Sorry for being stupid. The acpiphp module was not loaded. However, even with it being loaded manually, docking does not work as expected. This may invalidate some earlier findings. Dmesg will be attached soon, I am going to rebuild the kernel, with this driver as a non-module.
OK, so the original bug report was invalid, because the acpiphp driver was not loaded. All dmesgs that you asked for are attached to the (hopefully valid) bug #56501.
Just for clarity, the "invalid" status is only about the "does not rescan PCI bus" bug on 3.8.2. The same "does not rescan" bug is valid on 3.9-rc5 (moved to bug #56501). The "can't assign me / io" bug and the radeon bug are valid as of 3.8.2, but not reported yet. Will to that later today.