Bug 56531

Summary: Sony VAIO VPCZ23A4R: can't assign mem/io after docking
Product: Drivers Reporter: Alexander E. Patrakov (patrakov)
Component: PCIAssignee: drivers_pci (drivers_pci)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: aaron.lu, liuj97
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.8.2, 3.9-rc5 Subsystem:
Regression: No Bisected commit-id:
Attachments: Dmesg from 3.9-rc5
Dmesg after the patch
Dmesg after the patch about pci_assign_unassigned_bus_resources
/proc/ioports after booting with the dock station
/proc/iomem after booting with the dock station
/proc/ioports when booting undocked, before docking
/proc/iomem when booting undocked, before docking
/proc/ioports when booting undocked, after docking
/proc/iomem when booting undocked, after docking
Dmesg after the patch
/proc/ioports after the patch
/proc/iomem after the patch

Description Alexander E. Patrakov 2013-04-12 15:34:11 UTC
Created attachment 98421 [details]
Dmesg from 3.9-rc5

This is a fork of bug #56501 and bug #55611. A separate bug is needed because a decision was made to limit these two bugs to non-rescanning of the PCI bus. Feel free, however, to refer to any relevant files (e.g. kernel configs and DSDT dump) or comments attached to these bugs.

As can be seen from the attached dmesg, after docking, there are a lot of warnings in the kernel log. Namely, it fails to assign PCI resources to devices in the dock station. Please fix this.
Comment 1 Alexander E. Patrakov 2013-05-02 05:47:32 UTC
Sorry, due to a physical mishap, the laptop is now out of order. I will not be able to test any patches, thus closing the bug.
Comment 2 Alexander E. Patrakov 2013-06-10 06:24:25 UTC
The laptop has been repaired, and I am ready to follow your debugging instructions.
Comment 3 Alexander E. Patrakov 2013-06-11 05:22:13 UTC
An updated dmesg (from 3.10-rc4) is available in bug 59501
Comment 4 Jiang Liu 2013-06-12 05:11:33 UTC
I have some idea about issue, the resource assignment logic is not so strong for ACPI based pci hotplug. Could you please help to test this patch as a prove of concept?

---
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index d0699ed..9675d74 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -695,7 +695,7 @@ static int __ref enable_device(struct acpiphp_slot *slot)
                            dev->hdr_type == PCI_HEADER_TYPE_CARDBUS) {
                                max = pci_scan_bridge(bus, dev, max, pass);
                                if (pass && dev->subordinate) {
-                                       check_hotplug_bridge(slot, dev);
+                                       /*check_hotplug_bridge(slot, dev); */
                                        pci_bus_size_bridges(dev->subordinate);
                                }
                        }
---
Comment 5 Alexander E. Patrakov 2013-06-12 09:52:20 UTC
Created attachment 104491 [details]
Dmesg after the patch

I have booted with a patched kernel and then docked the laptop. As you can see from the dmesg, the patch doesn't affect the problem.
Comment 6 Alexander E. Patrakov 2013-06-12 09:55:55 UTC
Note that the "can't assign mem/io" messages happen even in the initially-docked case, but then they don't prevent devices from working. See e.g. https://bugzilla.kernel.org/attachment.cgi?id=104331 around timestamp 0.727453.
Comment 7 Jiang Liu 2013-06-12 15:49:10 UTC
This bug is caused by the a difference in resource assignment between boot time and runtime hotplug. Basically acpiphp ignores resource assignment from firmware and tries to do that by itself. 

Could you please help to test this?

---
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index d0699ed..aba3efb 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -696,13 +696,12 @@ static int __ref enable_device(struct acpiphp_slot *slot)
                                max = pci_scan_bridge(bus, dev, max, pass);
                                if (pass && dev->subordinate) {
                                        check_hotplug_bridge(slot, dev);
-                                       pci_bus_size_bridges(dev->subordinate);
                                }
                        }
                }
        }
 
-       pci_bus_assign_resources(bus);
+       pci_assign_unassigned_bus_resources(bus);
        acpiphp_sanitize_bus(bus);
        acpiphp_set_hpp_values(bus);
        acpiphp_set_acpi_region(slot);
---
Comment 8 Alexander E. Patrakov 2013-06-12 17:47:16 UTC
Created attachment 104531 [details]
Dmesg after the patch about pci_assign_unassigned_bus_resources

Your patch didn't fully help. However, it cured the hotplug case of bug 59581.
Comment 9 Jiang Liu 2013-06-13 00:36:42 UTC
Do you mean all devices function properly after docking with this patch applied? If so, I think we have found the solution for this. On the other hand, we can't get rid of all resource allocation failures because BIOS only allocates limited resources. The target is to make hotplug works as boot time.
Comment 10 Alexander E. Patrakov 2013-06-13 01:40:32 UTC
The patch only helps with the pata_marvell controller and the BluRay drive. Please see bug 59581.
Comment 11 Jiang Liu 2013-06-13 02:57:18 UTC
Then what's the issue left? Any device on dock station functions abnormally?
Could you please help to send the content of /proc/iomem and /proc/ioport, for both boot docked and boot undocked cases? So we could easily compare the differences in resource assignment between boot time and runtime hotplug.
Comment 12 Alexander E. Patrakov 2013-06-13 03:25:29 UTC
Oops, my latest test was invalid. I have applied the patch from comment #7, but not from #4. Will retest and also compare /proc/iomem and /proc/ioport when I return to my laptop.

As for "what's left" - before your patch from comment #7, nothing in the dock station worked, and after your patch, r8169 and pata_marvell devices in the dock station work while radeon and snd_hda_intel don't.
Comment 13 Alexander E. Patrakov 2013-06-13 08:05:38 UTC
Hm. I need more coffee. From the context, it is obvious that the patches from comments #4 and #7 are not meant to be applied together. And it doesn't help anyway.
Comment 14 Alexander E. Patrakov 2013-06-13 08:19:58 UTC
Created attachment 104571 [details]
/proc/ioports after booting with the dock station

These ioports and iomem files are from the 3.10-rc4 kernel with the following patches applied:

[PATCH] ACPI: initialize dock subsystem before scanning PCI root buses
ACPIPHP: fix device destroying order issue in handling dock notification
The fixup moving acpi_scan_lock_acquire()/release(), fixed up to actually apply
The patch from comment #7
Comment 15 Alexander E. Patrakov 2013-06-13 08:20:19 UTC
Created attachment 104581 [details]
/proc/iomem after booting with the dock station
Comment 16 Alexander E. Patrakov 2013-06-13 08:20:59 UTC
Created attachment 104591 [details]
/proc/ioports when booting undocked, before docking
Comment 17 Alexander E. Patrakov 2013-06-13 08:21:27 UTC
Created attachment 104601 [details]
/proc/iomem when booting undocked, before docking
Comment 18 Alexander E. Patrakov 2013-06-13 08:22:20 UTC
Created attachment 104611 [details]
/proc/ioports when booting undocked, after docking
Comment 19 Alexander E. Patrakov 2013-06-13 08:23:23 UTC
Created attachment 104621 [details]
/proc/iomem when booting undocked, after docking

I hope that these /proc dumps help you to progress with this bug.
Comment 20 Jiang Liu 2013-06-14 02:31:04 UTC
According to my investigation, the issue is caused by difference in PCI resource
assignment between boot time and runtime hotplug. On x86 platforms, it respects PCI resource assignment from BIOS and only reassign resources for unassigned BARs. But with acpiphp, it ignores BIOS resource assignment and reassign all resources by OS.
   If we have enough resources, reassigning all PCI resources should work too, but may fail if we are under resource constraints. On the other handle, current PCI IOMM align algorithm may waste huge MMIO address space if we have some PCI devices with huge IOMM BAR.
   On this Sony laptop, BIOS allocates limited IOMM resources for the dock station and the dock station has a gfx which has a 256MB IOMM BAR. So current acpiphp driver fails to allocate resources for most devices on the dock station.
    Currently I'm trying to change acpiphp to respect BIOS resource assignment by calling pcibios_survey_resource_bus(), as in pci_root.c. The other way is to change the IOMM resource allocation algorithm, but obviously it's much more risky of regressions if changing the algorithm. 

So could please help to test this patch?
---
diff --git a/drivers/pci/hotplug/acpiphp_glue.c b/drivers/pci/hotplug/acpiphp_glue.c
index 270fdba..12e3f6e 100644
--- a/drivers/pci/hotplug/acpiphp_glue.c
+++ b/drivers/pci/hotplug/acpiphp_glue.c
@@ -837,13 +837,13 @@ static int __ref enable_device(struct acpiphp_slot *slot)
                                max = pci_scan_bridge(bus, dev, max, pass);
                                if (pass && dev->subordinate) {
                                        check_hotplug_bridge(slot, dev);
-                                       pci_bus_size_bridges(dev->subordinate);
+                                       pcibios_resource_survey_bus(dev->subordinate);
                                }
                        }
                }
        }

-       pci_bus_assign_resources(bus);
+       pci_assign_unassigned_bus_resources(bus);
        acpiphp_sanitize_bus(bus);
        acpiphp_set_hpp_values(bus);
        acpiphp_set_acpi_region(slot);
---
Comment 21 Alexander E. Patrakov 2013-06-14 04:08:53 UTC
The patch helped. Now even radeon doesn't error out.
Comment 22 Alexander E. Patrakov 2013-06-14 04:12:00 UTC
Created attachment 104681 [details]
Dmesg after the patch
Comment 23 Alexander E. Patrakov 2013-06-14 04:12:58 UTC
Created attachment 104691 [details]
/proc/ioports after the patch
Comment 24 Alexander E. Patrakov 2013-06-14 04:13:20 UTC
Created attachment 104701 [details]
/proc/iomem after the patch