Bug 99751
Created attachment 179461 [details]
3.14.44 success case -- analyze_suspend.py output
0000:00:1c.0 is not seen on the resume side of the graph
Created attachment 179471 [details]
3.15.0 failure case - analyze_suspend.py output
notice the 2250ms 0000:00:1c.0 in the yellow resume section
click on "device detail" and notice it has no children
Created attachment 179481 [details]
3.15.0 failure case - dmesg
in this dmesg:
$ grep 0000:00:1c.0 *dmesg*
[ 35.291619] calling 0000:00:1c.0+ @ 106, parent: pci0000:00
[ 35.291624] call 0000:00:1c.0+ returned 0 after 2 usecs
[ 35.529334] calling 0000:00:1c.0+ @ 6, parent: pci0000:00
[ 35.529577] call 0000:00:1c.0+ returned 0 after 235 usecs
[ 35.619421] calling 0000:00:1c.0+ @ 6, parent: pci0000:00
[ 35.619561] call 0000:00:1c.0+ returned 0 after 134 usecs
[ 35.635103] calling 0000:00:1c.0+ @ 1730, parent: pci0000:00
[ 37.885840] pciehp 0000:00:1c.0:pcie04: Link Training Error occurs
[ 37.885842] pciehp 0000:00:1c.0:pcie04: Failed to check link status
[ 37.885846] call 0000:00:1c.0+ returned 0 after 2197684 usecs
[ 37.957392] pcieport 0000:00:1c.0: no hotplug settings from platform
[ 37.958538] pcieport 0000:00:1c.0: no hotplug settings from platform
Created attachment 179491 [details]
4.1.0-rc5+ failure case - analyze_suspend.py
see in the yellow resume section, 0000:00:1c.0 {pcieport},
which takes 2249ms -- the slowest device in the system, by far.
Created attachment 179501 [details]
4.1.0-rc5+ failure case - dmesg
$ grep 0000:00:1c.0 *dmesg*
[ 34.624011] calling 0000:00:1c.0+ @ 107, parent: pci0000:00
[ 34.624032] call 0000:00:1c.0+ returned 0 after 18 usecs
[ 34.755665] calling 0000:00:1c.0+ @ 96, parent: pci0000:00
[ 34.755735] call 0000:00:1c.0+ returned 0 after 67 usecs
[ 34.795443] calling 0000:00:1c.0+ @ 97, parent: pci0000:00
[ 34.795675] call 0000:00:1c.0+ returned 0 after 222 usecs
[ 34.816632] calling 0000:00:1c.0+ @ 1668, parent: pci0000:00
[ 37.065898] pciehp 0000:00:1c.0:pcie04: Link Training Error occurs
[ 37.065899] pciehp 0000:00:1c.0:pcie04: Failed to check link status
[ 37.065907] call 0000:00:1c.0+ returned 0 after 2198217 usecs
Hi Len, would you please: - attach a complete dmesg log from a recent kernel, including the initial boot, and - attach "lspci -vvv" output collected as root? "pcie04" is the hotplug service (PCIE_PORT_SERVICE_HP). The "service" idea is a way to have separate drivers bind to different services (AER, PME, hotplug, etc.) associated with the single 00:1c.0 Root Port. The "04" in the "lspci -tv" output is the secondary bus number of the 1c.0 Root Port and it's only coincidence that it resembles pcie04. From other XPS 11 lspci output on the web, I think 1c.0 does not have a slot, so I think pciehp should not touch it (but I haven't yet convinced myself that the pciehp probe path does this correctly). Created attachment 179511 [details]
4.1.0-rc5+ failure case - ftrace
analyze_suspend.py also grabs some ftrace points in the kernel.
Curiously, this ftrace output shows:
pci_bus 0000:04, parent: 0000:00:1c.0, [suspend]
but lspci does not show device 0000:04
Created attachment 179521 [details]
4.1.0-rc5+ dmesg from boot
Created attachment 179531 [details]
4.1.0-rc5+ lspci -vvv as root
Created attachment 179541 [details]
4.1.0-rc5+ dmesg from boot, including suspend/resume
oops, previous dmesg was collected before the suspend,
this one is a super-set, including boot and suspend/resume
Created attachment 179551 [details] debug patch I'm stumped. Re: comment #7, 0000:04 refers to a PCI bus, not to a device. This one is the secondary bus of the 00:1c.0 root port, and there is no device on that bus. On 4.1.0-rc5, I think you should see a delay of 1.1sec or so. Here are the capabilities of this slot: pciehp 0000:00:1c.0:pcie04: Slot #0 AttnBtn- AttnInd- PwrInd- PwrCtrl- MRL- Interlock- NoCompl+ LLActRep+ This is the path we should be in: pciehp_resume pcie_enable_notification pcie_write_cmd pcie_wait_cmd if (NO_CMD_CMPL) return # A pciehp_get_adapter_status(..., &status) if (status) # B pciehp_enable_slot pciehp_get_adapter_status board_added pciehp_check_link_status if (ctrl->link_active_reporting) # C pcie_wait_link_active # D 1 sec delay msleep(100) # E 0.1 sec delay "Link Training Error occurs" "Failed to check link status" At (A), we don't wait at all because this Root Port has the No Command Completed Support bit set ("NoCmpl+" above and in lspci), so we can write commands with no delay. This path has changed quite a bit since v3.14, so I wouldn't be surprised if we did wait for 1 sec here on some kernels between v3.15 and v4.0. At (B), we apparently read a Slot Status with the Presence Detect State bit set. This is a puzzle because your lspci output shows this bit cleared, and I assume you didn't add a device while we were suspended. At (C), this Root Port has the Data Link Layer Link Active Reporting Capable bit set ("LLActRep+" above and in lspci), so we can poll for Link Active instead of always waiting 1 sec. At (D), we poll for up to 1 sec, waiting for Data Link Layer Link Active to be set in the Link Status register. But it never becomes set, probably because there's no device on the other end, so we timeout after 1 sec. At (E), we always wait 100ms, presumably because of some PCIe spec requirement. Oops, I meant to ask if you could test with the debug patch in comment #11. Maybe there's some slot capability that changed over suspend/resume, and we aren't refreshing it when we should be. Created attachment 179611 [details] 4.1.0-rc5+ dmesg from boot, including suspend/resume with patch from comment #11 applied 4.1.0-rc5+ dmesg from boot, including suspend/resume, with patch from comment #11 applied [ 0.431618] pcieport 0000:00:1c.0: in pciehp_probe: [ 0.431621] pcieport 0000:00:1c.0: PCI_EXP_FLAGS 0x0142 [ 0.431637] pcieport 0000:00:1c.0: PCI_EXP_SLTSTA 0x0000 [ 34.361722] pcieport 0000:00:1c.0: in pciehp_resume: [ 34.361727] pcieport 0000:00:1c.0: PCI_EXP_FLAGS 0x0042 [ 34.361765] pcieport 0000:00:1c.0: PCI_EXP_SLTSTA 0x0040 #define PCI_EXP_FLAGS_SLOT 0x0100 #define PCI_EXP_SLTSTA_PDS 0x0040 At boot, 00:1c.0 claims to have a slot (PCI_EXP_FLAGS_SLOT is set) but there's no device in the slot (PCI_EXP_SLTSTA_PDS is clear). At resume, 00:1c.0 now claims there's no slot (PCI_EXP_FLAGS_SLOT is clear) but there *is* a device (PCI_EXP_SLTSTA_PDS is set). pciehp doesn't re-read PCI_EXP_FLAGS at resume-time, so it still thinks there's a slot. It does read Slot Status, which now claims there's a device, so it tries to bring up the link. But there isn't really a device, so the link never comes up and we time out. PCI_EXP_FLAGS_SLOT is HwInit, so it's read-only from our point of view. I'm suspicious of the fact that it changed between boot and resume. Is this a hardware defect, a BIOS problem, some kind of Linux re-initialization problem? I don't know how we should handle this or why it seemed to work in v3.14. There were several pciehp changes between v3.14 and v3.15, but nothing obvious jumps out to me. This may have always been broken.
> changes between v3.14 and v3.15
intending to bisect, I discovered that 3.14.0 fails too.
so I re-built and re-tested 3.14.44 and it still fails.
Although I do have analyze_suspend results for the 3.14.44 "success" case,
it is possible that I mixed that result up with a "pcie_ports=compat" test.
Unfortunately, analyze_suspend does not save the whole dmesg
and so the cmdline for that test wasn't saved.
I know when I first got this machine, that pcieport was not the slowest
resume device (i915 was), so I do think this is a regression. However,
it could be that what I used to run a year ago had this driver disabled?
That said, what is the benefit of the pcieport driver on
a laptop with no exposed PCIE ports?
> That said, what is the benefit of the pcieport driver on
> a laptop with no exposed PCIE ports?
The question is how pciehp could tell whether it is needed. The code that does this is a bit of a mess, but I think pciehp claims a port if:
- _OSC has granted native PCIe hotplug control to the OS,
- PCIe Capabilities Register has "Slot Implemented" set, and
- PCIe Slot Capabilities has "Hot-Plug Capable" set.
Based on your dmesg and lspci output, these are true for 00:1c.0 and 00:1c.1, but not for 00:1c.3 (where Hot-Plug Capable is not set).
Is there some other indication pciehp should look at to determine whether a port is exposed? I don't see anything else in PCIe, but maybe there's something in ACPI?
FWIW, I updated the BIOS to A02 from A00 today, and it made no difference. I've been unable to discover what scenario caused this issue to not appear in an earlier experiment -- other than booting with the pcieport driver disabled; so I'm clearing the "regression" bit on this report -- it seems Linux has had this trouble with the XPS11 forever. Ummm... Len mentions that he has seen the same kernel (3.14.44) work in the past, and has this issue now. Since the pciehp purely relies on slot status & capabilities, and we see that those bits have been fiddling in the logs here (after a resume, the controller says it doesn't have a slot, but now has a device), do you think this could be an intermittent issue where because of some HW issue (or voltage / noise / spikes) some of the bits intermittently or transiently get flipped? Is there another XPS11 that can be tried? (In reply to Len Brown from comment #17) > FWIW, I updated the BIOS to A02 from A00 today, and it made no difference. > > I've been unable to discover what scenario caused this issue to not appear > in an earlier experiment -- other than booting with the pcieport driver > disabled; > so I'm clearing the "regression" bit on this report -- it seems Linux > has had this trouble with the XPS11 forever. I think that is silicon bug. The 00:1c.0 is from ICH, and that slot is a dummy hotplug ... We checked with Intel before, and they agreed that is silicon bug. > I think that is silicon bug. The 00:1c.0 is from ICH, and that slot is
> a dummy hotplug ...
>
> We checked with Intel before, and they agreed that is silicon bug.
Can you give us any more hints? A URL to an erratum? If we knew, for example, that function 0 of certain Root Ports can never have a slot, but it sometimes reports one, maybe we could work around that in pciehp.
(In reply to Bjorn Helgaas from comment #20) > > Can you give us any more hints? A URL to an erratum? If we knew, for > example, that function 0 of certain Root Ports can never have a slot, but it > sometimes reports one, maybe we could work around that in pciehp. Maybe Len could check with that internally. Intel's silicon validation team responded: "At least the Sunrise Point Spec clearly says that if XCAP.SI is 0 the PDS bit is set to “1”. The Linux driver should re-read SI bit to make a decision and then look at the PDS" Presence Detect State (PDS): If XCAP.SI is set (indicating that this root port spawns a slot), then this bit indicates whether a device is connected ('1') or empty ('0'). If XCAP.SI is cleared, this bit is a '1'. Slot Implemented (SI): Indicates whether the root port is connected to a slot. Slot support is platform specific. BIOS programs this field, and it is maintained until a platform reset. > > Slot Implemented (SI): Indicates whether the root port is connected to a > slot. > Slot support is platform specific. BIOS programs this field, > and it is maintained until a platform reset. But from comment #14 above, the SI bit is getting cleared between the suspend and resume? [ 0.431618] pcieport 0000:00:1c.0: in pciehp_probe: [ 0.431621] pcieport 0000:00:1c.0: PCI_EXP_FLAGS 0x0142 [ 0.431637] pcieport 0000:00:1c.0: PCI_EXP_SLTSTA 0x0000 [ 34.361722] pcieport 0000:00:1c.0: in pciehp_resume: [ 34.361727] pcieport 0000:00:1c.0: PCI_EXP_FLAGS 0x0042 [ 34.361765] pcieport 0000:00:1c.0: PCI_EXP_SLTSTA 0x0040 The SI bit value is not getting maintained apparently.. Created attachment 180191 [details] debug patch (In reply to comment #22): 1) I assume XCAP.SI is another name for the "Slot Implemented" bit in the PCI Express Capabilities Register (PCIe spec r3.0, sec 7.8.2). 2) I agree that if SI is 0, PDS in the Slot Status register must be hardwired to 1 (PCIe spec, sec 7.8). The real question is why SI is changing. 3) The PCIe SI bit is HwInit, and I assume on this hardware that translates to "write-once by firmware, then read-only and can only be reset by Fundamental Reset" (PCIe spec sec 7.4). 4) I suppose the implication is that suspend/resume involves a PCIe Fundamental Reset, and Linux should re-read SI when it resumes. If that's true, there may be other places in PCI where we need to re-read things on resume. 5) No hardware changed during suspend/resume, so the fact that SI isn't set again during resume sounds like a BIOS bug. I don't know much about suspend/resume ... does it involve a Fundamental Reset? Dangit, I wasn't finished editing that. Bugzilla is over-eager to submit when adding comments to an attachment. Anyway, if there is a Fundamental Reset, maybe we could do something like the patch in comment #24. I think there would be more to do, too. For example, the PCIe Capability Register is cached in pcie_flags_reg and is now stale. Bjorn, the patch in comment #24 doesn't build for me -- does it apply to a particular tree? Created attachment 180261 [details]
debug patch
Sorry, I botched the patch. This applies on v4.1-rc2 (but should apply on any later kernel), and this time I actually did build it.
The patch in comment #27 eliminates the big pcieport delay from resume. dmesg after the resume shows: pcieport 0000:00:1c.0: slot disappeared! However, the patch causes subsequent suspend/resumes now fail -- not making it back far enough to resume the screen. Tested on top of 4.1-rc8, which can suspend/resume repeatedly, without failure. Created attachment 241671 [details]
analyze suspend output
still broken in Linux 4.8.0, as shipped by Ubuntu 16.10
this analyze_suspend.py was run used the new -dev option,
and so it shows many msleep(10)'s an msleep(100),
and then many msleep(20)'s invoked by pciehp_check_link_status()
Created attachment 241731 [details] testing patch Here's my theory about the hang in comment #28: the first resume succeeds and since 00:1c.0 no longer advertises a slot, we call pciehp_remove(), which deregisters the slot and frees some data structures. When we attempt a second suspend/resume, pciehp is still bound to 00:1c.0, so we call pciehp_resume() again, and we again think the slot disappeared and we again call pciehp_remove(). This time it fails because we try to use some of the data structures we freed during the first resume. This patch clears out the service_data pointer the first time, so subsequent resumes should basically be no-ops. Confirmed on Linux-4.9-rc1+, that the patch in comment #30 removes the pcieport delay from resume, and allows multiple resumes to succeed w/o the hang mentioned in comment #28. Created attachment 242151 [details]
dmesg with patch
Also verified the patch works as applied to vanilla Linux-4.8.0
Attached is the dmesg resulting from boot with 2 suspend/resume cycles.
Note that when the patch is present, there are couple of additional lines:
[ 55.927208] pcieport 0000:00:1c.0: PCIe Capabilities was 0x0142, now 0x0042
[ 55.986801] pcieport 0000:00:1c.0: No longer supports hotplug
...
[ 97.024614] pcieport 0000:00:1c.0: No longer supports hotplug
Bjorn, I don't see this patch in linux-next. Do you have plans to upstream it? Is there any more information I can provide? If you have the machine handy, can you try this after a resume (without the patch above)? setpci -s 00:1c.0 42.w=142 setpci -s 00:1c.0 42.w The spec says the PCI_EXP_FLAGS_SLOT bit (0x100) is R/WO, and suggests that it becomes read-only after the first write. I think this problem happens because BIOS didn't write the bit after resume. So maybe we can do the write in Linux. root@xps11-9p33:/# setpci -s 00:1c.0 42.w=142 root@xps11-9p33:/# setpci -s 00:1c.0 42.w 0142 Created attachment 260867 [details]
Still a 3,500 ms delay in Linux-4.14.2
Upgraded the BIOS on this machine to A04 (1/2/2019). Installed Ubuntu 19.10 (Linux 5.3) This issue is no longer reproducible. |
Created attachment 179451 [details] lspci -vvv output On a Dell XPS11 laptop, pcieport 0000:00:1c.0 takes several seconds to resume from suspend. Booting with "pcie_ports=compat", or building with CONFIG_HOTPLUG_PCI_PCIE=n makes this delay go away. I've tested Linux-3.14.44 and do not see this delay. I've tested Linux-3.15.0 and others up through 4.1-rc5 and I do see this delay. So I'll mark this as a regression. while I use analyze_suspend.py to trace resume, initcall_debug is sufficient to show this delay in dmesg, and I notice some additional PCI warnings there: [ 55.897986] pciehp 0000:00:1c.0:pcie04: Link Training Error occurs [ 55.897988] pciehp 0000:00:1c.0:pcie04: Failed to check link status [ 55.897992] call 0000:00:1c.0+ returned 0 after 2197660 usecs (yes, that is 2.19 SECONDS -- where anything> 50ms for resume is considered slow) Unclear (to me) what pcie04 is... $ lspci 00:00.0 Host bridge: Intel Corporation Haswell-ULT DRAM Controller (rev 0b) 00:02.0 VGA compatible controller: Intel Corporation Device 0a1e (rev 0b) 00:03.0 Audio device: Intel Corporation Haswell-ULT HD Audio Controller (rev 0b) 00:04.0 Signal processing controller: Intel Corporation Device 0a03 (rev 0b) 00:14.0 USB controller: Intel Corporation 8 Series USB xHCI HC (rev 04) 00:16.0 Communication controller: Intel Corporation 8 Series HECI #0 (rev 04) 00:1b.0 Audio device: Intel Corporation 8 Series HD Audio Controller (rev 04) 00:1c.0 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 1 (rev e4) 00:1c.1 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 2 (rev e4) 00:1c.3 PCI bridge: Intel Corporation 8 Series PCI Express Root Port 4 (rev e4) 00:1d.0 USB controller: Intel Corporation 8 Series USB EHCI #1 (rev 04) 00:1f.0 ISA bridge: Intel Corporation 8 Series LPC Controller (rev 04) 00:1f.2 SATA controller: Intel Corporation 8 Series SATA Controller 1 [AHCI mode] (rev 04) 00:1f.3 SMBus: Intel Corporation 8 Series SMBus Controller (rev 04) 00:1f.6 Signal processing controller: Intel Corporation 8 Series Thermal (rev 04) 05:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. RTS5249 PCI Express Card Reader (rev 01) 07:00.0 Network controller: Intel Corporation Wireless 7260 (rev 6b) This 04 item appears under 1c.0 when I ask lspci about topology: $ lspci -tv -[0000:00]-+-00.0 Intel Corporation Haswell-ULT DRAM Controller +-02.0 Intel Corporation Device 0a1e +-03.0 Intel Corporation Haswell-ULT HD Audio Controller +-04.0 Intel Corporation Device 0a03 +-14.0 Intel Corporation 8 Series USB xHCI HC +-16.0 Intel Corporation 8 Series HECI #0 +-1b.0 Intel Corporation 8 Series HD Audio Controller +-1c.0-[04]-- +-1c.1-[05]----00.0 Realtek Semiconductor Co., Ltd. RTS5249 PCI Express Card Reader +-1c.3-[07]----00.0 Intel Corporation Wireless 7260 +-1d.0 Intel Corporation 8 Series USB EHCI #1 +-1f.0 Intel Corporation 8 Series LPC Controller +-1f.2 Intel Corporation 8 Series SATA Controller 1 [AHCI mode] +-1f.3 Intel Corporation 8 Series SMBus Controller \-1f.6 Intel Corporation 8 Series Thermal