Bug 35452

Summary: Devices not available after resume from suspend to RAM
Product: Drivers Reporter: Rafael J. Wysocki (rjw)
Component: PCIAssignee: Rafael J. Wysocki (rjw)
Status: ASSIGNED ---    
Severity: normal CC: adrian.fita, alan, andiry.xu, arindam.nath, bjorn, cliff.cai, empx, jbarnes, mmvinni, ray.huang, rjw, shane.huang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.6.2 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: acpidump from the problematic HP Pavilion dv5 laptop, model dv5-1250eo
PCI / PM: Do not disable driverless devices during suspend
PCI / PM: Prepare driverless devices for system suspend
dmesg from 592fe89 (3.4.0-rc3+00036)
dmesg when an SD card is inserted while laptop suspended
dmesg from v3.6

Description Rafael J. Wysocki 2011-05-19 21:27:18 UTC
On some systems certain devices appear to be in D3 after resume and cannot be
brought back to D0.

Reference: https://lkml.org/lkml/2011/5/15/84
Comment 1 Andiry Xu 2011-05-20 10:17:59 UTC
The following pm test passed on 2.6.39:

# echo core > /sys/power/pm_test
# echo mem > /sys/power/state

both remove xhci-hcd or keep it loaded passed.
Comment 2 Mikko Vinni 2011-05-20 19:55:50 UTC
Created attachment 58832 [details]
acpidump from the problematic HP Pavilion dv5 laptop, model dv5-1250eo
Comment 3 Mikko Vinni 2011-05-20 20:07:14 UTC
I don't understand how xhci-hcd (comment #1) is relevant for this problem. This HP laptop is maybe 3 to 4 years old, no USB 3 here and xhci-hcd not compiled.
Comment 4 Rafael J. Wysocki 2011-05-20 20:11:59 UTC
On the other machine having this problem the USB3 controllers are affected.

@Mikko: Do you have a kernel version where the problem didn't happen?

@Andiry: Is it possible to do full suspend/resume on that machine with 2.6.39?
Comment 5 Mikko Vinni 2011-05-20 21:47:33 UTC
No known good kernel here. I recently checked with the default 2.6.32 kernel from Ubuntu 10.10 - fails in the exact same way.
Comment 6 Andiry Xu 2011-05-21 07:16:24 UTC
Test on 2.6.39 with full suspend/resume:

If xhci-hcd is loaded during S3: pass
If xhci-hcd is removed before suspend and load after resume: fail, the host controller is in D3 and initialization fails.
Comment 7 Rafael J. Wysocki 2011-05-21 09:26:36 UTC
(In reply to comment #6)
> Test on 2.6.39 with full suspend/resume:
> 
> If xhci-hcd is loaded during S3: pass

Good.

> If xhci-hcd is removed before suspend and load after resume: fail, the host
> controller is in D3 and initialization fails.

Well, I'm not sure what the problem is here.  Perhaps our handling of
driverless devices during suspend/resume is too aggressive.
Comment 8 Rafael J. Wysocki 2011-05-21 09:33:46 UTC
Created attachment 58872 [details]
PCI / PM: Do not disable driverless devices during suspend
Comment 9 Rafael J. Wysocki 2011-05-21 09:36:01 UTC
Created attachment 58882 [details]
PCI / PM: Prepare driverless devices for system suspend
Comment 10 Rafael J. Wysocki 2011-05-21 09:39:26 UTC
@Andiry: Please test with xhci-hcd unloaded during suspend and with three
combinations of the two patches above:
(1) The patch from comment #8 applied alone.
(2) The patch from comment #9 applied alone.
(3) Both patches applied.

@Mikko: Are the drivers of the affected devices loaded before suspend?
Comment 11 Mikko Vinni 2011-05-21 20:23:28 UTC
At least firewire_ohci, sdhci-pci, and jmb38x_ms have been loaded:

firewire_ohci 0000:0a:00.0: Refused to change power state, currently in D3
sdhci-pci 0000:0a:00.1: Refused to change power state, currently in D3
pci 0000:0a:00.2: Refused to change power state, currently in D3
jmb38x_ms 0000:0a:00.3: Refused to change power state, currently in D3
pci 0000:0a:00.4: Refused to change power state, currently in D3
PM: early resume of devices complete after 111.888 msecs

(I have also tried with firewire_ohci blacklisted and sdhci-pci rmmod'ed before suspend, but that didn't change anything)

I don't know the logic of the printout, but 0a:00.2 should also be covered by the sdhci-pci module:

0a:00.2 SD Host controller: JMicron Technology Corp. Standard SD Host Controller (prog-if 01)
        Subsystem: Hewlett-Packard Company Device 3600
        Flags: fast devsel, IRQ 18
        Memory at d1100a00 (32-bit, non-prefetchable) [size=256]
        Capabilities: [a4] Power Management version 3
        Capabilities: [80] Express Endpoint, MSI 00
        Capabilities: [94] MSI: Enable- Count=1/1 Maskable- 64bit-
        Kernel modules: sdhci-pci

For 0a:00.4 there is no driver compiled nor loaded:

0a:00.4 System peripheral: JMicron Technology Corp. xD Host Controller
        Subsystem: Hewlett-Packard Company Device 3600
        Flags: bus master, fast devsel, latency 0, IRQ 10
        Memory at d1100800 (32-bit, non-prefetchable) [size=256]
        Capabilities: [a4] Power Management version 3
        Capabilities: [80] Express Endpoint, MSI 00
        Capabilities: [94] MSI: Enable- Count=1/1 Maskable- 64bit-
Comment 12 Rafael J. Wysocki 2011-05-21 21:18:32 UTC
Are the devices alive after the test from comment #1 (apart from the xHCI part,
of course)?
Comment 13 Mikko Vinni 2011-05-22 22:35:58 UTC
After

# echo core > /sys/power/pm_test
# echo mem > /sys/power/state

everything works ok.
Comment 14 Andiry Xu 2011-05-23 09:35:08 UTC
(In reply to comment #10)
> @Andiry: Please test with xhci-hcd unloaded during suspend and with three
> combinations of the two patches above:
> (1) The patch from comment #8 applied alone.
> (2) The patch from comment #9 applied alone.
> (3) Both patches applied.

None of the three solutions work...:(
Comment 15 Rafael J. Wysocki 2012-04-18 21:29:48 UTC
Mikko, please attach the output of dmesg from the current Linus' tree kernel including one suspend-resume cycle.
Comment 16 Mikko Vinni 2012-04-19 07:27:33 UTC
Created attachment 72977 [details]
dmesg from 592fe89 (3.4.0-rc3+00036)
Comment 17 Mikko Vinni 2012-04-19 09:36:08 UTC
Created attachment 72979 [details]
dmesg when an SD card is inserted while laptop suspended

I tested suspending with a memory card in the card reader. It seems that if there is a card in the reader at resume time, everything will work. Having the card in at the moment of suspend doesn't matter.

laptop fresh from reboot
no card in the reader
suspend
put a SD card in the card reader (don't have other types to test at hand)
resume
nothing suspicious in logs, SD card can be accessed
Comment 18 Andiry Xu 2012-08-29 17:35:36 UTC
I have left AMD and this mail address is obsolete. Please contact andiry.xu@gmail.com
Comment 19 Bjorn Helgaas 2012-10-01 20:39:15 UTC
Is anything happening on this bug?  Does it still occur with v3.6?  Is there anything I can do to help resolve it?
Comment 20 Cliff Cai 2012-10-01 20:40:12 UTC
I'm on holiday during 10/1/2012~10/5/2012, please expect slow email response.
Comment 21 Mikko Vinni 2012-10-04 16:39:11 UTC
Created attachment 82121 [details]
dmesg from v3.6

This is still a problem on my HP laptop.

The attached dmesg log shows the results from
three suspends and resumes on vanilla 3.6.0:
1 (around 630s): a firewire cable (and camera) attached during the s2ram
2 (around 914s): an SD card inserted in the reader during the s2ram
3 (around 1018s): neither firewire nor SD card attached

After 1 and 2 firewire and the card reader seem to work fine.
After 3, this happens:

[ 1020.620279] firewire_ohci 0000:0a:00.0: Refused to change power state, currently in D3
[ 1020.620293] firewire_ohci 0000:0a:00.0: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)
...
[ 1020.700274] sdhci-pci 0000:0a:00.1: Refused to change power state, currently in D3
[ 1020.700287] sdhci-pci 0000:0a:00.1: restoring config space at offset 0x3c (was 0xffffffff, writing 0x10a)
...

and so on.

Keeping a memory card inserted all the time works as a
satisfactory workaround, if I know I will be needing
the card reader or firewire. Unfortunately I don't know
how to debug the real problem, but I can try any
suggestions.
Comment 22 Mike 2012-10-23 17:34:32 UTC
I have this D3 problem on my Asrock P67 Pro3's integrated NIC, a Realtek RTL8111E.
However, it only happens on about every 10th resume, the other 9 resumes everything is fine. The problem persists in 3.6.2 and has occurred for over a year.
log:
kernel: r8169 0000:03:00.0: Refused to change power state, currently in D3