Bug 51281

Summary: Suspend to RAM randomly fails with "START_STOP FAILED" on sda
Product: Power Management Reporter: Alberto Mattea (support.intranet)
Component: Hibernation/SuspendAssignee: Aaron Lu (aaron.lu)
Status: CLOSED UNREPRODUCIBLE    
Severity: normal CC: aaron.lu, alan, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.6.2 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: full lspci
dmesg from kernel 2.6.37.2
dmesg from kernel 3.6.2

Description Alberto Mattea 2012-12-03 21:33:39 UTC
Created attachment 88351 [details]
full lspci

When trying to suspend to ram S3 the machine, one of these things will happen, apparently randomly:

1) The screen goes black, everything freezes for about 15-20 seconds (but fans stay on), then the LED blinks and the systems returns to the desktop

2) The screen goes black, everything freezes for quite a long time (30-60 seconds), then the system finally goes to S3

3) The system goes immediately to S3

1) is by far the most common, followed by 2). I have seen 3) (the expected behaviour) only a couple of times. Usually when 1) happens, trying again will produce 2). However, a few times 1) happened 2-3 times in a row before 2).

When 1) happens, in dmesg there is:

[11615.245918] PM: Syncing filesystems ... done.
[11615.295922] Freezing user space processes ... (elapsed 0.01 seconds) done.
[11615.306991] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[11615.317981] Suspending console(s) (use no_console_suspend to debug)
[11615.318331] sd 5:0:0:0: [sda] Synchronizing SCSI cache
[11615.319217] sd 5:0:0:0: [sda] Stopping disk
[11615.321015] i8042 kbd 00:08: wake-up capability enabled by ACPI
[11645.363404] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[11645.363407] ata6.00: cmd e0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[11645.363407]          res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[11645.363411] ata6: hard resetting link
[11645.363412] ata6: nv: skipping hardreset on occupied port
[11645.817242] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11645.823417] ata6.00: configured for UDMA/100
[11645.823419] ata6.00: device reported invalid CHS sector 0
[11645.823425] ata6: EH complete
[11645.823427] sd 5:0:0:0: [sda] START_STOP FAILED
[11645.823428] sd 5:0:0:0: [sda]  
[11645.823429] Result: hostbyte=0x00 driverbyte=0x08
[11645.823430] sd 5:0:0:0: [sda]  
[11645.823431] Sense Key : 0xb [current] [descriptor]
[11645.823432] sd 5:0:0:0: [sda]  
[11645.823433] ASC=0x0 ASCQ=0x0
[11645.823435] dpm_run_callback(): 0xffffffff81305120 returns 134217730
[11645.823438] PM: Device 5:0:0:0 failed to suspend async: error 134217730
[11645.823468] PM: Some devices failed to suspend
[11645.823520] ohci_hcd 0000:00:0a.0: setting latency timer to 64
[11645.823529] ehci_hcd 0000:00:0a.1: setting latency timer to 64
[11645.823620] pci 0000:00:0e.0: setting latency timer to 64
[11645.823877] i8042 kbd 00:08: wake-up capability disabled by ACPI
[11645.834250] pata_amd 0000:00:0c.0: setting latency timer to 64
[11645.834261] sata_nv 0000:00:0d.0: setting latency timer to 64
[11645.835216] sata_nv 0000:00:0d.1: setting latency timer to 64
[11645.835495] snd_hda_intel 0000:00:0e.1: setting latency timer to 64
[11645.837251] ata8: port disabled--ignoring
[11645.990823] ata7.01: ACPI cmd ef/03:42:00:00:00:b0 (unknown) filtered out
[11645.990825] ata7.01: ACPI cmd ef/03:0c:00:00:00:b0 (unknown) filtered out
[11645.994898] ata7.00: ACPI cmd ef/03:42:00:00:00:a0 (unknown) filtered out
[11645.994899] ata7.00: ACPI cmd ef/03:0c:00:00:00:a0 (unknown) filtered out
[11645.994905] ata7: nv_mode_filter: 0x739f&0x739f->0x739f, BIOS=0x7000 (0xc0c00000) ACPI=0x701f (60:60:0x1f)
[11645.994907] ata7: nv_mode_filter: 0x739f&0x739f->0x739f, BIOS=0x7000 (0xc0c00000) ACPI=0x701f (60:60:0x1f)
[11646.000653] ata7.00: configured for UDMA/33
[11646.006734] ata7.01: configured for UDMA/33
[11646.545940] ata5: SATA link down (SStatus 0 SControl 300)
[11646.558933] ata2: SATA link down (SStatus 0 SControl 300)
[11646.558941] ata3: SATA link down (SStatus 0 SControl 300)
[11646.558966] ata1: SATA link down (SStatus 0 SControl 300)
[11646.558971] ata4: SATA link down (SStatus 0 SControl 300)
[11646.784547] PM: resume of devices complete after 961.445 msecs
[11646.784739] Restarting tasks ... done.
[11647.064163] forcedeth 0000:00:10.0: irq 42 for MSI/MSI-X
[11647.064203] forcedeth 0000:00:10.0: eth0: MSI enabled
[11647.064426] forcedeth 0000:00:10.0: eth0: no link during initialization
[11647.064864] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[11648.885199] forcedeth 0000:00:10.0: eth0: link up
[11648.885608] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready

When 2) happens, in dmesg there is:

[11801.240224] PM: Syncing filesystems ... done.
[11801.290312] Freezing user space processes ... (elapsed 0.01 seconds) done.
[11801.301706] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[11801.312676] Suspending console(s) (use no_console_suspend to debug)
[11801.313046] sd 5:0:0:0: [sda] Synchronizing SCSI cache
[11801.315940] i8042 kbd 00:08: wake-up capability enabled by ACPI
[11861.477568] ata6.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[11861.477571] ata6.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0
[11861.477571]          res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[11861.477575] ata6: hard resetting link
[11861.477575] ata6: nv: skipping hardreset on occupied port
[11861.931333] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[11861.937504] ata6.00: configured for UDMA/100
[11861.937505] ata6.00: retrying FLUSH 0xea Emask 0x4
[11861.940305] ata6.00: device reported invalid CHS sector 0
[11861.940311] ata6: EH complete
[11861.942599] sd 5:0:0:0: [sda] Stopping disk
[11861.955310] PM: suspend of devices complete after 60665.827 msecs
[11861.955428] PM: late suspend of devices complete after 0.116 msecs
[11861.966334] ehci_hcd 0000:00:0a.1: wake-up capability enabled by ACPI
[11861.977301] ohci_hcd 0000:00:0a.0: wake-up capability enabled by ACPI
[11861.988420] PM: noirq suspend of devices complete after 33.004 msecs
[11861.989052] ACPI: Preparing to enter system sleep state S3
[11861.989449] PM: Saving platform NVS memory
[11861.989486] Disabling non-boot CPUs ...
[11861.991182] smpboot: CPU 1 is now offline
[11861.992744] smpboot: CPU 2 is now offline
[11861.994234] smpboot: CPU 3 is now offline
[11861.994496] Extended CMOS year: 2000


So it seems the disk (a Crucial C300 256GB) is failing to suspend. The motherboard is an M2N32-SLI deluxe, lspci attached.

With version 2.6.37.2 it worked without problems 100% of the times.
Comment 1 Alan 2012-12-04 10:16:37 UTC
Actually this looks like it is being suspended too early
Comment 2 Aaron Lu 2012-12-11 03:27:08 UTC
Similar bug as Bug 48951, from 2.6.38 some hard disks fail to process the STANDBY IMMEDIATE command on suspend.
Comment 3 Aaron Lu 2012-12-11 07:06:20 UTC
Full dmesgs for both v2.6.38(the bad one) and v2.6.37(the good one) are helpful, please attach them, thanks.
Comment 4 Alberto Mattea 2012-12-12 19:29:22 UTC
I'd love to do that but after a CMOS clear S3 mysteriously started working (it has now suspended correctly about ten times in a row). Note that I have put exactly the same values as before in the BIOS. If dmesg would be useful anyway, tell me and I'll gather them.
Comment 5 Aaron Lu 2012-12-13 01:41:50 UTC
Yes, please attach the dmesg. I want to know which host controller this failing port belongs to, thanks.
Comment 6 Alberto Mattea 2012-12-17 13:33:42 UTC
Created attachment 89371 [details]
dmesg from kernel 2.6.37.2
Comment 7 Alberto Mattea 2012-12-17 13:34:12 UTC
Created attachment 89381 [details]
dmesg from kernel 3.6.2
Comment 8 Alberto Mattea 2012-12-17 13:34:35 UTC
Here it is, sorry for the delay
Comment 9 Aaron Lu 2012-12-19 00:45:21 UTC
NVIDIA MCP55 SATA controller, using sata_nv driver.

It seems this controller has problems dealing with standby command, I've seen some reports on this:
http://marc.info/?l=linux-ide&m=133534061316338&w=2
Bug 48951


I'll close this bug and if the problem happened again, feel free to re-open it, thanks.