Bug 48951 - Hard drive fails to suspend (scsi_bus_suspend+0x0/0x10 returns 134217730)
Summary: Hard drive fails to suspend (scsi_bus_suspend+0x0/0x10 returns 134217730)
Status: CLOSED WILL_NOT_FIX
Alias: None
Product: Power Management
Classification: Unclassified
Component: Hibernation/Suspend (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Aaron Lu
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-10-17 02:03 UTC by Joe Sapp
Modified: 2014-11-30 09:38 UTC (History)
8 users (show)

See Also:
Kernel Version: 3.6.2
Subsystem:
Regression: No
Bisected commit-id:


Attachments
A section from dmesg around the suspend test (2.46 KB, text/plain)
2012-10-17 02:03 UTC, Joe Sapp
Details
dmesg output of suspend/resume (79.27 KB, text/plain)
2012-10-17 18:05 UTC, Dimitris Damigos
Details
dmesg from kernel 2.6.37 (44.79 KB, text/plain)
2013-01-08 01:50 UTC, Joe Sapp
Details
dmesg from kernel 2.6.38 (59.85 KB, text/plain)
2013-01-08 01:51 UTC, Joe Sapp
Details
Successful dmesg with async suspend disabled, kernel 3.7.2 (6.74 KB, text/plain)
2013-01-14 18:01 UTC, bladud
Details
Failed suspend dmesg from 3.7.2, with async suspend enabled (5.96 KB, text/plain)
2013-01-14 18:03 UTC, bladud
Details
Add debug message when devices are suspended (926 bytes, patch)
2013-03-08 05:32 UTC, Aaron Lu
Details | Diff
Successful dmesg with debug patch applied (65.88 KB, text/plain)
2013-03-10 02:34 UTC, bladud
Details
Failed dmesg from with debug patch applited (64.86 KB, text/plain)
2013-03-10 02:35 UTC, bladud
Details
Full successful dmesg with debug patch (122.35 KB, text/plain)
2013-03-13 04:48 UTC, bladud
Details
Full failed dmesg with debug patch attached (188.37 KB, text/plain)
2013-03-13 05:15 UTC, bladud
Details
Results from test in comment #31 (291.23 KB, text/plain)
2013-03-14 02:15 UTC, Joe Sapp
Details

Description Joe Sapp 2012-10-17 02:03:51 UTC
Created attachment 83711 [details]
A section from dmesg around the suspend test

Suspend to ram has not worked for me in versions 3.5.3, 3.5.4, and 3.6.2.  I tried to do some basic debugging and it seems to be the SATA hard drive isn't suspending properly.  I've attached a section of dmesg around when I performed the following test:

echo devices > /sys/power/pm_test
echo mem > /sys/power/state
Comment 1 Alan 2012-10-17 10:39:23 UTC
Curious - it seems that when we went to tell the disk to go to sleep it had already fallen off the bus.
Comment 2 Dimitris Damigos 2012-10-17 18:05:40 UTC
Created attachment 83731 [details]
dmesg output of suspend/resume

I have a similar problem that it might be connected to this bug. With kernel 3.6.2 when the laptop resumes from suspend it seems that the hard disk never resumes and I get input/output errors. With kernel 3.5.x I didn't have this problem. I have attached the output of dmesg.
Comment 3 Aaron Lu 2012-11-15 02:01:17 UTC
Hello Joe,

Is there a known working kernel version?
The disk seems not responding to the STANDBY IMMEDIATE command, the command timed out.
Comment 4 Joe Sapp 2012-12-02 20:33:33 UTC
(In reply to comment #3)

2.6.37 suspended properly and 2.6.38 fails to work -- however, both of these used the TuxOnIce patchset.  I couldn't find any differences in drivers/scsi/scsi_pm.c between the two versions and I'm not sure where else to look.  I tried the "devices" test on 2.6.38, but it froze after suspending.  I could try the PM_TRACE_RTC test and see if that shows anything interesting.
Comment 5 Aaron Lu 2012-12-04 06:01:17 UTC
Just took a look at TuxOnIce, it is another implementation for hibernation so should not affect suspend to ram. But you can test if vanilla kernel has the same problem. And the full dmesg might be also helpful.
Comment 6 Aaron Lu 2012-12-11 07:08:52 UTC
Full dmesgs for both v2.6.38(the bad one) and v2.6.37(the good one) are
helpful, please attach them, thanks.
Comment 7 Joe Sapp 2013-01-08 01:50:06 UTC
Created attachment 90641 [details]
dmesg from kernel 2.6.37
Comment 8 Joe Sapp 2013-01-08 01:51:18 UTC
Created attachment 90651 [details]
dmesg from kernel 2.6.38
Comment 9 Aaron Lu 2013-01-08 05:26:14 UTC
PCI ID, 10de:0266
NVIDIA MCP51 SATA controller, using sata_nv driver.

Very similar problem as Bug 51281.
And also see here:
http://marc.info/?l=linux-ide&m=133534061316338&w=2

It all looks like to me the Nvidia MCP sata controller has problem dealing with STANDBY command, but don't know what exactly is the problem...

BTW, can you please check when you shutdown the system, does such error occur? Search for kernel messages for "START_STOP FAILED". You can check the kernel messages after you boot the system again in /var/log/messages. Please note that you have to do a shutdown to see if such error occurred, as reboot will not stop the disk.

Thanks.
Comment 10 bladud 2013-01-14 17:50:15 UTC
I get the same problem, since kernel 3.3, as in the linked message. 

I found that reverting
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=966f1212e1ac5fe3ddf04479d21488ddb36a2608
on top of linux 3.5.5 fixes the problem.

Figuring that disabling async suspend would help, I discovered that all kernels can suspend successfully by doing:

echo disabled > /sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/power/async

On this system, "/sys/devices/pci0000:00/0000:00:05.[01]" means "all occupied sata ports" - there is also a pata drive, which didn't seem to have a problem. 

There are also a bunch of other power/async files, such as
/sys/devices/pci0000:00/0000:00:05.0/ata1/host0/target0:0:0/0:0:0:0/power/async
which didn't seem to make a difference. I don't know what exactly any of them do, but hopefully this information is helpful. Let me know if you need any more. 

Also, if it's important, all my sata devices are on an md-raid.
Comment 11 bladud 2013-01-14 18:01:56 UTC
Created attachment 91331 [details]
Successful dmesg with async suspend disabled, kernel 3.7.2
Comment 12 bladud 2013-01-14 18:03:57 UTC
Created attachment 91341 [details]
Failed suspend dmesg from 3.7.2, with async suspend enabled
Comment 13 Joe Sapp 2013-01-15 03:10:40 UTC
(In reply to comment #9)
> PCI ID, 10de:0266
> NVIDIA MCP51 SATA controller, using sata_nv driver.
> 
> Very similar problem as Bug 51281.
> And also see here:
> http://marc.info/?l=linux-ide&m=133534061316338&w=2

It looks like it might be related.

> It all looks like to me the Nvidia MCP sata controller has problem dealing
> with
> STANDBY command, but don't know what exactly is the problem...
> 
> BTW, can you please check when you shutdown the system, does such error
> occur?
> Search for kernel messages for "START_STOP FAILED". You can check the kernel
> messages after you boot the system again in /var/log/messages. Please note
> that
> you have to do a shutdown to see if such error occurred, as reboot will not
> stop the disk.

I do see it in my kernel messages, but not after a reboot -- maybe my syslog program stops stopping too early?  I'm trying to see if this is an issue.
I'm not sure what causes this to happen, but it does come soon after a "failed command: STANDBY IMMEDIATE".
Comment 14 Aaron Lu 2013-01-15 05:29:44 UTC
(In reply to comment #10)
> I get the same problem, since kernel 3.3, as in the linked message. 
> I found that reverting
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=966f1212e1ac5fe3ddf04479d21488ddb36a2608
> on top of linux 3.5.5 fixes the problem.

Good finding, thanks a lot!

> Figuring that disabling async suspend would help, I discovered that all
> kernels can suspend successfully by doing:
> echo disabled >
> /sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/power/async
> On this system, "/sys/devices/pci0000:00/0000:00:05.[01]" means "all occupied
> sata ports" - there is also a pata drive, which didn't seem to have a
> problem. 

So this means, if we disable asyc suspend for the scsi target, the suspend callback which involves stop the disk for scsi device will succeed.

> There are also a bunch of other power/async files, such as
>
> /sys/devices/pci0000:00/0000:00:05.0/ata1/host0/target0:0:0/0:0:0:0/power/async
> which didn't seem to make a difference.

By didn't make a difference, do you mean suspend still failed if you just disable async suspend for the scsi device?

> I don't know what exactly any of them do, but hopefully this information is
> helpful. Let me know if you need any more.

Absolutely helpful, thanks.
And the asyc file controls if asynchronous suspend for the device is allowed, to decrease the whole system suspend time.
 
> Also, if it's important, all my sata devices are on an md-raid.
Comment 15 Aaron Lu 2013-01-15 05:46:29 UTC
(In reply to comment #13)
> I do see it in my kernel messages, but not after a reboot -- maybe my syslog
> program stops stopping too early?  I'm trying to see if this is an issue.
> I'm not sure what causes this to happen, but it does come soon after a
> "failed
> command: STANDBY IMMEDIATE".

Oh right, we can't expect the filesystem(where the log file resides) is still there when we are stopping its back device(the disk :-).
But never mind, since we know standby also failed in the shutdown case.

BTW, you can try to disable async suspend for the scsi target or ata port to see if you will be able to suspend the system as bladud@gmail.com has found:
To disable asyc suspend for ata port:
echo disabled > /sys/devices/pci0000:00/0000:00:0e.0/ata1/power/async
To disable asyn suspend for scsi target:
echo disabled > /sys/devices/pci0000:00/0000:00:0e.0/ata1/host0/target0:0:0/power/async
Comment 16 Joe Sapp 2013-01-15 14:46:43 UTC
(In reply to comment #15)
> BTW, you can try to disable async suspend for the scsi target or ata port to
> see if you will be able to suspend the system as bladud@gmail.com has found:
> To disable asyc suspend for ata port:
> echo disabled > /sys/devices/pci0000:00/0000:00:0e.0/ata1/power/async
> To disable asyn suspend for scsi target:
> echo disabled >
> /sys/devices/pci0000:00/0000:00:0e.0/ata1/host0/target0:0:0/power/async

Disabling async suspend for the scsi target (i.e., the second command above) allows me to suspend the system.  There appear to be no errors in the system log relating to the hard disk any more.
Comment 17 bladud 2013-01-15 15:22:57 UTC
> > There are also a bunch of other power/async files, such as
> >
> /sys/devices/pci0000:00/0000:00:05.0/ata1/host0/target0:0:0/0:0:0:0/power/async
> > which didn't seem to make a difference.
> 
> By didn't make a difference, do you mean suspend still failed if you just
> disable async suspend for the scsi device?

That's right - suspend still failed after 
echo disabled > /sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/0:0:0:0/power/async
or any of the other matches for /sys/devices/pci0000:00/0000:00:05.[01]/**/power/async, including the ones with */block/*/power/async, and */scsi_device/*/power/async

> Absolutely helpful, thanks.
> And the asyc file controls if asynchronous suspend for the device is allowed,
> to decrease the whole system suspend time.

You're welcome. I still don't really understand the differences between 
/sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/power/async
/sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/*/power/async
or even
/sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/*/block/power/async
but that's not important, I was just curious...
Comment 18 Aaron Lu 2013-01-29 09:03:29 UTC
(In reply to comment #17)
> You're welcome. I still don't really understand the differences between 
> /sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/power/async
> /sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/*/power/async
> or even
>
> /sys/devices/pci0000:00/0000:00:05.[01]/ata?/host?/target*/*/block/power/async
> but that's not important, I was just curious...

Sorry for replying late.

The async field controls if asynchronous suspend is allowed for this device during system suspend phase(ie. S3 or S4). Asynchronous suspend is intended to speed up the whole system suspend time, but looks like it will cause problem for some NVIDIA MCP SATA controllers. 

There are scsi_device, scsi_target, scsi_host and block devices and each device will have an async file to control the behaviour.

So this basically means if we disable asynchronous suspend for scsi target device(which is the parent device of scsi_device, aka the disk device), suspend will succeed. I've no idea of why...
Comment 19 bladud 2013-02-03 18:33:02 UTC
Thanks for the explanation. Is it possible to write a udev rule or something to do this automatically?
Comment 20 Aaron Lu 2013-02-05 07:42:37 UTC
Yes, below udev rule will disable async suspend for all scsi target, save it in a file and place it under /etc/udev/rules.d/.

ACTION=="add", SUBSYSTEM=="scsi", ENV{DEVTYPE}=="scsi_target", ATTR{power/async}
="disabled"
Comment 21 bladud 2013-03-01 04:08:13 UTC
Thanks - I confirm that worked.
Comment 22 Aaron Lu 2013-03-01 04:50:09 UTC
(In reply to comment #16)
> (In reply to comment #15)
> > BTW, you can try to disable async suspend for the scsi target or ata port
> to
> > see if you will be able to suspend the system as bladud@gmail.com has
> found:
> > To disable asyc suspend for ata port:
> > echo disabled > /sys/devices/pci0000:00/0000:00:0e.0/ata1/power/async
> > To disable asyn suspend for scsi target:
> > echo disabled >
> > /sys/devices/pci0000:00/0000:00:0e.0/ata1/host0/target0:0:0/power/async
> 
> Disabling async suspend for the scsi target (i.e., the second command above)
> allows me to suspend the system.  There appear to be no errors in the system
> log relating to the hard disk any more.

Does disable async suspend for the ata port(i.e., the first command) also work for you or not?
Comment 23 Aaron Lu 2013-03-01 04:52:51 UTC
(In reply to comment #21)
> Thanks - I confirm that worked.

Thanks for the confirm.

Another thing I wonder, if you do not revert any commit, just disable async suspend for the ata port, does the problem go away on 3.3+ kernels?
Comment 24 bladud 2013-03-01 04:57:53 UTC
Yes - sorry, I thought I said that. 

I am running an unmodified 3.7.9 kernel with async suspend disabled via your udev rule, and it works fine.

But disabling async suspend for the ata port made no difference last time I tried it.
Comment 25 Aaron Lu 2013-03-01 05:19:07 UTC
(In reply to comment #24)
> Yes - sorry, I thought I said that. 
> 
> I am running an unmodified 3.7.9 kernel with async suspend disabled via your
> udev rule, and it works fine.
> 
> But disabling async suspend for the ata port made no difference last time I
> tried it.

This is weird...
I thought disable async suspend for the ata port would work for you. Since in comment #10, you found the offending commit, which just enables async suspend for ata port. So if reverting that commit works for you, it should also work if you disable async suspend for the ata port through sysfs async file. Or do I misunderstand something?
Comment 26 bladud 2013-03-01 17:13:01 UTC
Nope, I double checked, and disabling async suspend for the ata port definitely doesn't help. It's the scsi target that makes the difference - perhaps the patch has a side effect of disabling async suspend for the scsi target as well?
Comment 27 Aaron Lu 2013-03-08 02:21:33 UTC
(In reply to comment #26)
> Nope, I double checked, and disabling async suspend for the ata port
> definitely
> doesn't help. It's the scsi target that makes the difference - perhaps the
> patch has a side effect of disabling async suspend for the scsi target as
> well?

Sorry for the late reply.
Well, that commit really just enabled async suspend for the ata port device, it's a one line commit. Anyway, glad to know the facts, thanks.

I think I need to involve more people on this, perhaps by writing an email, since some guys don't have an account here.
Comment 28 bladud 2013-03-08 02:34:01 UTC
Right, but maybe it was enabled for the ata port and child devices, including all those attached to the port, such as the scsi target.
Comment 29 Aaron Lu 2013-03-08 02:37:15 UTC
(In reply to comment #28)
> Right, but maybe it was enabled for the ata port and child devices, including
> all those attached to the port, such as the scsi target.

Oh no, it didn't do that, it just enable async suspend for itself :-)

Hang on for a moment, I want to cook up a debug patch for you to test. I want to see the device suspend sequence for the two different cases. Which kernel tree is convenient for you? Thanks.
Comment 30 bladud 2013-03-08 05:14:39 UTC
Hokay, you're the dude who knows what he's doing, so I shall believe you :)

I seem to be running 3.7.10 at the moment, so you could send me a patch based on that?

My kernel source is:
http://www.kernel.org/pub/linux/kernel/v3.x/linux-3.7.tar.xz
http://www.kernel.org/pub/linux/kernel/v3.x/patch-3.7.10.xz

thanks
Comment 31 Aaron Lu 2013-03-08 05:32:05 UTC
Created attachment 94861 [details]
Add debug message when devices are suspended

Please apply this patch, and then disable async suspend for the scsi_target device and suspend, everything should work fine; after resume, enable async suspend for the scsi target device and suspend again, it should fail, attach the full dmesg to this point. Thanks.
Comment 32 bladud 2013-03-10 02:34:19 UTC
Created attachment 95091 [details]
Successful dmesg with debug patch applied

This is with async suspend of the scsi target disabled
Comment 33 bladud 2013-03-10 02:35:04 UTC
Created attachment 95101 [details]
Failed dmesg from with debug patch applited

async scsi target suspend enabled.
Comment 34 bladud 2013-03-13 04:48:34 UTC
Created attachment 95281 [details]
Full successful dmesg with debug patch
Comment 35 bladud 2013-03-13 05:15:50 UTC
Created attachment 95291 [details]
Full failed dmesg with debug patch attached

Note this is the same boot as the successful dmesg, so the first part is identical, and it includes a successful suspend-resume before the failed suspend-resume
Comment 36 Joe Sapp 2013-03-14 02:15:29 UTC
Created attachment 95361 [details]
Results from test in comment #31

I performed the tests in comment #31 back-to-back.  First suspend (which was successful) at line 1425.  Second suspend (which failed) at line 2174.

(Apparently my kernel ring buffer wasn't big enough to hold all the messages, so I copied /var/log/messages and stripped out all the non-kernel messages.  I can post the full log if necessary.)
Comment 37 Aaron Lu 2013-03-14 02:36:46 UTC
(In reply to comment #36)
> Created an attachment (id=95361) [details]
> Results from test in comment #31
> 
> I performed the tests in comment #31 back-to-back.  First suspend (which was
> successful) at line 1425.  Second suspend (which failed) at line 2174.
> 
> (Apparently my kernel ring buffer wasn't big enough to hold all the messages,
> so I copied /var/log/messages and stripped out all the non-kernel messages. 
> I
> can post the full log if necessary.)

Thanks Joe. I think there are some messages missing, as I do not see anywhere ata2 gets suspended. But I think this might have something to do with the sysloger or you mistakenly removed more lines. But it really doesn't matter much now, since we have found the root cause. The nv_swncq_port_suspend looks very suspicious, let's just see if nvidia guys will fix this. If not, I can try to add some quirk to work around this problem, but it's ugly code that I hope we can avoid.

And you can try not to use swncq mode by adding the following to kernel command line to see if it makes suspend OK now(you do not need to disable async suspend for scsi target with this command line, but it may affect performance, I'm not sure about this):
sata_nv.adma_enabled=1 sata_nv.swncq=0
Comment 38 Aaron Lu 2013-03-14 02:38:14 UTC
(In reply to comment #37)
> sata_nv.adma_enabled=1 sata_nv.swncq=0

Should be:
sata_nv.adma=1 sata_nv.swncq=0
Comment 39 Aaron Lu 2013-03-28 08:25:22 UTC
Looks like the NVIDIA guys are not responsive, so I'm afraid there is nothing I can do here. Please use the workaround udev rule in Comment #20 to avoid the problem.
Comment 40 gojul 2013-05-20 13:21:09 UTC
Hi,

I'm currently using Ubuntu Raring 13.04 with the kernel they provide, 3.8.xxx (labelled as 3.8.0-21-generic) but actually I also used vanilla kernel 3.9 and the problem was the same, the one mentionned in this bug.

However the udev rule provided does not work for me. Indeed subnodes async are not even created within tree /sys/devices/pci0000:00/0000:00:0e.0/ata1/power/

Here's the output of my udevinfo for my HDD /dev/sda :
Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0/2:0:0:0/block/sda':
    KERNEL=="sda"
    SUBSYSTEM=="block"
    DRIVER==""
    ATTR{ro}=="0"
    ATTR{size}=="1953525168"
    ATTR{stat}=="  145696    42290 10355566   855148     5001    17612   276488   452908        0   203148  1307880"
    ATTR{range}=="16"
    ATTR{discard_alignment}=="0"
    ATTR{events}==""
    ATTR{ext_range}=="256"
    ATTR{events_poll_msecs}=="-1"
    ATTR{alignment_offset}=="0"
    ATTR{inflight}=="       0        0"
    ATTR{removable}=="0"
    ATTR{capability}=="50"
    ATTR{events_async}==""

  looking at parent device '/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0/2:0:0:0':
    KERNELS=="2:0:0:0"
    SUBSYSTEMS=="scsi"
    DRIVERS=="sd"
    ATTRS{rev}=="05.0"
    ATTRS{type}=="0"
    ATTRS{scsi_level}=="6"
    ATTRS{model}=="WDC WD1002FAEX-0"
    ATTRS{state}=="running"
    ATTRS{queue_type}=="simple"
    ATTRS{iodone_cnt}=="0x25184"
    ATTRS{iorequest_cnt}=="0x25dfe"
    ATTRS{queue_ramp_up_period}=="120000"
    ATTRS{timeout}=="30"
    ATTRS{evt_media_change}=="0"
    ATTRS{ioerr_cnt}=="0x12"
    ATTRS{queue_depth}=="31"
    ATTRS{vendor}=="ATA     "
    ATTRS{device_blocked}=="0"
    ATTRS{iocounterbits}=="32"

  looking at parent device '/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0':
    KERNELS=="target2:0:0"
    SUBSYSTEMS=="scsi"
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:05.0/ata3/host2':
    KERNELS=="host2"
    SUBSYSTEMS=="scsi"
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:05.0/ata3':
    KERNELS=="ata3"
    SUBSYSTEMS==""
    DRIVERS==""

  looking at parent device '/devices/pci0000:00/0000:00:05.0':
    KERNELS=="0000:00:05.0"
    SUBSYSTEMS=="pci"
    DRIVERS=="sata_nv"
    ATTRS{irq}=="21"
    ATTRS{subsystem_vendor}=="0x1043"
    ATTRS{broken_parity_status}=="0"
    ATTRS{class}=="0x010185"
    ATTRS{consistent_dma_mask_bits}=="32"
    ATTRS{dma_mask_bits}=="32"
    ATTRS{local_cpus}=="00000000,00000000,00000000,00000000,00000000,00000000,00000000,0000000f"
    ATTRS{device}=="0x037f"
    ATTRS{enable}=="1"
    ATTRS{msi_bus}==""
    ATTRS{local_cpulist}=="0-3"
    ATTRS{vendor}=="0x10de"
    ATTRS{subsystem_device}=="0x8239"
    ATTRS{numa_node}=="0"
    ATTRS{d3cold_allowed}=="1"

  looking at parent device '/devices/pci0000:00':
    KERNELS=="pci0000:00"
    SUBSYSTEMS==""
    DRIVERS==""

As you can see there's nothing around attribute "power/async". Am I missing something or has the feature just disapperead from recent kernels ?
Comment 41 Aaron Lu 2013-05-31 09:06:58 UTC
Show me the outout of the following command:
$ ls /sys/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0
Comment 42 gojul 2013-05-31 17:29:59 UTC
Hi,

Here's the output :
julien@pcathlon64:~$ ls /sys/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0
2:0:0:0  power  subsystem  uevent

And under folder power :
julien@pcathlon64:~$ ls /sys/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0/power
autosuspend_delay_ms  control  runtime_active_time  runtime_status  runtime_suspended_time

(Tried with Ubuntu custom Kernel 3.8.xx and with upstream kernel 3.10-rcXX)
Comment 43 azdt 2013-06-25 22:42:49 UTC
hi,

here's another workaround...

I just spent the whole evening trying to solve the same hard drive suspend issue,
and hopefully found this bug tracking. My kernel is 3.9.7-030907-generic,
and as read before the udev rule doesn't work since async nodes do not seem to be created anymore.

Instead I used the /sys/power/pm_async switch, to disable asynchronous suspend and resume for all devices.
echo 0 > /sys/power/pm_async
(added to /etc/rc.local to be permanent) 

Suspend & resume is now ok.

A.
Comment 44 gojul 2013-06-26 16:45:14 UTC
Hi,

I confirm that with the proposed WA everything works.
Comment 45 Aaron Lu 2013-06-27 01:19:24 UTC
(In reply to comment #42)
> Hi,
> 
> Here's the output :
> julien@pcathlon64:~$ ls
> /sys/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0
> 2:0:0:0  power  subsystem  uevent
> 
> And under folder power :
> julien@pcathlon64:~$ ls
> /sys/devices/pci0000:00/0000:00:05.0/ata3/host2/target2:0:0/power
> autosuspend_delay_ms  control  runtime_active_time  runtime_status 
> runtime_suspended_time
> 
> (Tried with Ubuntu custom Kernel 3.8.xx and with upstream kernel 3.10-rcXX)

The async file under power directory requires CONFIG_PM_ADVANCED_DEBUG to be set, it is a per device's control of asynchronous suspend/resume.

The /sys/power/async is a global control of asynchronous suspend/resume for all devices.
Comment 46 gojul 2013-06-27 08:54:01 UTC
Hi Aaron,

Thanks for your help, but actually it turns out that for many distributions the CONFIG_PM_ADVANCED_DEBUG flag is not set, and for maintenance reasons I don't want to recompile myself my kernel just for that.

The workaround proposed by azdt works very well, that's enough. I've published it to Launchpad bugtracker for the bug I've reported, as a possible workaround.

Now I know this is dirty but in the kernel source we should implement some conditional stuff before enabling async support. In object-oriented language, this would be something like :

if (driver.isAsyncSupported()) {
   driver_async_suspend();
}

The default implementation of isAsyncSupported would return true, and for MCP55 it would return false.
Comment 47 Aaron Lu 2013-06-27 09:04:04 UTC
We can do that thing if the MCP55 controller is indeed buggy, but the real problem is, the driver's suspend routine is buggy, not the hardware. And its developer doesn't seem care about this(I have raised this question to the original developer but didn't get any response). That's the reason I would not like to add a quirk for MCP55, instead, the buggy code should be fixed.
Comment 48 David Tombs 2013-12-17 01:45:52 UTC
Another Ubuntu user here affected by this bug. Completely disabling async suspend also worked for me. Thank you, Aaron Lu, for all the assistance!

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1133835
Comment 49 Mark VIz 2014-11-29 14:32:00 UTC
Hi everybody,

It's an old thread but I've found it useful..

I've tried the echo 0 > /sys/power/pm_async WA, something has changed but the pc wakes up after about one minute with a strange behaviour (the cpu fan gets faster and the hd makes a strange noise before waking up). This is the log output:

[  194.677414] PM: Syncing filesystems ... done.
[  194.687760] PM: Preparing system for mem sleep
[  194.687877] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  194.689792] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  194.691089] PM: Entering mem sleep
[  194.691115] Suspending console(s) (use no_console_suspend to debug)
[  194.708200] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  194.708362] sd 0:0:0:0: [sda] Stopping disk
[  195.887900] serial 00:08: disabled
[  195.887907] serial 00:08: System wakeup disabled by ACPI
[  195.888375] parport_pc 00:04: disabled
[  196.253848] PM: suspend of devices complete after 1562.480 msecs
[  196.254043] PM: late suspend of devices complete after 0.210 msecs
[  196.269436] ehci-pci 0000:00:0b.1: System wakeup enabled by ACPI
[  196.285492] ohci-pci 0000:00:0b.0: System wakeup enabled by ACPI
[  196.301632] PM: noirq suspend of devices complete after 47.558 msecs
[  196.301709] ACPI: Preparing to enter system sleep state S3
[  196.735888] PM: Saving platform NVS memory
[  196.736205] Disabling non-boot CPUs ...
[  196.847980] smpboot: CPU 1 is now offline
[  196.848199] ACPI: Low-level resume complete
[  196.848199] PM: Restoring platform NVS memory
[  196.848199] Enabling non-boot CPUs ...
[  196.848199] smpboot: Booting Node 0 Processor 1 APIC 0x1
[  196.860944] CPU1 is up
[  196.861345] ACPI: Waking up from system sleep state S3
[  196.861781] pci 0000:00:00.0: Found disabled HT MSI Mapping
[  196.861784] pci 0000:00:00.0: Enabling HT MSI Mapping
[  196.927295] pci 0000:00:00.0: Found enabled HT MSI Mapping
[  196.927328] pci 0000:00:00.0: Found enabled HT MSI Mapping
[  196.940807] ohci-pci 0000:00:0b.0: System wakeup disabled by ACPI
[  196.957557] ehci-pci 0000:00:0b.1: System wakeup disabled by ACPI
[  196.973514] pci 0000:00:09.0: Found enabled HT MSI Mapping
[  196.973567] pci 0000:00:09.0: Found enabled HT MSI Mapping
[  196.989626] PM: noirq resume of devices complete after 127.838 msecs
[  196.989776] PM: early resume of devices complete after 0.120 msecs
[  196.989930] ohci-pci 0000:00:0b.0: setting latency timer to 64
[  197.014304] ehci-pci 0000:00:0b.1: setting latency timer to 64
[  197.014386] sata_nv 0000:00:0e.0: setting latency timer to 64
[  197.014392] pci 0000:00:10.0: setting latency timer to 64
[  197.561927] forcedeth 0000:00:14.0 eth0: no link during initialization
[  197.765975] parport_pc 00:04: activated
[  197.766538] serial 00:08: activated
[  199.144761] forcedeth 0000:00:14.0 eth0: link up
[  203.685452] ata1: link is slow to respond, please be patient (ready=0)
[  203.918976] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[  203.941735] ata1.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) filtered out
[  204.120675] ata1.00: configured for UDMA/133
[  204.594881] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[  204.600692] ata2.00: ACPI cmd ef/03:46:00:00:00:a0 (SET FEATURES) filtered out
[  204.616899] ata2.00: configured for UDMA/100
[  204.618179] sd 0:0:0:0: [sda] Starting disk
[  204.748092] usb 1-6: reset high-speed USB device number 3 using ehci-pci
[  205.061276] usb 2-2: reset full-speed USB device number 2 using ohci-pci
[  205.303982] PM: resume of devices complete after 8314.206 msecs
[  205.304290] PM: Finishing wakeup.
[  205.304292] Restarting tasks ... done.

Thanks in advance!!
Marco
Comment 50 Mark VIz 2014-11-30 09:38:44 UTC
My problem was related to some uncontrolled USB events.
Disabling 

sudo -s
echo USB0 > /proc/acpi/wakeup
echo USB2 > /proc/acpi/wakeup

resolved everithing.
Thanks

Note You need to log in before you can comment on or make changes to this bug.