Bug 29912

Summary: slow resume from suspend
Product: Power Management Reporter: Mehmet Giritli (mehmet)
Component: Hibernation/SuspendAssignee: Jeff Garzik (jgarzik)
Status: CLOSED DUPLICATE    
Severity: normal CC: tj
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
lspci -nn
dmesg with delayed resume after disabling marvel chip
lspci output after disabling marvel chip
dmesg with delayed resume after disabling jmicron
dmesg with 2.6.36

Description Mehmet Giritli 2011-02-26 10:25:57 UTC
I am attaching my dmesg, which contains two suspend and resume actions.

I am having about 10-15 second delays with black screen until the computer is resumed after I press the button. The computer runs, but the display appears only after the delay. It is the same story when I try to suspend. However, the first and only the first suspend is problem free. All of the following suspends will happen with a 10-15 sec delay as well (the display will go dark, I hear disks stopping immediately but power down happens with the delay).

I had these problems with 2.6.36 as well and now I am having them with 2.6.38-rc6. I just managed to report this now...
Comment 1 Mehmet Giritli 2011-02-26 10:27:27 UTC
Created attachment 49202 [details]
dmesg
Comment 2 Tejun Heo 2011-02-26 11:09:04 UTC
please attach the output of "lspci -nn".  The ahci controller @06:00.0 is repeatedly reporting PHY events but when probed reports that the port is empty.  Is anything connected to the controller?
Comment 3 Mehmet Giritli 2011-02-26 11:15:49 UTC
Created attachment 49212 [details]
lspci -nn
Comment 4 Mehmet Giritli 2011-02-26 11:17:44 UTC
(In reply to comment #2)
> please attach the output of "lspci -nn".  The ahci controller @06:00.0 is
> repeatedly reporting PHY events but when probed reports that the port is
> empty.
>  Is anything connected to the controller?

Well, I have 3 disks and 2 optical drives on my machine. How do I know if any of them is attached to that particular controller? If you can tell me how to check this, it would be great
Comment 5 Tejun Heo 2011-02-26 11:22:56 UTC
Well, all your devices are detected so the port is apparently empty.  The offender is Marvell 88SE9128 6Gbps ahci controller.  Maybe plugging something in there or disabling it from BIOS can solve the issue, but it's very likely to be a faulty controller.  It keeps saying "oh oh! I see a new device. Probe me! Probe me!" and then when the driver comes around "never mind, it wasn't anything".
Comment 6 Mehmet Giritli 2011-02-26 11:27:37 UTC
Will try disabling the chip from bios and report back here if the delay are gone.
Comment 7 Mehmet Giritli 2011-02-26 18:05:30 UTC
Created attachment 49262 [details]
dmesg with delayed resume after disabling marvel chip

Nope, it didnt make any difference. I get some errors still. Looks like different though
Comment 8 Mehmet Giritli 2011-02-26 18:10:17 UTC
Created attachment 49272 [details]
lspci output after disabling marvel chip
Comment 9 Tejun Heo 2011-02-27 08:32:41 UTC
Weird, now ata8 is acting up, which is jmicron ahci which is known to behave.  It's weird to see two different controllers showing the same problem.  Maybe I made a mistake matching the port to controller the first time.  Looking again... Oh yeah, I did.  It was the jmicron one from the beginning.  Sorry about that.

Hmmm... jmb363's are known to behave well with the ahci driver.  It's really unlikely to be a driver problem.  Either the controller is fried somehow or something wonky is attached there (some mobo manufacturers put some hardware backup/raid thingies to those extra SATA connectors and they often don't conform very well to the standard.)

Please disable to jmicron one and see whether the problem goes away.

Thanks.
Comment 10 Mehmet Giritli 2011-02-27 09:01:20 UTC
(In reply to comment #9)
> Weird, now ata8 is acting up, which is jmicron ahci which is known to behave. 
> It's weird to see two different controllers showing the same problem.  Maybe
> I
> made a mistake matching the port to controller the first time.  Looking
> again... Oh yeah, I did.  It was the jmicron one from the beginning.  Sorry
> about that.
> 
> Hmmm... jmb363's are known to behave well with the ahci driver.  It's really
> unlikely to be a driver problem.  Either the controller is fried somehow or
> something wonky is attached there (some mobo manufacturers put some hardware
> backup/raid thingies to those extra SATA connectors and they often don't
> conform very well to the standard.)
> 
> Please disable to jmicron one and see whether the problem goes away.
> 
> Thanks.

Will try it now but just to make it clear, the jmb362 chip is for 2 esata ports on my board. My mainboard is here: http://www.gigabyte.com/products/product-page.aspx?pid=3258&dl=1#sp
Comment 11 Mehmet Giritli 2011-02-27 17:25:40 UTC
It seems like there are no more errors but I still have the same delay. So, it would seem that the errors and the selay are not related after all.

I would like to point out that I have:

ata6: link is slow to respond, please be patient (ready=0)

in my logs. Do you think it might be related?
Comment 12 Mehmet Giritli 2011-02-27 17:28:26 UTC
Created attachment 49542 [details]
dmesg with delayed resume after disabling jmicron
Comment 13 Tejun Heo 2011-02-28 09:01:29 UTC
I see.  Hmmm, so the errors are gone now.  The link is slow to respond message during resume is normal.  The controller is waiting for the drive to spin up.  It takes some seconds for hard drives to spin up, especially ones with a lot of platters (the drive is early 1TB one, right?).  Unfortunately, libata currently doesn't support asynchronous resume and waits for drives to spin up during resume so the delay is visible.  There are plans to make it asynchronous to the rest of resume but it'll take some time.  So, the delay itself is an expected behavior if you have harddrives which spin up slowly.

Thanks.
Comment 14 Mehmet Giritli 2011-03-03 08:55:24 UTC
Hi,

I has dome time to experiment with 2.6.36 kernels again and I can not reproduce this bug at all. I havent tried 2.6.37 kernel yet. I am attaching dmesg again containing 2 suspend resume cycles.

I still got "link is slow to respond, please be patient (ready=0)" messages and no delay, so it seems that this bug has nothing to do with "storage". 

Tejun, I think I will change the product to PM in bugzilla, if you have nothing else to comment on at this point...
Comment 15 Mehmet Giritli 2011-03-03 08:56:43 UTC
Created attachment 49982 [details]
dmesg with 2.6.36
Comment 16 Mehmet Giritli 2011-03-05 10:08:02 UTC
I am sorry but I couldnt change to the default assignee, bugzilla wouldnt let me...So if someone can help...
Comment 17 Mehmet Giritli 2011-03-06 14:15:55 UTC
I am now pretty sure that this is my problem:

https://bugzilla.kernel.org/show_bug.cgi?id=30032

Couldnt verify yet...
Comment 18 Mehmet Giritli 2011-03-06 16:21:44 UTC
(In reply to comment #17)
> I am now pretty sure that this is my problem:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=30032
> 
> Couldnt verify yet...

Confirmed now..so closing this bug report
Comment 19 Mehmet Giritli 2011-03-06 16:22:14 UTC

*** This bug has been marked as a duplicate of bug 30032 ***