Bug 43278

Summary: After upgrade from 3.3.0 to 3.4.0 I can't wake-up computer with keyboard
Product: Power Management Reporter: Dâniel Fraga (fragabr)
Component: Hibernation/SuspendAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: normal CC: bug-track, jmarcet, rat.o.drat, rjw, rui.zhang, stern
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.4.0 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: 3.3.0 /sys/devices output
3.4.0 /sys/devices output
3.3.0 S3 suspend
3.4.0 S3 suspend
ACPI dump output
ACPI / PCI / PM: Check _SxD/_SxW for devices whose power states aren't managed by ACPI
ACPI / PCI / PM: Check _SxD/_SxW for devices whose power states aren't managed by ACPI
map acpi_d2 to pci_d0
acpi-fit-SxD-SxW-check-to-ACPI-plain
ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification
Log choice of ACPI states for suspend
PCI / PM: Make platform choose target low-power states of more devices
dmesg with patch from comment #57 and log patch from Alan: works
dmesg with patch from comment #57 and #60 and also Alan log patch (doesn't work)
sudo lspci -vv -s 1d.0 requested by Alan
acpidump with new (latest) bios (3304)
dmesg with NEW BIOS and patch from comment #57 and #60 and also Alan log patch (doesn't work)
dsdt-parts-P8V68-V
dsdt-parts-ux31e
dmidecode as requested by Rafael

Description Dâniel Fraga 2012-05-22 19:07:10 UTC
I use the Asus P8V68-V PRO/GEN3 motherboard. With kernel 3.3.0 I could press any key and the computer would wake-up from S3. Now with kernel 3.4.0 it doesn't wake up anymore.

I didn't change anything in the BIOS neither in the system.

How can I help you to debug that? Any hints? Thanks.
Comment 1 Rafael J. Wysocki 2012-05-22 19:12:21 UTC
What kind of keyboard is that?  Is it a USB keyboard or PS/2?
Comment 2 Dâniel Fraga 2012-05-22 19:24:27 UTC
Hi Rafael. It's a USB keyboard (Logitech Access Keyboard 600).

If you need more info, just ask. Thanks.
Comment 3 Rafael J. Wysocki 2012-05-22 20:34:18 UTC
Please run

$ find /sys/devices -name wakeup -print -exec cat {} \;

on both the working and failing systems and see if there are any differences.
Comment 4 Alan Stern 2012-05-22 21:30:29 UTC
Also, it would help if you could build a kernel with CONFIG_USB_DEBUG enabled and attach the dmesg log showing what happens during system suspend and resume.
Comment 5 Dâniel Fraga 2012-05-23 20:35:00 UTC
(In reply to comment #3)
> Please run
> 
> $ find /sys/devices -name wakeup -print -exec cat {} \;
> 
> on both the working and failing systems and see if there are any differences.

Rafael, both systems show the same output (the only difference is the order -- I will attach both files sys-3.3.0.txt and sys-3.4.0.txt).

I activated CONFIG_USB_DEBUG and the dmesg is also attached:
suspend-3.3.0.txt and suspend-3.4.0.txt

Thanks.
Comment 6 Dâniel Fraga 2012-05-23 20:36:02 UTC
Created attachment 73365 [details]
3.3.0 /sys/devices output

kernel 3.3.0
find /sys/devices -name wakeup -print -exec cat {} \;
Comment 7 Dâniel Fraga 2012-05-23 20:37:34 UTC
Created attachment 73366 [details]
3.4.0 /sys/devices output

kernel 3.4.0
find /sys/devices -name wakeup -print -exec cat {} \;
Comment 8 Dâniel Fraga 2012-05-23 20:38:02 UTC
Created attachment 73367 [details]
3.3.0 S3 suspend
Comment 9 Dâniel Fraga 2012-05-23 20:38:26 UTC
Created attachment 73368 [details]
3.4.0 S3 suspend
Comment 10 Alan Stern 2012-05-24 15:42:00 UTC
Here's the important difference.  From the 3.3 log:

ehci_hcd 0000:00:1d.0: wakeup: 1
ehci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1d.0: --> PCI D3hot

From the 3.4 log:

ehci_hcd 0000:00:1d.0: Staying in PCI D0

This message was added by commit 151b61284776, which was the original workaround for the ASUS computer problem.  If that commit is reverted, does the problem go away?  It's scheduled to be replaced by commit 46730c8667e72 from the linus-next tree.
Comment 11 Dâniel Fraga 2012-05-24 20:09:12 UTC
(In reply to comment #10)

> This message was added by commit 151b61284776, which was the original
> workaround for the ASUS computer problem.  If that commit is reverted, does
> the
> problem go away?  It's scheduled to be replaced by commit 46730c8667e72 from
> the linus-next tree.

Hi Stern. Since I'm using the stock 3.4.0 kernel, would you mind to generate this patch for me so I can apply it here? It seems it's not possible to generate this commit patch through git alone without having the repository here.

Thank you.
Comment 12 Alan Stern 2012-05-24 20:43:42 UTC
You could get the patch for the new commit through the gitweb interface at kernel.org.  Or you can get it directly from here:

   http://marc.info/?l=linux-pm&m=133582191808988&w=2
Comment 13 Dâniel Fraga 2012-05-24 21:11:15 UTC
Alan, (In reply to comment #12)
> You could get the patch for the new commit through the gitweb interface at
> kernel.org.  Or you can get it directly from here:
> 
>    http://marc.info/?l=linux-pm&m=133582191808988&w=2

Alan, I applied the patches from the link provided, but it didn't help. 

Any other suggestion? Thanks.
Comment 14 Alan Stern 2012-05-24 21:21:35 UTC
Did you remove commit 151b61284776 first?  That's the one which is causing your problem.
Comment 15 Dâniel Fraga 2012-05-24 21:49:30 UTC
(In reply to comment #14)
> Did you remove commit 151b61284776 first?  That's the one which is causing
> your
> problem.

Hi Stern, sorry, no (I thought it was just enough to apply those patches), but that's the problem. I couldn't reach the commit 151b61284776. Could you help me?

I tried to search for 151b61284776 through the:

http://git.kernel.org/

so I could download de patch to revert it, but I couldn't find it. Could you point me the exact url, if it would not be much effort? If you point me how can I get this commit (patch) I can apply patch -R and revert it. Thanks!
Comment 16 Alan Stern 2012-05-25 14:42:34 UTC
Here's the URL:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=151b61284776

For downloading you'll want to click on the "patch" link.
Comment 17 Dâniel Fraga 2012-05-25 19:01:52 UTC
(In reply to comment #16)
> Here's the URL:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=151b61284776
> 
> For downloading you'll want to click on the "patch" link.

Bingo! Removing commit 151b61284776 fixes the problem.

BUT with the patches provided on (with commit 151b61284776 removed at the same time):

http://marc.info/?l=linux-pm&m=133582191808988&w=2

it doesn't work anymore.

So commit 151b61284776 causes the problem, but the new patches on the link above (marc.info) also cause trouble.

If you need more testing, just provide other patches and I'll test as long as you want. Thanks.
Comment 18 Alan Stern 2012-05-25 19:10:25 UTC
Rafael, the ball is in your court now.  Evidently the "ACPI / PCI: Make _SxD/_SxW check follow ACPI 4.0a spec" commit (46730c8667e72) breaks wakeup on this machine by preventing the USB host controller from going into D3.
Comment 19 Rafael J. Wysocki 2012-05-25 19:29:46 UTC
Well.

It looks we just have to drop that commit, then?
Comment 20 Rafael J. Wysocki 2012-05-25 19:37:07 UTC
Dâniel, please attach the output of acpidump from the affected system (as plain text).
Comment 21 Rafael J. Wysocki 2012-05-25 20:04:08 UTC
OK, so I didn't notice that before, but the first hunk of the patch at http://marc.info/?l=linux-pm&m=133582191808988&w=2 is pointless, because it duplicates code that's already there in acpi_pm_device_sleep_state().
Comment 22 Rafael J. Wysocki 2012-05-25 20:08:31 UTC
Moreover, it is wrong, because it fails to check adev->wakeup.sleep_state, which it should do.
Comment 23 Dâniel Fraga 2012-05-25 20:18:33 UTC
Created attachment 73395 [details]
ACPI dump output
Comment 24 Rafael J. Wysocki 2012-05-25 20:18:55 UTC
Created attachment 73396 [details]
ACPI / PCI / PM: Check _SxD/_SxW for devices whose power states aren't managed by ACPI

Can you please check if this patch also breaks things for you?
Comment 25 Dâniel Fraga 2012-05-25 20:20:29 UTC
(In reply to comment #20)
> Dâniel, please attach the output of acpidump from the affected system (as
> plain
> text).

Done ;)
Comment 26 Rafael J. Wysocki 2012-05-25 20:38:17 UTC
Yes, thanks.

So, apparently, there is an _S3D for your USB controller in the DSDT, but _S3W is missing.  At the same time, _PRW tells use that S4 is the lowest sleep state it can wake up from.

According to ACPI 4 and later we shouldn't put that device into low-power states deeper that D2 for wakeup, but these are _ACPI_ low-power states, which may or may not be the same as PCI low-power states.

Anyway, I suppose the patch from comment #24 will break things for you, won't it?
Comment 27 Dâniel Fraga 2012-05-25 20:43:20 UTC
(In reply to comment #26)
> Yes, thanks.
> 
> So, apparently, there is an _S3D for your USB controller in the DSDT, but
> _S3W
> is missing.  At the same time, _PRW tells use that S4 is the lowest sleep
> state
> it can wake up from.
> 
> According to ACPI 4 and later we shouldn't put that device into low-power
> states deeper that D2 for wakeup, but these are _ACPI_ low-power states,
> which
> may or may not be the same as PCI low-power states.
> 
> Anyway, I suppose the patch from comment #24 will break things for you, won't
> it?

You're right. I tested in 2 different ways:

1) removing commit 151b61284776 *and* applying the patch from comment #24 (it works *because* I removed commit 151b61284776)

2) applying your patch directly without removing commit 151b61284776, it doesn't work, as you predicted

So, what else we can test? Thanks.
Comment 28 Rafael J. Wysocki 2012-05-25 21:02:26 UTC
Well, the situation is this:

If we are supposed to do the

-	if (platform_pci_power_manageable(dev)) {
+	/*
+	 * According to ACPI 4.0a,7.2 Device Power Management Objects, device
+	 * with wake capability should have _PRW or _PSW object and can have
+	 * _SxD or _SxW object.
+	 * It looks like some OEMs use this fields to avoid buggy Dx states
+	 * of devices, so we need to check for _PRW or _PSW and see if _SxD or
+	 * _SxW indicate to overwrite Dx.
+	 */
+	if (platform_pci_power_manageable(dev)
+	    || (device_may_wakeup(dev) && platform_pci_can_wakeup(dev))) {

change in pci_target_state(), then your DSDT is buggy, because the _S3D method for your USB controller should return 3.

However, if we are supposed to ignore _SxD for devices whose power states cannot be set through ACPI, which we do now, then we'll just ignore the bug in your DSDT, although we're going to step on hardware/firmware bugs on other Asus machines.

In the face of this I'm not going to push the patch from comment #24 upstream and the only thing we can possibly do is to refine the check introduced by http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commit;h=151b61284776 .

Alan, the floor is yours. :-)
Comment 29 Alan Stern 2012-05-25 21:11:07 UTC
I'm a little confused by your description, because I'm not familiar with the detailed meanings all the various ACPI functions and values.  It's also not clear how we can compare ACPI power levels to PCI power levels.

However...  Do you have any reason to expect that the patch in comment #24 will fail to fix the original problem?  According to what Dâniel wrote, it does seem to fix _his_ problem.
Comment 30 Rafael J. Wysocki 2012-05-25 21:20:24 UTC
Dâniel, does:

> 1) removing commit 151b61284776 *and* applying the patch from comment #24 (it
> works *because* I removed commit 151b61284776)

mean that wakeup worked for you in this configuration?
Comment 31 Dâniel Fraga 2012-05-25 21:41:23 UTC
(In reply to comment #29)
> I'm a little confused by your description, because I'm not familiar with the
> detailed meanings all the various ACPI functions and values.  It's also not
> clear how we can compare ACPI power levels to PCI power levels.
> 
> However...  Do you have any reason to expect that the patch in comment #24
> will
> fail to fix the original problem?  According to what Dâniel wrote, it does
> seem
> to fix _his_ problem.

No Alan. I tested the patch with and witout the commit 151b61284776 and it would only work when I remove commit 151b61284776. So the problem is in the commit 151b61284776...

The patch provided by Rafael doesn't affect anything, neither fix the problem.

The only way I can fix the problem is removing commit 151b61284776 (of course you both know far more than me to determine what exactly in the commit 151b61284776 causes the problem, since on kernel 3.3.0 everything worked fine).
Comment 32 Dâniel Fraga 2012-05-25 21:42:52 UTC
(In reply to comment #30)
> Dâniel, does:
> 
> > 1) removing commit 151b61284776 *and* applying the patch from comment #24
> (it
> > works *because* I removed commit 151b61284776)
> 
> mean that wakeup worked for you in this configuration?

Yes! But notice that in this case I removed commit 151b61284776 from kernel 3.4.0.

If I don't remove this commit, it will not work. So the only way to make it work here is removing commit 151b61284776.
Comment 33 Rafael J. Wysocki 2012-05-25 21:54:16 UTC
Created attachment 73398 [details]
ACPI / PCI / PM: Check _SxD/_SxW for devices whose power states aren't managed by ACPI

(In reply to comment #32)
> (In reply to comment #30)
> > Dâniel, does:
> > 
> > > 1) removing commit 151b61284776 *and* applying the patch from comment #24
> (it
> > > works *because* I removed commit 151b61284776)
> > 
> > mean that wakeup worked for you in this configuration?
> 
> Yes! But notice that in this case I removed commit 151b61284776 from kernel
> 3.4.0.

Well, that's because the comparison between the system sleep to enter and the maximum sleep state the device can wake up from is reversed.

Please (1) revert commit 151b61284776 and (2) apply the patch attached in this comment and check if the wakeup still works.
Comment 34 Dâniel Fraga 2012-05-25 21:59:44 UTC
> Well, that's because the comparison between the system sleep to enter and the
> maximum sleep state the device can wake up from is reversed.
> 
> Please (1) revert commit 151b61284776 and (2) apply the patch attached in
> this
> comment and check if the wakeup still works.

Yes, I already tested this case (2nd case from comment #27 and id did not work):

"2) applying your patch directly without removing commit 151b61284776, it
doesn't work, as you predicted"

So the answer is no, your patch doesn't solve the issue.

Feel free to send more patches, I can test all until we can figure it out a solution. Thank you Rafael!
Comment 35 Rafael J. Wysocki 2012-05-25 22:10:52 UTC
(In reply to comment #34)
> > Well, that's because the comparison between the system sleep to enter and
> the
> > maximum sleep state the device can wake up from is reversed.
> > 
> > Please (1) revert commit 151b61284776 and (2) apply the patch attached in
> this
> > comment and check if the wakeup still works.
> 
> Yes, I already tested this case (2nd case from comment #27 and id did not
> work):
> 
> "2) applying your patch directly without removing commit 151b61284776, it
> doesn't work, as you predicted"
> 
> So the answer is no, your patch doesn't solve the issue.
> 
> Feel free to send more patches, I can test all until we can figure it out a
> solution. Thank you Rafael!

Can you _please_ do as instructed?  That is:

(1) revert commit 151b61284776 and (2) apply the patch from comment #33 and see if the wakeup still works?

This is not about fixing your issue, but about trying to _understand_ what's going on here.  Is that clear enough?
Comment 36 Dâniel Fraga 2012-05-25 22:31:22 UTC
(In reply to comment #35)

> Can you _please_ do as instructed?  That is:
> 
> (1) revert commit 151b61284776 and (2) apply the patch from comment #33 and
> see
> if the wakeup still works?
> 
> This is not about fixing your issue, but about trying to _understand_ what's
> going on here.  Is that clear enough?

Sorry, I missed the patch from comment #33 (this bugzilla interface is awful).

Ok, did exactly as you asked: I applied patch from comment #33 on the 3.4.0 stock kernel and it did *not* work.

Yes, it was clear enough ;) Waiting for new instructions ;)
Comment 37 Rafael J. Wysocki 2012-05-25 22:32:23 UTC
BTW, "revert" means "make sure it's not there" (same as "remove", if you please).
Comment 38 Dâniel Fraga 2012-05-25 22:33:23 UTC
(In reply to comment #37)
> BTW, "revert" means "make sure it's not there" (same as "remove", if you
> please).

Ok, I'll do it now. Wait a minute. Sorry about my misinterpretation.
Comment 39 Dâniel Fraga 2012-05-25 22:40:52 UTC
(In reply to comment #37)
> BTW, "revert" means "make sure it's not there" (same as "remove", if you
> please).

Ok. Kernel 3.4.0 without commit 151b61284776 and with patch from comment #33.

Result: is works.
Comment 40 Rafael J. Wysocki 2012-05-25 22:44:47 UTC
Now, that's a surprise. :-)

OK, we'll need to wait for the reporters in bug #42728 to do their testing of the same patch.
Comment 41 Dâniel Fraga 2012-05-25 23:03:10 UTC
(In reply to comment #40)
> Now, that's a surprise. :-)
> 
> OK, we'll need to wait for the reporters in bug #42728 to do their testing of
> the same patch.

Thanks Rafael. I'm available and I'm glad I can help you solve this. So I'll wait. ;)

Ps: on comment #39 I mistyped "is" instead of "it" but you got it ;)
Comment 42 Oleksij Rempel (fishor) 2012-05-26 06:13:47 UTC
Ok, this dsdt method looks like in asus ux31e but triggers some new bug. hm..
According to this dsdt if we enter S3 sleeps state, we should put usb
controller in acpi_d2 state. Normally we assume acpi_d2 == pci_d2, at least
current code do this. But according to controller documentation (for asus
ux31e, not for hardware in this case) usb controller do not support D2 state.
On my hardware i can pass pci_D2 and it will be just ignored, it will continue
to stay in pci_D0. What if this controller do not support pci_D2 and will cause
this problem?
Comment 43 Oleksij Rempel (fishor) 2012-05-26 06:31:37 UTC
Created attachment 73400 [details]
map acpi_d2 to pci_d0

Can ou please test this patch?
Comment 44 Dâniel Fraga 2012-05-26 07:38:57 UTC
(In reply to comment #43)
> Created an attachment (id=73400) [details]
> map acpi_d2 to pci_d0
> 
> Can ou please test this patch?

Ok. Before I test, do you want me to test this patch with or without 
commit 151b61284776 ?

I need to know that because Rafael asked me to revert commit 151b61284776 from the stock 3.4.0 kernel before testing his patch. Do you want the same or should I apply your patch to the stock 3.4.0 kernel directly without removing anything before?
Comment 45 Oleksij Rempel (fishor) 2012-05-26 07:40:50 UTC
With commit 151b61284776. For example current master git.
Comment 46 Dâniel Fraga 2012-05-26 07:53:06 UTC
(In reply to comment #45)
> With commit 151b61284776. For example current master git.

I tested and it didn't work.
Comment 47 Oleksij Rempel (fishor) 2012-05-26 08:24:16 UTC
Created attachment 73401 [details]
acpi-fit-SxD-SxW-check-to-ACPI-plain

Oh wait. I was assuming commit 151b61284776 is my patch... . So the patch i send you make no difference. With commit 151b61284776 you controller stay in pci_D0 independent on what say dsdt. Lets try my patch to see if pci_D2 works for you.

I attached the plain version of my patch, to make sure we really do D2.
- remove commit 151b61284776 and my previous "remap" patch.
Comment 48 Dâniel Fraga 2012-05-26 08:40:05 UTC
(In reply to comment #47)

> I attached the plain version of my patch, to make sure we really do D2.
> - remove commit 151b61284776 and my previous "remap" patch.

Ok, I removed commit 151b61284776 from stock 3.4.0 kernel and applied only your latest patch (from comment #47) and it works.
Comment 49 Oleksij Rempel (fishor) 2012-05-26 08:55:34 UTC
Ok, thank you for testing!

pci_d2 or pci_d3 with sleesp state S3 work. d0 fails.

So we fails some where here:
	if (acpi_target_sleep_state == ACPI_STATE_S0 ||
	    (device_may_wakeup(dev) &&
	     adev->wakeup.sleep_state <= acpi_target_sleep_state)) {
Comment 50 Rafael J. Wysocki 2012-05-26 11:45:59 UTC
Oleksji, can't you see that the first hunk in your patch is wrong?

It turns out that the patch from comment #33 works both here and in bug #42728, so I think we can use this one eventually.
Comment 51 Rafael J. Wysocki 2012-05-26 11:48:29 UTC
Sorry, not the patch from comment #33, but the one here:

https://bugzilla.kernel.org/attachment.cgi?id=73399

Dâniel, can you please test this one on top of kernel 3.4.0 without commit 151b61284776?
Comment 52 Oleksij Rempel (fishor) 2012-05-26 13:01:39 UTC
(In reply to comment #50)
> Oleksji, can't you see that the first hunk in your patch is wrong?

Rafael, i can see it and i accepted it on mailing list. But you need to accept, even for you it took one month to see it.
This check is not readable. It need at least good comment or other variable names.
Comment 53 Alan Stern 2012-05-26 13:30:23 UTC
In all versions of the patch that I have seen, two useless comments are added to drivers/pci/pci.c with only a one-line change to the actual code -- and that one-line change has no obvious relation to the comments.

Comments should go with the code they apply to, not in the calling routines!
Comment 54 Oleksij Rempel (fishor) 2012-05-26 14:38:47 UTC
Ok. I have no more excuses, promise to digg better next time.
Comment 55 Dâniel Fraga 2012-05-26 20:41:08 UTC
(In reply to comment #48)
> (In reply to comment #47)
> 
> > I attached the plain version of my patch, to make sure we really do D2.
> > - remove commit 151b61284776 and my previous "remap" patch.
> 
> Ok, I removed commit 151b61284776 from stock 3.4.0 kernel and applied only
> your
> latest patch (from comment #47) and it works.

Sorry. I forgot to remove the "remap" patch. I redid the test and it didn't work.
Comment 56 Dâniel Fraga 2012-05-26 20:41:32 UTC
(In reply to comment #51)
> Sorry, not the patch from comment #33, but the one here:
> 
> https://bugzilla.kernel.org/attachment.cgi?id=73399
> 
> Dâniel, can you please test this one on top of kernel 3.4.0 without commit
> 151b61284776?

I tested this new patch and it didn't work.
Comment 57 Rafael J. Wysocki 2012-05-26 20:49:41 UTC
Created attachment 73413 [details]
ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification

Well, I was afraid it wouldn't work.

The attached patch is a part of the patch you have just tested that seems to work for you.  To verify this, please apply it on top of kernel 3.4.0 without commit
151b61284776 and report back.
Comment 58 Alan Stern 2012-05-26 20:58:58 UTC
Created attachment 73414 [details]
Log choice of ACPI states for suspend

While you're testing Rafael's patches, please apply this patch also.

Then, before starting the suspend, do "sudo dmesg -C".  After the resume, save the output from "dmesg" and attach it to this bug report.

In addition, please attach to the bug report the output from "sudo lspci -vv -s 1d.0"
Comment 59 Dâniel Fraga 2012-05-26 21:12:57 UTC
(In reply to comment #57)
> Created an attachment (id=73413) [details]
> ACPI / PM: Make acpi_pm_device_sleep_state() follow the specification
> 
> Well, I was afraid it wouldn't work.
> 
> The attached patch is a part of the patch you have just tested that seems to
> work for you.  To verify this, please apply it on top of kernel 3.4.0 without
> commit
> 151b61284776 and report back.

Rafael, I tested with this patch, it works!

Ps: I'll do now what Alan requested and send the output.
Comment 60 Rafael J. Wysocki 2012-05-26 21:21:44 UTC
Created attachment 73415 [details]
PCI / PM: Make platform choose target low-power states of more devices

Good, thanks for verifying!

Now, please apply this patch (and the Alan's one) on top of the patch from comment #57 and on top of 3.4.0 without commit 151b61284776 and check if wakeup works for you (please attach the output of dmesg containing one suspend/resume cycle regardless of the result).
Comment 61 Dâniel Fraga 2012-05-26 21:42:27 UTC
(In reply to comment #60)
> Created an attachment (id=73415) [details]
> PCI / PM: Make platform choose target low-power states of more devices
> 
> Good, thanks for verifying!
> 
> Now, please apply this patch (and the Alan's one) on top of the patch from
> comment #57 and on top of 3.4.0 without commit 151b61284776 and check if
> wakeup
> works for you (please attach the output of dmesg containing one
> suspend/resume
> cycle regardless of the result).

This one didn't work Rafael.

Ok Rafael and Alan. I'll reply both here (wait for the attachments):

1) dmesg-73413.txt: dmesg output without commit 151b61284776, patch 73413 (comment #57) and Alan's log patch. Works.

2) dmesg-73415.txt: dmesg output without commit 151b61284776, patch 73413 (comment #57), patch 73415 (comment #60) and Alan's log patch. Doesn't work.

3) lspci-output.txt: lspci requested by Alan

Wait for the attachments please.
Comment 62 Dâniel Fraga 2012-05-26 21:43:24 UTC
Created attachment 73416 [details]
dmesg with patch from comment #57 and log patch from Alan: works
Comment 63 Dâniel Fraga 2012-05-26 21:44:02 UTC
Created attachment 73417 [details]
dmesg with patch from comment #57 and #60 and also Alan log patch (doesn't work)
Comment 64 Dâniel Fraga 2012-05-26 21:44:44 UTC
Created attachment 73418 [details]
sudo lspci -vv -s 1d.0 requested by Alan
Comment 65 Rafael J. Wysocki 2012-05-26 21:53:10 UTC
(In reply to comment #63)
> Created an attachment (id=73417) [details]
> dmesg with patch from comment #57 and #60 and also Alan log patch (doesn't
> work)

OK, so clearly, your EHCI controller doesn't handle wakeup correctly if it has been put into power state D2 before suspending the system.

This is a BIOS bug, because _S3D for the controller shouldn't return 2 in that case.

Realistically, the only way to work around this issue I can see is to quirk the controller so that it's not put into D2, but there may be a problem if it identifies in the same way as controllers in other machines that _have_ _to_ be put into D2 to work at all.
Comment 66 Rafael J. Wysocki 2012-05-26 22:11:29 UTC
Or perhaps we should just refine the check added by commit 151b61284776 so that it doesn't match your machine, if possible.

I suppose that would be the simplest thing we could do to fix the problem for you now - so long as it _is_ possible to refine that check this way.

Alan, what do you think?
Comment 67 Alan Stern 2012-05-26 22:37:39 UTC
I'd like to know why the logs in comment #62 and #63 don't show any output from the patch in comment #58.  The only difference that I can see from comment #60's patch would be if platform_pci_can_wakeup() returns true but yet ACPI chooses a state in which the device _can't_ wakeup the system.  But my patch doesn't show ACPI choosing any state at all.

Without knowing what's really going on, it's hard to say just where the problem is.
Comment 68 Alan Stern 2012-05-26 22:44:39 UTC
Oh wait -- I missed it.  The output _is_ there in the second log.

ACPI is telling us to use D2, even though the PME capability shows that the controller can't issue wakeups from D2 (and apparently there's no other platform-specific wakeup mechanism).  This does look like a bug in the BIOS.

Dâniel, can you check to see if any BIOS updates are available for your machine?
Comment 69 Dâniel Fraga 2012-05-26 22:50:32 UTC
(In reply to comment #68)
> Oh wait -- I missed it.  The output _is_ there in the second log.
> 
> ACPI is telling us to use D2, even though the PME capability shows that the
> controller can't issue wakeups from D2 (and apparently there's no other
> platform-specific wakeup mechanism).  This does look like a bug in the BIOS.
> 
> Dâniel, can you check to see if any BIOS updates are available for your
> machine?

Alan and Rafael, my BIOS is 3202 and there is a new one: 3304.

http://www.asus.com/Motherboards/Intel_Socket_1155/P8Z68V_PROGEN3/#download

I'll update to see if it helps, although it doens't mention any fix regarding power management. Then I'll repeat the tests.
Comment 70 Dâniel Fraga 2012-05-26 23:18:50 UTC
Created attachment 73421 [details]
acpidump with new (latest) bios (3304)

I don't know if it matters, but I found this differences between acpidump with old bios (3202) and latest bios (3304). I'll redo all the tests with new BIOS:

fraga@tux ~/src$ diff acpidump.txt acpidump-newbios.txt 
3258c3258
<   0000: 53 53 44 54 92 0a 00 00 01 c8 50 6d 52 65 66 00  SSDT......PmRef.
---
>   0000: 53 53 44 54 92 0a 00 00 01 d9 50 6d 52 65 66 00  SSDT......PmRef.
3261c3261
<   0030: 05 0c 0d 43 50 55 30 49 53 54 20 00 0c 18 00 e8  ...CPU0IST .....
---
>   0030: 05 0c 0d 43 50 55 30 49 53 54 20 00 0c 18 f0 e7  ...CPU0IST .....
Comment 71 Dâniel Fraga 2012-05-26 23:28:21 UTC
Created attachment 73422 [details]
dmesg with NEW BIOS and patch from comment #57 and #60 and also Alan log patch (doesn't work)

This is dmesg output with new bios, patch from comment #57 and #60, Alan log patch... It doesn't work. Alan log patch shows:

ehci_hcd 0000:00:1a.0: suspend root hub
e1000e 0000:00:19.0: ACPI recommends 0 0
e1000e 0000:00:19.0: wake-up capability enabled by ACPI
snd_hda_intel 0000:00:1b.0: ACPI recommends 4 0
PM: suspend of devices complete after 215.727 msecs
PM: late suspend of devices complete after 0.057 msecs
ehci_hcd 0000:00:1d.0: wakeup: 1
ehci_hcd 0000:00:1d.0: ACPI recommends 2 2
ehci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1d.0: wake-up capability disabled by ACPI
ehci_hcd 0000:00:1d.0: --> PCI D0 legacy
ehci_hcd 0000:00:1a.0: wakeup: 1
ehci_hcd 0000:00:1a.0: ACPI recommends 2 2
ehci_hcd 0000:00:1a.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1a.0: wake-up capability disabled by ACPI
ehci_hcd 0000:00:1a.0: --> PCI D0 legacy
PM: noirq suspend of devices complete after 0.504 msecs
ACPI: Preparing to enter system sleep state S3

*****************************************************************

Rafael, do you want me to remove patch from comment #60 and test again?????
Comment 72 Oleksij Rempel (fishor) 2012-05-27 08:05:52 UTC
Just side notice:
- old and new dsdt are identical.
- if we test _SxD fix vs Alans quirk, we get not clean results.
The reason is. This board has 3! Controllers: 2 EHCI and 1 XHCI.

The system was brocken by quirk and syspend was probably like this:
EHC1 = D0
EHC2 = D0
XHCI = D3

after _SxD patch we get:
EHC1 = D2
EHC2 = D2
XHCI = D2  <- this can introduce extra error.

The comment 55 confuses me. If _SxD + remap works and _SxD stand alone do not work then the problem is XHCI.

because after _SxD + remap we should get:
EHC1 = D0
EHC2 = D0
XHCI = D0

comments?
Comment 73 Dâniel Fraga 2012-05-27 08:12:13 UTC
(In reply to comment #72)
> Just side notice:
> - old and new dsdt are identical.
> - if we test _SxD fix vs Alans quirk, we get not clean results.
> The reason is. This board has 3! Controllers: 2 EHCI and 1 XHCI.
> 
> The system was brocken by quirk and syspend was probably like this:
> EHC1 = D0
> EHC2 = D0
> XHCI = D3
> 
> after _SxD patch we get:
> EHC1 = D2
> EHC2 = D2
> XHCI = D2  <- this can introduce extra error.
> 
> The comment 55 confuses me. If _SxD + remap works and _SxD stand alone do not
> work then the problem is XHCI.
> 
> because after _SxD + remap we should get:
> EHC1 = D0
> EHC2 = D0
> XHCI = D0
> 
> comments?

Oleksij XHCI is used only by USB 3.0 right? So, in the BIOS I _disable_ USB 3.0 since I don't use it. Could it be this the problem?

If you want, I can enable on the BIOS USB 3.0 support and see what happens. Do you want me to enable USB 3.0 support in the bios even if I don't use those ports? Or this BIOS setting doesn't matter?

Ps: I always disable everything in the BIOS I don't use to save energy.
Comment 74 Oleksij Rempel (fishor) 2012-05-27 08:26:40 UTC
Hm... then it should not make any difference. But who knows... if nobody try, nobody knows ;)

Or may be you need to make more careful tests. Because "Alans quirk" and "_SxD + remap" should have same results for EHCI (USB2).
Comment 75 Rafael J. Wysocki 2012-05-27 09:38:29 UTC
In the dmesg output from comment #62 (working) we have:

ehci_hcd 0000:00:1d.0: wakeup: 1
ehci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1d.0: --> PCI D3hot
ehci_hcd 0000:00:1a.0: wakeup: 1
ehci_hcd 0000:00:1a.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1a.0: --> PCI D3hot

So clearly, both EHCI controllers go into D3 and wakeup works.

In the dmesg from comment #63 (not working) we have:

ehci_hcd 0000:00:1d.0: wakeup: 1
ehci_hcd 0000:00:1d.0: ACPI recommends 2 2
ehci_hcd 0000:00:1d.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1d.0: wake-up capability disabled by ACPI
ehci_hcd 0000:00:1d.0: --> PCI D0 legacy
ehci_hcd 0000:00:1a.0: wakeup: 1
ehci_hcd 0000:00:1a.0: ACPI recommends 2 2
ehci_hcd 0000:00:1a.0: wake-up capability enabled by ACPI
ehci_hcd 0000:00:1a.0: wake-up capability disabled by ACPI
ehci_hcd 0000:00:1a.0: --> PCI D0 legacy

which is suspicious, because it looks like the controllers are left in D0.  Also, the "wake-up capability disabled by ACPI" lines look quite wrong.

I need to investigate it a bit more, but I don't have the time right now.
Comment 76 Rafael J. Wysocki 2012-05-27 11:36:23 UTC
OK, I know what's up.

Namely, ACPI says that we should use D2, but it follows from the lspci data at https://bugzilla.kernel.org/attachment.cgi?id=73418 that D2 is not supported by the device.  For this reason, pci_set_power_state() returns error code when it is called to put the device into D2 and wakeup is disabled accordingly.

Concluding, it looks like we should check the results returned by acpi_pm_device_sleep_state() against the actual capabilities of the hardware and choose something that actually is supported.

I think the algorithm should be:
- if the range of states returned by acpi_pm_device_sleep_state() is not supported by the device, go into the next deeper (lower-power) supported state.
- if the device is configured for wakeup and wakeup is not supported by it from the range of states returned by acpi_pm_device_sleep_state(), go into the next deeper (lower-power) state that wakeup is supported from.

I'll try to prepare a patch along these lines later today or tomorrow.
Comment 77 Oleksij Rempel (fishor) 2012-05-27 13:50:07 UTC
With other words, what PCI_D-state will it choice?
Do not forged the other bug where this problem is actually started. It has same controller, same pciid and same dsdt-_SxD entry. If we enter on other devoice D3, then we have problem with other hardware.
Comment 78 Oleksij Rempel (fishor) 2012-05-27 13:52:53 UTC
This bug:
D0 - bad, D2 - bad, D3 - ok.

the asus ux31e bug:
D0 - ok, D2 - ok, D3 - bad.

how can we fit in it?
Do actually suspend works with windows on this board?
Comment 79 Alan Stern 2012-05-27 16:36:26 UTC
There are two separate questions here:

   1. What device power states are allowed during a particular system sleep state?

   2. What device power state/system sleep state combinations support wakeup?

For example, on the ux31e machines we are now ruling out D3 because the _SxW method says D3 doesn't support wakeup.  (_SxW returns the highest-numbered state in which the device can wake up the system, right?)  But what if the controller's power/wakeup attribute was set to disabled?  Then we wouldn't check _SxW, so the controller would be put into D3 anyway and the machine would crash.

Oleksij, have you carefully compared the ACPI tables on Dâniel's machine with the tables on any of the ux31e machines?  Maybe there's a difference that will help us to distinguish these two cases.

On the other hand, when the ACPI tables are simply wrong it's hard to know what to do.
Comment 80 Oleksij Rempel (fishor) 2012-05-27 19:28:17 UTC
Created attachment 73426 [details]
dsdt-parts-P8V68-V

Here are part of dsdt for P8V68-V (current topic). Here are only parts relevant for power management.
Comment 81 Oleksij Rempel (fishor) 2012-05-27 19:30:31 UTC
Created attachment 73427 [details]
dsdt-parts-ux31e

and this is dsdt for ux31e.
The EHC1 part is absolutely identical.
Comment 82 Oleksij Rempel (fishor) 2012-05-27 19:59:19 UTC
There are some differences:
- ux31e rewrite device suspend state only for EHC1. There is XHCI too, there is no _SxD/W methods for it.
- P8V68-V defines EUSB with address of EHC1, with same device states.
- P8V68-V defines device states for EHC2 and XHC too.


One question. Depends on how colled _PSW method it writes some thing to PCI_config (with arg, then it write "Ones", if no arg, then "Zero"). Should we actually call it before we set acpi_S3 state? See attached dsdt parts.
If we wont set brocken device to D3 then we should call _PSW and disable wake capabilities. If we wont wake then we should call _PSW and set device in D2 state. Even if we know wake will not work. Is it correct?
Comment 83 Rafael J. Wysocki 2012-05-27 20:43:09 UTC
My opinion is that we should use the states returned by _SxD/_SxW, so long as they aren't disabled in the PCI config space (or we want the device to wake up, but that is not possible from those states).  The question is how to handle corner cases when something doesn't match etc.

First, I think we should fall back to deeper (lower-power) states if the state returned by _SxD is not supported in hardware.  The reason is that the _SxD return value means "do not put the device into a shallower state that this" and it doesn't prevent us from using a deeper state.  In that case, if _SxW is not present we really don't know what the lower limit is, so we can use whatever we
find suitable.  If SxW is there and returns something that is supported, we should use that one.

Another problem is if the states returned by _SxD/_SxW are supported (i.e. the device can be put into those states), but it cannot generate PME from there.  This is more of a gray area, but I'm inclined to fall back to the current behavior, i.e. find the lowest-power state that PMEs can be generated from and use this one.

Now, for ux31e systems, it is sufficient to use D0 if both _SxD/_SxW are not present.  This actually is not against the spec, so we can do it.
Comment 84 Rafael J. Wysocki 2012-05-28 19:19:10 UTC
I have one more idea.

Dâniel, please attach the output of dmidecode from your machine (as plain text).
Comment 85 Dâniel Fraga 2012-05-28 21:56:58 UTC
Created attachment 73449 [details]
dmidecode as requested by Rafael
Comment 86 Alan Stern 2012-06-18 20:13:08 UTC
This has now been fixed by commit c2fb8a3fa25513de8fedb38509b1f15a5bbee47b (USB: add NO_D3_DURING_SLEEP flag and revert 151b61284776be2) in 3.5-rc3.  The bug report can be closed.
Comment 87 Javier Marcet 2012-06-27 09:18:22 UTC
Please don´t. Commit c2fb8a3fa25513de8fedb38509b1f15a5bbee47b breaks many other boards which are not affected by this bug.

I have an Asus P8H67-M board which I use as HTPC/NAS. I turn it on with a bluetooth remote and both c2fb8a3fa25513de8fedb38509b1f15a5bbee47b and 151b61284776be2d6f02d48c23c3625678960b97 break it unnecessarily.

IMHO this should be done with a blacklist only since not even all boards with the same chipset (Sandy Bridge in my case) and from the same manufacturer (Asus) are affected.

If you want any additional info from this system (dmesg, lspci, dmidecode, acpidump, ...), please tell.

I remember having been affected by http://www.acpica.org/bugzilla/show_bug.cgi?id=937 It was fixed upstream on 3.3, though.
Comment 88 Alan Stern 2012-07-23 19:26:37 UTC
Now that commit dbf0e4c7257f8d684ec1a3c919853464293de66e (PCI: EHCI: fix crash during suspend on ASUS computers) is in Linus's kernel, and since it appears to be a final valid fix for the problem, this bug report can be closed out once and for all.
Comment 89 Zhang Rui 2012-11-28 13:33:25 UTC
(In reply to comment #88)
> Now that commit dbf0e4c7257f8d684ec1a3c919853464293de66e (PCI: EHCI: fix
> crash
> during suspend on ASUS computers) is in Linus's kernel, and since it appears
> to
> be a final valid fix for the problem, this bug report can be closed out once
> and for all.

Bug closed.
Comment 90 Len Brown 2013-02-08 18:07:30 UTC
*** Bug 43250 has been marked as a duplicate of this bug. ***