Bug 26432
Summary: | 2.6.36 regression: reboot after poweroff - HP6930p | ||
---|---|---|---|
Product: | ACPI | Reporter: | drago01 |
Component: | Power-Off | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | aaron.lu, alan, lenb, rjw, rui.zhang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.1.9 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 16055 | ||
Attachments: |
dmidecode
lspci -vvvv dmesg on 2.6.37 /proc/acpi/wakeup running 2.6.35 /proc/acpi/wakeup running 2.6.37 acpidump output /proc/acpi/wakeup running 3.8.5 debug patch to check who installs gpe handler |
Description
drago01
2011-01-09 20:45:26 UTC
Created attachment 43012 [details]
dmidecode
Created attachment 43022 [details]
lspci -vvvv
*** Bug 26422 has been marked as a duplicate of this bug. *** (In reply to comment #3) > *** Bug 26422 has been marked as a duplicate of this bug. *** Ugh ... sorry for the double filling after submitting the first one I got a white page (and no mail) so I was not sure whether the bug was actually filled. still an issue with 2.6.37? can you bisect which commit broke poweroff between 2.6.35 and 2.6.36? Created attachment 43942 [details]
dmesg on 2.6.37
This also happens on a Sony VAIO SR29VN. The last working release was 2.6.36_rc7 according to the complaining user. The first broken was 2.6.36. 2.6.37 still has the issue. dmesg is attached.
(In reply to comment #6) > Created an attachment (id=43942) [details] > dmesg on 2.6.37 > > This also happens on a Sony VAIO SR29VN. The last working release was > 2.6.36_rc7 according to the complaining user. The first broken was 2.6.36. > 2.6.37 still has the issue. dmesg is attached. Yeah I didn't test any rc releases yet, as I stated in comment #1 it is broken in .37 too. Please post the contents of /proc/acpi/wakeup . Created attachment 44082 [details]
/proc/acpi/wakeup running 2.6.35
Here he output for 2.6.25 i.e the working case.
Created attachment 44092 [details]
/proc/acpi/wakeup running 2.6.37
Here the broken case (2.6.37)
(In reply to comment #9) > Created an attachment (id=44082) [details] > /proc/acpi/wakeup running 2.6.25 > > Here he output for 2.6.25 i.e the working case. This shoudl read 2.6.*35* For the record 2.6.38-rc2 is also broken. It's great that kernel bugzilla is back. can you please verify if the problem still exists in the latest upstream kernel? (In reply to comment #13) > It's great that kernel bugzilla is back. > > can you please verify if the problem still exists in the latest upstream > kernel? It is still present in 3.1.9 .. have not tested 3.2.x yet. (In reply to comment #14) > (In reply to comment #13) > > It's great that kernel bugzilla is back. > > > > can you please verify if the problem still exists in the latest upstream > > kernel? > > It is still present in 3.1.9 .. have not tested 3.2.x yet. Hi, If it's still a problem on latest kernel, can you please do the git bisect as suggested in comment #5 between 2.6.35 and 2.6.36? Here is a link on how to do git bisect: http://git-scm.com/book/en/Git-Tools-Debugging-with-Git#Binary-Search You can start with: $ git bisect start $ git bisect bad v2.6.36 $ git bisect good v2.6.35 Thanks. (In reply to comment #15) > (In reply to comment #14) > > (In reply to comment #13) > > > It's great that kernel bugzilla is back. > > > > > > can you please verify if the problem still exists in the latest upstream > > > kernel? > > > > It is still present in 3.1.9 .. have not tested 3.2.x yet. > Hi, > > If it's still a problem on latest kernel, Yes still happens in 3.8.2 > can you please do the git bisect as > suggested in comment #5 between 2.6.35 and 2.6.36? I am afraid I can't. The kernel is way to old for my current userspace, won't even boot. please attach the acpidump output. please attach the /proc/acpi/wakeup output in the latest kernel that you're running. Created attachment 98521 [details]
acpidump output
Here is a dump generated while running 3.8.5
first, I think it is commit f517709d65beed95f52f021b43e3035b52ef791a that allows wake gpes other than power/sleep/lid buttons to be run_wake gpe. And the commit was introduced in 2.6.34. this explains why there are so many "*" in the /proc/acpi/wakeup. Second, the only difference of /proc/acpi/wakeup between 2.6.35 and 2.6.37 is that -USB6 S0 disabled pci:0000:00:1a.2 +USB6 S0 *disabled pci:0000:00:1a.2 I checked the BIOS and _PRW for USB6 should use GPE 0x20 as wake GPE. But there is no AML handler for this GPE. So I suspect that some drivers, e.g. driver for pci:0000:00:1a.2, installs the GPE handlers for the wakeup GPE sometime between 2.6.35 and 2.6.36, and this explains why the wake GPE for USB6 becomes run_wake. But I can not find any USB code that invokes acpi_install_gpe_handler(). drago01, can you please attach the output of "grep . /sys/firmware/acpi/interrupts/*" in both 2.6.35 and 2.6.37? I think GPE20 should be invalid in 2.6.35 and enabled in 2.6.37. BTW, please attach the output of /proc/acpi/wakeup in 3.8.5. Created attachment 98591 [details]
/proc/acpi/wakeup running 3.8.5
Created attachment 98601 [details]
debug patch to check who installs gpe handler
can you please try to apply this debug patch on top of 2.6.37 and attach the dmesg output after boot?
USB1 S0 *enabled pci:0000:00:1d.0 USB2 S0 *enabled pci:0000:00:1d.1 USB3 S0 *enabled pci:0000:00:1d.2 USB4 S0 *enabled pci:0000:00:1a.0 USB5 S0 *enabled pci:0000:00:1a.1 USB6 S0 *enabled pci:0000:00:1a.2 EHC1 S0 *enabled pci:0000:00:1d.7 EHC2 S0 *enabled pci:0000:00:1a.7 can you reproduce the problem in 3.8.5 after disabling all of this? (for example, you can disable USB1 by echo USB1 > /proc/acpi/wakeup) (In reply to comment #23) > USB1 S0 *enabled pci:0000:00:1d.0 > USB2 S0 *enabled pci:0000:00:1d.1 > USB3 S0 *enabled pci:0000:00:1d.2 > USB4 S0 *enabled pci:0000:00:1a.0 > USB5 S0 *enabled pci:0000:00:1a.1 > USB6 S0 *enabled pci:0000:00:1a.2 > EHC1 S0 *enabled pci:0000:00:1d.7 > EHC2 S0 *enabled pci:0000:00:1a.7 > > can you reproduce the problem in 3.8.5 after disabling all of this? > (for example, you can disable USB1 by echo USB1 > /proc/acpi/wakeup) I can only disable USB5 (all others remain enabled in /proc/acpi/wake even if I try to disable them). As for whether it helps it worked in 4 out of 4 attempts. Which does not mean much as it does not always happen. If it helps I can run with it for a few days and report back. (In reply to comment #24) > As for whether it helps it worked in 4 out of 4 attempts. Which does not mean > much as it does not always happen. Oh, the machine powers off properly most of time and reboot occasionally? I'm not aware of this before. can you tell me in what percentage the bug occurs? (In reply to comment #25) > (In reply to comment #24) > > > As for whether it helps it worked in 4 out of 4 attempts. Which does not > mean > > much as it does not always happen. > > Oh, the machine powers off properly most of time and reboot occasionally? Yeah. > I'm not aware of this before. Sorry seems like I indeed didn't mention that anywhere. > can you tell me in what percentage the bug occurs? Not sure something like 20% or 30%. (In reply to comment #24) > (In reply to comment #23) > > USB1 S0 *enabled pci:0000:00:1d.0 > > USB2 S0 *enabled pci:0000:00:1d.1 > > USB3 S0 *enabled pci:0000:00:1d.2 > > USB4 S0 *enabled pci:0000:00:1a.0 > > USB5 S0 *enabled pci:0000:00:1a.1 > > USB6 S0 *enabled pci:0000:00:1a.2 > > EHC1 S0 *enabled pci:0000:00:1d.7 > > EHC2 S0 *enabled pci:0000:00:1a.7 > > > > can you reproduce the problem in 3.8.5 after disabling all of this? > > (for example, you can disable USB1 by echo USB1 > /proc/acpi/wakeup) > > I can only disable USB5 (all others remain enabled in /proc/acpi/wake even if > I > try to disable them). > for the other ones, please try the follow commands, take USB1 for example, "echo disabled > /sys/bus/pci/devices/0000\:00\:1d.0/power/wakeup" and see if it helps. can you please do the test in comment #27? (In reply to comment #28) > can you please do the test in comment #27? Sorry for the delay ... yes this works I can disable the others this way. (In reply to comment #29) > (In reply to comment #28) > > can you please do the test in comment #27? > > Sorry for the delay ... yes this works I can disable the others this way. I am now running 3.8.8 and have disabled all of the entries from comment #23 and did multiple poweroff cycles ... seems to work fine so far i.e no reboot instead of power off. okay, then we need to check them one by one to see which device wakes the system up. And IMO, USB6 is the first suspect that worth trying. (In reply to comment #30) > (In reply to comment #29) > > (In reply to comment #28) > > > can you please do the test in comment #27? > > > > Sorry for the delay ... yes this works I can disable the others this way. > > I am now running 3.8.8 and have disabled all of the entries from comment #23 > and did multiple poweroff cycles ... seems to work fine so far i.e no reboot > instead of power off. OK I take that back it happened again today with all of them disabled. Is there anything useful that I can extract after such a reboot?
> OK I take that back it happened again today with all of them disabled.
bad news.
well, the next thing worth trying is to disable runtime PM completely.
could you please rebuild your kernel with CONFIG_PM_RUNTIME=n and see if it helps?
ping... ping... (In reply to comment #35) > ping... Hi, sorry been busy lately will build a kernel and test with CONFIG_PM_RUNTIME=n at the end of this week. Any news? ping... (In reply to comment #37) > Any news? (In reply to comment #38) > ping... I am running 3.9.7 with CONFIG_PM_RUNTIME=n since yesterday. I have not been able to reproduce the bug yet. I have tried multiple boot->poweroff and boot->suspend->resume->poweroff tests and all of them worked as expected. OK, thanks! Let's hope it's been fixed and close it. Please reopen if you can reproduce it. (In reply to comment #40) > OK, thanks! > > Let's hope it's been fixed and close it. Please reopen if you can reproduce > it. So you are saying that there is no other way to fix it other then disabling runtime pm? I'd prefer this is a driver/firmware issue, rather than ACPI problem. If you'd like to continue debugging, the best way is to disable the runtime PM support bus by bus, and check which bus causes this problem, and then file a new bug again that component. s/again/against |