Bug 6881
Summary: | S3 resume: Wake On LAN hangs, button works | ||
---|---|---|---|
Product: | ACPI | Reporter: | fengping hu (hufengping) |
Component: | Power-Sleep-Wake | Assignee: | Robert Moore (Robert.Moore) |
Status: | REJECTED UNREPRODUCIBLE | ||
Severity: | blocking | CC: | acpi-bugzilla |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.17 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
screen0 on wake on lan event
screen1 on wake on lan event screen2 on wake on lan event screen on power button event screen0 with DEBUG info on wake on lan event screen1 with DEBUG info on wake on lan event patch vs 2.6.21-rc1 fixing resume |
Description
fengping hu
2006-07-21 12:04:39 UTC
>on receive the packet, messages show up on screen is:
>Intel Machine check architecture supported
>BUG: soft lockup dected on CPU#0
>ACPI Exeception(evgpe-0678)
>AE_NO_MEMORY unable to queue handler for GPE[B]...
Sounds interesting. Can you give the full log? A photo of the screen is good
enough.
Created attachment 8612 [details]
screen0 on wake on lan event
on receiving wake on LAN magic packet:
screen0 show up
after a while(about 5 seconds)
screen2 show up
after about 1 minute
screen3 show up(keep printing the messages)
On power button
screen-powerbutton show up
then after about 5 second, system is back to normal.
Created attachment 8613 [details]
screen1 on wake on lan event
Created attachment 8614 [details]
screen2 on wake on lan event
Created attachment 8615 [details]
screen on power button event
2.6.18-rc2 fixed the 'wrong open interrupt' bug, which should fix the softlockup issue. For the GPE 0xB issue, I guess 2.6.18-rc2 also could fix it. Can you try 2.6.18-rc2? I just tried 2.6.18-rc2. this time, on receiving the magic packet, the only message show up on screen is: inu and the machine is dead. Wake on power button and keyboard works fine as previous version. >inu
This means CPU already goes into protected mode.
Can you change printk level and retry? There might be other info hidden.
echo "7 4 1 6" > /proc/sys/kernel/printk
changed printk level by echo "7 4 1 6" > /proc/sys/kernel/printk and retry, still no more message showed up on screen. One thing to mention is when I execute command: cat /proc/acpi/wakeup output is: Device Sleep state Status SBTN 4 * enabled PCI1 4 enabled SBRG 4 enabled UAR1 4 disabled USB 4 disabled AC9 4 disabled SMB 4 disabled The sleep state looks not so right. Sleep state 3 is not in the list. However, the manchine respond to Wake on LAN event from suspend to ram only after device PCI1 is enabled. Another thing to memtion is that Wake on lAN works if the machine is in "Soft Off" state by command: shutdown -h now >The sleep state looks not so right. Sleep state 3 is not in the list.
Yes, if it can wakeup system from S4, it also could from S3.
Can you enable NMI watchdog in the system? hopefully watchdog could print some
more info. There is a similar bug report against 2.6.18-rc2 (resume hang), but
I couldn't reproduce the bug here.
I tried nmi_watchdog=1 and nmi_watchdog=2 as kernel parameter, In both cases, NMI value in /proc/interrupts is 0. Does that mean I cann't enable NMI watchdog in the system. I also tried kernel 2.6.17 by adding line "#define DEBUG" in driver/base/sys.c according to instructions in bug 6450. This time more information is printed on screen(see new attachment), I wonder if this should help. Created attachment 8632 [details]
screen0 with DEBUG info on wake on lan event
Created attachment 8633 [details]
screen1 with DEBUG info on wake on lan event
>Does that mean I cann't enable NMI watchdog in the system. You haven't local APIC enabled, so no NMI watchdog. The new log looks very like Bug 6450. I'm wondering if disabling softlockup has any difference. I have test 2.6.18-rc2 with NMI watchdog enabled. No more message is showed up
on screen on wake on LAN event, that is, the only message showed up on screen is:
inu.
> I'm wondering if disabling softlockup has any difference.
Does that mean "make menuconfig" and exclude "Detect Soft Lockups" in "Kernel
debugging". I just test this and it didn't make any difference.
please try this debug patch to test wake on lan. From the bug description and the screen message, this problem is likely relative to PCI1 GPE storm. diff --git a/drivers/acpi/hardware/hwsleep.c b/drivers/acpi/hardware/hwsleep.c index 8bb43ca..092a878 100644 --- a/drivers/acpi/hardware/hwsleep.c +++ b/drivers/acpi/hardware/hwsleep.c @@ -559,11 +559,12 @@ acpi_status acpi_leave_sleep_state(u8 sl } acpi_gbl_system_awake_and_running = TRUE; +#if 0 /*YU debug*/ status = acpi_hw_enable_all_runtime_gpes(); if (ACPI_FAILURE(status)) { return_ACPI_STATUS(status); } - +#endif /* Enable power button */ (void) just test debug patch, wake on lan still hangs:( Did you see the flood of "Unalbe to queue handler for GPE .." message on screen? >Did you see the flood of "Unalbe to queue handler for GPE .." message on
screen?
for kernel 2.6.18-rc2, the only message showed on screen is "inu"
for kernel 2.6.17, I did see the flood of "Unalbe to queue handler for GPE .."
message on screen before I applied this patch.
Since this flood of message show up more than one minute after wake on lan
event, I didn't wait for so long today(I turned the machice off when it was at
screen1 and went home:) ). I will test it tommorow to see if this flood of
message still shows after the patch.
The flood of "Unalbe to queue handler for GPE .." message can be seen in 2.6.17 after the debug patch. Hmm, let's debug further, please try this: diff --git a/drivers/acpi/sleep/main.c b/drivers/acpi/sleep/main.c index 62ce87d..2838e6f 100644 --- a/drivers/acpi/sleep/main.c +++ b/drivers/acpi/sleep/main.c @@ -113,6 +113,8 @@ static int acpi_pm_enter(suspend_state_t if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3)) acpi_clear_event(ACPI_EVENT_POWER_BUTTON); + if (ACPI_SUCCESS(status) && (acpi_state == ACPI_STATE_S3)) + acpi_ev_walk_gpe_list(acpi_hw_clear_gpe_block); local_irq_restore(flags); printk(KERN_DEBUG "Back to C!\n"); just tried second patch. still hangs. I don't know if this provides more information: sometimes when I use command "echo -n mem > /sys/power/state", the machine suspends, but will resume instantly. But if I use command "echo SBTN > /proc/acpi/wakeup" to disable Device SBTN, suspend works fine.(this occurs both before and after the patch) S3 resume: Wake On LAN now works. This is indeed caused by GPE storm. what I did is: modified drivers/acpi/events/evgpe.c acpi_status acpi_ev_disable_gpe(struct acpi_gpe_event_info *gpe_event_info) .... case ACPI_GPE_TYPE_WAKE: ACPI_CLEAR_BIT(gpe_event_info->flags, ACPI_GPE_WAKE_ENABLED); + status = acpi_hw_write_gpe_enable_reg(gpe_event_info); break; .... acpi_ev_disable_gpe is called in acpi_ev_gpe_dispatch to disable gpe so that it doesn't keep firing before the method has a chance to run. However, originally only ACPI_CLEAR_BIT is called, which doesn't prevent the gpe from firing. I also notice a comment in acpi_ev_disable_gpe which says: /* Mark wake-disabled or HW disable, or both */ I am wondering what's the concern that we need only do the Mark but not the HW disable as well. Reassign the issue to bob, as this is the acpica code. Bob, is it possible we disable GPE too late in resume? acpi_leave_sleep_state will disable wake GPE, but it's called very later. Before it's called, we already did many things like restore devices' state. case ACPI_GPE_TYPE_WAKE: The GPE masks are setup so that only "runtime" GPEs are enabled at runtime. A "Wake" GPE should always be disabled at runtime, and only enabled as the system is about to sleep. If the wake GPEs are not getting disabled when the system awakes, this could be a bug. I'm thinking that perhaps it would make sense to disable *all* GPEs at the moment we get a wake GPE (all wake and runtime GPEs). Please try this patch below in acpi_ev_gpe_dispatch: /* Save current system state */ if (acpi_gbl_system_awake_and_running) { ACPI_SET_BIT (gpe_event_info->flags, ACPI_GPE_SYSTEM_RUNNING); } else { ACPI_CLEAR_BIT (gpe_event_info->flags, ACPI_GPE_SYSTEM_RUNNING); + + /* + * We just woke up because of a wake GPE. Disable any further GPEs + * (wake and runtime) until we are fully up and running. + */ + (void) acpi_hw_disable_all_gpes (); } The code in #26 was released in ACPICA version 20060831 The good news is that ACPICA 20060831 shipped in Linux-2.6.21-rc1 the bad news is that we had to revert the code in comment #26 to make suspend/resume work again. Created attachment 10651 [details]
patch vs 2.6.21-rc1 fixing resume
Please re-open if this is still a problem in linux-2.6.22.stable or later. |