Bug 3691
Summary: | PM1_STATUS bits not preserved - S3 resume becomes cold boot - ECS G320 laptop | ||
---|---|---|---|
Product: | ACPI | Reporter: | Matthew Garrett (mjg59-kernel) |
Component: | Power-Sleep-Wake | Assignee: | Shaohua (shaohua.li) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | albeclemit, dagit, jim, jon.nettleton, jrigling, n_shtinkov, pixelmonkey, sogerc1 |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.9 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
acpidmp output from EiSystems 3001
acpidmp output lspci output Test patch acpidmp from AV3250HX-01 kernel 2.6.11.3 dmesg from AV3250HX-01 kernel 2.6.11.3 dmidecode from AV3250HX-01 kernel 2.6.11.3 DSDT from AV3250HX-01 kernel 2.6.11.3 /proc/interrupts from AV3250HX-01 kernel 2.6.11.3 lspci -vv from AV3250HX-01 kernel 2.6.11.3 debug patch Attempt to fix up abnormal power-off flag at boot |
Description
Matthew Garrett
2004-11-03 15:16:04 UTC
>Adding an infinite loop at the start of the wakeup code doesn't appear
>to change behaviour.
So this means the system never get a correct wakeup address, right? I'd like
to check FADT table, please attach your acpidmp
Does this system have latest BIOS and work with WinXP?
Maybe a blind guess, but how about apply below patch: --- a/drivers/acpi/tables/tbconvrt.c.orig 2004-11-12 17:34:52.386045104 +0800 +++ b/drivers/acpi/tables/tbconvrt.c 2004-11-12 17:35:21.091681184 +0800 @@ -522,8 +522,7 @@ acpi_tb_build_common_facs ( acpi_gbl_common_fACS.global_lock = &(acpi_gbl_FACS->global_lock); if ((acpi_gbl_RSDP->revision < 2) || - (acpi_gbl_FACS->length < 32) || - (!(acpi_gbl_FACS->xfirmware_waking_vector))) { + (acpi_gbl_FACS->length < 32)) { /* ACPI 1.0 FACS or short table or optional X_ field is zero */ acpi_gbl_common_fACS.firmware_waking_vector = ACPI_CAST_PTR (u64, &(acpi_gbl_FACS->firmware_waking_vector)); And please print some info in routine 'acpi_set_firmware_waking_vector', I'd like know which branch is taken in your system. Created attachment 4020 [details]
acpidmp output from EiSystems 3001
I've attached the acpi tables (EiSystems is the brand name, the hardware is made by ECS) The patch in comment 2 doesn't appear to alter anything. Checking with acpi_set_firmware_wakeup_vector, the code goes through the if statement and not the else statement (ie, acpi_gbl_common_fACS.vector_width is equal to 32) Could you please tell me what's chipset type (MCH, ICH) of the system? I noticed the BIOS do some PCI config write in MCH. And please try acpi_os_name=Microsoft Windows NT or acpi_os_name=Microsoft Windows or acpi_os_name=Microsoft WindowsME: Millennium Edition The write to MCH diffs from varient os name. Changing the OS name appears to make no difference. It's a VIA-based machine - lspci gives: 0000:00:00.0 Host bridge: VIA Technologies, Inc. VT8623 [Apollo CLE266] 0000:00:01.0 PCI bridge: VIA Technologies, Inc. VT8633 [Apollo Pro266 AGP] Sorry, I forgot to mention - Windows XP appears to work correctly on this hardware, but FreeBSD fails in the same way as Linux. Created attachment 4079 [details]
acpidmp output
Created attachment 4080 [details]
lspci output
I wanted to confirm this bug for an Averatec 3225 laptop, also on VIA hardware. I can resume S3 (even with modules loaded) and the screen blanks and the power indicator starts to blink (which, when doing a Windows sleep, indicates the machine has really been put into sleep mode). However, while under Windows hitting any key or hitting the power button results in a resume, under Linux keys have no effect and the power button simply reboots the machine. I have also applied the patch that ensures the wakup vector contains a physical address, but it doesn't help. Considering both the original poster and I have VIA motherboards in our machines, it's very possible a fix to our problems will fix a whole class of S3-resume-related problems. I'd be perfectly willing to do whatever debug work kernel developers think is necessary to squash this bug. Attached are lspci and acpidmp output. Hi, andrew. The problem of the issue is we actually can't debug it. In matthieu's system, resume code never is invoked. Possibly a VIA motherboard is required to test it. I'll try if I can get one. Created attachment 4259 [details]
Test patch
I wonder if the BIOS tells wrong sleep type. In most systems, the sleep type
returned in _S3 is the same. Just a guess, please try. Thanks.
I made the _S3 types the same on my friend's laptop by changing the DSDT; no change. He also has an Averatec 3225 like comment #11. (In #11 presumably he meant "I can suspend S3", not "I can resume S3")? Could you please try the workaround in bug 3909? My HP nx5000 S3 failure looks like yours. Thanks. I'm afraid not - the system still immediately reboots on pressing the power button. same with 2.6.10? Yup, same with 2.6.10 (plus the immediate post-2.6.10 ACPI patch) Upgrading to the latest BIOS hasn't altered this behaviour. Did you try the patch in http://sourceforge.net/mailarchive/forum.php?thread_id=6423768&forum_id=6102 The patch didn't change anything on my Averatec 3225 laptop, kernel 2.6.11-rc2. No, the patch mentioned in comment 20 has no effect on the system. I have an Averatec laptop (AV3250HX-01) that also has the symptomps described. This is with 2.6.10 and 2.6.11.3. I've found that the powering off behavior isn't really related to the power button. My system can also wake on USB. When I plugin a USB device the system reboots. echo EHCI > /proc/acpi/wakeup echo mem > /sys/power/state Now plugin a usb mouse and the system reboots. This functionality works fine in windows xp home edition that came with the laptop. I have tried setting acpi_os_name="Microsoft Windows NT" but the behavior did not change. Created attachment 4715 [details]
acpidmp from AV3250HX-01 kernel 2.6.11.3
Created attachment 4716 [details]
dmesg from AV3250HX-01 kernel 2.6.11.3
Created attachment 4717 [details]
dmidecode from AV3250HX-01 kernel 2.6.11.3
Created attachment 4718 [details]
DSDT from AV3250HX-01 kernel 2.6.11.3
Created attachment 4719 [details]
/proc/interrupts from AV3250HX-01 kernel 2.6.11.3
Created attachment 4720 [details]
lspci -vv from AV3250HX-01 kernel 2.6.11.3
I found this bug today: http://bugme.osdl.org/show_bug.cgi?id=3586 Could others experiencing this bug try commenting the lines described in comment #3 on that bug? It worked for me, the laptop does a suspend/resume. The only problem being that I haven't worked out how to make the video resume (just a blank screen). But I know the laptop is working because I can type "find /" and I get hard drive activity and the capslock key works. Do you means delete the 'pushl 0; popfl' ? how about change them to 'pushw 0; popfw'? Changing them to pushw and popw is not enough. I have to actually delete the push/pop. Something strange is definitely happening in wakeup.S. The version without pushl/popl worked for about 20 reboots while I tested options and tried to get the video to resume. But then suddenly it has stopped working and the machine now reboots instead of resuming. I tried booting into WinXP and then booting into my modified kernel. When I do that the machine crashes right after it resumes. I don't have video so I don't know if there is an oops or other useful information. bug 3568 mentions using dead-loops and beeps to pinpoint the failure. Does anyone know the code for doing that? I'd like to apply that some strategy on my machine if I can. Why would the machine reboot on pushl $0? I've never done any serious asm programming so I don't know what is going on. If there is anything I can do that would be helpful in figuring this out, please let me know. I am eager to resolve this bug. Thanks! I added this at the beginning of wakeup.S (right after the includes) #define BEEP \ inb $97, %al; \ outb %al, $0x80; \ movb $3, %al; \ outb %al, $97; \ outb %al, $0x80; \ movb $-74, %al; \ outb %al, $67; \ outb %al, $0x80; \ movb $-119, %al; \ outb %al, $66; \ outb %al, $0x80; \ movb $15, %al; \ outb %al, $66; then moved a BEEP statement around to narrow down to the line causing the problem. after deleting the pushl $0 line, my laptop goes trough the entire wakeup.S, then crashes somewhere out of it (after the Back to C debug output)... it looks like there are 2 kind of bugs out there: 1.wakeup code is never executed 2.the kernel gains control but reboot is triggered by the pushl $0 is it right? On my system the code to do the beeping does not work. (My system could very well lack a PC speaker). But I wiped out my kernel and went back to the stock 2.6.12-rc1 source with my current config and I was just able to suspend/resume after commenting out the pushl/popl. I'll be posting more information in bug 3586 as it seems more relevant there. Please check if below change fixes the issue: - mov $(wakeup_stack - wakeup_code), %sp + movl $(wakeup_stack - wakeup_code), %esp I suspect the high bits of esp isn't cleared. Matthew, does s3 work without any driver loaded in the system? For convenience, you can access bug 5037 to get a initramfs. Created attachment 6064 [details]
debug patch
I wonder if it helps if we set the BIOS reset entry to wakeup code address.
Could anybody try this patch? Thanks!
No difference on the ECS machine, I'm afraid - on hitting the power button, the system appears to return to the BIOS. *** Bug 2347 has been marked as a duplicate of this bug. *** this is just a "bump" post to ask, has any work on this been done in recent kernel versions? Does anyone need me to test S3 functionality in the bleeding edge kernels? No, IIRC. This bug is basically impossible to debug without the system at hand. hey, I bought a bunch of these Averatecs for a business. One my employees recently broke the screen on one by accident (not a big break, but there's a 1 in. blind spot). I'm thinking of replacing it. If I did, and sent you the one with the broken screen, you could probably debug it using either the existing screen or using a VGA cable to a monitor. Would you actually be _willing_ to debug this if I sent you the hardware? :-) Hi Andrew, I just got confirm I could accept it (yes, Intel takes serious about such thing).If you're still willing to send it to me, please drop me an email. I'm eager to solve this bug. I am very anxious to get this bug fixed as well. I am an owner of an Averatec 3220-H1 that does exhibit this problem, and will gladly donate my time to help do any debugging or testing required. Feel free to contact me directly if that is easier. Hey David, It's going to be a couple of weeks until I can secure the laptop and get ready to ship it to you. When this comes closer to reality, I'll contact you for shipping information. Sorry for the hold-up! Here is some additional debug information I just stumbled upon, and would like to see if anyone can confirm it. I am running 2.6.16-1.2111_FC5 from Fedora and the following doesn't cause a reboot. 1) Unload cpufreq daemon 2) Load the cpufreq_ondemand module 3) echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor Going into suspend mode works as expected. Pressing the power button to wake up the machine no longer causes an immediate reboot. The blue led stops blinking and that is about it. Any ideas on what we could do next? Got it. Bit 11 of PM1_STATUS in VIA southbridges is flagged as "Abnormal power-off". If this is set, the BIOS won't jump back to the OS. Writing the bit back clears it and the machine then suspends/resumes correctly. Wonderful! So BIOS set the bit and as it's a ignored bit, per ACPI spec OS should preserve the bit and write 1, right? I'll report this to Bob. Created attachment 8335 [details]
Attempt to fix up abnormal power-off flag at boot
I believe this ought to do the job
I'm wondering if the ACPICA should do this, as this is a ignored bit, and ACPI spec said OS writing should preserve the bit. There might be other similar bits. Are you saying that the AcpiHwRegisterWrite interface should, in the case of a write to this register, do a read/update/write cycle and preserve bit 11? (Note, this would mean that any users of this interface would not be able to change this bit, but this would be correct according to the ACPI spec.) For suspend to work, the bit needs to be cleared by writing the same bit back again - is that what you mean by preserving it during the read/write operation? I guess that would make sense - we'd then end up clearing it in acpi_enter_sleep_state when acpi_hw_clear_acpi_status gets called, which should work happily. Yes. We would preserve it by always writing the same value back to it. This change would preserve the following bits: PM1_CONTROL[0] (SCI_EN) PM1_CONTROL[9] PM1_STATUS[11] Ok, that sounds good to me. I used the fixed patch from the 2.6.17-mm1 today on my KM400 chipset Averatec. I am still having the reboot problem on wakeup. One thing that is definitely different is pressing a key on the keyboard now brings the machine out of sleep. Any other data I can gather to help debug this? The quirk_via_abnormal_poweroff patch in comment #50 was applied to the kernel at some point. Then it was deemed unnecessary in this thread: http://www.ussg.iu.edu/hypermail/linux/kernel/0610.2/0428.html and removed for 2.6.18.2 and 2.6.19. closed. |