Most recent kernel where this bug did *NOT* occur: n/a Distribution: initramfs (bins from FC6) Hardware Environment: Toshiba Satellite A100 laptop, Celeron M 1.4GHz, 512M RAM, SIL SATA HD Software Environment: virtually no dev drivers. no net, no blk, no usb, no anything except for VGA console, keyboard and RTC Hello, Thanks for the work you guys do with ACPI. I have a Toshiba A100-105 laptop which has trouble resuming after suspend-to-RAM. The BIOS says the laptop is an "ATI RC410MB+SB450 (Goldfish)". After Matthew Garret's great talk at LCA2007, I grabbed 2.6.20-rc5 (now 2.6.20) from kernel.org, turned on CONFIG_PM_TRACE, and turned off as much other stuff as I could. There's no block devices, networking, USB, SATA, anything. I also made an initramfs with bash and a few utilities from FC6 in it. After compiling with CONFIG_PM_TRACE and booting into the initramfs, I'm trying to suspend with echo -n mem > /sys/power/state. The message "stopping tasks" flashes briefly on the screen, and the power LED (normally solid green) turns orange and starts doing a rolling fade in/fade out. Power to everything seems off. When I press the power button, the LED goes back to solid green, the backlight and CD go on, the hard disk LED goes on momentarily, but there is no BIOS screen and pressing Caps Lock doesn't toggle the LED. Holding down the power causes the machine to power-off again (power LED is off) but otherwise there's no response from the machine. This continues as long as there's a battery or AC: Turning the laptop on just turns on the LED and CD. Removing battery and AC allows the laptop to boot again. When I reboot, the date and time hasn't changed, which seems to indicate that a trace value was never written to CMOS. So I replaced the if() test in TRACE_RESUME() with if(1). When I tried again, I found that the date/time was still not changed, which tells me that the kernel isn't getting as far as resuming any devices. I modified my kernel to also do a TRACE_RESUME just as it's about to suspend. (For the call to TRACE_DEVICE(), I wrote an fn to look up the "zero" device). @@ -192,13 +194,22 @@ goto Unlock; } + zero = find_zero_device(); + if (zero == 0) + goto Unlock; + TRACE_DEVICE(zero); + TRACE_RESUME(0); pr_debug("PM: Preparing system for %s sleep\n", pm_states[state]); if ((error = suspend_prepare(state))) goto Unlock; + TRACE_DEVICE(zero); + TRACE_RESUME(1); pr_debug("PM: Entering %s sleep\n", pm_states[state]); error = suspend_enter(state); + TRACE_DEVICE(zero); + TRACE_RESUME(2); pr_debug("PM: Finishing wakeup.\n"); suspend_finish(state); Unlock: This time the date and time is modified, and the TRACE report on reboot points to the TRACE_RESUME(1) call. So I think my laptop is suspending, but never leaving suspend_enter(). I am running FC6, but Matthew Garret suggested I try Ubuntu. I did, and Ubuntu 6.10 has the same problem. I also tried the patch for VIA southbridges mentioned here, but it doesn't help: http://lca2007.linux.org.au/profile/53 I've also updated the BIOS to Toshiba's latest version, but it hasn't helped. Here are some dumps from my laptop. Most were taken under FC6, not my initramfs. http://www.afork.com/a100/acpidump.txt http://www.afork.com/a100/dmidecode.txt http://www.afork.com/a100/intr.txt http://www.afork.com/a100/lspci.txt http://www.afork.com/a100/mjd-trace.diff I haven't included a dmesg or serial log, as I don't have a serial port and I have no way of saving the output. Would that be useful? Can anyone suggest what's happening here? Is more information needed? What can I do to further diagnose the problem? Any help you can give me would be most appreciated. Thanks, Mitch.
Here is the .config for my kernel: http://www.afork.com/a100/.config Here is the same file, with the inactive options removed (shorter): http://www.afork.com/a100/.config-filtered
Can you try the latest -mm kernel (but please do not compile it with CONFIG_PREEMPT set)?
I have now tried 2.6.20-mm2, and the result is exactly the same: The laptop suspends but never gets to first base on resume. Please, any ideas?
I am afraid I might be hitting the same problem here with Satellite A100-847. More info on Red Hat bug #229464.
Created attachment 11613 [details] Linux Firmware Kit test results
Created attachment 11614 [details] DSDT disassembly
Mitch, Julian, can you both please test 2.6.22-rc3 or the latest -git ?
Not until it gets into Fedora development repo - I don't think I'll be able to compile kernel from scratch. I have tested using 2.6.20.3 to be precise. Also, what I suspect, is that 5.00 and further bios updates have screwed things deeper. I tried rolling back to kernel that used to work, and it does not anymore.
I mean 2.6.21.3
OK, found 2.6.22-rc3 in fedora cvs. I'll give it a go.
2.6.22-rc3, precisely funny-versioned 2.6.21-1.3193.fc8 does not work. Tried with init=/bin/bash.
Created attachment 11619 [details] mitch-DSDT.dsl Dump of DSDT from running Linuxtoolkit r2 on my Toshiba
Created attachment 11620 [details] mitch-results.txt Results of Linuxtoolkit r2 run on my Toshiba.
Created attachment 11621 [details] julian-mitch-results-txt.diff Julian and I both ran Linuxtoolkit r2. Nevertheless, the ordering of sections in our results.txt files was different. I reordered the sections in his file to match mine, then diffed the two. Here is the result.
> Mitch, Julian, can you both please test 2.6.22-rc3 or the latest -git ? Hi Rafael, thanks for grabbing this bug. I just tried 2.6.22-rc3 again, with the same minimal config I had when I reported the bug. The problem and symptoms are the same :-( Like Julian, I ran Linuxtoolkit r2, and I've attached my results. It certainly seems like his machine and mine are related, although his CPU is faster and the DSDT seems to have been compiled with the Intel compiler not the Microsoft one. Doing Linuxtoolkit's suspend/resume fails in the same way - power light goes green but no response. Any ideas now? I have been talking with Matthew Garrett. I'll point this bug out to him. Meanwhile, is there anything else I can do for further diagnosis? Thanks, Mitch.
Julian, I'm compiling my kernel with a minimal config (no devices except console and RTC), and providing a ramdisk with some system commands from FC6 in it. If you'd like a copy (3Mb), you can grab it here: http://www.afork.com/a100/bug-7988-bzImage To boot it, I added two lines to my /boot/grub/grub.conf: title My kernel kernel (hd0,5)/tosh/src/linux/compile/arch/i386/boot/bzImage Then it shows up in my boot menu. (You'd have to use a different path)
Referring to Comment #15: I'll have to look at your DSDT and Linuxtoolkit r2 results more thoroughly, but that'll take some time.
Might this help? http://article.gmane.org/gmane.linux.acpi.devel/23326 [PATCH]: ACPI: preserve the ebx value in acpi_copy_wakeup_routine I'll try it...
> [PATCH]: ACPI: preserve the ebx value in acpi_copy_wakeup_routine > > I'll try it... It didn't help. Same failure to resume, code never leaves suspend_enter().
This one may have similar cause (at least it is also Toshiba): http://bugzilla.kernel.org/show_bug.cgi?id=7499. If someone could try if compiling the kernel before mentioned commit (http://bugzilla.kernel.org/show_bug.cgi?id=7499#c13) fixes it for you too, it would be nice.
Hi Andrey, > This one may have similar cause (at least it is also Toshiba): Toshiba source from a number of places. My laptop mainboard is made by (and has chips from) ATI. A guy with similar symptom's Toshiba was made by Quanta and has Intel chips. And yours seems to have ALi chips. > http://bugzilla.kernel.org/show_bug.cgi?id=7499. > If someone could try if compiling the kernel before mentioned commit I will try it, but it appears in your case that the BIOS at least passes control back to Linux! So it seems that your problem and mine are quite different. Thanks, Mitch.
Just to clarify: A100-847 has an Intel Core 2 Duo T7200 CPU, GeForce Go 7600 GPU and Intel 945PM chipset.
> Just to clarify: A100-847 has an Intel Core 2 Duo T7200 CPU, > GeForce Go 7600 GPU and Intel 945PM chipset. Yes. Yours has a number of significant hardware differences to mine, but I think the BIOS is closely related, as are the symptoms. I think Andrey's and Olaf Dietsche's laptops (see below) are significantly different. http://article.gmane.org/gmane.linux.acpi.devel/23340
Hi Rafael, > I'll have to look at your DSDT and Linuxtoolkit r2 results more thoroughly, > but that'll take some time. Have you had a chance to have a look? Alternatively, is there something else Julian or I could do that would help debug the problem? Thank you!
No, unfortunately I haven't. I think the problems are related to the graphics adapters. Please have a look at this thread on LKML: http://lkml.org/lkml/2007/6/8/226
This thread suggested me to check for nvidia driver. Indeed, I upgraded it from 9361 to 9746 around the time resume stopped working. It did not help, though. Moreover, I checked that nvidia.ko is not loaded in minimal mode, so it does not seem to be related.
So, the suspend doesn't work with and without the NVidia driver loaded?
Yes. And in past (my wild guess is that with earlier bios) it worked with nvidia.ko loaded.
Julian, can you post the results of lspci please? (I have an ATI graphics chip, not NV)
Created attachment 11723 [details] lspci of A100-847
Hi Julian, The symptoms we're seeing sound similar, and I thought our LinuxToolkit results looked similar (esp BIOS), but it seems there's a huge difference in hardware between the two machines. The majority of chips in your laptop are from Intel, whereas in mine, lspci doesn't have any. The majority of mine appear to be ATI. So, it may be that we have two separate problems. Rafael, what do you suggest we do next?
To check if the kernel real-mode resume code is executed, you can use the Pavel's beeping patch available at: http://lkml.org/lkml/2007/6/9/80 (uncomment the BEEP before 'movw $0xb800, %ax' to enable the beeping). Then, if this code turns out to be executed, you can use PM_TRACE to check where exactly it fails. That will be tedious, I'm afraid, but I don't see what else can be done at this point.
Unlucky, I have x64 here :(. But I doubt it will beep, since PM_TRACE does not change the clock at all. Still may be worth trying.
I tried the beep patch (thank you) but I didn't get a beep after trying to arouse my laptop from its slumber. I will also try to put the beep patch into the early kernel startup (also in real mode) to verify that the beep code works ok on my machine.
(In reply to comment #33) > Unlucky, I have x64 here :(. But I doubt it will beep, since PM_TRACE does > not > change the clock at all. Still may be worth trying. x86_64 version is available now (thanks to Nigel), so you can try it: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.22-rc5/patches/29-Optional-Beeping-During-Resume-From-RAM.patch
Compiling the patch into Fedora's 2.6.22 right now. My limited coding skills show that it should beep provided I enable pm_trace, right? P.S. Sorry it took so long. Dreaded real life syndrome.
OK, here is what I did: booted into patched kernel in minimal mode; echo 1 > /sys/power/s2ram_beep; pm-suspend; resume I got no beep, but I have to confirm if this box has a pc speaker it could beep with at all.
Unlucky, this laptop does not beep.
And I checked 2.6.22.1, still no luck.
Looks like the kernel dies _really_ early. I am slowly starting to lose hope it will ever work. One idea came to my mind, though. Our lovely vendor has started to enable intel VT in bioses, it was disabled in the past. Unfortunately, there is no option to switch it. Can this be related?
What is maybe even more strange, I find clock going totally bonkers when I try to resume *without* PM_TRACE, whileenabling it does not change it at all. Weird. For record, I tried: pm-suspend --quirk-s3-bios pm-suspend --quirk-s3-mode pm-suspend --quirk-s3-bios --quirk-s3-mode earlier today under the init=/bin/bash boot, which ended up shifting the clock 4 hours forward. And I can't use NTP here.
I did some more examination on this and found that the failed resume sequence™ (pm-suspend, resume that fails, holding down the power button to shut down and reboot) shifts the clock 1 hour forward if invoked in minimal boot (init=/bin/bash), but leaves it intact when used with fully-booted system.
Can you try if this patch changes anything: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23-rc3/patches/26-s2ram-kill-old-debugging-junk.patch
Hi Rafael, Thanks for looking at this bug. I tried 2.6.23-rc3 with this patch and the same thing happened as before: On suspend, the steady green power light goes to orange fading in and out. After pressing the power button to reawaken it, the power light goes to green, and the batt/disk/power leds all go on, but no beep, no caps lock led, no anything. Obviously S-T-R works under Windows. But why? Is there anything I can do under Windows that would help? Is it possible to use VMware or QEMU or something to trap what Windows is doing with the hardware? Maybe it's using ACPI in a way Linux isn't. Is there some sort of spy program I can run to find out?
Well, I'm still thinking that this problem is related to video. Have you ever tried s2ram (http://en.opensuse.org/s2ram)?
The patch has no influence here either, *but* kernel-2.6.23-0.124.rc3.git2.fc8 resumes ok. It does not make the things worse, then. Well, now at least there is caps lock/find response, I need to figure out the correct quirk for the display. Looks like we were having separate problems indeed, Mitch. Sorry for the noise.
OK, with nvidia blob installed, system wakes up perfectly with no quirks at all.
Hm, I'm not sure what to do with this bug. The problem is evidently graphics-related and I don't think we can fix it in the kernel.
Hi, I have an ATI graphics chip, not nVidia. I'm trying the suspend from a text console of a cut-down kernel. Not X and not framebuffer. We still may be seeing two problems create similar symptoms, and my problem is different to the other reporter's problem. Apart from s2ram, is there a test you'd like me to do in order to show it's graphics related? Nevertheless, I will try s2ram tonight, and I will pay special attention to the ATI-specific s2ram notes. Thanks, Mitch.
Hi Rafael, I just tried s2ram from suspend-0.7. With -n, it gives this info: Machine matched entry 258: sys_vendor = 'TOSHIBA' sys_product = 'Satellite A100' sys_version = '' bios_version = '' Fixes: 0x3 S3_BIOS S3_MODE This machine can be identified by: sys_vendor = 'TOSHIBA' sys_product = 'Satellite A100' sys_version = 'PSAA2A-03501N' bios_version = '1.90' Without any args, s2ram puts the laptop to sleep in the same way as previously indicated, and the laptop fails to wake as previously indicated. No caps lock, no beep. That's in both FC6 (kernel-2.6.22.2) and from a bare kernel+initrd (kernel-2.6.23.rc3) I tried various combos of the recommended flags, and I got the message: Switching from vt1 to vt1 /proc/sys/kernel/acpi_video_flags does not exist; you need a kernel >= 2.6.16. switching back to vt1 (and suspend does not happen) I am compiling the kernel with CONFIG_ACPI turned on. Do you know what else I need to turn on for s2ram with arguments to work? Alternatively, any other test I can do? (I don't think it's the same problem as the laptop with the nVidia chip. Wish it was)
(In reply to comment #50) > Hi Rafael, > > I just tried s2ram from suspend-0.7. With -n, it gives this info: > > Machine matched entry 258: > sys_vendor = 'TOSHIBA' > sys_product = 'Satellite A100' > sys_version = '' > bios_version = '' > Fixes: 0x3 S3_BIOS S3_MODE > This machine can be identified by: > sys_vendor = 'TOSHIBA' > sys_product = 'Satellite A100' > sys_version = 'PSAA2A-03501N' > bios_version = '1.90' > > Without any args, s2ram puts the laptop to sleep in the same way as > previously > indicated, and the laptop fails to wake as previously indicated. No caps > lock, > no beep. That's in both FC6 (kernel-2.6.22.2) and from a bare kernel+initrd > (kernel-2.6.23.rc3) > > I tried various combos of the recommended flags, and I got the message: > Switching from vt1 to vt1 > /proc/sys/kernel/acpi_video_flags does not exist; you need a kernel >= > 2.6.16. > switching back to vt1 > > (and suspend does not happen) That's why the quirks don't work for you. > I am compiling the kernel with CONFIG_ACPI turned on. Do you know what else > I > need to turn on for s2ram with arguments to work? CONFIG_SUSPEND in the power management menu.
CONFIG_SUSPEND is already on. http://www.afork.com/a100/dot-config-s2ram.txt Is there a more specific define? Also, is it part of the video module, and how would I turn that module on? Many thanks, Mitch.
(In reply to comment #52) > CONFIG_SUSPEND is already on. > > http://www.afork.com/a100/dot-config-s2ram.txt First, you can try to unset CONFIG_DISABLE_CONSOLE_SUSPEND. > Is there a more specific define? No. > Also, is it part of the video module, and how would I turn that module on? No, it's not. Can you boot the 2.6.23-rc3 and see if the file /proc/sys/kernel/acpi_video_flags is present?
Greetings. I found why I didn't have that file: Needed CONFIG_PROC_SYSCTL. I recompiled. I tried s2ram (no longer complains), and the behaviour was identical to before: Suspends, but does not resume properly. No caps lock, no beep, no nothing. The computer will resuspend, but needs a disconnection of power and battery to get out of stuck state. I'm glad s2ram helped Julian. I think my A100 must be different enough inside that the fix listsed in the whitelist doesn't apply. (This is similar to Linksys WRT-54s or some DLink wireless cards, which have the same model number but can have different hardware). Is there any other diagnostic I can do? Out of Windows and Linux, can I run one OS under another in a virtual machine, and capture how they access portspace when suspending? Might there be a secret port write (Matthew Garrett found such a port for ATI southbridges)? Anything else wacky we can try? Any help you could give would be much appreciated.
Well, there are some important suspend-related fixes in the current Linus' tree. If you can, please test it.
I tried 2.6.23-rc7-git2 and the result was practically the same. The laptop still won't resume (orange power LED turns to green but otherwise stuck). There are two differences as far as I can see, both relevant to that stuck state. One is that the backlight doesn't come back on. The second is that it's no longer possible to hold down the power button and turn it off. Not sure if that's relevant (doesn't seem useful) but it is a change. Do you have any ideas of what I can try next? No idea too silly.
Well, I'm not sure if it applies on top of 2.6.23-rc7-git2, but you can try this patch: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.23-rc7/patches/23-s2ram-kill-old-debugging-junk.patch (if it doesn't apply, please let me know).
Also, what kind of hard disk drive is there in your box? IDE or SATA?
> 23-s2ram-kill-old-debugging-junk.patch I'll try it soon. > what kind of hard disk drive is there in your box? It's SATA. Controller is Silicon Image 3112, and the driver is sata_sil. It has given people a fair bit of trouble, although Tejun Heo's work has helped a lot and is now more or less reliable. Is it worth making a bootable USB so that the SATA drive is never touched?
(In reply to comment #59) > > 23-s2ram-kill-old-debugging-junk.patch > > I'll try it soon. OK > > what kind of hard disk drive is there in your box? > > It's SATA. Controller is Silicon Image 3112, and the driver is sata_sil. It > has given people a fair bit of trouble, although Tejun Heo's work has helped > a > lot and is now more or less reliable. Please try if booting the kernel with "libata.noacpi=0" in the command line helps. > Is it worth making a bootable USB so that the SATA drive is never touched? Well, that may cause other sorts of problems to appear, but of course you can try. ;-)
> Please try if booting the kernel with "libata.noacpi=0" > in the command line helps. It beeps! With libata.noacpi=0, it beeps! Still not there yet, but I'll do some more digging. Thank you!
Now, I think, you can use PM_TRACE to identify the place where it fails.
Oh that's a good idea, thanks. I'll do that.
Can you test the current mainline kernel, please?
Hello Rafael, In comment #61, I mentioned that I got a beep with libata.noacpi=0. I only got it once, and I've never been able to reproduce it since. :-( I just tried 2.6.24-rc5-git3, and the behaviour is still the same - the power light turns back to green, but otherwise no response. I added acpi_sleep=s3_beep, and I tried with and without libata.noacpi=0, but I'm not getting a post-resume response or beep. Thank you for your continued help. Is there anything else I can try?
FYI, there are some fixes related to libata suspend in the current mainline. Also, I have some patches that could help us debug the issue a bit further, but they only apply on top of 2.6.24-rc6 (or a later kernel).
Hi Rafael, Merry Christmas. > there are some fixes related to libata suspend in the current mainline. No support for block devices, network or USB is compiled in. I have tried to make the kernel as simple as possible. > I have some patches that could help us debug the issue a bit further, > but they only apply on top of 2.6.24-rc6 (or a later kernel). Thanks, I would like to try those patches. Mitch.
Please apply the patch series from: http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24-rc6/patches/ on top of vanilla 2.6.24-rc6 (or the current -git), compile the kernel with CONFIG_PM_DEBUG set, try to do: # echo core > /sys/power/pm_test # echo mem > /sys/power/state and see if that works (it won't actually suspend your system, but it will busy wait for 5 seconds after executing the suspend sequence and it will execute the resume sequence after that).
Hello, reporting to you live from LCA :-) I have tried 2.6.24, with the patches in http://www.sisk.pl/kernel/hibernation_and_suspend/2.6.24/snapshot-080126.tgz, and following your instructions. On the second echo, I get suspend-type messages, then a delay of several seconds, then everything comes back again. I really think the BIOS isn't handing control back to Linux, on resume.
Well, I assume you've tried the plain suspend with these patches too ...
Yes. I have tried with and without libata.noacpi=0. echo 1 > /sys/power/pm_trace echo mem > /sys/power/state The results are the same as before. Suspends, but when power is pressed to resume, light turns green, but no other response. The date and time is not altered.
Hmm. You might be right that the control doesn't reach the kernel on resume. I don't know how to verify that, though.
Hi Rafael, Somehow the BIOS is able to jump back into Windows ok. So I think it's either that some piece of hardware isn't being set properly by Linux, or it could be that some data the BIOS cares about is not set properly. (For an example of the first, see here) http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=709cf5ea7a8bea1b956d361ee7cef1945423200c Here are some ideas. - Is there something to verify the basic suspend/resume mechanism in the BIOS, without all the other Linux stuff around it? - Is there some way I can dump ACPI-related state in Windows, and dump state under Linux pre-suspend, to see if there's any difference? - Is there a way to run the Linux->BIOS->Linux cycle under an emulator, such as QEMU? Any help gratefully received.
Hi, I've been working with Matthew Garrett. We've found that if we resume immediately after suspending (like, 1/3 second), then the resume comes back. No video, but the caps lock works. It's possibly related to DRAM self-refresh: If self-refresh is not being activated, then a longer delay will lead to corruption of the contents of memory, and suspend won't work. Any ideas?
Hello, I'm proposing we close this bug. The problem is certainly real, and we seem to have narrowed it down to the self-refresh not being set on the RAM prior to suspend (thanks Matthew G), but I seem to be the only one having the problem (or at least, the only reporter), and I don't know how to get the information from ATI/AMD on how to enable it manually via blacklist. Rafael, thank you for your continuing help.
Okay, I've closed it with "Insufficient data" as the resolution. Please reopen if necessary.