Most recent kernel where this bug did not occur: 2.6.12.1 Distribution: Debian Sarge Hardware Environment: Toshiba Potege 4010 Software Environment: Problem Description: "apm -s" still works, but on resume the screen is blank (backlight on) and hard disk spinning. Steps to reproduce: apm -s; then after suspend press power switch
Do you mean the released 2.6.15 kernel, or a -git tree? There is one obvious bug introduced into APM code between 2.6.12 and 2.6.15; SMP systems should never call the APM BIOS on CPUs other than zero. This code was removed; perhaps it is worked around elsewhere, but I did not see it. Nevertheless, it is not your bug. I am suspicious of the changes to apm_console_blank in the latest -git tree. The code breaks out of the loop early if an error code of APM_NOT_ENGAGED is returned, whereas it used to try extra hard to blank or unblank despite any returned error code. So it is very important to know if this code is in your kernel, since it sounds like it might explain your symptoms (and it is quite possible the Toshiba BIOS does not implement the APM spec perfectly - most APM implementations are notoriously buggy). It also looks like a patch of mine was misapplied to the latest -git tree; my diffs assumed that only CPU-0 would be calling the APM BIOS, and some change in between appears to have violated that assumption. Again, probably not your bug.
The kernel in which the bug is manifest is the released 2.6.15 version. The laptop has (of course) only one CPU. FWIW, ACPI resume from suspend-to-memory in 2.6.15 also fails in the same way if the resume is made after some delay -- blank screen with backlight on and disk spinning. If resume is done within a few seconds (~10sec) then it often succeeds. I haven't tried ACPI in prior kernels, so don't know if this phenomenon is new or old. Combinations of s3_bios and s3_mode as given in Documentation/power/video.txt don't help.
What is you APM config looking like? Specifically, do you have CONFIG_APM_ALLOW_INTS turned on? This option can be dangerous in either setting, but you can try toggling it. It sounds like you are hitting a more fundamental console blanking problem than the APM code. I am testing APM blanking now on 2.6.15, but I have only one APM BIOS implementation to test against. Zach
CONFIG_APM_ALLOW_INTS was OFF. Changing it to ON made no difference. But in the process I discovered I had described the symptoms incorrectly. On resume from APM suspend-to-RAM in kernel 2.6.15 the console is not blank, it is live showing the same display as was there at suspend. The disk is spinning and there is no response from the OS to keystrokes, but the BIOS seems to be handling them correctly (eg Fn/F10 turn on the numeric keypad light). Sorry for the incorrect information previously. I was confusing APM symptoms with ACPI symptoms.
I seem to be able to reproduce an APM regression in 2.6.15 as well. Console blanking works with my APM BIOS in 2.6.14, but not in 2.6.15. 2.6.14 + my APM GDT patches appears to have no problem. I am testing suspend now, although I can't guarantee much luck with that, as I am not convinced our suspend to RAM does anything remotely similar to what your BIOS does. This seems to highlight the changes to apm_do_idle as a potential problem source.
Created attachment 7053 [details] Revert APM idle code to 2.6.14 Could you try this patch and see if it fixes the problem?
No, reverting to 2.6.14 code for APM idle did not change the visible behaviour.
2.6.14 and 2.6.15 both with and without my patch seem to suspend to RAM fine for me. To rule out my APM segment changes in 2.6.15, please try the following patch.
Created attachment 7066 [details] Revert my 2.6.15 GDT changes Can you see if reverting my APM GDT changes fixes the bug?
Zach Sorry for the delay in replying -- I'm currently travelling. Reverting APM GDT changes to 2.6.14 did not change the behaviour. The Portege 4010 still hangs on resume from suspend-to-RAM. Cheers -- Ross
Further information:- My default setup is to load the kernel with "quiet" command-line parameter. If I boot the 2.6.15 kernel without that parameter, then the hang on resume from apm suspend gives a continuous spool of error messages (see below). If I boot the 2.6.15 kernel without loading modules, then sometimes apm resume works OK; sometimes it gives a burst of error messages then resumes. The error message appears to come from ide which is built-in to my standard kernel, not loaded as a module. The modules I routinely load are: ---8<------8<------8<------8<------8<------8<------8<------8<------8<--- # ohci_hcd # snd_trident snd_ali5451 snd_pcm_oss # yenta_socket # hermes orinoco orinoco_cs # serial_core 8250 slhc ppp_generic zlib_inflate zlib_deflate ppp_deflate ppp_async # ntfs vfat loop # sd_mod usb_storage # ---8<------8<------8<------8<------8<------8<------8<------8<------8<--- The error message on eventually-successful resume is: ---8<------8<------8<------8<------8<------8<------8<------8<------8<--- nomad:~# apm -s PCI: Found IRQ 11 for device 0000:00:10.0 PCI: Sharing IRQ 11 with 0000:01:00.0 PCI: Found IRQ 11 for device 0000:00:11.0 PCI: Sharing IRQ 11 with 0000:00:12.0 PCI: Found IRQ 11 for device 0000:00:11.1 nomad:~# hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown ide0: reset: success nomad:~# ---8<------8<------8<------8<------8<------8<------8<------8<------8<--- The "PCI:" messages are given on all resumes. If the resume fails, the sequence hda: task_in_intr: status=0x59 { DriveReady SeekComplete DataRequest Error } hda: task_in_intr: error=0x04 { DriveStatusError } ide: failed opcode was: unknown repeats indefinitely until hard reset or power cycle.
Thanks for the update. It appears my changes are not at fault. Have you tried narrowing the interval between kernel versions? You can binary search the last working kernel version, then binary search off the commit list until you find the point of failure. It looks like maybe suspending with an IDE interrupt pending is causing some trouble?
Bart, we think this is an IDE problem.
Ross, could you send output of 'dmesg' command?
Created attachment 7120 [details] Output from dmesg Thanks for your interest in this bug. Output from dmesg attached. The last 20 or so lines (following EXT3) result from suspend-resume. Cheers -- Ross
> Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2 > ide: Assuming 66MHz system bus speed for PIO modes > Probing IDE interface ide0... > hda: IC25N030ATCS04-0, ATA DISK drive > Probing IDE interface ide1... > hdc: HL-DT-STDVD-ROM GDR8081N, ATAPI CD/DVD-ROM drive > Probing IDE interface ide2... > Probing IDE interface ide3... > Probing IDE interface ide4... > Probing IDE interface ide5... > ide0 at 0x1f0-0x1f7,0x3f6 on irq 14 > ide1 at 0x170-0x177,0x376 on irq 15 > hda: max request size: 128KiB > hda: 58605120 sectors (30005 MB) w/1768KiB Cache, CHS=58140/16/63 > hda: cache flushes not supported > hda: hda1 hda2 hda3 hda4 > hdc: ATAPI 24X DVD-ROM drive, 512kB Cache You are using ide-generic driver instead of proper driver for your chipset. Generic driver has very limited suspend/resume support - it doesn't know how to reprogram IDE chipset and devices during resume...
Problem fixed by configuring kernel with ALi15x3 IDE driver instead of generic driver. Thanks Zach and Bartolomiej for your help and your patience. Cheers -- Ross