Most recent kernel where this bug did *NOT* occur: 2.6.20.3 Distribution: Debian Hardware Environment: Acer Aspire 5652 (Core Duo, 1 GB RAM, Intel 82801G/ICH7-chipset) Software Environment: Desktop Problem Description: Suspend to RAM used to work fine on my computer up to 2.6.20.3. But no matter which rc of 2.6.21 I use, suspend to RAM doesn't work anymore. Up to rc3 even suspending stopped at "suspending console" which appearently seems to be fixed in rc4. I tried rc4-git4 with minimal config (no dyndicks, no HRT, no MSI, no sound, no bluetooth, no PCMCIA, no WLAN, no USB, no cpufreq) but still I can't resume properly. Caps works and I can login through SSH. Back to a more complete config (sound, MMC, WLAN, PCMCIA - still no dynticks or HRT - see attachment "config") I get exactly the same behaviour. When logged in through SSH after resume I saved output of dmesg (which includes full power management debug messages), see attachement "dmesg-resume". The system basically seems to be back but lot of things do not work such as loading/unloading e.g. my WLAN-driver (ipw3945), running "top" or "dstat" etc. "uptime" always returns 0 min, even with power management debug disabled. Kernel: Linux version 2.6.21-rc4 (gcc version 4.1.2 20061115 (prerelease) (Debian 4.1.1-21)) #23 SMP PREEMPT Mon Mar 19 12:27:56 CET 2007 A complete bisect between 2.6.20 and 2.6.21-rc4-git4 stops at a stage (a4bbb810dedaecf74d54b16b6dd3c33e95e1024c) where I'm not able to compile the kernel anymore because of compiling-errors in arch/i386/kernel/setup.c (ACPI-related compiling errors). Stepping some revisions back until it compiled again resume didn't work either. So I started all over again with bisect only on arch/i386 and ended up at ceb6c46839021d5c7c338d48deac616944660124 as the bad commit. But this file seems to be some kind of finalization of a series of patches ("ACPICA: Remove duplicate table manager")... Steps to reproduce: echo mem /sys/power/state and power on machine again by pressing a key or pushing power-on-button.
Created attachment 10894 [details] Output of dmesg after resume (made using SSH-login)
Created attachment 10896 [details] My kernel-configuration (also tried with less drivers etc. - same result)
Created attachment 10897 [details] Output of acpidump
> I can login through SSH. So the console doesn't come back up? What if you use a text console instead of X -- does that work? (ie. suspend/resume from S3, not from S5?) Sounds like resume sort of worked, but something basic is broken, like system time. What do you see with date sleep 10 date after resume?
also, see if this patch helps on top of rc4: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/patches/release/2.6.21/ acpi-release-20070126-2.6.21-rc4.diff.bz2
First of all, your patch does not work at all :( The computer just hangs on boot. I made a photo of it, see attachement. Also tried with "pci=nomsi" and "vga=normal". Anything else I could test? BTW, my BIOS is up to date. The console itself never came up after resume (always black - all kernel-versions) but this was no problem because X always resumed just fine. However even if I had no X run, I could type commands etc. (e.g. a find in / and see HD-LEDs blinking) and press Alt+Ctrl+Del for example which isn't the case with 2.6.21-kernels. I couldn't do any further testing right now as I currently have no second computer (for SSH). This will change tomorrow, so I'll post more information.
Created attachment 10919 [details] Hang machine with patch 20070126 sorry for the poor quality, just made with my mobile phone ;-) if you need more/detailed photos, tell me.
Please verify that this hang is not in 2.6.21-rc5 There was a bad C-state patch in acpi-release-20070126-2.6.21-rc4.diff.bz2 You should be able to avoid it by booting with "processor.max_cstate=1", or simply running 2.6.21-rc5, which includes all of the above, except this bad bit.
Sorry for disappointing you again, but rc5 doesn't work either. Now I even can't login through SSH or ping the machine :-( Caps works but that's all. I don't know how to debug now...
After taking a look at my dmesg-attachement (#10894) I found the following line: Calibrating delay using timer specific routine.. 189932.25 BogoMIPS (lpj=94966129) This looks very strange to me. Is this already fixed with the timer-patch you mentioned?
"Please verify that this hang is not in 2.6.21-rc5" Does not hang at startup anymore, that's right.
okay, then we are back where we started, 2.6.21-rc5 doesn't resume from S3 on this machine, but 2.6.20.3 did resume. > Calibrating delay using timer specific routine.. 189932.25 BogoMIPS (lpj=94966129) Yeah, that is way off. I don't know what to make of it. Any better if you boot with maxcpus=1 and "noapic"?
Hmmm, this BIOS has two MADTs: lenb@d975xbx2:~/Documents/8247> /usr/bin/acpixtract -a acpidump Acpi table [DSDT] - 23955 bytes written to DSDT.dat Acpi table [FACS] - 64 bytes written to FACS.dat Acpi table [FACP] - 116 bytes written to FACP.dat Acpi table [APIC] - 104 bytes written to APIC1.dat Acpi table [HPET] - 56 bytes written to HPET.dat Acpi table [MCFG] - 60 bytes written to MCFG.dat Acpi table [SLIC] - 374 bytes written to SLIC.dat Acpi table [DBGP] - 52 bytes written to DBGP.dat Acpi table [APIC] - 104 bytes written to APIC2.dat Acpi table [BOOT] - 40 bytes written to BOOT.dat Acpi table [SSDT] - 1615 bytes written to SSDT1.dat Acpi table [SSDT] - 1682 bytes written to SSDT2.dat Acpi table [SSDT] - 607 bytes written to SSDT3.dat Acpi table [SSDT] - 166 bytes written to SSDT4.dat Acpi table [SSDT] - 1228 bytes written to SSDT5.dat Acpi table [RSDT] - 88 bytes written to RSDT.dat Acpi table [RSDP] - 20 bytes written to RSDP.dat lenb@d975xbx2:~/Documents/8247> madt < APIC1.dat ACPI: APIC (v001 Acer Grape 0x06040000 LOHR 0x0000005a) @ 0x(nil) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) ACPI: IOAPIC (id[0x01] address[0xfec00000] global_irq_base[0x0]) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) Length 104 OK Checksum OK lenb@d975xbx2:~/Documents/8247> madt < APIC2.dat ACPI: APIC (v001 PTLTD APIC 0x06040000 LTP 0x00000000) @ 0x(nil) ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) ACPI: IOAPIC (id[0x02] address[0xfec00000] global_irq_base[0x0]) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 high edge) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) Length 104 OK Checksum OK Please boot with apic=debug, and attach the output from dmesg -s64000 and paste the /proc/interrupts from booting with and without "acpi_apic_instance=0" or the patch from bug 8283; and report if it has any effect on the suspend/resume issue at hand.
After waiting a bit after resume I found out, that my machine resumes properly after about 60 seconds (60 seconds after I pressed a key/the power-LED turned on). This also "worked" without your patch or without "acpi_apic_instance=0". However this of course still is not satisfying. After I resumed (after these 60 seconds), the average load is at 32. Furthermore my screen doesn't come back if I don't use proprietary NVIDIA-driver (which appearently has fine power-management). I attached dmesg-output (with NVIDIA-driver) after suspend as well as dmesg-output when booting with apic=debug.
Created attachment 11014 [details] dmesg-output when booting with apic=debug
Created attachment 11015 [details] output of dmesg after resume
Everything seems to work well now with 2.6.21-rc5-git9 - thank you :)