Distribution: SUSE 9.2 x86-64 Hardware Environment: Asus L5D Problem Description: The box hangs solid while resuming from disk, after the system image has been restored. Apparently, the hang occurs as a result of the call to pci_write_config_word() with pmcsr = 0 in drivers/pci/pci.c:pci_set_power_state(). In fact, the function pci_write_config_word() completes successfully but the box hangs during msleep() afterwards. It cartainly is related to the fact that ACPI PCI links are not set before pci_set_power_state() is called. Steps to reproduce: Suspend the box to disk and try to resume it. The problem is 100% reproducible. Greets, Rafael
Created attachment 4812 [details] Output of acpidmp on Asus L5D Attached is the output of acpidmp (the 2.6.12-rc1-mm3 kernel) on the affected box.
Created attachment 4813 [details] The path that needs to be reverted to fix the problem Attached it the patch from the ACPI tree that needs to be reverted to fix the problem.
The problem seems to be related to the chipset (NVidia NForce 3) and/or BIOS, as I am not able to reproduce it on another AMD64-based box with the AMD-81x1 chipset.
Oh well, could you please also attach the dmesg? I suspect we should disable link device at suspend, but not very sure if it can help. But first I'd like to know which mode (pic/apic) the box is. I didn't see the issue on all boxes I tried, so it is quite possible a bios/chipset issue.
Created attachment 4817 [details] dmesg output The box is in the PIC mode (sorry, I should have said this earlier), because there are problems with IO-APIC on it. The attached dmesg output is from the kernel with the patch reverted.
ACPI: OEMB ... >>> ERROR: Invalid checksum ACPI: DSDT ... Please verify that this machine is running the latest available BIOS.
Could you please try below debug patch, it disable link device at suspend. If with the patch the system reports an oops caused by 'sleeping at atomic', please just ignored it. If it works, we will figure out a real fix. Thanks, Shaohua diff -puN drivers/acpi/pci_link.c~disable_link_device drivers/acpi/pci_link.c --- 2.5/drivers/acpi/pci_link.c~disable_link_device 2005-03-31 13:25:51.970339288 +0800 +++ 2.5-root/drivers/acpi/pci_link.c 2005-03-31 13:28:12.101036184 +0800 @@ -733,6 +733,7 @@ irqrouter_suspend( } if (link->irq.active && link->irq.initialized) link->irq.suspend_resume = 1; + acpi_ut_evaluate_object(link->handle, "_DIS", 0, NULL); } return_VALUE(0); }
Created attachment 4828 [details] Serial console log from 2.6.12-rc1-mm2 Referring to Comment #6: Yes, it does. Unfortunately, the BIOS is buggy (eg the processor description in the DSDT is wrong etc.). Referring to Comment #7: This patch seems to help, but it causes some funny side-effects to appear. Attached is the serial console log from 2.6.12-rc1-mm2 with the patch applied. Greets, Rafael
Referring to Comment #8: Please ignore the "acpi_pci_link_allocate(): irq.active = ..." lines in the log. They come from a printk that I've added to acpi_pci_link_allocate().
Referring to Comment #8: Er, one more mistake of the same kind. Please ignore the "pci_set_power_state(): pmcsr = ..." and "pci_write_config_word() successful" lines in the log. They come from printks that I've added to pci_set_power_state().
Great, looks like we should disable link device. Does any device work (except the devices with irq 11) with the patch? and how about unload the sound card driver?
Created attachment 4830 [details] dmesg output from 2.6.12-rc1-mm2 with the debug patch First, the log that I have posted is not a typical one, it appears. Typically, if I don't unload the sound card driver before suspend, IRQ 11 is disabled and the box goes into a neverending loop while suspending devices (ie during suspend), but magic SysRq works then. If I unload the sound card driver before suspend, both IRQ 10 and IRQ 11 are disabled and the box suspends and resumes sucessfully (IRQ 10 and 11 are shared). The devices that are not on either IRQ 10 or IRQ 11 seem to work afterwards. Attached is a log of a typical boot-suspend-resume cycle, from dmesg.
Created attachment 4831 [details] Contents of /proc/interrupts Attached is the contents of /proc/interrupts on the box.
Referring to Comment #12: A correction: The infinite loop that occurs if the sound card driver is not unloaded before suspend actually starts in device_resume() called from kernel/power/swsusp.c:swsusp_write(), which is while _resuming_ devices during suspend.
Ok, maybe we should mask all PIC/IOAPIC pin at suspend, and only enable them till a device really start using them.
Please also try below debug patch (with previous one). I'd like to see if it's a PIC issue. Thanks, Shaohua diff -puN arch/i386/kernel/i8259.c~disable_link_device arch/i386/kernel/i8259.c --- 2.5/arch/i386/kernel/i8259.c~disable_link_device 2005-04-01 13:11:22.865666792 +0800 +++ 2.5-root/arch/i386/kernel/i8259.c 2005-04-01 13:15:10.719027816 +0800 @@ -259,6 +259,8 @@ static void save_ELCR(char *trigger) static int i8259A_resume(struct sys_device *dev) { init_8259A(0); + /* mask interrupt 10,11 */ + trigger[1] &= 0xF3; restore_ELCR(irq_trigger); return 0; }
Referring to Comment #16: This patch does not seem to change anything.
Hmm, maybe the .suspend should call free_irq and .resume should call request_irq, so we mask some interrupts till devices request them. Pavel, the PIC suspend/resume seems quite wrong to me. We should enable PIC/IRQ router per device's quest. A blind guess will have a small window which causes unexpected interrupts occur.
I can confirm that the problem does not occur in the APIC mode (tested on 2.6.12-rc2-mm3).
The problem is present in 2.6.12-rc3-mm2 (PIC mode). Greets, Rafael
Created attachment 5034 [details] debug patch Hi Rafael, could you please try the debug patch. Please unload (never loading the drivers is preferred since some drivers' unload routine isn't good) all PCI drivers except yenta (my patch just changed yenta driver). I'd like to konw if yenta is ok after resume. The patche does: 1. if no driver uses PCI irq router, the router will be disabled so it can reduce race. 2. PCI drivers call pci_disable_device and free_irq at suspend, so the pic/ioapic is masked at suspend. The goal of the patch is to make ioapic/pic/irq router just like the boot time at suspend (boot time the devices are disabled).
Add Nigel in the list, since this also impacts suspend2.
Created attachment 5037 [details] Patch to get the previous patch work on x86-64 Referring to Comment #21: With the patch applied the box has suspended and resumed successfully. :-) However, I was not quite sure which kernel you wanted me to test, so I tested 2.6.12-rc3-mm2 (2.6.12-rc3-mm3 does not boot on my box). If you want me to test another one, please let me know. Also, the system here is an x86-64, and your patch is against i386, apparently, so I needed to apply some changes manually to arch/x86-64/kernel/mpparse.c (it seems to be very similar to the i386 version, so I just copied the changes from your patch - with some minor modifications). The attached patch contains the changes that I made. I booted the box to the runlevel 3 (ie no X). The network and USB drivers were not loaded. Before suspend, I had unloaded the sound card driver manually. The yenta driver was in memory all the time.
-mm2 should be ok. Any tree which includes the patch mentioned at comment 2 (the 'delete link resume' patch) is ok. Pavel, Len & Nigel, did you have any objection on this proposal? Maybe I should drop a mail to pm list.
Yes, please drop a mail to the pm list. Hopefully it'll get more attention. ;-)
David, do you want me to comment on http://bugme.osdl.org/attachment.cgi?id=4813&action=view ? havig u8 reserved:6 seems little silly to me. Otherwise it looks okay.
Thanks Pavel. I'd like consider the proposal in comment 21, but I'll send an email to pm list to detailly explain the proposal soon, so please review it then.
Hi Rafael, without the patch does the system work only with yenta loaded? I'd like show people a case that 'not call free_irq' isn't good for some systems. Thanks!
Unfortunately (?), without the patch and with only yenta loaded the system appears to work. As of 2.6.12-rc3-mm2, the drivers that hang the kernel during resume (sometimes also during suspend) are: ohci-hcd (shares IRQ with yenta) ehci-hcd snd_intel8x0
Can we have an update on this please? Is 2.6.12-rc5 fixed?
The 2.6.12-rc5 kernel is unaffected, as the problem is only present in -mm. The problem is present in 2.6.12-rc5-mm1. Apparently, the following drivers are affected: ohci-hcd ehci-hcd snd_intel8x0 The box hangs solid during resume or suspend while executing the drivers' _resume() routines.
The problem is not present in 2.6.12-rc5-mm2
As described in the bug logs, this issue is now fixed.
Unfortunately, not quite so. It's back in 2.6.13-rc2-mm2.
The problem has made it to mainline, as it is present in 2.6.13-rc3.
I've just had some time to test this a bit more on Linux-2.6.13-rc3-git5 with a patch that frees the IRQ in _resume() and requests one in _suspend() for yenta_socket. After a fresh reboot the configuration of IRQs is the following: CPU0 0: 31855 XT-PIC timer 1: 262 XT-PIC i8042 2: 0 XT-PIC cascade 5: 2 XT-PIC ehci_hcd:usb3 8: 0 XT-PIC rtc 9: 280 XT-PIC acpi 10: 194 XT-PIC SysKonnect SK-98xx, NVidia nForce3 11: 42 XT-PIC ohci_hcd:usb1, ohci_hcd:usb2, yenta, yenta 12: 1115 XT-PIC i8042 14: 719 XT-PIC ide0 15: 6425 XT-PIC ide1 NMI: 52 LOC: 31787 ERR: 2 MIS: 0 So, the involved PCI drivers are: ehci_hcd (IRQ5), sk98lin (IRQ10), snd_intel8x0 (IRQ10), ohci_hcd (IRQ11), yenta_socket (IRQ11). Evidently, if only one (arbitrary) of these drivers is loaded before resume, the box always suspends and resumes successfully, so the problem does not seem to be directly related to what these drivers do in their _suspend/_resume routines. However, there are the following combinations of drivers that cause the box to hang solid during resume: snd_intel8x0 + ohci_hcd snd_intel8x0 + ehci_hcd sk98lin + ohci_hcd sk98lin + ehci_hcd The remaining combinations of drivers are apparently OK wrt suspend/resume. Thus it looks like the box hangs during resume if IRQ10 is occupied (by anything) _and_ a USB controller is active at the same time. It is also apparent that the box hangs because of the call to pci_write_config_word() with pmcsr = 0 in drivers/pci/pci.c:pci_set_power_state() called from drivers/usb/core/usb/hcd-pci.c:usb_hcd_pci_resume() for either the ehci_hcd or the ohci_hcd driver which happens _before_ the driver has a chance to request an IRQ. It is also the first driver for which the _resume() routine is called so any other driver cannot request its IRQ earlier. Now it seems to me that, as a result of the call to pci_set_power_state(), an IRQ (IRQ10?) is generated which is then mishandled and this causes the box to hang. Greets, Rafael
Created attachment 5371 [details] Patch to fix the problem on Asus L5D The attached patch (the Yenta part is not mine) fixes the problem on my box, which has been tested on both 2.6.13-rc3-git5 and -git6. As I said in Comment #36, I think that calling pci_set_power_state() from usb_hcd_pci_resume() on my box causes an interrupt to occur which is mishandled if at least one driver has not freed IRQ10 during suspend. Apparently, all PCI drivers _must_ call free_irq() from their suspend routines or some unrelated interrupts may be mishandled during resume. As far as I am concerned, the problem is resolved. Greets, Rafael
Great analysis, Rafael. Please push it to upstream kernel. I guess there will be many similar patches recently. People seem agree we should free_irq/request_irq in suspend/resume time.
applied to-linus
We sent this one to 2.6.13-rc5 and caused quite a stir, resulting in Linus restoring the blind link restore code. So this failure should be masked for 2.6.13 -- please re-open if 2.6.13 is still an issue.