Bug 9787
Summary: | Deadlock on _any_ ACPI event with nmi_watchdog=1 | ||
---|---|---|---|
Product: | ACPI | Reporter: | Nick (gentuu) |
Component: | Config-Interrupts | Assignee: | Zhang Rui (rui.zhang) |
Status: | CLOSED DUPLICATE | ||
Severity: | low | CC: | acpi-bugzilla |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.24-rc8 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
Problem config
dmesg of the problem 2.6.24-rc8 (HEAD is a7da60f41551abb3c520b03d42ec05dd7decfc7f) working config Working dmesg output |
Description
Nick
2008-01-21 08:52:24 UTC
Just finished trying all the kernels down to 2.6.24-rc1 - the same thing. So, marking the bug as regression. Created attachment 14512 [details]
Problem config
Reply-To: akpm@linux-foundation.org On Mon, 21 Jan 2008 08:52:25 -0800 (PST) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=9787 > > Summary: Deadlock on _any_ ACPI event > Product: ACPI > Version: 2.5 > KernelVersion: 2.6.24-rc8 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: high > Priority: P1 > Component: ACPICA-Core > AssignedTo: acpi_acpica-core@kernel-bugs.osdl.org > ReportedBy: gentuu@gmail.com > > > Latest working kernel version: 2.6.23 for sure (but I had some 2.6.24 kernel > wroking) A box-killing post-2.6.23 regression. > Earliest failing kernel version: current 2.6.24, up to > a7da60f41551abb3c520b03d42ec05dd7decfc7f mainline commit > > Hardware Environment: x86_64 > Problem Description: > It's impossible to use any last kernel with ACPI compiled in: _any_ ACPI even > hangs my laptop without any [net]console output or so. > Any ACPI even mean: lid closing, AC adapter removing/plugging in, button > press. > > Guess, needless to say that it's impossible to use laptop without ACPI: > no battery info, no automating AC adapter events, no hibernate on the button > press. > > Steps to reproduce: > It can be reproduced easily and almost 100%. I'm just booting ACPI kernel > and > removing AC adapter - laptop is locked. > > I'm trying 2.6.24-rc5 and lower now to try to find out the problem release. > But I can boot 2.6.23 anytime - and there is no the problem there. > Yes, please do try to pinpoint which change broke it. Created attachment 14513 [details]
dmesg of the problem 2.6.24-rc8 (HEAD is a7da60f41551abb3c520b03d42ec05dd7decfc7f)
I found that any 2.6.24-rcX has the problem. Somewhere between 2.6.24-rc7 - .rc8 I've tried to merge kgdb git branch and tried to connect to the locked system when the problem appears - it didn't let me in. Even kgdb can't interrupt this. Total lock. can you attach the dmesg output of a working kernel? say 2.6.23. Surprise-surprise! Just found a backed up working 2.6.24-rc6 (!!) Attaching its config and dmesg output. Created attachment 14521 [details]
working config
2.6.24-rc6 working config
Created attachment 14522 [details]
Working dmesg output
2.6.24-rc6 working dmesg output
The thing is there is nothing ACPI related between the configs (except CONFIG_ACPI_SYSFS_POWER which is NOP for me: the problem doesn't depend on the option state). So, we have some non-obvious reason. I'm trying different options (in the configs diff) to find it out. >[ 0.000000] Linux version 2.6.24-rc8 (root@knote) (gcc version 4.2.2
>(Gentoo 4.2.2 p1.0)) #31 SMP PREEMPT Mon Jan 21 21:23:44 EET 2008
>[ 0.000000] Command line: ro root=/dev/sda2 nmi_watchdog=1
why do you add the "nmi_watchdoh=1" boot option?
Is there any difference if you remove it?
I started to add it to find out if it will show deadlock with disabled interrupts or so. But in fact - it didn't change anything. BTW, I have an update. I've built rc8 and rc6 with the working config... and I got the same problem(!). So, I guess the problem is in my own environment. Rechecking/cleaning everything and trying again. Congrats Zhang :) You asked the right question in fact. I added nmi_watchdog to track down some problems during IPVS customization I was doing before I hit the %subj% problem. Now I can see why I hit it... Without nmi_watchdog everything is OK :) So, the prob. is a low prio in fact. But we have an interesting question finally: why an ACPI event hangs system when nmi_watchdog is enabled? In fact, the problem appears and in 2.6.23 - so, this bug doesn't block meta-bug #9243 and is probably not a regression (not sure since what version it exists) - so removing both attributes. Per Linus, nmi_watchdog was disabled to prevent issues like this: http://lkml.org/lkml/2007/3/5/303 closing as a duplicate of bug 7839 -- use nmi_watchdog at your own risk. Please re-open if you find that nmi-watchdog worked on a previous kernel and then stopped working. *** This bug has been marked as a duplicate of bug 7839 *** Thanks Len! Sorry for wasting all your time. |