Distribution: Debian sid Hardware Environment: Thinkpad T30 P4-M 1.8Ghz Software Environment: Problem Description: Panic on boot Steps to reproduce: Press on button : Two logs, without #define ACPI_DEBUG_OUTPUT and with #define ACPI_DEBUG_OUPUT No ACPI_DEBUG_OUTPUT:
*** Bug 824 has been marked as a duplicate of this bug. ***
I didn't find this kind of error on T23, T40. Would you please try latest kernel with latest acpi such as 2.6.0-test4
Yes, it still hangs the test4. Would it be helpful you have the output of /proc/acpi/dsdt (from 2.5.66 as its the last one that I have that will boot with acpi), or any combination of output from pmtools, either the straight raw output or ran through acpidisasm.
please try the patch for https://bugzilla.redhat.com/bugzilla/show_bug.cgi? id=98849
OK, that patch fixed this particular problem, although two others appeared instead. Do *not* probe for hardware that doesn't exist (namely floppy drive), hang time. Also, now its hanging in the ACPI second stage. So, no panic messages but still no usuable machine with acpi enabled. I won't mark the bug fixed, we can do that when the mentioned patch is included in mainstream.
I'm interested in "two others appeared instead", Would you please describe how to reproduce, and post dmesg exhibits the problems. Thanks a lot.
Reproduction is pretty easy, just turn the machine on with acpi support. It'll fail in different places depending upon what is and isn't compiled in, but I think that's just a red herring to the real problem, i.e. if floppy support is compiled in it'll hang when checking for the floppy (as this machine doesn't have one). Another way is too compile the eepro100 driver into the kernel, again it'll hang on probe of the card. Here's the latest serial console capture at boot time, the BUG in spinlock.h looks mighty suspicious. (Serial console gets upset when the main serial driver is loaded, but the output on tty0 will continue until the ACPI [S0 S2 S3 S4 S5] message is displayed and then stop. Linux version 2.6.0-test4 (phillim@debian-t30) (gcc version 3.3.2 20030908 (Debian prerelease)) #7 SMP Wed Sep 17 01:48:48 EDT 2003 Video mode to be used for restore is f00 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f000 (usable) BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved) BIOS-e820: 00000000000d2000 - 00000000000d4000 (reserved) BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000001ff70000 (usable) BIOS-e820: 000000001ff70000 - 000000001ff7e000 (ACPI data) BIOS-e820: 000000001ff7e000 - 000000001ff80000 (ACPI NVS) BIOS-e820: 000000001ff80000 - 0000000020000000 (reserved) BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved) 511MB LOWMEM available. On node 0 totalpages: 130928 DMA zone: 4096 pages, LIFO batch:1 Normal zone: 126832 pages, LIFO batch:16 HighMem zone: 0 pages, LIFO batch:1 DMI present. IBM machine detected. Enabling interrupts during APM calls. IBM machine detected. Disabling SMBus accesses. ACPI: RSDP (v002 IBM ) @ 0x000f7010 ACPI: XSDT (v001 IBM TP-1I 0x00002040 LTP 0x00000000) @ 0x1ff73216 ACPI: FADT (v001 IBM TP-1I 0x00002040 IBM 0x00000001) @ 0x1ff73300 ACPI: SSDT (v001 IBM TP-1I 0x00002040 MSFT 0x0100000d) @ 0x1ff733b4 ACPI: ECDT (v001 IBM TP-1I 0x00002040 IBM 0x00000001) @ 0x1ff7de73 ACPI: TCPA (v001 IBM TP-1I 0x00002040 PTL 0x00000001) @ 0x1ff7dec5 ACPI: BOOT (v001 IBM TP-1I 0x00002040 LTP 0x00000001) @ 0x1ff7dfd8 ACPI: DSDT (v001 IBM TP-1I 0x00002040 MSFT 0x0100000d) @ 0x00000000 ACPI: MADT not present Building zonelist for node : 0 Kernel command line: root=/dev/hda2 ro console=ttyS0,38400n8 console=tty0 No local APIC present or hardware disabled Initializing CPU#0 PID hash table entries: 2048 (order 11: 16384 bytes) Detected 1798.806 MHz processor. Console: colour VGA+ 80x25 Calibrating delay loop... 3547.13 BogoMIPS Memory: 511832k/523712k available (3255k kernel code, 11136k reserved, 1182k data, 196k init, 0k highmem) Dentry cache hash table entries: 65536 (order: 6, 262144 bytes) Inode-cache hash table entries: 32768 (order: 5, 131072 bytes) Mount-cache hash table entries: 512 (order: 0, 4096 bytes) -> /dev -> /dev/console -> /root CPU: Trace cache: 12K uops, L1 D cache: 8K CPU: L2 cache: 512K CPU: Hyper-Threading is disabled Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Checking 'hlt' instruction... OK. POSIX conformance testing by UNIFIX CPU0: Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz stepping 04 per-CPU timeslice cutoff: 1462.97 usecs. task migration cache decay timeout: 2 msecs. SMP motherboard not detected. Local APIC not detected. Using dummy APIC emulation. Starting migration thread for cpu 0 CPUS done 32 Initializing RT netlink socket PCI: PCI BIOS revision 2.10 entry at 0xfd8fe, last bus=8 PCI: Using configuration type 1 mtrr: v2.0 (20020519) BIO: pool of 256 setup, 14Kb (56 bytes/bio) biovec pool[0]: 1 bvecs: 256 entries (12 bytes) biovec pool[1]: 4 bvecs: 256 entries (48 bytes) biovec pool[2]: 16 bvecs: 256 entries (192 bytes) biovec pool[3]: 64 bvecs: 256 entries (768 bytes) biovec pool[4]: 128 bvecs: 256 entries (1536 bytes) biovec pool[5]: 256 bvecs: 256 entries (3072 bytes) ACPI: Subsystem revision 20030813 ACPI: Found ECDT ACPI: Interpreter enabled ACPI: Using PIC for interrupt routing ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11) ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 *11) ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 *11) ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 11, disabled) ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 11, disabled) ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11, disabled) ACPI: PCI Root Bridge [PCI0] (00:00) PCI: Probing PCI hardware (bus 00) Transparent bridge - 0000:00:1e.0 ACPI: Embedded Controller [EC] (gpe 28) ACPI: Power Resource [PUBS] (on) eip: c0269f6f ------------[ cut here ]------------ kernel BUG at include/asm/spinlock.h:120! invalid operand: 0000 [#1] CPU: 0 EIP: 0060:[<c0269f9b>] Not tainted EFLAGS: 00010086 EIP is at acpi_ec_gpe_query+0x87/0x142 eax: c0269f6f ebx: c16fe000 ecx: c0445468 edx: 0000157c esi: c17f3ce4 edi: c17f3d18 ebp: c16fff54 esp: c16fff28 ds: 007b es: 007b ss: 0068 Process events/0 (pid: 4, threadinfo=c16fe000 task=dff066b0) Stack: 00000286 00000000 c150ac20 00010000 33323130 37363534 42413938 46454443 dff4ceb0 dff4cebc dff4ceb8 c16fff64 c0251e94 c17f3ce4 c16fe000 c16fffec c013898c dff4ceb0 c16fffa0 00000000 dff5c018 dff5c028 dff4ceb0 c0251e84 Call Trace: [<c0251e94>] acpi_os_execute_deferred+0x10/0x22 [<c013898c>] worker_thread+0x213/0x3b4 [<c0251e84>] acpi_os_execute_deferred+0x0/0x22 [<c012173a>] default_wake_function+0x0/0x2e [<c010a5da>] ret_from_fork+0x6/0x14 [<c012173a>] default_wake_function+0x0/0x2e [<c0138779>] worker_thread+0x0/0x3b4 [<c0108239>] kernel_thread_helper+0x5/0xb Code: 0f 0b 78 00 1b 4b 44 c0 f0 fe 4e 34 0f 88 93 05 00 00 8d 46 <6>note: events/0[4] exited with preempt_count 1 SCSI subsystem initialized Linux Kernel Card Services 3.1.22 options: [pci] [cardbus] [pm] drivers/usb/core/usb.c: registered new driver usbfs drivers/usb/core/usb.c: registered new driver hub ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10 ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 9 ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 5 ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10 ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 9 PCI: Using ACPI for IRQ routing PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off' pty: 256 Unix98 ptys configured SBF: Simple Boot Flag extension found and enabled. SBF: Setting boot flags 0x1 cpufreq: P4/Xeon(TM) CPU On-Demand Clock Modulation available apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac) apm: overridden by ACPI. Journalled Block Device driver loaded Installing knfsd (copyright (C) 1996 okir@monad.swb.de). udf: registering filesystem Initializing Cryptographic API ACPI: AC Adapter [AC] (on-line) ACPI: Battery Slot [BAT0] (battery present) ACPI: Battery Slot [BAT1] (battery absent) ACPI: Power Button (FF) [PWRF] ACPI: Lid Switch [LID] ACPI: Sleep Button (CM) [SLPB] ACPI: Processor [CPU] (supports C1 C2 C3, 8 throttling states) ACPI: Thermal Zone [THM0] (59 C) request_module: failed /sbin/modprobe -- parport_lowlevel. error = -16 lp: driver loaded but no devices found Real Time Clock Driver v1.11a Linux agpgart interface v0.100 (c) Dave Jones agpgart: Detected an Intel i845 Chipset. agpgart: Maximum main memory to use for agp memory: 439M agpgart: AGP aperture is 64M @ 0xe0000000 [drm] Initialized radeon 1.9.0 20020828 on minor 0 Serial: 8250/16550 driver $Revision: 1.90 $ IRQ sharing disabled [garbage followed - snipped]
*sigh*, update bug report with dmesg, have another thought 30 seconds later and spend an hour verifying. Rebuilding the kernel UP makes the problem go away, its only when compiled SMP that the hang occurs. (ACPI even appears to work as well under UP). Looks like an uninitialized spinlock somewhere in the code. While this gets around the problem, it doesn't fix the bug, this machine is UP, true, but I need to test SMP code paths for other drivers and need the kernel with SMP support compiled in.
Would you please try SMP config at bug 1171 a try? I think we need isolate your problem from ACPI. Thanks a lot.
Please try the patch in OSDL 1171. This will resolve boot hang with SMP on.
OK, here the update from the last couple of suggestions. Firstly the config in 1171 doesn't even build on a kernel.org clean 2.6.0-test5, the module build fails, but never fear we got past that but just booting the kernel which *did* work. Some investigation of config files later revealed why the 1171 config works - CONFIG_DEBUG_SPINLOCK is not set. So at this point in the game the bug is still there, but subtley ignored. Finally, applied the patch from 1171 as suggested. It does apply to test5 with a little offset. This *does* fix the bug, even with CONFIG_DEBUG_SPINLOCK turned on. The upshot of all this is that the bug *is* actually fixed now. My configuration now has *both* patches applied. Would it be worth testing this with just the patch from 1171 ?
if the two patchs include 'https://bugzilla.redhat.com/bugzilla/show_bug.cgi? id=98849', it's not worth trying. Otherwise, please try to only apply the patch from 1171. Thanks a lot.
Yep, its the 98849 and 1171 patches that are required. I did try just 1171, but the original panic came back (expected really, just tested for completeness)
please test the new patch in Bug 1171 for the SMP error. The previous patch has error. Thanks.
*** This bug has been marked as a duplicate of 1171 ***