Bug 823 - 2.5.67 > panics on boot with latest thinkpad bios 2.04
Summary: 2.5.67 > panics on boot with latest thinkpad bios 2.04
Status: REJECTED DUPLICATE of bug 1171
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: i386 Linux
: P2 blocking
Assignee: Len Brown
URL:
Keywords:
: 824 (view as bug list)
Depends on:
Blocks:
 
Reported: 2003-06-17 19:10 UTC by Mike Phillips
Modified: 2004-03-05 11:39 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.5.72
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Mike Phillips 2003-06-17 19:10:09 UTC
Distribution: Debian sid
Hardware Environment: Thinkpad T30 P4-M 1.8Ghz
Software Environment: 
Problem Description: Panic on boot

Steps to reproduce: Press on button :

Two logs, without #define ACPI_DEBUG_OUTPUT and with #define ACPI_DEBUG_OUPUT

No ACPI_DEBUG_OUTPUT: 

Comment 1 Mike Phillips 2003-06-17 20:04:33 UTC
*** Bug 824 has been marked as a duplicate of this bug. ***
Comment 2 Luming Yu 2003-08-24 22:24:05 UTC
I didn't find this kind of error on T23, T40. Would you please try latest kernel
with latest acpi such as 2.6.0-test4
Comment 3 Mike Phillips 2003-08-25 16:24:32 UTC
Yes, it still hangs the test4. Would it be helpful you have the output of
/proc/acpi/dsdt (from 2.5.66 as its the last one that I have that will boot with
acpi), or any combination of output from pmtools, either the straight raw output
or ran through acpidisasm. 
Comment 4 Shaohua 2003-08-25 18:53:21 UTC
please try the patch for https://bugzilla.redhat.com/bugzilla/show_bug.cgi?
id=98849
Comment 5 Mike Phillips 2003-08-25 20:08:02 UTC
OK, that patch fixed this particular problem, although two others appeared
instead. Do *not* probe for hardware that doesn't exist (namely floppy drive),
hang time. Also, now its hanging in the ACPI second stage. 

So, no panic messages but still no usuable machine with acpi enabled. 

I won't mark the bug fixed, we can do that when the mentioned patch is included
in mainstream. 

Comment 6 Luming Yu 2003-09-07 19:44:06 UTC
I'm interested in "two others appeared instead", Would you please describe how 
to reproduce, and post dmesg exhibits the problems. Thanks a lot.
Comment 7 Mike Phillips 2003-09-16 23:18:24 UTC
Reproduction is pretty easy, just turn the machine on with acpi support. It'll
fail in different places depending upon what is and isn't compiled in, but I
think that's just a red herring to the real problem, i.e. if floppy support is
compiled in it'll hang when checking for the floppy (as this machine doesn't
have one). Another way is too compile the eepro100 driver into the kernel, again
it'll hang on probe of the card. Here's the latest serial console capture at
boot time, the BUG in spinlock.h looks mighty suspicious. (Serial console gets
upset when the main serial driver is loaded, but the output on tty0 will
continue until the ACPI [S0 S2 S3 S4 S5] message is displayed and then stop. 

Linux version 2.6.0-test4 (phillim@debian-t30) (gcc version 3.3.2 20030908
(Debian prerelease)) #7 SMP Wed Sep 17 01:48:48 EDT 2003
Video mode to be used for restore is f00
BIOS-provided physical RAM map:
 BIOS-e820: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e820: 000000000009f000 - 00000000000a0000 (reserved)
 BIOS-e820: 00000000000d2000 - 00000000000d4000 (reserved)
 BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved)
 BIOS-e820: 0000000000100000 - 000000001ff70000 (usable)
 BIOS-e820: 000000001ff70000 - 000000001ff7e000 (ACPI data)
 BIOS-e820: 000000001ff7e000 - 000000001ff80000 (ACPI NVS)
 BIOS-e820: 000000001ff80000 - 0000000020000000 (reserved)
 BIOS-e820: 00000000ff800000 - 0000000100000000 (reserved)
511MB LOWMEM available.
On node 0 totalpages: 130928
  DMA zone: 4096 pages, LIFO batch:1
  Normal zone: 126832 pages, LIFO batch:16
  HighMem zone: 0 pages, LIFO batch:1
DMI present.
IBM machine detected. Enabling interrupts during APM calls.
IBM machine detected. Disabling SMBus accesses.
ACPI: RSDP (v002 IBM                                       ) @ 0x000f7010
ACPI: XSDT (v001 IBM    TP-1I    0x00002040  LTP 0x00000000) @ 0x1ff73216
ACPI: FADT (v001 IBM    TP-1I    0x00002040 IBM  0x00000001) @ 0x1ff73300
ACPI: SSDT (v001 IBM    TP-1I    0x00002040 MSFT 0x0100000d) @ 0x1ff733b4
ACPI: ECDT (v001 IBM    TP-1I    0x00002040 IBM  0x00000001) @ 0x1ff7de73
ACPI: TCPA (v001 IBM    TP-1I    0x00002040 PTL  0x00000001) @ 0x1ff7dec5
ACPI: BOOT (v001 IBM    TP-1I    0x00002040  LTP 0x00000001) @ 0x1ff7dfd8
ACPI: DSDT (v001 IBM    TP-1I    0x00002040 MSFT 0x0100000d) @ 0x00000000
ACPI: MADT not present
Building zonelist for node : 0
Kernel command line: root=/dev/hda2 ro console=ttyS0,38400n8 console=tty0
No local APIC present or hardware disabled
Initializing CPU#0
PID hash table entries: 2048 (order 11: 16384 bytes)
Detected 1798.806 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 3547.13 BogoMIPS
Memory: 511832k/523712k available (3255k kernel code, 11136k reserved, 1182k
data, 196k init, 0k highmem)
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
-> /dev
-> /dev/console
-> /root
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: Hyper-Threading is disabled
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
CPU0: Intel(R) Pentium(R) 4 Mobile CPU 1.80GHz stepping 04
per-CPU timeslice cutoff: 1462.97 usecs.
task migration cache decay timeout: 2 msecs.
SMP motherboard not detected.
Local APIC not detected. Using dummy APIC emulation.
Starting migration thread for cpu 0
CPUS done 32
Initializing RT netlink socket
PCI: PCI BIOS revision 2.10 entry at 0xfd8fe, last bus=8
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
BIO: pool of 256 setup, 14Kb (56 bytes/bio)
biovec pool[0]:   1 bvecs: 256 entries (12 bytes)
biovec pool[1]:   4 bvecs: 256 entries (48 bytes)
biovec pool[2]:  16 bvecs: 256 entries (192 bytes)
biovec pool[3]:  64 bvecs: 256 entries (768 bytes)
biovec pool[4]: 128 bvecs: 256 entries (1536 bytes)
biovec pool[5]: 256 bvecs: 256 entries (3072 bytes)
ACPI: Subsystem revision 20030813
ACPI: Found ECDT
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 *11)
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 9 10 11, disabled)
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 9 10 11, disabled)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 9 10 11, disabled)
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
Transparent bridge - 0000:00:1e.0
ACPI: Embedded Controller [EC] (gpe 28)
ACPI: Power Resource [PUBS] (on)
eip: c0269f6f
------------[ cut here ]------------
kernel BUG at include/asm/spinlock.h:120!
invalid operand: 0000 [#1]
CPU:    0
EIP:    0060:[<c0269f9b>]    Not tainted
EFLAGS: 00010086
EIP is at acpi_ec_gpe_query+0x87/0x142
eax: c0269f6f   ebx: c16fe000   ecx: c0445468   edx: 0000157c
esi: c17f3ce4   edi: c17f3d18   ebp: c16fff54   esp: c16fff28
ds: 007b   es: 007b   ss: 0068
Process events/0 (pid: 4, threadinfo=c16fe000 task=dff066b0)
Stack: 00000286 00000000 c150ac20 00010000 33323130 37363534 42413938 46454443 
       dff4ceb0 dff4cebc dff4ceb8 c16fff64 c0251e94 c17f3ce4 c16fe000 c16fffec 
       c013898c dff4ceb0 c16fffa0 00000000 dff5c018 dff5c028 dff4ceb0 c0251e84 
Call Trace:
 [<c0251e94>] acpi_os_execute_deferred+0x10/0x22
 [<c013898c>] worker_thread+0x213/0x3b4
 [<c0251e84>] acpi_os_execute_deferred+0x0/0x22
 [<c012173a>] default_wake_function+0x0/0x2e
 [<c010a5da>] ret_from_fork+0x6/0x14
 [<c012173a>] default_wake_function+0x0/0x2e
 [<c0138779>] worker_thread+0x0/0x3b4
 [<c0108239>] kernel_thread_helper+0x5/0xb

Code: 0f 0b 78 00 1b 4b 44 c0 f0 fe 4e 34 0f 88 93 05 00 00 8d 46 
 <6>note: events/0[4] exited with preempt_count 1
SCSI subsystem initialized
Linux Kernel Card Services 3.1.22
  options:  [pci] [cardbus] [pm]
drivers/usb/core/usb.c: registered new driver usbfs
drivers/usb/core/usb.c: registered new driver hub
ACPI: PCI Interrupt Link [LNKA] enabled at IRQ 10
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 9
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 5
ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
ACPI: PCI Interrupt Link [LNKE] enabled at IRQ 9
PCI: Using ACPI for IRQ routing
PCI: if you experience problems, try using option 'pci=noacpi' or even 'acpi=off'
pty: 256 Unix98 ptys configured
SBF: Simple Boot Flag extension found and enabled.
SBF: Setting boot flags 0x1
cpufreq: P4/Xeon(TM) CPU On-Demand Clock Modulation available
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
apm: overridden by ACPI.
Journalled Block Device driver loaded
Installing knfsd (copyright (C) 1996 okir@monad.swb.de).
udf: registering filesystem
Initializing Cryptographic API
ACPI: AC Adapter [AC] (on-line)
ACPI: Battery Slot [BAT0] (battery present)
ACPI: Battery Slot [BAT1] (battery absent)
ACPI: Power Button (FF) [PWRF]
ACPI: Lid Switch [LID]
ACPI: Sleep Button (CM) [SLPB]
ACPI: Processor [CPU] (supports C1 C2 C3, 8 throttling states)
ACPI: Thermal Zone [THM0] (59 C)
request_module: failed /sbin/modprobe -- parport_lowlevel. error = -16
lp: driver loaded but no devices found
Real Time Clock Driver v1.11a
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected an Intel i845 Chipset.
agpgart: Maximum main memory to use for agp memory: 439M
agpgart: AGP aperture is 64M @ 0xe0000000
[drm] Initialized radeon 1.9.0 20020828 on minor 0
Serial: 8250/16550 driver $Revision: 1.90 $ IRQ sharing disabled
[garbage followed - snipped]
Comment 8 Mike Phillips 2003-09-17 00:21:10 UTC
*sigh*, update bug report with dmesg, have another thought 30 seconds later and
spend an hour verifying. Rebuilding the kernel UP makes the problem go away, its
only when compiled SMP that the hang occurs. (ACPI even appears to work as well
under UP). Looks like an uninitialized spinlock somewhere in the code. 

While this gets around the problem, it doesn't fix the bug, this machine is UP,
true, but I need to test SMP code paths for other drivers and need the kernel
with SMP support compiled in. 
Comment 9 Luming Yu 2003-09-17 00:40:10 UTC
Would you please try SMP config at bug 1171 a try? I think we need isolate your 
problem from ACPI. Thanks a lot.
Comment 10 Shaohua 2003-09-18 19:26:47 UTC
Please try the patch in OSDL 1171. This will resolve boot hang with SMP on. 
Comment 11 Mike Phillips 2003-09-22 19:21:20 UTC
OK, here the update from the last couple of suggestions. 

Firstly the config in 1171 doesn't even build on a kernel.org clean 2.6.0-test5,
the module build fails, but never fear we got past that but just booting the
kernel which *did* work. 

Some investigation of config files later revealed why the 1171 config works -
CONFIG_DEBUG_SPINLOCK is not set. So at this point in the game the bug is still
there, but subtley ignored. 

Finally, applied the patch from 1171 as suggested. It does apply to test5 with a
little offset. This *does* fix the bug, even with CONFIG_DEBUG_SPINLOCK turned on. 

The upshot of all this is that the bug *is* actually fixed now. My configuration
now has *both* patches applied. Would it be worth testing this with just the
patch from 1171 ?  
Comment 12 Shaohua 2003-09-23 22:06:08 UTC
if the two patchs include 'https://bugzilla.redhat.com/bugzilla/show_bug.cgi?
id=98849', it's not worth trying. Otherwise, please try to only apply the 
patch from 1171. Thanks a lot.
Comment 13 Mike Phillips 2003-09-25 18:52:38 UTC
Yep, its the 98849 and 1171 patches that are required. I did try just 1171, but
the original panic came back (expected really, just tested for completeness)
Comment 14 Shaohua 2003-10-30 02:13:55 UTC
please test the new patch in Bug 1171 for the SMP error. The previous patch 
has error. Thanks.
Comment 15 Len Brown 2003-11-12 18:14:23 UTC

*** This bug has been marked as a duplicate of 1171 ***

Note You need to log in before you can comment on or make changes to this bug.