Bug 1123

Summary: Hang at i8042.c when booting with no PS/2 mouse attached
Product: ACPI Reporter: Felipe Alfaro Solana (felipe_alfaro)
Component: Config-InterruptsAssignee: Len Brown (lenb)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: high CC: acpi-bugzilla, adam, ingok, teuf, topia, zach.gelnett
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.0-test3-bk6 Subsystem:
Regression: --- Bisected commit-id:
Attachments: config used to compile 2.6.0-test3-bk6
dmesg output after booting *with* a PS/2 mouse plugged in
Modified version of i8042.c while some additional printk()
a copy of /proc/interrupts
output of the "lspci -vvv" command
The System.map file for the 2.6.0-test3-bk6 kernel
output of the "acpidmp"
output of the "dmidecode" tool
output of "dmesg" for 2.6.0-test4-bk1 booted with "pci=noacpi"
output of "dmesg" for 2.6.0-test4-bk1 booted without "pci=noacpi"
ingok@gmx.net: output of dmidecode
ingok@gmx.net: output of acpidmp
dmesg output of 2.6.0-test4-bk1 booted up with ACPI and APIC
output of the "acpidmp" tool
output of the "dmidecode" tool
a copy of /proc/interrupts
output of the "lspci -vvv" command
output of dmidecode for freya.yggdrasil.com
"lspci -vvv" output for freya.yggdrasil.com
/proc/interrupts for freya.yggdrasil.com
serial console output for freya.yggdrasil.com
freya.yggdrasil.com console log from older kernel
dmesg output for 2.6.0-test6

Description Felipe Alfaro Solana 2003-08-18 15:50:26 UTC
Distribution: 
Red Hat Linux Taroon Beta 1 
 
Hardware Environment: 
Pentium IV 2.0 GHz 
i845DE-based motherboard 
ATI Radeon 7000 VE 
3Com Corporation 3c905C-TX-M 
hda: ST380021A, ATA DISK drive 
hdb: IBM-DTLA-307030, ATA DISK drive 
hdc: Pioneer DVD-ROM ATAPIModel DVD-500M B00, ATAPI CD/DVD-ROM drive 
hdd: PLEXTOR CD-R PX-W4824A, ATAPI CD/DVD-ROM drive 
Microsoft Natural PS/2 keyboard 
No mouse 
 
Software Environment: 
Red Hat Linux Taroon Beta 1 
 
Problem Description: 
If I try to boot my P4 box (i845DE motherboard) with no PS/2 mouse plugged 
into the PS/2 port, the kernel hangs while checking the AUX ports in function 
i8042_check_aux(). The i8042_check_aux() function is trying to request IRQ 
#12, but the call to request_irq() causes the hang. The kernel hangs exactly 
at: 
 
   if (request_irq(values->irq, i8042_interrupt, SA_SHIRQ, 
                   "i8042", i8042_request_irq_cookie)) 
 
in drivers/input/serio/i8042.c, with a value of 12 for values->irq. If I boot 
with my PS/2 mouse attached, the kernel is able to boot normally. Also, 
disabling ACPI support in the kernel allows me to boot 2.6.0-test3-bk6 with no 
PS/2 mouse plugged in. 
 
I have booted up the kernel with initcall_debug=1 to see where it was hanging. 
It's not pretty informative (the last initcall report notesi8042_init), but 
I've left it enabled as you will see in the "dmesg"output. 
 
Attached to this report, I've included System.map (useful as I've 
enabledinitcall debug), and the sources of the i8042.c that I modified with 
plenty of printk() messages to guess where the hang was being produced. Also, 
the output of "lspci -vvv" while running a 2.4.21 kernel, "dmesg" for 
2.6.0-test3-bk6 and the list of interruptsbeing used (when booted with a PS/2 
mouse attached), plus the "config"file used to compile the kernel are attached 
to this message. 
 
Steps to reproduce: 
1. Compile 2.6.0-test3-bk6 using the supplied config file 
2. Try to boot it on a i845DE motherboard while no mouse is plugged in
Comment 1 Felipe Alfaro Solana 2003-08-18 15:51:31 UTC
Created attachment 666 [details]
config used to compile 2.6.0-test3-bk6
Comment 2 Felipe Alfaro Solana 2003-08-18 15:52:01 UTC
Created attachment 667 [details]
dmesg output after booting *with* a PS/2 mouse plugged in
Comment 3 Felipe Alfaro Solana 2003-08-18 15:53:57 UTC
Created attachment 668 [details]
Modified version of i8042.c while some additional printk()

This is the i8042.c version of 2.6.0-test3-bk6 with additional printk()'s to
find where exactly the kernel was hanging.

It just turned out the problem was while trying to request IRQ #12 by calling
request_irq() inside i8042_check_aux().
Comment 4 Felipe Alfaro Solana 2003-08-18 15:54:17 UTC
Created attachment 669 [details]
a copy of /proc/interrupts
Comment 5 Felipe Alfaro Solana 2003-08-18 15:55:04 UTC
Created attachment 670 [details]
output of the "lspci -vvv" command

This is the output of "lspci -vvv" while running a 2.4.21 kernel.
Comment 6 Felipe Alfaro Solana 2003-08-18 15:56:26 UTC
Created attachment 671 [details]
The System.map file for the 2.6.0-test3-bk6 kernel

I think this System.map can be interesting to decipher all those "initcall"
debug messages.
Comment 7 ingo 2003-08-23 02:31:32 UTC
- same for 2.6.0-test4 (independend of whether i plug mouse or not, though)
- seems to hang in request_irq in i8042_check_mux
Comment 8 Len Brown 2003-08-23 19:49:06 UTC
What is the most recent kernel that you know worked? 
I'm wondering if this is due to recent ACPI 20030813 changes, or earlier. 
 
Also, when you "disabled ACPI" and the system worked, did you use acpi=off? 
Can you confirm that pci=noacpi also boots? 
Attach the output from dmidecide, available in /usr/sbin/, or here: 
http://www.nongnu.org/dmidecode/ 
 
Attach the output from acpidmp, available in /usr/sbin/, or in here 
http://www.intel.com/technology/iapc/acpi/downloads/pmtools-20010730.tar.gz 
 
Attach the dmesg output showing the failure, if possible. 
 
Comment 9 Felipe Alfaro Solana 2003-08-24 04:38:01 UTC
The most recent kernel that worked for me is any of the -test3 series, for 
example, -test3-mm3, if my memory serves me well. 
 
When I disabled ACPI, I meant that I took out all ACPI code out of the kernel, 
and instead, compiled it using APM exclusively. I have tried, however, leaving 
ACPI compiled in (with no APM) and booting using "pci=noacpi" and it works 
perfectly. 
 
Next to this, I will attach a dmesg of 2.6.0-test4-bk1 booted with 
"acpi=noacpi", the output of dmesg when booting normally, and also the output 
of "dmidecode" and "acpidmp". 
Comment 10 Felipe Alfaro Solana 2003-08-24 04:40:24 UTC
Created attachment 702 [details]
output of the "acpidmp"

This is the output of running the "acpicmp" from the pmtools Intel package on
the P4 machine, while running a 2.6.0-test4-bk1 kernel booted up using
"pci=noacpi".
Comment 11 Felipe Alfaro Solana 2003-08-24 04:43:00 UTC
Created attachment 703 [details]
output of the "dmidecode" tool

This is the output of the "dmidecode" tool from Red Hat's kernel-utils-2.4-8.34
package.
Comment 12 Felipe Alfaro Solana 2003-08-24 04:43:57 UTC
Created attachment 704 [details]
output of "dmesg" for 2.6.0-test4-bk1 booted with "pci=noacpi"
Comment 13 Felipe Alfaro Solana 2003-08-24 04:44:34 UTC
Created attachment 705 [details]
output of "dmesg" for 2.6.0-test4-bk1 booted without "pci=noacpi"
Comment 14 ingo 2003-08-24 04:47:41 UTC
- linux-2.6.0-test3 was the last kernel that worked for me
- linux-2.6.0-test4 is the first kernel that has/exposes the bug
- linux-2.6.0-test4 hangs in i8042_check_mux() in request_irq()
- linux-2.6.0-test4 with acpi=off works
- linux-2.6.0-test4 with pci=noacpi works
Comment 15 ingo 2003-08-24 04:51:31 UTC
Created attachment 706 [details]
ingok@gmx.net: output of dmidecode
Comment 16 ingo 2003-08-24 04:52:39 UTC
Created attachment 707 [details]
ingok@gmx.net: output of acpidmp
Comment 17 Felipe Alfaro Solana 2003-08-24 07:02:52 UTC
OK, I've checked for a new BIOS and found one with date July, 7th 2003. This 
new BIOS supports IO-APIC and MPS 1.4 specification. 
 
I have flashed this new BIOS and I have found that 2.6.0-test4-bk1 boots fine 
if ACPI and APIC+IOAPIC are both enabled at the same time. Booting this new 
kernel with "noapic" causes the original hangs of this report when probing for 
the AUX port. Also, booting with "noapic pci=noacpi" also works. Thus, the 
system behaves exactly the same as with the old BIOS, but since this new BIOS 
includes APIC support, enabling APIC and IO-APIC also solves all problems. 
 
I have attached a new "dmesg" for 2.6.0-test4-bk1 with ACPI and APIC enabled, 
and booting with no extra kernel parameters. Also, I have added the output of 
running "lspci -vvv" and a copy of "/proc/interrupts" when the system is 
running with APIC and ACPI. Additionally, I have superseded my old "dmidecode" 
and "acpidmp" attachments with new ones. 
Comment 18 Felipe Alfaro Solana 2003-08-24 07:05:43 UTC
Created attachment 710 [details]
dmesg output of 2.6.0-test4-bk1 booted up with ACPI and APIC

This is the dmesg output of 2.6.0-test4-bk1 running on a Platinix 2D/533-A
motherboard with the latest BIOS, and booted with no extra kernel parameters,
that is, ACPI is enabled and so is IO-APIC.

I can't attach the dmesg output when booting with "noapic pci=noacpi" since the
machine hangs while probing for the AUX port and, since I'm booting with no
APIC support, the dmesg output differs considerably with respect to the dmesg
of booting with APIC enabled.
Comment 19 Felipe Alfaro Solana 2003-08-24 07:06:16 UTC
Created attachment 711 [details]
output of the "acpidmp" tool
Comment 20 Felipe Alfaro Solana 2003-08-24 07:07:15 UTC
Created attachment 712 [details]
output of the "dmidecode" tool
Comment 21 Felipe Alfaro Solana 2003-08-24 07:07:51 UTC
Created attachment 713 [details]
a copy of /proc/interrupts
Comment 22 Felipe Alfaro Solana 2003-08-24 07:08:13 UTC
Created attachment 714 [details]
output of the "lspci -vvv" command
Comment 23 Adam J. Richter 2003-08-24 23:50:34 UTC
I am experiencing this problem also, on a p4 motherboard with a Via chipset.
I have determined that when the system hangs, it is actually repeatedly
calling i8042_interrupt() in an infinite loop.  The first call to
i8042_interrupt occurs when setup_irq in arch/i386/kernel/irq.c calls
spin_unlock_irqrestore (i.e., the moment when interrupt processing by
i8042_interrupt becomes available).
Comment 24 Adam J. Richter 2003-08-25 00:01:26 UTC
Created attachment 720 [details]
output of dmidecode for freya.yggdrasil.com
Comment 25 Adam J. Richter 2003-08-25 00:02:52 UTC
Created attachment 721 [details]
"lspci -vvv" output for freya.yggdrasil.com
Comment 26 Adam J. Richter 2003-08-25 00:04:13 UTC
Created attachment 722 [details]
/proc/interrupts for freya.yggdrasil.com
Comment 27 Shaohua 2003-08-26 00:39:16 UTC
Any chance to get the boot message when fail to boot(through serial port)?
Comment 28 Adam J. Richter 2003-08-26 02:40:12 UTC
Created attachment 738 [details]
serial console output for freya.yggdrasil.com

The lines begining with "AJR" are from printk calls that I added.  I have
left them in, because they are pretty self-explanatory and provide some
useful information.  In particular, they show that the interrupt handler
is being called in some kind of infinite loop.	Perhaps there is some kind
of interrupt acknowledgement problem or some kind of edge versus level
misconfiguration.
Comment 29 Adam J. Richter 2003-08-26 23:35:01 UTC
Created attachment 745 [details]
freya.yggdrasil.com console log from older kernel

This is a log of the console output under 2.6.0-test3, which does not
experience this problem.  I believe someone asked for it in order to compare
with the failing 2.6.0-test4 output.
Comment 30 Jun Nakajima 2003-08-27 08:03:18 UTC
We acknowledged this problem, and working on it.

Comment 31 Greg Kroah-Hartman 2003-08-28 10:13:30 UTC
*** Bug 1144 has been marked as a duplicate of this bug. ***
Comment 32 Luming Yu 2003-09-03 07:42:14 UTC
Please reference bug#10, it looks like a similar issue root-caused there
Comment 33 Adam J. Richter 2003-09-03 08:55:56 UTC
I tried the "Initial fix for GA-7VAX" given in bug #10, and my system still
hung at the same point as before.

Comment 34 Luming Yu 2003-09-08 20:45:19 UTC
Would you please retry without Plug and Play support built-in?
Thanks a lot!
Comment 35 Luming Yu 2003-09-08 23:20:35 UTC
Also, I need you try UP kernel instead of SMP kernel.(And ACPI are fully 
enabled , debug option is openend!) And please remove your printk. 
Thanks a lot.
Comment 36 Zach Gelnett 2003-09-12 06:41:19 UTC
I just tried out test5 and am getting the same result.
Comment 37 mrmailer 2003-09-13 20:55:47 UTC
this continues in test5.
Comment 38 Adam J. Richter 2003-09-15 16:33:39 UTC
I will not be able to test new kernels on the machine that experiences this
problem until at least 2003.10.04.  So, I would appreciate it if someone else
who is experiencing this problem would try Luming's requests (comments 34 and
35), which were the following.

From comment 34, retry without Plug and Play support (not sure if this means ISA
PnP or some other configuration option).

From comment 35, retry with: CONFIG_SMP not set, CONFIG_ACPI set, and debug
option (?) enabled.
Comment 39 Luming Yu 2003-09-17 00:52:28 UTC
Would you please have patch at bug 1186 a try? thanks a lot.
Comment 40 Felipe Alfaro Solana 2003-09-17 05:20:57 UTC
The patch at bug 1186 
(http://bugme.osdl.org/attachment.cgi?id=903&action=view) fixes the problem 
for me. 
Comment 41 Petr Sebor 2003-09-22 16:34:07 UTC
Confirmed, this patch makes the problem go away...
2.6.0-test5-bk9, UP, ACPI, VIA KT400
Comment 42 ingo 2003-09-25 03:08:13 UTC
yep, the <a href=http://bugme.osdl.org/attachment.cgi?id=903&action=view>patch
at bug 1186</a> works for me (<a
href=http://bugzilla.kernel.org/show_bug.cgi?id=1123#c14>comment 14</a>) too.
Comment 43 ingo 2003-09-29 07:39:22 UTC
with plain 2.6.0-test6 the boots and AFAICS everything about works ok,
but there is a 'irq 12: nobody cared!' message exactly where the hang used to be.
dmesg output follows
Comment 44 ingo 2003-09-29 07:40:30 UTC
Created attachment 958 [details]
dmesg output for 2.6.0-test6
Comment 45 Luming Yu 2003-09-29 18:33:26 UTC
Did you try that patch which fixed your problem of -test5? Did such kind of 
error message appear on -test5?
Comment 46 ingo 2003-09-30 02:17:48 UTC
just checked, the patch from bug 1186 makes the system boot without
error message for test6, as was the case for test5.
Comment 47 Adam J. Richter 2003-10-20 07:55:11 UTC
The machine that had this problem with 2.6.0-test4 does not have this
problem with 2.6.0-test8.  2.6.0-test8 does not have the
patch that Ingo pointed to from bug 1186 (
http://bugme.osdl.org/attachment.cgi?id=903&action=view ), but
drivers/acpi/pci_link.c has changed somewhat between 2.6.0-test4 and
2.6.0-test8.

Does anyone still have this problem as of 2.6.0-test8?
Comment 48 Len Brown 2003-11-12 20:35:25 UTC
please re-open if you still have ps/2 IRQ problems with the latest 2.4 or 2.6 kernel. 
thanks, 
-Len