Bug 4711
Summary: | ACPI - System hang, loss of keyboard, Controller using wrong IRQ | ||
---|---|---|---|
Product: | ACPI | Reporter: | Parag Warudkar (kernel-stuff) |
Component: | Config-Interrupts | Assignee: | Len Brown (lenb) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | acpi-bugzilla, bunk |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.12-rc5 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
dmidecode output
biosdecode output DMESG output acpidmp output Picture showing IRQ assignments lspci -vv output lspci -vv output with acpi=off (no hang case) /proc/interrupts with acpi=off |
Description
Parag Warudkar
2005-06-04 13:52:16 UTC
Created attachment 5124 [details]
dmidecode output
Created attachment 5125 [details]
biosdecode output
All the listed problems go away if acpi=off is passed on kernel command line. >ACPI: PCI Interrupt 0000:00:02.0[P]: no GSI - using IRQ 11
How could a PCI device's pin be pin 'P'?
Could you please post detail information, such as dmesg, acpidmp?
Created attachment 5129 [details]
DMESG output
Created attachment 5130 [details]
acpidmp output
acpidmp output
is this one "via boards with IOAPIC" ? if yes, see bug http://bugme.osdl.org/show_bug.cgi?id=3319. Please attach "cat /proc/interrupts" S >PCI Interrupt 0000:00:02.0[P]: no GSI - using IRQ 11
Looks the OHCI device isn't initialized by BIOS or maybe isn't power on. So
sometimes it causes hang. And sometimes, its interrupt pin is 'P'.
What's the interrupt assignment in Win? Can you track down which step in
pci_enable_device cause the hang, pci_set_power_state or
pcibios_enable_device? And also please try boot option 'pci=routeirq'. Thanks!
See the attached IRQ-Usb.PNG for Win IRQ Assignment :) The device is powered on - only then the hang occurs. I will track it down to what exactly causes the hang inside pci_enable_device(). Thanks. Created attachment 5134 [details]
Picture showing IRQ assignments
Ok. pci=routeirq did not make a difference - pci_enable_device() still hangs while booting. Here is the relevant portion of dmseg with pci=routeirq PCI: Using ACPI for IRQ routing ** Routing PCI interrupts for all devices because "pci=routeirq" ** was specified. If this was required to make a driver work, ** please email the output of "lspci" to bjorn.helgaas@hp.com ** so I can fix the driver. ACPI: PCI Interrupt Link [LSMB] enabled at IRQ 22 ACPI: PCI interrupt 0000:00:01.1[A] -> GSI 22 (level, low) -> IRQ 22 ACPI: PCI Interrupt Link [LUS0] enabled at IRQ 21 ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 21 (level, low) -> IRQ 21 ACPI: PCI Interrupt Link [LUS1] enabled at IRQ 20 ACPI: PCI interrupt 0000:00:02.1[B] -> GSI 20 (level, low) -> IRQ 20 ACPI: PCI Interrupt Link [LUS2] enabled at IRQ 22 ACPI: PCI interrupt 0000:00:02.2[C] -> GSI 22 (level, low) -> IRQ 22 ACPI: PCI Interrupt Link [LACI] enabled at IRQ 21 ACPI: PCI interrupt 0000:00:06.0[A] -> GSI 21 (level, low) -> IRQ 21 ACPI: PCI Interrupt Link [LMCI] enabled at IRQ 20 ACPI: PCI interrupt 0000:00:06.1[B] -> GSI 20 (level, low) -> IRQ 20 ACPI: PCI Interrupt Link [LNK1] enabled at IRQ 19 ACPI: PCI interrupt 0000:02:00.0[A] -> GSI 19 (level, low) -> IRQ 19 ACPI: PCI Interrupt Link [LNK2] enabled at IRQ 18 ACPI: PCI interrupt 0000:02:01.0[A] -> GSI 18 (level, low) -> IRQ 18 ACPI: PCI Interrupt Link [LNK3] enabled at IRQ 17 ACPI: PCI interrupt 0000:02:02.0[A] -> GSI 17 (level, low) -> IRQ 17 ACPI: PCI interrupt 0000:02:04.0[A] -> GSI 19 (level, low) -> IRQ 19 ACPI: PCI interrupt 0000:02:04.1[B] -> GSI 18 (level, low) -> IRQ 18 ACPI: PCI Interrupt Link [LNK5] enabled at IRQ 16 ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16 cat /proc/interrupts -------------------- CPU0 0: 516363 IO-APIC-edge timer 1: 1071 IO-APIC-edge i8042 8: 0 IO-APIC-edge rtc 9: 16 IO-APIC-level acpi 12: 3261 IO-APIC-edge i8042 14: 10031 IO-APIC-edge ide0 15: 30822 IO-APIC-edge ide1 18: 379 IO-APIC-level eth0 19: 3 IO-APIC-level ohci1394 20: 1439 IO-APIC-level ohci_hcd, NVidia nForce3 Modem 21: 0 IO-APIC-level ohci_hcd, NVidia nForce3 22: 12884 IO-APIC-level ehci_hcd NMI: 60 LOC: 516210 ERR: 0 MIS: 0 I tried to isolate the hang - it happens inside pcibios_enable_device(). Hmm.. Interestingly I booted the machine today under the same conditions when it hangs on boot and it did not hang while booting. More interestingly, the /proc/interrupts output is different than the case where it hangs - Normally (when it hangs on boot) IRQ assignment looks like this - 18: 379 IO-APIC-level eth0 19: 3 IO-APIC-level ohci1394 20: 1439 IO-APIC-level ohci_hcd, NVidia nForce3 Modem 21: 0 IO-APIC-level ohci_hcd, NVidia nForce3 22: 12884 IO-APIC-level ehci_hcd Today when it did not hang and everything still worked fine here is what the relevant differences are - 16: 3 IO-APIC-level ohci1394 18: 49277 IO-APIC-level nvidia 19: 1753 IO-APIC-level eth0 20: 6253 IO-APIC-level ohci_hcd 21: 646 IO-APIC-level ohci_hcd, NVidia nForce3 22: 2225 IO-APIC-level ehci_hcd, NVidia nForce3 Modem NMI: 80 LOC: 840535 ERR: 0 MIS: 0 Questions - Why would the IRQ Assignment change between boots and why did it not hang with the later assignment? What can be done to keep this assignment and will it avoid the hang if kept same? Maybe you can find some rules, :). is there any difference in the dmesg in the non-hang case? This patch http://marc.theaimsgroup.com/?l=linux-kernel&m=112018510921263&w=2 might help you, could you please try? Thanks! I tried an updated version of the same patch you mentioned. It doesn't help. Hello! Is there a way to reliably force the IRQ assignment to be always - 20: 6253 IO-APIC-level ohci_hcd 21: 646 IO-APIC-level ohci_hcd, NVidia nForce3 22: 2225 IO-APIC-level ehci_hcd, NVidia nForce3 Modem I am thinking the controller doesn't like the normal IRQ assignment which ACPI chooses. If I had to modify the code to assign IRQs as above, where should I start looking to begin with? Windows IRQ Assignment - Strangely enough, IRQ 19 is missing! IRQ 0 System timer OK IRQ 1 Standard 101/102-Key or Microsoft Natural PS/2 Keyboard OK IRQ 6 Standard floppy disk controller OK IRQ 8 System CMOS/real time clock OK IRQ 9 Microsoft ACPI-Compliant System OK IRQ 10 NVIDIA nForce PCI System Management OK IRQ 12 Microsoft PS/2 Port Mouse (IntelliPoint) OK IRQ 13 Numeric data processor OK IRQ 14 NVIDIA nForce3 Parallel ATA Controller (v2.6) OK IRQ 15 NVIDIA nForce3 Parallel ATA Controller (v2.6) OK IRQ 16 Texas Instruments PCI-1620 CardBus Controller with UltraMedia. OK IRQ 16 NVIDIA GeForce4 440 Go 64M OK IRQ 17 Broadcom 802.11b/g WLAN OK IRQ 18 Texas Instruments OHCI Compliant IEEE 1394 Host Controller OK IRQ 18 Texas Instruments PCI-1620 CardBus Controller with UltraMedia. OK IRQ 20 Standard OpenHCD USB Host Controller OK IRQ 20 SoundMAX Integrated Digital Audio OK IRQ 21 Standard OpenHCD USB Host Controller OK IRQ 21 Agere Systems AC'97 Modem OK IRQ 22 Standard Enhanced PCI to USB Host Controller OK Can you attach the 'lspci -vv' output from the broken case? I'm still confused why the pin is 'P'. The "PCI Interrupt 0000:00:02.0[P]: no GSI" problem is very rare - I haven't recently encountered that problem. Routinely, the pin is almost always A. My real problem is the 2 minute hang while device is attached and the USB controller is assigned IRQ other than 20. I also tried (just for the heck of it) doing dev->irq = 20 before hcd-pci.c calls pci_enable_device - it doens't hang in that case. So this almost proves that the controller doesn't like any IRQ other than 20 to be assigned to it. Now only if we could look at why Linux's IRQ assignment is different than Windows and why ACPI chooses different IRQs at different times, I think we can crack this. I will post the lspci -vv output regardless though. Thanks. Created attachment 5324 [details]
lspci -vv output
Created attachment 5325 [details]
lspci -vv output with acpi=off (no hang case)
Created attachment 5326 [details]
/proc/interrupts with acpi=off
Some more info - The laptop has 3 USB ports - 1 legacy USB1.1 and 2 USB 2.0 ports. If any USB device is connected to any one or both of the two USB 2.0 ports, only then the load of ohci-hcd hangs the system during bootup. Attaching device to the legacy port doesnt hang the system by itself when ohci-hcd is loaded. This is the weirdest problem I ever had to debug! :( I managed to get a call trace while the kernel hangs - ohci_hcd: 2004 Nov 08 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI) PCI: Enabling device 0000:00:02.0 (0004 -> 0006) BUG: soft lockup detected on CPU#0! Modules linked in: ohci_hcd i2c_nforce2 i2c_core snd_intel8x0m shpchp snd_intel8 x0 snd_ac97_codec snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_d evice snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc 8 139too mii floppy dm_snapshot dm_zero dm_mirror ext3 jbd dm_mod sata_nv libata s d_mod usb_storage scsi_mod ehci_hcd Pid: 1115, comm: modprobe Not tainted 2.6.11-1.1369_FC4 RIP: 0010:[<ffffffff801703da>] <ffffffff801703da>{handle_IRQ_event+26} RSP: 0018:ffffffff8050a980 EFLAGS: 00000246 RAX: 0000000000000001 RBX: ffff81005faa22c0 RCX: 000000000000003d RDX: ffff81005faa22c0 RSI: ffffffff8050aa08 RDI: 0000000000000001 RBP: 0000000000000000 R08: 0000000000000000 R09: ffff81005fbb3d9c R10: 0000000000000046 R11: ffffffff8033f080 R12: 0000000000000001 R13: ffffffff8050aa08 R14: 0000000000000001 R15: 00000000005160f8 FS: 00002aaaaadfd6e0(0000) GS:ffffffff80576c00(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00002aaaaad698f0 CR3: 000000005ed3c000 CR4: 00000000000006e0 Call Trace: <IRQ> <ffffffff801705fd>{__do_IRQ+477} <ffffffff880049ff>{:ehci_hcd: ehci_work+47} <ffffffff801120b8>{do_IRQ+72} <ffffffff8010f6c3>{ret_from_intr+0} <ffffffff8033f080>{pci_conf1_read+0} <ffffffff801703da>{handle_IRQ_event+ 26} <ffffffff801705fd>{__do_IRQ+477} <ffffffff8013c54e>{profile_tick+78} <ffffffff801120b8>{do_IRQ+72} <ffffffff8010f6c3>{ret_from_intr+0} <EOI> <ffffffff8033f080>{pci_conf1_read+0} <ffffffff8024bb91>{pci_bus_re ad_config_byte+353} <ffffffff802851ce>{acpi_pci_irq_enable+79} <ffffffff8033ec0c>{pcibios_ena ble_resources+236} <ffffffff8024ebb6>{pci_enable_device_bars+38} <ffffffff8024ebeb>{pci_enab le_device+27} <ffffffff8031b3c3>{usb_hcd_pci_probe+83} <ffffffff80250a67>{pci_device_pr obe+103} <ffffffff802507aa>{pci_bus_match+42} <ffffffff802c503d>{driver_probe_devi ce+77} <ffffffff802c50b6>{driver_attach+70} <ffffffff802c51a9>{bus_add_driver+16 9} <ffffffff80250430>{pci_register_driver+160} <ffffffff80165765>{sys_init_m odule+277} <ffffffff8010f0fe>{system_call+126} Call Trace: <IRQ> <ffffffff8017012d>{softlockup_tick+285} <ffffffff80113fc1>{timer_interrupt+913} <ffffffff801703ec>{handle_IRQ_event+44} <ffffffff801705fd>{__do_IRQ+477} <ffffffff801120b8>{do_IRQ+72} <ffffffff8010f6c3>{ret_from_intr+0} <ffffffff8033f080>{pci_conf1_read+0} <ffffffff801703da>{handle_IRQ_event+26} <ffffffff801705fd>{__do_IRQ+477} <ffffffff880049ff>{:ehci_hcd:ehci_work+47} <ffffffff801120b8>{do_IRQ+72} <ffffffff8010f6c3>{ret_from_intr+0} <ffffffff8033f080>{pci_conf1_read+0} <ffffffff801703da>{handle_IRQ_event+26} <ffffffff801705fd>{__do_IRQ+477} <ffffffff8013c54e>{profile_tick+78} <ffffffff801120b8>{do_IRQ+72} <ffffffff8010f6c3>{ret_from_intr+0} <EOI> <ffffffff8033f080>{pci_conf1_read+0} <ffffffff8024bb91>{pci_bus_read_config_byte+353} <ffffffff802851ce>{acpi_pci_irq_enable+79} <ffffffff8033ec0c>{pcibios_enable_resources+236} <ffffffff8024ebb6>{pci_enable_device_bars+38} <ffffffff8024ebeb>{pci_enable_device+27} <ffffffff8031b3c3>{usb_hcd_pci_probe+83} <ffffffff80250a67>{pci_device_probe+103} <ffffffff802507aa>{pci_bus_match+42} <ffffffff802c503d>{driver_probe_device+77} <ffffffff802c50b6>{driver_attach+70} <ffffffff802c51a9>{bus_add_driver+169} <ffffffff80250430>{pci_register_driver+160} <ffffffff80165765>{sys_init_module+277} <ffffffff8010f0fe>{system_call+126} ACPI: PCI Interrupt Link [LUS0] enabled at IRQ 22 Whoa! These problems automatically stopped recently when I upgraded the laptop from 2.6.11 to 2.6.15- rc5. Don't know what exactly fixed these issues and in which kernel release. Closing since it's no longer reproducible. |