Bug 8714

Summary: boot hang at "setting up timer" on NForce C51, unless nolapic
Product: ACPI Reporter: Per Toft (per)
Component: Config-InterruptsAssignee: ykzhao (yakui.zhao)
Status: REJECTED INSUFFICIENT_DATA    
Severity: blocking CC: acpi-bugzilla, per
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.22-rc7 Subsystem:
Regression: --- Bisected commit-id:
Attachments: ACPI dump (binary)
ACPI dump (hex)
dmesg with single pci=noacpi
/proc/interrupts with single pci=noacpi kernel options
dump tool to get bios PRT/MPS table

Description Per Toft 2007-07-05 07:07:32 UTC
Most recent kernel where this bug did not occur: 2.6.22-rc7
Distribution: Ubuntu gutsy (developnent release)
Hardware Environment: AMD Turion X2, NForce C51
Software Environment: 2.6.22-rc7 x86_64 AND x86
Problem Description:
System hangs on boot when "setting up timer" or "loading additional drivers" if not nolpic or pci=noacpi is added.
If nolpci or pci=noacpi is added the system can boot, but the IRQs for the second PCI express buss are note assigned.

Output from dump_pirq.pl:
Interrupt routing table found at address 0xfde80:
  Version 1.0, size 0x0160
  Interrupt router is device 00:0a.0
  PCI exclusive interrupt mask: 0x0000 []

Device 00:0a.0 (slot 0): 
  INTA: link 0x09, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x0c, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:0b.0 (slot 0): 
  INTA: link 0x0f, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x0a, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:14.0 (slot 0): 
  INTA: link 0x11, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:10.0 (slot 0): 
  INTB: link 0x13, irq mask 0xcca0 [5,7,10,11,14,15]

Device 04:05.0 (slot 0): 
  INTA: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x02, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:0d.0 (slot 0): 

Device 00:0e.0 (slot 8): 
  INTA: link 0x0e, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:0f.0 (slot 9): 
  INTA: link 0x0d, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:10.0 (slot 0): 
  INTA: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x02, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x03, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x04, irq mask 0xcca0 [5,7,10,11,14,15]

Device 07:06.0 (slot 6): 
  INTA: link 0x03, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x04, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x02, irq mask 0xcca0 [5,7,10,11,14,15]

Device 07:07.0 (slot 5): 
  INTA: link 0x04, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]

Device 07:08.0 (slot 4): 
  INTA: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x02, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x03, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x04, irq mask 0xcca0 [5,7,10,11,14,15]

Device 07:09.0 (slot 3): 
  INTA: link 0x02, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x03, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x04, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]

Device 07:0a.0 (slot 2): 
  INTA: link 0x03, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x04, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x01, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x02, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:05.0 (slot 0): 
  INTA: link 0x07, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x08, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x05, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x06, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:04.0 (slot 0): 
  INTA: link 0x05, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x06, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x07, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x08, irq mask 0xcca0 [5,7,10,11,14,15]

Device 00:03.0 (slot 0): 
  INTA: link 0x05, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x06, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x07, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x08, irq mask 0xcca0 [5,7,10,11,14,15]

Device 03:00.0 (slot 7): 

Device 00:02.0 (slot 0): 
  INTA: link 0x06, irq mask 0xcca0 [5,7,10,11,14,15]
  INTB: link 0x07, irq mask 0xcca0 [5,7,10,11,14,15]
  INTC: link 0x08, irq mask 0xcca0 [5,7,10,11,14,15]
  INTD: link 0x05, irq mask 0xcca0 [5,7,10,11,14,15]

Device 01:00.0 (slot 7): 

Interrupt router at 00:0a.0: unknown vendor 0x10de device 0x0260
  PIRQ? (link 0x01): unrouted?
  PIRQ? (link 0x0a): irq 7
  PIRQ? (link 0x0c): unrouted?
  PIRQ? (link 0x0d): unrouted?
  PIRQ? (link 0x0e): irq 5
  PIRQ? (link 0x0f): irq 11
  PIRQ? (link 0x11): irq 11
  PIRQ? (link 0x13): irq 10
  PIRQ? (link 0x02): irq 10
  PIRQ? (link 0x03): unrouted?
  PIRQ? (link 0x04): unrouted?
  PIRQ? (link 0x05): unrouted?
  PIRQ? (link 0x06): unrouted?
  PIRQ? (link 0x07): irq 10
  PIRQ? (link 0x08): unrouted?
  PIRQ? (link 0x09): irq 10
  
Slot 7 is not assigned.

per@ubuntu:~$ cat /proc/interrupts 
           CPU0       CPU1       
  0:     230367      29878    XT-PIC-XT        timer
  1:          0         10   IO-APIC-edge      i8042
  2:          0          0    XT-PIC-XT        cascade
  5:          1        176   IO-APIC-edge      sata_nv
  7:      29465     230044   IO-APIC-edge      ehci_hcd:usb1
  8:          0          0   IO-APIC-edge      rtc
  9:          0         27   IO-APIC-edge      acpi
 10:          0          0   IO-APIC-edge      HDA Intel
 11:          3        890   IO-APIC-edge      ohci_hcd:usb2, eth0
 12:          0        528   IO-APIC-edge      i8042
 14:          0        129   IO-APIC-edge      ide0
NMI:          0          0 
LOC:     260177     260154 
ERR:          0

Any suggestions?
Comment 1 Len Brown 2007-07-11 19:12:42 UTC
The most important thing here is the boot hang, so lets focus on that...

The fact that "nolapic" makes the problem go away suggests
that the LAPIC timer is being used for clock ticks, yet
the LAPIC timer isn't working.

This may be an AMD issue in C-states, so please test
if it goes away if you boot with "idle=poll"

If that isn't the problem, then it may be related to
a different clock source, such as the PIT, and various
Nvidia chipsets have this screwed up in different ways,
try both acpi_use_timer_override and then acpi_skip_timer_override
to see if that is related to the problem.

I'm not going to worry about how your device interrupts
are not working with pci=noacpi, as that is by definition
a non ACPI problem:-)

finally, this bug is filed against 2.6.22-rc7, but it also states
that 2.6.22-rc7 was the last known working version.  Please clarify
if 2.6.22 is working or failing, and if any recent version before that
is working w/o any cmdline workarounds.
Comment 2 Len Brown 2007-07-11 19:16:50 UTC
also, please attach the output from acpidump,
and the complete dmesg from a successful boot with nolapic,
or with whatever of the params above that work.
Comment 3 Per Toft 2007-07-12 06:47:35 UTC
Created attachment 12007 [details]
ACPI dump (binary)
Comment 4 Per Toft 2007-07-12 06:48:09 UTC
Created attachment 12008 [details]
ACPI dump (hex)
Comment 5 Per Toft 2007-07-12 06:48:35 UTC
Created attachment 12009 [details]
dmesg with single pci=noacpi
Comment 6 Per Toft 2007-07-12 06:49:03 UTC
Created attachment 12010 [details]
/proc/interrupts with single pci=noacpi kernel options
Comment 7 Per Toft 2007-07-12 06:57:29 UTC
Kernel: 2.6.22-7-generic. I will try the latest kernel later today...
I have tried to boot with the different configurations. I used single user mode to ensure no further problems in all cases:

single idle=poll - System freeze when: "Loading hardware drivers" \n "error receiving uevent message: No buffer space available"
I have iterated this step five times with the same result.
 
single acpi_use_timer_override - System freeze when: "Setting the sysem clock"
Iterated five times. All results the same...

single acpi_skip_timer_override - System freeze when: "Setting the sysem clock"
Iterated five times. All results the same...

single nolacpi - System boots in single user mode.
Iterated five times. One success and four other random freezes.
It might be coincident that it worked previously with this option.

single pci=noacpi - Boots successfully, but with edge triggered interrupts and rare random lockups.

I have attached a dmesg and the output from /proc/interrupts

Hope this will help.

With regards
M.Sc.EE. Per Toft

 
Comment 8 Michael Evans 2007-07-21 02:10:53 UTC
These other bug threads may help.

http://bugzilla.kernel.org/show_bug.cgi?id=8368
http://bugzilla.kernel.org/show_bug.cgi?id=8219

As of gentoo's 2.6.22-gentoo-r1 #1 SMP compiled Fri Jul 20 2007, I still require acpi_use_timer_override, however all the details are in the later thread.
Comment 9 Fu Michael 2007-10-10 01:27:38 UTC
(In reply to comment #7)
> single nolacpi - System boots in single user mode.
         ^^^^^^^
Just to make sure it's just a typo instead really used as kernel parameter. Correct parameter should be nolapic, instead of nolacpi...

> Iterated five times. One success and four other random freezes.
> It might be coincident that it worked previously with this option.
>
it looks like not a local apic issue from your description...could you please try the kenrel parameter acpi=noirq. ( not use pci=noacpi )

assign it to Len.  
Comment 10 Per Toft 2007-10-12 01:10:11 UTC
Correct!
It should be:
single nolapic 

Ill try later to boot with apic=noirq and see the results
Comment 11 Fu Michael 2007-11-12 18:37:05 UTC
Per Toft,

any update of the test result of using "acpi=noirq".

note: NOT "apic=noirq"...
Comment 12 Fu Michael 2007-11-18 21:49:58 UTC
a dup of bug# 8219? could you please try to boot with kernel parameter acpi_use_timer_override
Comment 13 ykzhao 2007-11-18 23:58:15 UTC
It will be more useful if boot option(apic=debug) is added besides the option in comment #11 and #12.
Thanks.
Comment 14 ykzhao 2007-12-19 23:15:39 UTC
Created attachment 14134 [details]
dump tool to get bios PRT/MPS table

Will you please use the attached tool to dump BIOS PRT/MPS table?
Comment 15 ykzhao 2007-12-19 23:20:14 UTC
Will you please attach the output of lspci -vxxx ?
Thanks.
Comment 16 Len Brown 2007-12-28 21:55:28 UTC
no response since 2007-10-12, please re-open if this is still an issue
with the latest upstream kernel.org kernel.