Bug 10409
Description
Anielkis Herrera
2008-04-06 19:44:33 UTC
Will you please boot the system with "acpi=off" and attach the following outputs? a.acpidump b.lspci -vxxx Thanks. Created attachment 15641 [details]
output of "lspci -vxxx"
Created attachment 15642 [details]
acpidump
please try pci=nommconf, or latest kernel. I already did it, and had the same result how about boot option 'nohpet'? It works!!!! what is the problem? can you explain it to me?? I like to know how and why things happens Can you get the log just before the hang? You can use a serial port to get it, or just take a picture. The info is helpful for further debug. please also try boot option 'acpi_skip_timer_override' (without nohpet). It appears the hpet is counting, but no interrupt. Created attachment 15670 [details]
booting without acpi=off and with debug
Created attachment 15671 [details]
booting without acpi=off and with debug (pic2)
Created attachment 15672 [details]
booting without acpi=off and with debug (pic3)
Created attachment 15673 [details]
booting without acpi=off and debug (pic1)
Created attachment 15674 [details]
booting without acpi=off and without debug (pic2)
Created attachment 15675 [details]
booting without acpi=off and without debug (pic3)
Created attachment 15676 [details]
booting without acpi=off and without debug (pic4)
I can't debug with serial..
those are from the original errors, the first 3 with the option debug
and the other 4 without it, to see more recently messages
Can you try the boot option I mentioned in comment #9, I suspected hpet isn't linked to ioapic pin 2 in the system. I tried twice, but it doesn't boot Created attachment 15683 [details]
debug
looks my guess is wrong, the HPET just doesn't fire legacy interrupt. How about the debug patch?
it works with the debug patch, there is the problem.. in the funtion hpet_legacy_clockevent_register Created attachment 15684 [details]
debug
This confirms the issue is related to HPET interrupt instead of HPET counting, but I need more info to debug the issue. Can you apply the debug patch, attach the dmesg output with it? I'd like to check some registers of HPET.
Created attachment 15700 [details]
output of dmesg booting with the second dbg.patch
I take this from the initramfs, I add the debug option, and there mount / and save the results..
I'm trying to fix the DSDT, I follow http://gentoo-wiki.com/HOWTO_Fix_Common_ACPI_Problems and got 2 errors: dsdt.dsl 142: Return (\WBYT (Local1, Local0, Local2)) Error 4059 - Called method returns no value ^ dsdt.dsl 158: Return (\WWRD (Local1, Local0, Local2)) Error 4059 - Called method returns no value ^ i'm on it now. ;) Re comment #22: unfortunately this is just part of the log, and the info I want isn't in it. please check if you can get the full log. Re comment #23, I can make sure this isn't DSDT related. Created attachment 15706 [details]
obtained with the second debug patch
ok, I wait for the system to boot and obtain this
I haven't idea why HPET legacy interrupt is missing. Venki, any idea? Summary: PIT timer is working. HPET is counting, but legacy interrupt is missing (calibrate_delay() hangs) boot option 'acpi_skip_timer_override' doesn't work legacy interrupt is edge triggered Shaohua, I see this message in dmesg ATI board detected. Disabling timer routing over 8254. That causes enable_8259A_irq(0) not to be called on this platform. May be that disables the int from PIC getting forwarded to IOAPIC. No idea how 8254 continues to work though... Anielkis, can you try as venki suggested. just comments one line as below: in file arch/x86/kernel/early-quirks.c static void __init ati_bugs(int num, int slot, int func) { #ifdef CONFIG_X86_IO_APIC if (timer_over_8254 == 1) { // timer_over_8254 = 0; printk(KERN_INFO "ATI board detected. Disabling timer routing over 8254.\n"); } #endif } Created attachment 15718 [details]
here it is
hmm, please not apply the debug patch in comment 21. just comment one line as I suggested in comment 28. Otherwise, we can't see the effect. well without the debug patch from comment 21 it can't boot and stop on "spurious 8259A interrupt: IRQ7" Created attachment 15737 [details]
with only that line commented
I took this by the old method of the camera ;)
I thought we could blacklist this system, or we could have mechanism to detect if HPET interrupt is working in kernel. I'll do some investigation and back to you. can you send me the 'dmidecode' output? we can blacklist the system. Created attachment 15840 [details]
dmidecode
Created attachment 15879 [details]
workaround
This should workaround the issue, but don't know if community will accept it. Anyway, please try it first.
Created attachment 15882 [details]
output of dmesg
it works, here is my dmesg output
what is the problem in 0.192668 ? it says a BIOS problem.. can it be the source?
No, not, it just impacts pci config space read/write. Thomas, can you look at this issue? I am curious how the IOAPIC configuration looks like. Anielkis, can you please boot your machine with "hpet=disable debug apic=debug" on your kernel command line and provide dmesg of that boot? Thanks. Created attachment 16186 [details]
output of dmesg only with 'hpet=disable debug apic=debug'
The output file contains tons of zeros. Not really informative :) Can you please upload it again ? Created attachment 16200 [details]
new output
fixed??
Bad luck -- dmesg_output is readable but incomplete. Just for your interest I am looking for output like: ENABLING IO-APIC IRQs init IO_APIC IRQs IOAPIC[0]: Set routing entry (2-0 -> 0x30 -> IRQ 0 Mode:0 Active:0) IOAPIC[0]: Set routing entry (2-1 -> 0x31 -> IRQ 1 Mode:0 Active:0) ... IOAPIC[0]: Set routing entry (2-14 -> 0x3e -> IRQ 14 Mode:0 Active:0) IOAPIC[0]: Set routing entry (2-15 -> 0x3f -> IRQ 15 Mode:0 Active:0) IO-APIC (apicid-pin) 2-16, 2-17, 2-18, 2-19, 2-20, 2-21, 2-22, 2-23 not connected. and PCI: Using ACPI for IRQ routing number of MP IRQ sources: 16. number of IO-APIC #2 registers: 24. testing the IO APIC....................... IO APIC #2...... .... register #00: 02000000 ....... : physical APIC id: 02 .... register #01: 00178021 ....... : max redirection entries: 0017 ....... : PRQ implemented: 1 ....... : IO APIC version: 0021 .... register #02: 02000000 ....... : arbitration: 02 .... IRQ redirection table: NR Dst Mask Trig IRR Pol Stat Dmod Deli Vect: 00 003 0 0 0 0 0 1 1 30 01 003 0 0 0 0 0 1 1 31 02 003 0 0 0 0 0 1 1 32 ... 15 000 1 0 0 0 0 0 0 00 16 000 1 0 0 0 0 0 0 00 17 000 1 0 0 0 0 0 0 00 IRQ to pin mappings: IRQ0 -> 0:0 IRQ1 -> 0:1 IRQ2 -> 0:2 ... IRQ12 -> 0:12 IRQ13 -> 0:13 IRQ14 -> 0:14 which should be printed when you use the recommended debug options. But your dmesg starts beyond the place where this output usually occurs. Created attachment 16218 [details]
screenshot
I take this dissabling vesa, using boot_delay and recording a video
with vesafb the messages you need doesn't appear
do you need more pics or this is waht you need?
Thanks, but ... well, my intention was to get reliable debug data from your tests. Unfortunately taking pictures doesn't help much. (BTW, your screenshot indeed contains some parts of the desired output ;-) But to compare dmesg output of several boots, screenshots are not very helpful. I think w/o proper debugging means (e.g. serial console) it's not possible to fix the IO-A-PIC configuration on your machine to get hpet working. Many similar issues with timer/hpet and IOAPIC pin assignment were reported in the past. Often the problems went away when a BIOS update was applied, (If a BIOS update with a suitable fix was available...) It seems that you just have to live with a "broken" hpet on this machine. The question is whether it is acceptable to use "hpet=disable" or whether the blacklisting from Shaohua (comment #36) should be applied. Anielkis, can you try boot option 'irqpoll'? (In reply to comment #47) > Anielkis, can you try boot option 'irqpoll'? > Shaohua, i have not access now to the HAIER, in a month i will.. and try it hah, Anielkis, any updates? HPET is usually not enabled in BIOS for SB4xx chipsets. Thus almost all systems with SB4xx chipset do not provide an HPET ACPI table. On this system the vendor obviously decided to enable it. But I guess he just provided an ACPI table without setting up the proper bits to really enable the HPET. (There are some bits in the chipset that must be set to enable interrupts from HPET which are usually disabled.) As ATI did not want to support HPET on SB400 it is not a good idea to activate it. Final solution is to use "nohpet" kernel option for this notebook. (As long as the vendor does not provide a fixed BIOS w/o HPET ACPI table.) > Final solution is to use "nohpet" kernel option for this notebook.
> (As long as the vendor does not provide a fixed BIOS w/o HPET ACPI table.)
Shouldn't we HPET blacklist the SB4xx in that case ?
Thanks,
tglx
I'm late, sorry.. so, what are the parameters I have to check? Now I have kernel 2.6.26 compiled FYI, it might be worth to retest with 2.6.28-rc6. It contains a patch to enable hpet interrupts on SB400. (x86: hpet: modify IXP400 quirk to enable interrupts) Any update on this bug? Anielkis Herrera, were you able to re-test with a post 2.6.28 kernel? this bug is fixed on 2.6.28 and earlier.. since ubuntu-8.10 it can boot without "nohpet" Anielkis, thanks for the response. Closing. |