In openSUSE 11.4 and also in Ubuntu 10.14 my wideo card S3 UniChrome Pro P4M800 Pro PCI:*(0:1:0:0) 1106:3344:1734:109b rev 1, Mem @ 0xf0000000/67108864, 0xd1000000/16777216 has IRQ 0 assigned in openSUSE 11.3 and earlier it has IRQ 16 I can use system but I have to klick or move mouse from time to time because system freeze. adding boot parameters pci=noacpi acpi=off freeze system immediately after boot-splash appear I have no option in bios for ACPI/IRQ settings computer is Fujitsu-Siemens Amilo Pro V2030D Reproducible: Always "Yeah, this looks like a very broken irq assignment code bug: -pci 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16 +pci 0000:01:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 0" more info: /proc/interrupts dmesg lspci -v -s 01:00.0 acpidump dmidecode lspci -nn
It's great that the kernel bugzilla is back. Can you please verify if the problem still exists in the latest upstream kernel? If the problem still exists. please attach the output of acpidump. please attach the output of lspci -vvxxx -s 01:00.0 please attach the dmesg output after boot in both working and broken kernels.
Problem still exist in attachment is zipped file with requested outputs latest kernel test is: 3.2.0-7.1 working kernel version is: 2.6.34-12 on opensuse bugzilla there is more info: https://bugzilla.novell.com/show_bug.cgi?id=676068 #################################################### Thomas Renninger 2011-12-21 08:08:37 UTC This has been shortly discussed with Eric Biedermann (unfortunately no mailing list was in CC): > On Sunday 17 April 2011 02:27:51 Eric W. Biederman wrote: >> Thomas Renninger <trenn@suse.de> writes: >> >> But reading the dmesg I can see clearly that the timer is not working >> on that irq pin pair. > Not sure I understood this, we have: > - irq0 override happens > - irq0 override does not happen In both cases it appears the irq0 override happens. In the old code it was just more confusing, and the override happened poorly enough that it was still possible to use gsi 16 as linux irq 16. I don't remember the bugs well enough to remember how that could have happened. > But in both cases the timer does not work, but in one it at least fall > back to working lapic timer? We initially fall back to finding the timer through virtual wire mode, and in older kernel we switch from virtual wire mode to the lapic timer. > But in both cases one sees a sane number of timer interrupts happening > on irq0? Ultimately. > And it's about this line: > ACPI: IRQ0 used by override. > not the irq 16->0 override, right? They should be different reports of the same mechanism at work. >> In both kernels I see. >> [ 0.028398] ..TIMER: vector=0x30 apic1=0 pin1=16 apic2=-1 pin2=-1 >> [ 0.032000] ..MP-BIOS bug: 8254 timer not connected to IO-APIC >> [ 0.032000] ...trying to set up timer (IRQ0) through the 8259A ... >> [ 0.032000] ..... (found apic 0 pin 16) ... >> [ 0.032000] ....... failed. >> [ 0.032000] ...trying to set up timer as Virtual Wire IRQ... >> [ 0.075529] ..... works. >> >> The vga card does not claim the interrupt in the newer kernels. I >> expect that is because the linux irq became 0, which we use to signal >> no irq has been assigned. > And 0 (the override) is wrong, it must be irq 16 for the vga card, > right? The VGA card could probably work at any other linux irq number. The problem is because we know that 0 is only used by the ISA timer irq we have special-cased 0 to also mean no irq has been assigned. Except in very rare cases like this where BIOS the ability to screw up with irq overrides and then ACPI used that freedom to screw up permission to screw up with irq overrides and then used it > So there are two unrelated irq problems: > 1) vga/agp > 2) timer Roughly. The BIOS says the same irq is for both uses, but that irq input does not work as a timer input. ####################################################
Created attachment 72214 [details] output of vga irq (lspci, dmesg, acpidump, uname) on 3.2.0.7(bad) and 2.6.34-12(ok)
Feng, there is another Interrupt problem, would you please have a look at it?
Hi Szymon, Could you please post the "/proc/interrupts" and full "lspci -vvvx" and your kernel config for both the good and bad kernel? Thanks, BTW, could you check which kernel is the last known good kernel besides of this 2.6.34 working one, 2.6.35? 2.6.36? If you have time to do a bisect, that'll be perfect :) - Feng
Created attachment 73383 [details] lspci -vvvx
Created attachment 73384 [details] /proc/interrupts
About kernel versions: last working kernel version was from openSUSE 11.3 (2.6.34) in openSUSE 11.4 (2.6.37.1) problem appear about config that was kernel desktop delivered with openSUSE openSUSE 11.3 http://kernel.opensuse.org/cgit/kernel-source/plain/config/i386/desktop?h=openSUSE-11.3 openSUSE 11.4 http://kernel.opensuse.org/cgit/kernel-source/plain/config/i386/desktop?h=openSUSE-11.4 please tell me with exactly kernel should I test
Created attachment 73385 [details] acpi_debug.patch Hi Szymon, Could you test this debug patch and give me the dmesg log? diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 4558f0d..8e06cef 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -106,8 +106,12 @@ static unsigned int gsi_to_irq(unsigned int gsi) unsigned int irq = gsi + NR_IRQS_LEGACY; unsigned int i; + printk("%s(): gsi = %d, gsi_top = %d, NR_IRQS_LEGACY = %d\n", + __func__, gsi, gsi_top, NR_IRQS_LEGACY); + for (i = 0; i < NR_IRQS_LEGACY; i++) { if (isa_irq_to_gsi[i] == gsi) { + printk("%s(): we found a isa match, i = %d\n", __func__, i); return i; } } @@ -562,7 +566,9 @@ int acpi_register_gsi(struct device *dev, u32 gsi, int trigger, int polarity) unsigned int plat_gsi = gsi; plat_gsi = (*__acpi_register_gsi)(dev, gsi, trigger, polarity); + printk("%s(): plat_gsi = %d\n", __func__, plat_gsi); irq = gsi_to_irq(plat_gsi); + printk("%s(): irq = %d\n", __func__, irq); return irq; }
I think I may get the root cause of this issue, could you pls try this patch with kernel 3.2? to see if it fix the issue. diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 8e06cef..0be07d9 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -420,6 +420,11 @@ acpi_parse_int_src_ovr(struct acpi_subtable_header * header, return 0; } + if (intsrc->source_irq == 0 && intsrc->global_irq == 16) { + printk(PREFIX "BIOS IRQ0 pin0 --> pin16 override ignored.\n"); + return 0; + } + if (intsrc->source_irq == 0 && intsrc->global_irq == 2) { if (acpi_skip_timer_override) { printk(PREFIX "BIOS IRQ0 pin2 override ignored.\n");
Hi Szymon, No matter whether the last patch fix the issue, pls add "apic=debug" to the kernel command line and post the dmesg, thanks, - Feng
OK it seems to be fine :) below logs with apic=debug should I send this patch to openSUSE bugzilla as fix
Created attachment 73402 [details] dmesg with apic=debug after fix and debug patch
Created attachment 73403 [details] lspci -vvvx after fix patch
Created attachment 73404 [details] /proc/interrupts after pix patch
(In reply to comment #12) > OK it seems to be fine :) > > below logs with apic=debug > > should I send this patch to openSUSE bugzilla as fix Glad to know the patch works. You can send it to OpenSuSe bugzilla, but it's surely a hack and not in a good shape for upstream. Could you post the "sudo dmidecode" output, maybe we can add a quirk for your system. - Feng
Created attachment 73435 [details] dmidecode after fix
I attached dmidecode output. It can be stupid but I thinking to myself about this "hack": Is there any way that: intsrc->source_irq will be 0 when intsrc->global_irq is > 0 because there is check about later in code: intsrc->global_irq == 2 maybe there should be sth like: if (intsrc->source_irq == 0 && intsrc->global_irq > 0) {
Created attachment 73436 [details] new acpi patch Yes, that's similar to what I thought to. Could you try the new attached patch and post the results and dmesg?
Created attachment 73451 [details] dmesg with acpi=defug, patched kernel Patch is working :)
Comment on attachment 73451 [details] dmesg with acpi=defug, patched kernel I found this in dmesg: [ 0.000000] Using APIC driver default [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: at /usr/src/linux-3.1.10-1.9/arch/x86/kernel/acpi/boot.c:1358 dmi_ignore_irq0_timer_override+0x2c/0x53() [ 0.000000] Hardware name: AMILO PRO V2030 [ 0.000000] ati_ixp4x0 quirk not complete. [ 0.000000] Modules linked in: [ 0.000000] Pid: 0, comm: swapper Not tainted 3.1.10-1.9-desktop #1 [ 0.000000] Call Trace: [ 0.000000] [<c0205433>] try_stack_unwind+0x163/0x180 [ 0.000000] [<c0204167>] dump_trace+0x47/0xf0 [ 0.000000] [<c020549b>] show_trace_log_lvl+0x4b/0x60 [ 0.000000] [<c02054c8>] show_trace+0x18/0x20 [ 0.000000] [<c06f4cff>] dump_stack+0x6d/0x72 [ 0.000000] [<c0248868>] warn_slowpath_common+0x78/0xb0 [ 0.000000] [<c0248933>] warn_slowpath_fmt+0x33/0x40 [ 0.000000] [<c0ae05e9>] dmi_ignore_irq0_timer_override+0x2c/0x53 [ 0.000000] [<c05d6c28>] dmi_check_system+0x28/0x40 [ 0.000000] [<c0ae0ead>] acpi_boot_init+0xa/0x60 [ 0.000000] [<c0ada20b>] setup_arch+0x636/0x6bf [ 0.000000] [<c0ad7488>] start_kernel+0x7e/0x369 [ 0.000000] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.000000] FUJITSU SIEMENS detected: Ignoring BIOS IRQ0 pin2 override
Yes, the warning msg is expected. The root cause of this issue is the buggy FW which assign GSI 16 to 2 irqs, and what the patch did is add a quirk to ignore one assignment. The OS will show this warning once it detects this quirk. If you really hate this warning, you can add "acpi_skip_timer_override" to your kernel command line.
A patch referencing this bug report has been merged in Linux v3.5-rc5: commit f6b54f083cc66cf9b11d2120d8df3c2ad4e0836d Author: Feng Tang <feng.tang@intel.com> Date: Mon Jun 4 15:00:06 2012 +0800 ACPI: Add a quirk for "AMILO PRO V2030" to ignore the timer overriding
A patch referencing this bug report has been merged in Linux v3.5-rc5: commit ae10ccdc3093486f8c2369d227583f9d79f628e5 Author: Feng Tang <feng.tang@intel.com> Date: Mon Jun 4 15:00:04 2012 +0800 ACPI: Make acpi_skip_timer_override cover all source_irq==0 cases