Most recent kernel where this bug did not occur: unknown also occured on 2.6.12.5 and 2.6.14.2 Distribution: gentoo Hardware Environment: pcchips M811 motherboard Duron 1800 256M Software Environment: minimal install of gentoo with raidtools, smartmontools Problem Description: when I do a modprobe of siimage it takes about three minutes producing "Disabling IRQ #10" messages about 6 times and dmesg has 8 __report_bad_irq tracebacks Steps to reproduce: attach drive, I used a Maxtor 200G Diamond Max Plus 8M/7200 ATA133 CS to master connector on second port of fourth card, reboot, modprobe siimage, wait 3 minutes for modprobe
Created attachment 7049 [details] dmesg
Created attachment 7050 [details] /proc/interrupts
Created attachment 7051 [details] zcat /proc/config.gz
This is more likely an acpi or PCI problem than an IDE problem. Your dmesg output wasn't complete. Could you please increase CONFIG_LOG_BUF_SHIFT and try again, so we can see all the dmesg output? Also, are you able to identify an earlier 2.6 kernel which didn't have this bug? It helps us in identifying what change broke things, if we indeed broke it. Thanks.
Created attachment 7064 [details] longer dmesg
Also report if booting with acpi=off fixes it.
This machine has had this problems since it was built. acpi=off does not fix it either... I am guessing that the BIOS is doing something stupid, like leaving the last drive enabled with a pending interrupt, and turning off the board interrupt. The match code in ide-probe looks like it should prevent irq storms but it seems to only apply to configured interfaces, and I guess the IRQ 10 storm comes from the second channel of the card. Which is not initalized until after the irq is turned on? at which point it will not hit the match logic but will generate the 800000 interrupts.
Is this still a problem with current kernels. Does booting with "irqpoll" help ? P
Your question arrives while I am about a 6 hour drive from the computer :( It still occurs as of a few weeks ago when I last tried siimage I did not check closely since it appeared to sit for several minutes even with (I think) a 2.6.20.X or 2.6.21.X kernel I don't remember which of these two I did first, I would have to check the logs. I replaced the motherboard with a EPOX EP-8RDA3+PRO still occurred. I flashed all the PCI cards with the generic non-raid version of the expansion roms from SIS in case that was the problem. I was able to get it mostly working via the pata_sil module so that is the one I am using currently, since it was working I kept expanding the system. I have expanded the system so much that I would need to apply that patch for extra ide buses to fully test it. Well the sit in a irq storm for 3 minutes test is easy to see even without the patch. I seem to recall trying irqpoll,noacpi,routeirq and several others in various combinations without success. I am willing to resume testing if that would help or close the bug and mark it overtaken by pata_sil since I have a workaround.
sorry about rejected invalid, I did not intend to change the status.
Would it be possible to give 2.6.24-rc6-mm1 a try? It have heavily revamped IDE probing code (also a countless number of bugfixes went in since 2.6.21 kernel version).
Created attachment 14437 [details] copy of minicom bootlog panic.txt 2.6.24-r6-mm1 does not seem quite ready yet I patched 2.6.23->2.6.24-rc6->2.6.24-rc6-mm1 and copied .config from 2.6.23.9 make oldconfig it failed to compile with nfs on. turned nfs off. won't boot panics in __find_get_block/__getblk is there something else I should try? I have been running with the pata_sil on 2.6.23.9 looking over my notes I seem to remember that it appeared to be a problem with enabling the board interrupt before initializing the second channel and something BIOS? may have left a pending interrupt on the second channel since the board is enabled but the second channel is not configured yet the interrupt is not cleared and stays active. ~800000 later it dies due to the timeout. then the second channel is configured and has more trouble since the interrupt is disabled...
Care to try 2.6.25-rc2 (it has IDE patches that were previously in -mm).
Any feedback ?
I'm closing it for now. [ Please re-open if the problem still happens with recent kernels. ]