Bug 5911 - siimage.c interrupts screaming
Summary: siimage.c interrupts screaming
Status: REJECTED INSUFFICIENT_DATA
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: IDE (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Bartlomiej Zolnierkiewicz
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-01-17 12:12 UTC by James McMechan
Modified: 2008-12-07 05:30 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.12.5-2.6.15.1, 2.6.20.x-2.6.21.x
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmesg (14.88 KB, text/plain)
2006-01-17 12:13 UTC, James McMechan
Details
/proc/interrupts (489 bytes, text/plain)
2006-01-17 12:14 UTC, James McMechan
Details
zcat /proc/config.gz (34.10 KB, text/plain)
2006-01-17 12:15 UTC, James McMechan
Details
longer dmesg (17.75 KB, text/plain)
2006-01-18 06:19 UTC, James McMechan
Details
copy of minicom bootlog panic.txt (20.48 KB, text/plain)
2008-01-13 11:05 UTC, James McMechan
Details

Description James McMechan 2006-01-17 12:12:25 UTC
Most recent kernel where this bug did not occur: unknown also occured on
2.6.12.5 and 2.6.14.2
Distribution: gentoo
Hardware Environment: pcchips M811 motherboard Duron 1800 256M
Software Environment: minimal install of gentoo with raidtools, smartmontools
Problem Description: when I do a modprobe of siimage it takes about three
minutes producing "Disabling IRQ #10" messages about 6 times and dmesg has 8
__report_bad_irq tracebacks

Steps to reproduce: attach drive, I used a Maxtor 200G Diamond Max Plus 8M/7200
ATA133 CS to master connector on second port of fourth card, reboot, modprobe
siimage, wait 3 minutes for modprobe
Comment 1 James McMechan 2006-01-17 12:13:52 UTC
Created attachment 7049 [details]
dmesg
Comment 2 James McMechan 2006-01-17 12:14:56 UTC
Created attachment 7050 [details]
/proc/interrupts
Comment 3 James McMechan 2006-01-17 12:15:29 UTC
Created attachment 7051 [details]
zcat /proc/config.gz
Comment 4 Andrew Morton 2006-01-17 14:31:35 UTC
This is more likely an acpi or PCI problem than an IDE problem.

Your dmesg output wasn't complete.  Could you please increase
CONFIG_LOG_BUF_SHIFT and try again, so we can see all the dmesg output?

Also, are you able to identify an earlier 2.6 kernel which didn't have this bug?
 It helps us in identifying what change broke things, if we indeed broke it.

Thanks.
Comment 5 James McMechan 2006-01-18 06:19:29 UTC
Created attachment 7064 [details]
longer dmesg
Comment 6 Alan 2006-01-18 06:48:39 UTC
Also report if booting with acpi=off fixes it.

Comment 7 James McMechan 2006-01-18 12:40:10 UTC
This machine has had this problems since it was built.
acpi=off does not fix it either...

I am guessing that the BIOS is doing something stupid, like leaving the last drive
enabled with a pending interrupt, and turning off the board interrupt.

The match code in ide-probe looks like it should prevent irq storms but it
seems to only apply to configured interfaces, and I guess the IRQ 10 storm comes
from the second channel of the card. Which is not initalized until after the
irq is turned on? at which point it will not hit the match logic but will generate
the 800000 interrupts.
Comment 8 Alan 2007-06-18 08:06:57 UTC
Is this still a problem with current kernels. Does booting with "irqpoll" help ?

P
Comment 9 James McMechan 2007-06-19 00:49:39 UTC
Your question arrives while I am about a 6 hour drive from the computer :(

It still occurs as of a few weeks ago when I last tried siimage I did not check closely since it appeared to sit for several minutes even with (I think) a 2.6.20.X or 2.6.21.X kernel

I don't remember which of these two I did first, I would have to check the logs.

I replaced the motherboard with a EPOX EP-8RDA3+PRO still occurred.

I flashed all the PCI cards with the generic non-raid version of the expansion roms from SIS in case that was the problem.

I was able to get it mostly working via the pata_sil module so that is the one I am using currently, since it was working I kept expanding the system.

I have expanded the system so much that I would need to apply that patch for extra ide buses to fully test it. Well the sit in a irq storm for 3 minutes test is easy to see even without the patch.

I seem to recall trying irqpoll,noacpi,routeirq and several others in various combinations without success.

I am willing to resume testing if that would help or close the bug and mark it overtaken by pata_sil since I have a workaround.
Comment 10 James McMechan 2007-06-19 00:55:02 UTC
sorry about rejected invalid, I did not intend to change the status.
Comment 11 Bartlomiej Zolnierkiewicz 2008-01-07 15:52:00 UTC
Would it be possible to give 2.6.24-rc6-mm1 a try?  It have heavily revamped IDE probing code (also a countless number of bugfixes went in since 2.6.21 kernel version).
Comment 12 James McMechan 2008-01-13 11:05:35 UTC
Created attachment 14437 [details]
copy of minicom bootlog panic.txt

2.6.24-r6-mm1 does not seem quite ready yet
I patched 2.6.23->2.6.24-rc6->2.6.24-rc6-mm1
and copied .config from 2.6.23.9
make oldconfig
it failed to compile with nfs on.
turned nfs off.
won't boot panics in __find_get_block/__getblk
is there something else I should try?
I have been running with the pata_sil on 2.6.23.9
looking over my notes I seem to remember that it appeared to be a problem with enabling the board interrupt before initializing the second channel and something BIOS? may have left a pending interrupt on the second channel
since the board is enabled but the second channel is not configured yet the interrupt is not cleared and stays active. ~800000 later it dies due to the timeout. then the second channel is configured and has more trouble since the interrupt is disabled...
Comment 13 Bartlomiej Zolnierkiewicz 2008-02-16 11:44:16 UTC
Care to try 2.6.25-rc2 (it has IDE patches that were previously in -mm).
Comment 14 Alan 2008-10-09 09:31:32 UTC
Any feedback ?
Comment 15 Bartlomiej Zolnierkiewicz 2008-12-07 05:30:10 UTC
I'm closing it for now.

[ Please re-open if the problem still happens with recent kernels. ]

Note You need to log in before you can comment on or make changes to this bug.