Bug 5279
Summary: | SiI3112 problems under load with 2.6.14-rc1-mm1 | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Nikos Ntarmos (ntarmos) |
Component: | Serial ATA | Assignee: | Jeff Garzik (jgarzik) |
Status: | REJECTED UNREPRODUCIBLE | ||
Severity: | blocking | CC: | ntarmos |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.14-rc1-mm1 | Subsystem: | |
Regression: | --- | Bisected commit-id: |
Description
Nikos Ntarmos
2005-09-20 15:27:39 UTC
Some more stressing resulted in: Sep 21 02:19:29 Atlas kernel: sd 5:0:0:0: SCSI error: return code = 0x8000002 Sep 21 02:19:29 Atlas kernel: sdd: Current: sense key: Aborted Command Sep 21 02:19:29 Atlas kernel: Additional sense: Scsi parity error Sep 21 02:19:29 Atlas kernel: Info fld=0x5314f Sep 21 02:19:29 Atlas kernel: end_request: I/O error, dev sdd, sector 340303 Sep 21 02:19:29 Atlas kernel: R5: read error not correctable. Sep 21 02:19:29 Atlas kernel: ATA: abnormal status 0xD0 on port 0xFFFFC200000060C7 and a lock-up. Only to get worse, when I tried to halt after the previous error: Unable to handle kernel paging request at ffff82bc81000030 RIP: <ffffffff8015ccb6>{free_block+102} PGD 0 Oops: 0000 [1] CPU 0 Modules linked in: parport_pc lp parport capability commoncap ipt_ECN ipt_TOS ipt_limit ipt_REJECT ipt_ULOG ipt_state ipt_pkttype ipt_recent ipt_iprange ipt_multiport ipt_conntrack iptable_mangle ip_nat_irc ip_nat_tftp ip_nat_ftp iptable_nat ip_conntrack_irc ip_conntrack_tftp ip_conntrack_ftp ip_conntrack iptable_filter ip_tables md5 ipv6 sg uhci_hcd ohci_hcd ehci_hcd usbcore sata_via sata_sil sata_promise libata sk98lin dm_mod psmouse Pid: 4, comm: events/0 Not tainted 2.6.14-Atlas #1 RIP: 0010:[<ffffffff8015ccb6>] <ffffffff8015ccb6>{free_block+102} RSP: 0018:ffff810002f59df8 EFLAGS: 00010006 RAX: ffff82bc81000000 RBX: ffff81007f417940 RCX: 0000000000000001 RDX: 0000003790000000 RSI: ffff81007f666a10 RDI: 0000000000000000 RBP: 0000000000000000 R08: ffffffff8015d520 R09: 0000000000000000 R10: 000000000000000e R11: 0000000000000028 R12: 0000000000000001 R13: ffff81007f666a10 R14: ffff81007f7009e0 R15: ffffffff8043e6f0 FS: 00002aaaaae00640(0000) GS:ffffffff80472800(0000) knlGS:0000000000000000 CS: 0010 DS: 0018 ES: 0018 CR0: 000000008005003b CR2: ffff82bc81000030 CR3: 0000000078b8f000 CR4: 00000000000006e0 Process events/0 (pid: 4, threadinfo ffff810002f58000, task ffff810002f40060) Stack: ffff810002f59e78 ffff81007f666a10 ffff81007f666a00 0000000000000001 ffff81007f417940 ffffffff8015d4cf ffff81007f66bd40 0000000000000003 ffff81007f4179a8 ffffffff8015d598 Call Trace:<ffffffff8015d4cf>{drain_array_locked+111} <ffffffff8015d598>{cache_reap+120} <ffffffff80140d40>{worker_thread+416} <ffffffff8012dc60>{default_wake_function+0} <ffffffff8012dc60>{default_wake_function+0} <ffffffff80140ba0>{worker_thread+0} <ffffffff80140ba0>{worker_thread+0} <ffffffff80144e38>{kthread+136} <ffffffff8010f2d2>{child_rip+8} <ffffffff80140ba0>{worker_thread+0} <ffffffff80144db0>{kthread+0} <ffffffff8010f2ca>{child_rip+0} Code: 48 8b 70 30 48 8b 56 08 48 0f b7 46 30 48 39 32 48 8b 4c c3 RIP <ffffffff8015ccb6>{free_block+102} RSP <ffff810002f59df8> CR2: ffff82bc81000030 (Sorry for posting these in separate chunks... I'm frantically trying to sort things out and just don't seem to be getting anywhere near solving this issue...) Just another data-point: 2.6.13-rc7-libata1 managed to get to almost 90% of reconstructing the 320G raid5 array with just a couple of 'status=0xc8' errors, only to lock-up hard with the same error as in http://bugzilla.kernel.org/show_bug.cgi?id=5279#c1 after nearly 40' of disk activity :( In the meanwhile, I've also tried all possible combinations of 'acpi=off', 'pci=noacpi', 'noapic', 'pci=routeirq', 'pci=usepirqmask', and 'idebus=66', as well as hooking up other (although identical to the current) disks on the SiI3112, moving it to different PCI slots, changing cables, reserving irqs from the bios to force sata_sil into choosing different ones, and dancing Fandago on the keyboard. I intended to check with a 2.4 kernel but that seems impossible with Debian 3.1r0a/x86_64 (I've filed a bug with the Debian BTS but haven't gotten the bug # yet). Is there anything I can do to narrow this down? I no longer have neither the machine nor the controller, so I can't reproduce this issue or provide any further insight. Since this bug report has laid dormant for all this time, I think it's safe and sane to close it. |