Bug 13982
Summary: | [libata] (?) causing Hardlock in 2.6.30.4 during simultaneous read & write | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Wylda (wylda) |
Component: | SCSI | Assignee: | linux-scsi (linux-scsi) |
Status: | RESOLVED DUPLICATE | ||
Severity: | normal | CC: | devzero, hilld, rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.30.4 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
kernel config
dmesg lspci -v First trace during 5days of testing/restarting... |
Created attachment 22714 [details]
dmesg
Created attachment 22715 [details]
lspci -v
Created attachment 22716 [details]
First trace during 5days of testing/restarting...
this may be http://bugzilla.kernel.org/show_bug.cgi?id=13933 (someone mark it as duplicate then) can you check if 2.6.29 works without problems ? is it possible to temporarly disable/remove the realtek nic ? |
Created attachment 22713 [details] kernel config Hi. HW: Server Board Intel STL2, 2x P3 @ 1GHz, 1GB ECC RAM SW: self-compiled kernel 2.6.30.4 on Debian Lenny Symptom: PC completely stops responding (ping, ALT+F2..., Numlock, CTRL-ALT-DEL, ALT-SysRq) Traces: No Oops, nothing in syslog etc. I think it's not HW failure, because it never happened when * 2x dd if=/dev/zero bs=1M count=200000 | md5sum -b * 2x dd if=/dev/zero of=test-x bs=1M count=200000 such tests take a long time on this HW (51min and 85min) and checksums always OK. Tested many times. Anyway i'm usually able to invoke Hardlock in 2min. I use a script: #!/bin/bash dd if=/dev/zero bs=1M count=200000 | md5sum -b & dd if=/dev/zero bs=1M count=200000 | md5sum -b & cd /home/pik/a md5sum -c office.md5 & cd /home/pik/b md5sum -c office.md5 & So i run this stress script _and_ begin FTP write to the same HDD. Usually Hardlock itself, but if it does not Hardlock in 60sec i can help it with another dd (dd if=/dev/zero of=test1 bs=1M count=200000). Also why should not be HW failure - No complains of EDAC and happens on different HW: * PATA drive IC35L040AVVA07 on ServerWorks OSB4 (MOBO's chipset aka IB6566 South Bridge) * SATA drives 2xWD5000AADS in md0 on Sil3114 * Network card: PCI-X, Intel 1Gbps 82543GC * Network card: PCI Realtek RT8139 Today when doing last test for bugreport there was a trace, but the HardLock was not 100% same (as always ping stopped working, console switching did not work, no Numlock reaction, but Alt-SysRq worked). Hope its not misleading - see attachment. Another prove(?), that this is not HW failure: * never happens with Debian's 2.6.26-17lenny1 all_generic_ide=1 gcc4.1.3 * easy to trigger with 2.6.30.4 gcc4.3.2 ...i know know different kernel version, kernel parameters and gcc, but HW error would occurred anyway. config kernel, dmesg, lspci atached.