Bug 3694
Summary: | kernel bug when writing to usb-disk | ||
---|---|---|---|
Product: | Drivers | Reporter: | simon manlay (spointm) |
Component: | USB | Assignee: | Matthew Dharm (mdharm-usb) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | bunk, greg, RG.Schneider |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.8 (debian sarge kernel) | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 5089 | ||
Attachments: | More information |
Description
simon manlay
2004-11-04 01:15:05 UTC
After testing, it appears that the USB-disk was defective. With a new, disk, no problem is detected. But The last time, whem the scsi went down (because of the defective disk), the server was half crashed. I think it's bad to crash the whole system because the external device has problems. A similar probleme has occured again, with an USB disk certified goog. Here is the call trace: SCSI error : <6 0 0 0> return code = 0x70000 end_request: I/O error, dev sdc, sector 42494602 SCSI error : <6 0 0 0> return code = 0x70000 end_request: I/O error, dev sdc, sector 42494603 usb 1-2: USB disconnect, address 10 SCSI error : <6 0 0 0> return code = 0x70000 end_request: I/O error, dev sdc, sector 42494604 scsi: Device offlined - not ready after error recovery: host 6 channel 0 id 0 lun 0 sd 6:0:0:0: Illegal state transition cancel->offline Badness in scsi_device_set_state at drivers/scsi/scsi_lib.c:1643 [<d084e5f0>] scsi_device_set_state+0xc4/0xcf [scsi_mod] [<d084c75d>] scsi_eh_offline_sdevs+0x47/0x60 [scsi_mod] [<d084cd3d>] scsi_unjam_host+0x18d/0x1a2 [scsi_mod] [<d084ce68>] scsi_error_handler+0x116/0x15a [scsi_mod] [<d084cd52>] scsi_error_handler+0x0/0x15a [scsi_mod] [<c01041e1>] kernel_thread_helper+0x5/0xb SCSI error : <6 0 0 0> return code = 0x70000 end_request: I/O error, dev sdc, sector 42494605 printk: 267 messages suppressed. Buffer I/O error on device sdc1, logical block 42494573 lost page write due to I/O error on sdc1 ------------[ cut here ]------------ kernel BUG at drivers/block/as-iosched.c:1852! invalid operand: 0000 [#1] PREEMPT Modules linked in: nls_iso8859_1 nls_cp437 vfat fat ppp_deflate zlib_deflate bsd_comp ipt_state ipt_REJECT ipt_LOG capability commoncap ppp_async af_packet crc_ccitt ipv6 ppp_generic slhc ip_conntrack_irc ip_conntrack_ftp ipt_MASQUERADE iptable_nat ip_conntrack iptable_filter ip_tables usb_storage ide_core ehci_hcd usbcore 8139too crc32 e100 mii rtc sd_mod ext3 jbd cryptoloop loop blowfish aic7xxx scsi_mod unix CPU: 0 EIP: 0060:[<c01dcee4>] Not tainted EFLAGS: 00010212 (2.6.8-slt) EIP is at as_exit+0x22/0x62 eax: cafb1774 ebx: cafb1760 ecx: 00000000 edx: c3a09eb0 esi: cec67eb8 edi: 00000282 ebp: cffd28b4 esp: c3a09ee4 ds: 007b es: 007b ss: 0068 Process scsi_eh_6 (pid: 16528, threadinfo=c3a08000 task=ca1140f0) Stack: cec67e2c c01d5bdf cec67e2c cec67e38 c01d7462 cec67e2c ca118c24 ca118c00 d084fd92 cec67e2c ca118da8 c02c9a88 c02c9aa0 cffd28d8 c01d2a51 ca118d84 c018c494 ca118da8 ca118c08 00000286 cffd2800 ca118c00 d0849d10 ca118da8 Call Trace: [<c01d5bdf>] elevator_exit+0x12/0x15 [<c01d7462>] blk_cleanup_queue+0x1f/0x62 [<d084fd92>] scsi_device_dev_release+0xd8/0xf9 [scsi_mod] [<c01d2a51>] device_release+0x14/0x44 [<c018c494>] kobject_cleanup+0x40/0x65 [<d0849d10>] __scsi_iterate_devices+0x69/0x73 [scsi_mod] [<d084c28e>] scsi_eh_stu+0x11d/0x136 [scsi_mod] [<d084ca94>] scsi_eh_ready_devs+0x17/0x5d [scsi_mod] [<d084cd3d>] scsi_unjam_host+0x18d/0x1a2 [scsi_mod] [<d084ce68>] scsi_error_handler+0x116/0x15a [scsi_mod] [<d084cd52>] scsi_error_handler+0x0/0x15a [scsi_mod] [<c01041e1>] kernel_thread_helper+0x5/0xb Code: 0f 0b 3c 07 85 69 26 c0 8d 43 0c 39 43 0c 74 08 0f 0b 3d 07 cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 5 model name : Pentium II (Deschutes) stepping : 2 cpu MHz : 398.370 cache size : 512 KB fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr bogomips : 784.38 Created attachment 6651 [details]
More information
Hello,
problem seems not to be related to defect hard disc.
Problem is repoducable on various motherboards, USB 2.0 PCI cards, 2.5" and
3.5"
USB 2.0 hard disc boxes, various hard discs ( Hitachi Desc/Travel Star, Samsung
1SP161.. ) and all 2.6 kernel ( with processor optimisation or 'out of the box
Debian Sarge 3.10a Distribution )
Problem occures:
1. quickly when reading very big files ( e.g.: DVD iso image ) from
a fast ide/scsi device to USB 2.0 hard disc.
2. after cache is almost full ( on my system: 700 MB with appr. 3-4 MB Ram
free.
3. NOT ( or not seen by me yet ) when copying many small files.
It looks very much like a time out or buffer overflow ( in the USB hard disc
box ) problem.
Either from the USB EHCI <> usb_storage <> USB hard disc box ( have seen USB
disconnect messages ) or the USB <> scsi emulation layer.
System sometimes hangs completly sometimes a disconnect and reconnect of the
USB storage device helps. The latter comes with a switch of the sdx device.
E.g.: sda -> sdb first time then sdb -> sda again after 2nd
disconnect/reconnect
procedure.
Of course, there a lot of hanfging processes related to ehci, pdflush ...
I can send more kern.log ... on request.
Is this issue still present in kernel 2.6.16.7? Hello,
> ------- Additional Comments From bunk@stusta.de 2006-04-18 07:32 -------
> Is this issue still present in kernel 2.6.16.7?
Answer:
The latest kernel I tested with is the 2.6.16 ( after xx-rcy ). I could
still see these errors - but really seldom.
Some more tests ( specially after trying same thing at my friends place
and instantly running into this trouble )
revealed trouble with hardware ( cables, USB2IDE adapters ? external
HDs ?? ).
THIS means that changing the just and ONLY the usb cable made things
working or worse. Using the same external drive and usb cable on another
computer with another usb 2.0 adapter lead to trouble - or not.
ANYHOW:: I think that if there's some problems with transmission of data
via USB the error recovery between the scsi emulation and usb driver /
layer is not working that good.
E.G::: Meanwhile I got a quite old SCSI hard disc. During linux
installation on that one I saw that the hd reported defect sector(s) and
started a very long time conduming sector re-arrangement procedure.
BUT:: Instead of disconnecting and braking the installation procedure,
the ( native ??) scsi driver waited for the disc to finish.
Question:: Can it be that there is no possibility ( Specs limits ... )
or missing implentation of such error recovery between the scsi and USB
Sub system??
Anyhow : Yesterday I started again with firewire and hard discs and had
same problems ( either disc not recognized at all or if found and
mounted it disconnected several times during reading of about 2 GBytes
files ( 2 different discs )
Remark:: I'm not that aware of writing drivers and specs of USB and
SCSI ::: So, please be patient - I'm trying to help AND NOT ANNOY !!
Many greetings
/RalfS
I would like to clarify the situation. First, in the latest reprot, it appears that the problem goes away if you switch cables. Is that correct? Second, this problem only happens with EHCI (not UHCI or OHCI)? Third, what does the crash message look like with the latest kernel? > ------- Additional Comments From mdharm-usb@one-eyed-alien.net 2006-05-08 13:39 ------- > I would like to clarify the situation. > > First, in the latest reprot, it appears that the problem goes away if you switch > cables. Is that correct? > Answer: Correct ++ using the same cable with other hardware ( computer, PCI to USB interface card ( VIA, ALI chipset ) and or USB hard disc (including their various USB to ide chips ) make the problem disappear. > Second, this problem only happens with EHCI (not UHCI or OHCI)? > Answer: Well, recently I only use EHCI because of the speed. BUT: I have never seen a problem in former time when I had only USB 1.1 interfaces. Addon, probably I never copied such an amount of data with 1.1 . > Third, what does the crash message look like with the latest kernel? > Sorry, havn't downloaded the latest kernel yet. Will do so but may take a little while. ------ I hope that I could clarify things. If not, please respond. I will restart the testings in a more professionell way and write a list of hardware combinations and results. Thanks for Your patience /RalfS Error recovery has greatly improved in recent kernels. If this continues to be an issue, we can re-open this bug. |