Bug 8840 - usb storage device dis/reconnects quickly and oopses
Summary: usb storage device dis/reconnects quickly and oopses
Status: REJECTED UNREPRODUCIBLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks: USB
  Show dependency tree
 
Reported: 2007-08-03 01:34 UTC by Mourad De Clerck
Modified: 2008-09-24 06:43 UTC (History)
2 users (show)

See Also:
Kernel Version: 2.6.22
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Mourad De Clerck 2007-08-03 01:34:02 UTC
Distribution:
Debian unstable

Hardware Environment:
Asus M2V (VIA KT890), ValuePlus SPIO 352S USB2 to SATA enclosure

usb device:     Bus 005 Device 007: ID 152d:2336
usb controller: 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86)
usb controller: 00:10.4 0c03: 1106:3104 (rev 86)

Software Environment:
- 32bit kernel and userland
- udev rule:
SUBSYSTEM=="scsi", SUBSYSTEMS=="usb", ATTRS{idVendor}=="152d", ATTR{max_sectors}="128"

... because the drive enclosure can't seem to cope with max_sectors=240 reliably. (with 240 it disconnects during heavy traffic)

Problem Description:
An external disk gets plugged/unplugged regularly. Under certain circumstances (even with no traffic at all) it will start going nuts, connecting/disconnecting quite quickly, and finally ending in an Oops. The enclosure isn't the most reliable AFAICT, so maybe this bug is hardware related after all - feel free to close it if you think so.

Steps to reproduce:
Plug in and out on a regular basis. Once in a while (7-10 days) it'll start "disconnecting" continuously, and then it's a matter of time before it oopses.



Aug  2 09:04:30 server kernel: usb 5-3: new high speed USB device using ehci_hcd and address 117
Aug  2 09:04:30 server kernel: usb 5-3: configuration #1 chosen from 1 choice
Aug  2 09:04:30 server kernel: scsi97 : SCSI emulation for USB Mass Storage devices
Aug  2 09:04:30 server kernel: usb-storage: device found at 117
Aug  2 09:04:30 server kernel: usb-storage: waiting for device to settle before scanning
Aug  2 09:04:33 server kernel: usb 5-3: USB disconnect, address 117
Aug  2 09:04:35 server kernel: usb 5-3: new high speed USB device using ehci_hcd and address 118
Aug  2 09:04:35 server kernel: usb 5-3: configuration #1 chosen from 1 choice
Aug  2 09:04:35 server kernel: scsi98 : SCSI emulation for USB Mass Storage devices
Aug  2 09:04:35 server kernel: usb-storage: device found at 118
Aug  2 09:04:35 server kernel: usb-storage: waiting for device to settle before scanning
Aug  2 09:04:37 server kernel: usb 5-3: USB disconnect, address 118
Aug  2 09:04:40 server kernel: usb 5-3: new high speed USB device using ehci_hcd and address 119
Aug  2 09:04:40 server kernel: usb 5-3: configuration #1 chosen from 1 choice
Aug  2 09:04:40 server kernel: scsi99 : SCSI emulation for USB Mass Storage devices
Aug  2 09:04:40 server kernel: usb-storage: device found at 119
Aug  2 09:04:40 server kernel: usb-storage: waiting for device to settle before scanning
Aug  2 09:04:45 server kernel: usb-storage: device scan complete
Aug  2 09:04:45 server kernel: scsi 99:0:0:0: Direct-Access     WDC WD40 00YR-01PLB0           PQ: 0 ANSI: 2 CCS
Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] 781422768 512-byte hardware sectors (400088 MB)
Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] Write Protect is off
Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] Mode Sense: 00 38 00 00
Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] Assuming drive cache: write through
Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] 781422768 512-byte hardware sectors (400088 MB)
Aug  2 09:04:48 server kernel: sd 99:0:0:0: [sdc] Write Protect is off
Aug  2 09:04:48 server kernel: usb 5-3: USB disconnect, address 119
Aug  2 09:04:48 server kernel: BUG: unable to handle kernel NULL pointer dereference at virtual address 00000000
Aug  2 09:04:48 server kernel:  printing eip:
Aug  2 09:04:48 server kernel: c022ffc9
Aug  2 09:04:48 server kernel: *pde = 00000000
Aug  2 09:04:48 server kernel: Oops: 0000 [#1]
Aug  2 09:04:49 server kernel: SMP
Aug  2 09:04:49 server kernel: Modules linked in: sha1 arc4 ecb blkcipher ppp_mppe button ac battery ppp_deflate zlib_deflate bsd_comp ipt_MASQUERADE xt_TCPMSS xt_state xt_NOTRACK iptable_raw ipt_REDIRECT ipt_REJECT ipt_LOG xt_limit xt_tcpudp iptable_filter iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nfnetlink iptable_mangle ip_tables x_tables ppp_async crc_ccitt ppp_generic slhc ipv6 dm_snapshot dm_mirror dm_mod it87 hwmon_vid i2c_isa evdev serio_raw shpchp psmouse k8temp i2c_viapro pcspkr atl1 rtc parport_pc parport i2c_core pci_hotplug mii ext3 jbd mbcache raid1 md_mod ide_generic ide_cd cdrom sd_mod via82cxxx usb_storage generic ide_core ehci_hcd uhci_hcd ata_generic e1000 sata_via floppy usbcore libata scsi_mod thermal processor fan
Aug  2 09:04:49 server kernel: CPU:    0
Aug  2 09:04:49 server kernel: EIP:    0060:[<c022ffc9>]    Not tainted VLI
Aug  2 09:04:49 server kernel: EFLAGS: 00010202   (2.6.22-1-k7 #1)
Aug  2 09:04:49 server kernel: EIP is at make_class_name+0x27/0x7a
Aug  2 09:04:49 server kernel: eax: 00000000   ebx: ffffffff   ecx: ffffffff   edx: 0000000b
Aug  2 09:04:49 server kernel: esi: f8877a72   edi: 00000000   ebp: 00000000   esp: df99fe68
Aug  2 09:04:49 server kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss: 0068
Aug  2 09:04:49 server kernel: Process khubd (pid: 847, ti=df99e000 task=dfa29a90 task.ti=df99e000)
Aug  2 09:04:49 server kernel: Stack: f7d47208 f7d47200 f8885dec f8885d80 f7d47208 c02300fb 00000000 f7d47200
Aug  2 09:04:49 server kernel:        f7d47094 00000246 f7d47c00 c0230185 f7d47000 f886d6d2 f7d47000 dfc19c00
Aug  2 09:04:49 server kernel:        f886b302 dfc19c30 dfc19c00 f88664be dfc19ef8 dfe9f218 f8952660 f894289e
Aug  2 09:04:49 server kernel: Call Trace:
Aug  2 09:04:49 server kernel:  [<c02300fb>] class_device_del+0x83/0x105
Aug  2 09:04:49 server kernel:  [<c0230185>] class_device_unregister+0x8/0x10
Aug  2 09:04:49 server kernel:  [<f886d6d2>] __scsi_remove_device+0x23/0x60 [scsi_mod]
Aug  2 09:04:49 server kernel:  [<f886b302>] scsi_forget_host+0x2d/0x4a [scsi_mod]
Aug  2 09:04:49 server kernel:  [<f88664be>] scsi_remove_host+0x65/0xd7 [scsi_mod]
Aug  2 09:04:49 server kernel:  [<f894289e>] storage_disconnect+0xe/0x16 [usb_storage]
Aug  2 09:04:49 server kernel:  [<f88ad706>] usb_unbind_interface+0x44/0x85 [usbcore]
Aug  2 09:04:49 server kernel:  [<c022f893>] __device_release_driver+0x6e/0x8b
Aug  2 09:04:49 server kernel:  [<c022fc1e>] device_release_driver+0x1e/0x34
Aug  2 09:04:49 server kernel:  [<c022f28c>] bus_remove_device+0x6a/0x7a
Aug  2 09:04:49 server kernel:  [<c022da99>] device_del+0x1c7/0x238
Aug  2 09:04:49 server kernel:  [<f88ab1ff>] usb_disable_device+0x5c/0xbb [usbcore]
Aug  2 09:04:49 server kernel:  [<f88a7d43>] usb_disconnect+0x83/0x122 [usbcore]
Aug  2 09:04:49 server kernel:  [<f88a8460>] hub_thread+0x377/0xa73 [usbcore]
Aug  2 09:04:49 server kernel:  [<c0133375>] autoremove_wake_function+0x0/0x35
Aug  2 09:04:49 server kernel:  [<f88a80e9>] hub_thread+0x0/0xa73 [usbcore]
Aug  2 09:04:49 server kernel:  [<c01332af>] kthread+0x38/0x5d
Aug  2 09:04:49 server kernel:  [<c0133277>] kthread+0x0/0x5d
Aug  2 09:04:49 server kernel:  [<c01049b7>] kernel_thread_helper+0x7/0x10
Aug  2 09:04:49 server kernel:  =======================
Aug  2 09:04:49 server kernel: Code: 5b 04 5b c3 55 31 ed 57 89 c7 56 89 c6 89 e8 53 83 cb ff 89 d9 83 ec 04 89 14 24 f2 ae f7 d1 49 8b 04 24 89 ca 89 d9 8b 38 89 e8 <f2> ae f7 d1 49 8d 44 0a 02 ba d0 00 00 00 e8 38 7b f3 ff 31 d2
Aug  2 09:04:49 server kernel: EIP: [<c022ffc9>] make_class_name+0x27/0x7a SS:ESP 0068:df99fe68
Aug  2 09:04:49 server kernel: sd 99:0:0:0: [sdc] Mode Sense: 00 38 00 00
Aug  2 09:04:49 server kernel: sd 99:0:0:0: [sdc] Assuming drive cache: write through
Aug  2 09:04:49 server kernel:  sdc:<3>Buffer I/O error on device sdc, logical block 0
Aug  2 09:04:49 server kernel: Buffer I/O error on device sdc, logical block 0

... got more of these in logfiles if needed.

Thanks for your consideration,

-- Mourad DC
Comment 1 Anonymous Emailer 2007-08-03 01:39:25 UTC
Reply-To: akpm@linux-foundation.org

On Fri,  3 Aug 2007 01:27:49 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=8840
> 
>            Summary: usb storage device dis/reconnects quickly and oopses
>            Product: Drivers
>            Version: 2.5
>      KernelVersion: 2.6.22
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: USB
>         AssignedTo: greg@kroah.com
>         ReportedBy: kernel-bugzilla@aquazul.com
> 
> 
> Distribution:
> Debian unstable
> 
> Hardware Environment:
> Asus M2V (VIA KT890), ValuePlus SPIO 352S USB2 to SATA enclosure
> 
> usb device:     Bus 005 Device 007: ID 152d:2336
> usb controller: 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev
> 86)
> usb controller: 00:10.4 0c03: 1106:3104 (rev 86)
> 
> Software Environment:
> - 32bit kernel and userland
> - udev rule:
> SUBSYSTEM=="scsi", SUBSYSTEMS=="usb", ATTRS{idVendor}=="152d",
> ATTR{max_sectors}="128"
> 
> ... because the drive enclosure can't seem to cope with max_sectors=240
> reliably. (with 240 it disconnects during heavy traffic)
> 
> Problem Description:
> An external disk gets plugged/unplugged regularly. Under certain
> circumstances
> (even with no traffic at all) it will start going nuts,
> connecting/disconnecting quite quickly, and finally ending in an Oops. The
> enclosure isn't the most reliable AFAICT, so maybe this bug is hardware
> related
> after all - feel free to close it if you think so.
> 
> Steps to reproduce:
> Plug in and out on a regular basis. Once in a while (7-10 days) it'll start
> "disconnecting" continuously, and then it's a matter of time before it
> oopses.
> 
> 
> 
> Aug  2 09:04:30 server kernel: usb 5-3: new high speed USB device using
> ehci_hcd and address 117
> Aug  2 09:04:30 server kernel: usb 5-3: configuration #1 chosen from 1 choice
> Aug  2 09:04:30 server kernel: scsi97 : SCSI emulation for USB Mass Storage
> devices
> Aug  2 09:04:30 server kernel: usb-storage: device found at 117
> Aug  2 09:04:30 server kernel: usb-storage: waiting for device to settle
> before
> scanning
> Aug  2 09:04:33 server kernel: usb 5-3: USB disconnect, address 117
> Aug  2 09:04:35 server kernel: usb 5-3: new high speed USB device using
> ehci_hcd and address 118
> Aug  2 09:04:35 server kernel: usb 5-3: configuration #1 chosen from 1 choice
> Aug  2 09:04:35 server kernel: scsi98 : SCSI emulation for USB Mass Storage
> devices
> Aug  2 09:04:35 server kernel: usb-storage: device found at 118
> Aug  2 09:04:35 server kernel: usb-storage: waiting for device to settle
> before
> scanning
> Aug  2 09:04:37 server kernel: usb 5-3: USB disconnect, address 118
> Aug  2 09:04:40 server kernel: usb 5-3: new high speed USB device using
> ehci_hcd and address 119
> Aug  2 09:04:40 server kernel: usb 5-3: configuration #1 chosen from 1 choice
> Aug  2 09:04:40 server kernel: scsi99 : SCSI emulation for USB Mass Storage
> devices
> Aug  2 09:04:40 server kernel: usb-storage: device found at 119
> Aug  2 09:04:40 server kernel: usb-storage: waiting for device to settle
> before
> scanning
> Aug  2 09:04:45 server kernel: usb-storage: device scan complete
> Aug  2 09:04:45 server kernel: scsi 99:0:0:0: Direct-Access     WDC WD40
> 00YR-01PLB0           PQ: 0 ANSI: 2 CCS
> Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] 781422768 512-byte hardware
> sectors (400088 MB)
> Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] Write Protect is off
> Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] Mode Sense: 00 38 00 00
> Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] Assuming drive cache: write
> through
> Aug  2 09:04:45 server kernel: sd 99:0:0:0: [sdc] 781422768 512-byte hardware
> sectors (400088 MB)
> Aug  2 09:04:48 server kernel: sd 99:0:0:0: [sdc] Write Protect is off
> Aug  2 09:04:48 server kernel: usb 5-3: USB disconnect, address 119
> Aug  2 09:04:48 server kernel: BUG: unable to handle kernel NULL pointer
> dereference at virtual address 00000000
> Aug  2 09:04:48 server kernel:  printing eip:
> Aug  2 09:04:48 server kernel: c022ffc9
> Aug  2 09:04:48 server kernel: *pde = 00000000
> Aug  2 09:04:48 server kernel: Oops: 0000 [#1]
> Aug  2 09:04:49 server kernel: SMP
> Aug  2 09:04:49 server kernel: Modules linked in: sha1 arc4 ecb blkcipher
> ppp_mppe button ac battery ppp_deflate zlib_deflate bsd_comp ipt_MASQUERADE
> xt_TCPMSS xt_state xt_NOTRACK iptable_raw ipt_REDIRECT ipt_REJECT ipt_LOG
> xt_limit xt_tcpudp iptable_filter iptable_nat nf_nat nf_conntrack_ipv4
> nf_conntrack nfnetlink iptable_mangle ip_tables x_tables ppp_async crc_ccitt
> ppp_generic slhc ipv6 dm_snapshot dm_mirror dm_mod it87 hwmon_vid i2c_isa
> evdev
> serio_raw shpchp psmouse k8temp i2c_viapro pcspkr atl1 rtc parport_pc parport
> i2c_core pci_hotplug mii ext3 jbd mbcache raid1 md_mod ide_generic ide_cd
> cdrom
> sd_mod via82cxxx usb_storage generic ide_core ehci_hcd uhci_hcd ata_generic
> e1000 sata_via floppy usbcore libata scsi_mod thermal processor fan
> Aug  2 09:04:49 server kernel: CPU:    0
> Aug  2 09:04:49 server kernel: EIP:    0060:[<c022ffc9>]    Not tainted VLI
> Aug  2 09:04:49 server kernel: EFLAGS: 00010202   (2.6.22-1-k7 #1)
> Aug  2 09:04:49 server kernel: EIP is at make_class_name+0x27/0x7a
> Aug  2 09:04:49 server kernel: eax: 00000000   ebx: ffffffff   ecx: ffffffff  
> edx: 0000000b
> Aug  2 09:04:49 server kernel: esi: f8877a72   edi: 00000000   ebp: 00000000  
> esp: df99fe68
> Aug  2 09:04:49 server kernel: ds: 007b   es: 007b   fs: 00d8  gs: 0000  ss:
> 0068
> Aug  2 09:04:49 server kernel: Process khubd (pid: 847, ti=df99e000
> task=dfa29a90 task.ti=df99e000)
> Aug  2 09:04:49 server kernel: Stack: f7d47208 f7d47200 f8885dec f8885d80
> f7d47208 c02300fb 00000000 f7d47200
> Aug  2 09:04:49 server kernel:        f7d47094 00000246 f7d47c00 c0230185
> f7d47000 f886d6d2 f7d47000 dfc19c00
> Aug  2 09:04:49 server kernel:        f886b302 dfc19c30 dfc19c00 f88664be
> dfc19ef8 dfe9f218 f8952660 f894289e
> Aug  2 09:04:49 server kernel: Call Trace:
> Aug  2 09:04:49 server kernel:  [<c02300fb>] class_device_del+0x83/0x105
> Aug  2 09:04:49 server kernel:  [<c0230185>] class_device_unregister+0x8/0x10
> Aug  2 09:04:49 server kernel:  [<f886d6d2>] __scsi_remove_device+0x23/0x60
> [scsi_mod]
> Aug  2 09:04:49 server kernel:  [<f886b302>] scsi_forget_host+0x2d/0x4a
> [scsi_mod]
> Aug  2 09:04:49 server kernel:  [<f88664be>] scsi_remove_host+0x65/0xd7
> [scsi_mod]
> Aug  2 09:04:49 server kernel:  [<f894289e>] storage_disconnect+0xe/0x16
> [usb_storage]
> Aug  2 09:04:49 server kernel:  [<f88ad706>] usb_unbind_interface+0x44/0x85
> [usbcore]
> Aug  2 09:04:49 server kernel:  [<c022f893>]
> __device_release_driver+0x6e/0x8b
> Aug  2 09:04:49 server kernel:  [<c022fc1e>] device_release_driver+0x1e/0x34
> Aug  2 09:04:49 server kernel:  [<c022f28c>] bus_remove_device+0x6a/0x7a
> Aug  2 09:04:49 server kernel:  [<c022da99>] device_del+0x1c7/0x238
> Aug  2 09:04:49 server kernel:  [<f88ab1ff>] usb_disable_device+0x5c/0xbb
> [usbcore]
> Aug  2 09:04:49 server kernel:  [<f88a7d43>] usb_disconnect+0x83/0x122
> [usbcore]
> Aug  2 09:04:49 server kernel:  [<f88a8460>] hub_thread+0x377/0xa73 [usbcore]
> Aug  2 09:04:49 server kernel:  [<c0133375>]
> autoremove_wake_function+0x0/0x35
> Aug  2 09:04:49 server kernel:  [<f88a80e9>] hub_thread+0x0/0xa73 [usbcore]
> Aug  2 09:04:49 server kernel:  [<c01332af>] kthread+0x38/0x5d
> Aug  2 09:04:49 server kernel:  [<c0133277>] kthread+0x0/0x5d
> Aug  2 09:04:49 server kernel:  [<c01049b7>] kernel_thread_helper+0x7/0x10
> Aug  2 09:04:49 server kernel:  =======================
> Aug  2 09:04:49 server kernel: Code: 5b 04 5b c3 55 31 ed 57 89 c7 56 89 c6
> 89
> e8 53 83 cb ff 89 d9 83 ec 04 89 14 24 f2 ae f7 d1 49 8b 04 24 89 ca 89 d9 8b
> 38 89 e8 <f2> ae f7 d1 49 8d 44 0a 02 ba d0 00 00 00 e8 38 7b f3 ff 31 d2
> Aug  2 09:04:49 server kernel: EIP: [<c022ffc9>] make_class_name+0x27/0x7a
> SS:ESP 0068:df99fe68
> Aug  2 09:04:49 server kernel: sd 99:0:0:0: [sdc] Mode Sense: 00 38 00 00
> Aug  2 09:04:49 server kernel: sd 99:0:0:0: [sdc] Assuming drive cache: write
> through
> Aug  2 09:04:49 server kernel:  sdc:<3>Buffer I/O error on device sdc,
> logical
> block 0
> Aug  2 09:04:49 server kernel: Buffer I/O error on device sdc, logical block
> 0
> 
> ... got more of these in logfiles if needed.
> 
> Thanks for your consideration,
> 
> -- Mourad DC
> 
> 
> -- 
> Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email
> ------- You are receiving this mail because: -------
> You are on the CC list for the bug, or are watching someone who is.
Comment 2 Alan Stern 2007-08-07 09:06:51 UTC
The disconnect/connect activity you see probably _is_ caused by a hardware problem.  However the oops is a separate matter; it is caused by a bug in the SCSI core.  You can work around the bug by turning off CONFIG_SCSI_SCAN_ASYNC.
Comment 3 Minhyoung Kim 2007-08-13 19:57:02 UTC
Hi, everybody.

I have the same problem with kerenl version 2.6.12.
It is not easy to reproduce the problem by removing USB storage device manually. Fortunately I have a USB storage device with loose connector.
I has investigated this problem and the following commnet is the result.

The problem happens when USB storage device is disconneted while SCSI driver scans device. 
When storage device is removed, quiesce_and_remove_host function is called to remove SCSI host. In my knowledge, this function doesn't check usb_stor_scan_thread is finished. So __scsi_remove_target function can be called while usb_stor_scan_thread scans SCSI device by calling scsi_scan_host. In this situation, __scsi_remove_target function increase reap_ref counter without any locking mechanism with scsi_scan_channel. And this makes scsi_target_reap do not handle properly SCSI_SCAN_NO_RESPONSE situation. In result, oops happens.

The following correction makes the problem disappeared. But I don't know this is correct or the best patch to fix the problem.

Before :
void scsi_remove_host(struct Scsi_Host *shost)
{
        scsi_forget_host(shost);
        ....
}

After :
void scsi_remove_host(struct Scsi_Host *shost)
{
	down(&shost->scan_mutex); 
	up(&shost->scan_mutex);   	
	
        scsi_forget_host(shost);
        ....
}
Comment 4 Alan Stern 2007-08-14 07:40:22 UTC
2.6.12 is pretty old now, and the problem you describe here (which is _not_ the same as the one in the original bug report) was fixed long ago.  You should upgrade to a more recent kernel.

Note You need to log in before you can comment on or make changes to this bug.