Bug 14118

Summary: USB storage: "No sense [current]" when connecting
Product: IO/Storage Reporter: Mantas Mikulėnas (grawity)
Component: SCSIAssignee: linux-scsi (linux-scsi)
Status: RESOLVED CODE_FIX    
Severity: normal CC: akpm, stern
Priority: P1    
Hardware: All   
OS: Linux   
URL: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/400652
Kernel Version: 2.6.30 Subsystem:
Regression: No Bisected commit-id:
Attachments: Patch to handle errors with no sense

Description Mantas Mikulėnas 2009-09-04 12:03:01 UTC
When I connect my phone (Sony-Ericsson W760i in Mass Storage mode) to my system, it fails to mount the device, and dmesg gets flooded with "No Sense [current]" messages:

[ 1603.224162] sd 3:0:0:0: [sdc] Sense Key : No Sense [current]
[ 1603.224177] sd 3:0:0:0: [sdc] Add. Sense: No additional sense information
[ 1603.232153] sd 3:0:0:1: [sdd] Sense Key : No Sense [current]
[ 1603.232167] sd 3:0:0:1: [sdd] Add. Sense: No additional sense information
[ 1603.322091] sd 3:0:0:0: [sdc] Sense Key : No Sense [current]
[ 1603.322106] sd 3:0:0:0: [sdc] Add. Sense: No additional sense information

I have submitted a bug report on Ubuntu website, at https://bugs.launchpad.net/ubuntu/+source/linux/+bug/400652 - but I have since switched to Arch Linux, and I still have the same problem (using 2.6.30-ARCH kernel). (All the dmesg/lsusb/etc output posted on that page, and on https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264789 , should still be valid. Tell me if they aren't.)
Comment 1 Andrew Morton 2009-09-04 23:15:14 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 4 Sep 2009 12:03:02 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=14118
> 
>                URL: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/4
>                     00652
>            Summary: USB storage: "No sense [current]" when connecting
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.30
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: USB
>         AssignedTo: greg@kroah.com
>         ReportedBy: grawity@gmail.com
>         Regression: No
> 
> 
> When I connect my phone (Sony-Ericsson W760i in Mass Storage mode) to my
> system, it fails to mount the device, and dmesg gets flooded with "No Sense
> [current]" messages:
> 
> [ 1603.224162] sd 3:0:0:0: [sdc] Sense Key : No Sense [current]
> [ 1603.224177] sd 3:0:0:0: [sdc] Add. Sense: No additional sense information
> [ 1603.232153] sd 3:0:0:1: [sdd] Sense Key : No Sense [current]
> [ 1603.232167] sd 3:0:0:1: [sdd] Add. Sense: No additional sense information
> [ 1603.322091] sd 3:0:0:0: [sdc] Sense Key : No Sense [current]
> [ 1603.322106] sd 3:0:0:0: [sdc] Add. Sense: No additional sense information
> 
> I have submitted a bug report on Ubuntu website, at
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/400652 - but I have
> since
> switched to Arch Linux, and I still have the same problem (using 2.6.30-ARCH
> kernel). (All the dmesg/lsusb/etc output posted on that page, and on
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/264789 , should still be
> valid. Tell me if they aren't.)
>
Comment 2 Alan Stern 2009-09-05 02:24:45 UTC
> > http://bugzilla.kernel.org/show_bug.cgi?id=14118
> > 
> >                URL: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/4
> >                     00652
> >            Summary: USB storage: "No sense [current]" when connecting

> > When I connect my phone (Sony-Ericsson W760i in Mass Storage mode) to my
> > system, it fails to mount the device, and dmesg gets flooded with "No Sense
> > [current]" messages:
> > 
> > [ 1603.224162] sd 3:0:0:0: [sdc] Sense Key : No Sense [current]
> > [ 1603.224177] sd 3:0:0:0: [sdc] Add. Sense: No additional sense
> information
> > [ 1603.232153] sd 3:0:0:1: [sdd] Sense Key : No Sense [current]
> > [ 1603.232167] sd 3:0:0:1: [sdd] Add. Sense: No additional sense
> information
> > [ 1603.322091] sd 3:0:0:0: [sdc] Sense Key : No Sense [current]
> > [ 1603.322106] sd 3:0:0:0: [sdc] Add. Sense: No additional sense
> information

Please collect a usbmon trace showing what happens when you plug in the 
phone, and post the result or attach it to the bug report.  
Instructions for usbmon can be found in the kernel source file 
Documentation/usb/usbmon.txt.

Alan Stern
Comment 3 Mantas Mikulėnas 2009-09-05 08:47:28 UTC
Alan Stern wrote:
>>> http://bugzilla.kernel.org/show_bug.cgi?id=14118
> Please collect a usbmon trace showing what happens when you plug in the 
> phone, and post the result or attach it to the bug report.  
> Instructions for usbmon can be found in the kernel source file 
> Documentation/usb/usbmon.txt.
> 
> Alan Stern

usbmon trace attached (and uploaded to
<http://sine.cluenet.org/~grawity/trash/usbmon-2u.log>)
Comment 4 Alan Stern 2009-09-05 16:01:13 UTC
On Sat, 5 Sep 2009, [UTF-8] Mantas MikulÄnas wrote:

> usbmon trace attached (and uploaded to
> <http://sine.cluenet.org/~grawity/trash/usbmon-2u.log>)

The usbmon trace shows several related problems.

The phone reports that it has two logical units (LUNs).  It says that
LUN 0 has 120093 sectors (60 MB) and LUN 1 has 3995649 sectors (2 GB).

The first problem occurs when the kernel tries to read 8 sectors from
LUN 0 starting at sector 120072.  The phone provides only 4 sectors of
data together with a Check Condition error indication, but when asked
for more detailed error information (Request Sense) it sends no error
data (No sense).  With no error info, the kernel thinks that nothing
was really wrong so it tries issuing the READ command again, with the
same result, over and over...  This unending loop is the second
problem.

The third problem is like the first; it occurs when the kernel tries to
read sector 3995648 of LUN 1 (the last sector).  The same sort of thing
happens; the phone provides no data and an error indication but no
error data, so the command is retried over and over.

The first and third problems are caused by bugs in the phone.  There
may be no good way around the first, but you should be able to fix the
third by specifying the "quirks" module parameter for usb-storage.  
Add a line saying

options usb-storage quirks=fce:e0c6:c

to your /etc/modprobe.conf file.  This will tell the kernel that each 
LUN really has one fewer sector than the phone claims.

The second problem has nothing to do with USB; it is a bug in the SCSI
layer.  If the module parameter doesn't fix everything, I suggest this
bug report be reassigned to James Bottomley rather than Greg
Kroah-Hartman and the component be changed from USB to SCSI.

Alan Stern
Comment 5 Mantas Mikulėnas 2009-09-05 18:04:24 UTC
On Sat, Sep 05, 2009 at 12:01:09PM -0400, Alan Stern wrote:
> [snip]
> The third problem is like the first; it occurs when the kernel tries to
> read sector 3995648 of LUN 1 (the last sector).  The same sort of thing
> happens; the phone provides no data and an error indication but no
> error data, so the command is retried over and over.
> 
> The first and third problems are caused by bugs in the phone.  There
> may be no good way around the first, but you should be able to fix the
> third by specifying the "quirks" module parameter for usb-storage.  
> Add a line saying
> 
> options usb-storage quirks=fce:e0c6:c
> 
> to your /etc/modprobe.conf file.  This will tell the kernel that each 
> LUN really has one fewer sector than the phone claims.

That did fix the sector count, but the other problems still are there.

After first reporting this on Ubuntu's Launchpad (a month ago), I had
been told to try changing /lib/udev/rules.d/60-persistent-storage.rules
to make vol_id/blkid not check for RAID, but it's only a temporary
workaround.

> The second problem has nothing to do with USB; it is a bug in the SCSI
> layer.  If the module parameter doesn't fix everything, I suggest this
> bug report be reassigned to James Bottomley rather than Greg
> Kroah-Hartman and the component be changed from USB to SCSI.

Should I edit the bug myself? (If yes, what address do I reassign it to?)
Comment 6 Alan Stern 2009-09-05 20:40:10 UTC
On Sat, 5 Sep 2009, Mantas [utf-8] MikulÄnas wrote:

> After first reporting this on Ubuntu's Launchpad (a month ago), I had
> been told to try changing /lib/udev/rules.d/60-persistent-storage.rules
> to make vol_id/blkid not check for RAID, but it's only a temporary
> workaround.

Right.  In addition there's a program in hal that also checks for RAID, 
so you'd have to make two changes.

> > The second problem has nothing to do with USB; it is a bug in the SCSI
> > layer.  If the module parameter doesn't fix everything, I suggest this
> > bug report be reassigned to James Bottomley rather than Greg
> > Kroah-Hartman and the component be changed from USB to SCSI.
> 
> Should I edit the bug myself? (If yes, what address do I reassign it to?)

Maybe we can prevail upon Andrew Morton to reassign it for you.

Alan Stern
Comment 7 Andrew Morton 2009-09-05 20:53:45 UTC
On Sat, 5 Sep 2009 16:40:06 -0400 (EDT) Alan Stern <stern@rowland.harvard.edu> wrote:

> > > The second problem has nothing to do with USB; it is a bug in the SCSI
> > > layer.  If the module parameter doesn't fix everything, I suggest this
> > > bug report be reassigned to James Bottomley rather than Greg
> > > Kroah-Hartman and the component be changed from USB to SCSI.
> > 
> > Should I edit the bug myself? (If yes, what address do I reassign it to?)
> 
> Maybe we can prevail upon Andrew Morton to reassign it for you.

OK, I reassigned it to scsi, but I expect it would help the scsi guys
if someone could add a new summary of what they think the actual bug is.
Comment 8 Alan Stern 2009-09-05 21:18:18 UTC
On Sat, 5 Sep 2009, Andrew Morton wrote:

> OK, I reassigned it to scsi, but I expect it would help the scsi guys
> if someone could add a new summary of what they think the actual bug is.

Okay, here's a quick summary.  This USB mass storage device has a bug: 
It doesn't like to access the last 16 or so sectors of LUN 0.  When 
asked to read from those sectors it returns no data and Check Condition 
set.  Then in response to REQUEST SENSE it returns no information 
(SK=0, ASC=ASCQ=0).

This causes the SCSI midlayer to reissue the read request, with the 
same result.  We enter an unending loop because each READ is a new 
request with a new timeout.  That's the real problem -- the bug in the 
phone wouldn't matter much if SCSI would just give up and fail the READ 
after a couple of retries.

There are a few more details in comment #4.  Note that recently people
have been encountering more and more devices with this same kind of bug
(READ fails, REQUEST SENSE returns nothing).

My suggestion for a fix: Don't call scsi_requeue_command() if the 
current command made zero forward progress.  But this will have to be 
elaborated and made reliable.

Alan Stern
Comment 9 Alan Stern 2009-09-28 18:01:16 UTC
Created attachment 23196 [details]
Patch to handle errors with no sense

Try this patch.  It should fix the problem of the unending retries.  There will still be errors, but each one will be retried only a small number of times.
Comment 10 Mantas Mikulėnas 2009-10-03 20:19:36 UTC
> Try this patch.  It should fix the problem of the unending retries.  There
> will
> still be errors, but each one will be retried only a small number of times.

The patch seems to fix the problems.

(Tried with 2.6.31.1 kernel, default Arch Linux configuration, if that matters.)
Comment 11 Alan Stern 2009-10-03 21:29:39 UTC
Thanks for testing.  I will submit the patch for inclusion in 2.6.32 and 2.6.31.stable.
Comment 12 Alan Stern 2009-10-13 02:40:33 UTC
The patch has been merged as commit f1a0743bc0e7a30c032b1eb78f6a2b0f805b4597.  You can close out this bug report.