Bug 12120

Summary: [Block layer or SCSI] requests aborted too early during check_partition()
Product: IO/Storage Reporter: Stefan Richter (stefanr)
Component: SCSIAssignee: linux-scsi (linux-scsi)
Status: CLOSED CODE_FIX    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28-rc Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 11808    

Description Stefan Richter 2008-11-29 05:19:55 UTC
Latest working kernel version: 2.6.27.4
Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)

Hardware Environment: OxSemi 912 and 92x based FireWire harddisks


See http://marc.info/?l=linux-scsi&m=122774022130397 --
Disks may still spin their motor up when the kernel attempts to read partitions. In kernel 2.6.27, this was handled transparently.  But since 2.6.28-rc?, the kernel immediately aborts the associated requests and sets the new scsi_device offline.
Comment 1 Anonymous Emailer 2008-11-29 07:12:35 UTC
Reply-To: stefanr@s5r6.in-berlin.de

On 29 Nov, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12120
> 
>            Summary: [Block layer or SCSI] requests aborted too early during
>                     check_partition()
>            Product: IO/Storage
>            Version: 2.5
>      KernelVersion: 2.6.28-rc
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: SCSI
>         AssignedTo: linux-scsi@vger.kernel.org
>         ReportedBy: stefan-r-bz@s5r6.in-berlin.de
> OtherBugsDependingO 11808
>              nThis:
>         Regression: 1
> 
> 
> Latest working kernel version: 2.6.27.4
> Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs yet)
> 
> Hardware Environment: OxSemi 912 and 92x based FireWire harddisks
> 
> 
> See http://marc.info/?l=linux-scsi&m=122774022130397 --
> Disks may still spin their motor up when the kernel attempts to read
> partitions. In kernel 2.6.27, this was handled transparently.  But since
> 2.6.28-rc?, the kernel immediately aborts the associated requests and sets
> the
> new scsi_device offline.
> 
> 

I tried a workaround in firewire-sbp2:  Check whether the new sdev is
offline right after scsi_add_device.  If so, remove the sdev and try
everything again.  However,
  - a regression still remains because the addition of the HDD takes
    about 30...40 seconds instead of only a few seconds like in 2.6.27.
  - The workaround works with OXFW912 and OXUF922, but not with
    OXUF924DSB.  Login retries don't succeed with the latter at all.
  - The workaround would be very hard to implement in the older
    ieee1394/sbp2 driver.
  - Other hardware besides FireWire may be affected.
Comment 2 Anonymous Emailer 2008-11-29 10:36:11 UTC
Reply-To: James.Bottomley@HansenPartnership.com

On Sat, 2008-11-29 at 16:12 +0100, Stefan Richter wrote:
> On 29 Nov, bugme-daemon@bugzilla.kernel.org wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=12120
> > 
> >            Summary: [Block layer or SCSI] requests aborted too early during
> >                     check_partition()
> >            Product: IO/Storage
> >            Version: 2.5
> >      KernelVersion: 2.6.28-rc
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: normal
> >           Priority: P1
> >          Component: SCSI
> >         AssignedTo: linux-scsi@vger.kernel.org
> >         ReportedBy: stefan-r-bz@s5r6.in-berlin.de
> > OtherBugsDependingO 11808
> >              nThis:
> >         Regression: 1
> > 
> > 
> > Latest working kernel version: 2.6.27.4
> > Earliest failing kernel version: 2.6.28-rc6  (I didn't test earlier -rcs
> yet)
> > 
> > Hardware Environment: OxSemi 912 and 92x based FireWire harddisks
> > 
> > 
> > See http://marc.info/?l=linux-scsi&m=122774022130397 --
> > Disks may still spin their motor up when the kernel attempts to read
> > partitions. In kernel 2.6.27, this was handled transparently.  But since
> > 2.6.28-rc?, the kernel immediately aborts the associated requests and sets
> the
> > new scsi_device offline.
> > 
> > 
> 
> I tried a workaround in firewire-sbp2:  Check whether the new sdev is
> offline right after scsi_add_device.  If so, remove the sdev and try
> everything again.  However,
>   - a regression still remains because the addition of the HDD takes
>     about 30...40 seconds instead of only a few seconds like in 2.6.27.
>   - The workaround works with OXFW912 and OXUF922, but not with
>     OXUF924DSB.  Login retries don't succeed with the latter at all.
>   - The workaround would be very hard to implement in the older
>     ieee1394/sbp2 driver.
>   - Other hardware besides FireWire may be affected.

It sounds like the device isn't being spun up, so commands which require
media read are being failed with not ready.  Could we get some traces to
show this; just enable all tracing in the boot up sequence:

echo 0xffffffff > /sys/module/scsi_mod/parameters/scsi_logging_level

James
Comment 3 Anonymous Emailer 2008-11-29 11:29:54 UTC
Reply-To: stefanr@s5r6.in-berlin.de

> It sounds like the device isn't being spun up, so commands which require
> media read are being failed with not ready.  Could we get some traces to
> show this; just enable all tracing in the boot up sequence:
> 
> echo 0xffffffff > /sys/module/scsi_mod/parameters/scsi_logging_level

OK, here is a log from a failing session in 2.6.28-rc6.
The firmware responds to everything without spinning the disk up,
until the first READ(10) is sent.

  # spin disk down artificially with
  # "modprobe -r firewire-sbp2" while manage_start_stop == 1
  #
  # then "modprobe firewire-sbp2"

Nov 29 19:57:05 stein Error handler scsi_eh_20 sleeping
Nov 29 19:57:05 stein scsi20 : SBP-2 IEEE-1394
Nov 29 19:57:06 stein firewire_sbp2: fw3.0: logged in to LUN 0000 (0 retries)
Nov 29 19:57:06 stein scsi 20:0:0:0: scsi scan: INQUIRY pass 1 length 36
Nov 29 19:57:06 stein scsi 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein scsi 20:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 36, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein scsi 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein scsi 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein scsi 20:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 19:57:06 stein scsi 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein scsi 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 36 bytes done.
Nov 29 19:57:06 stein scsi scan: INQUIRY successful with code 0x0
Nov 29 19:57:06 stein scsi 20:0:0:0: Direct-Access-RBC ST340083 2A               3.03 PQ: 0 ANSI: 4
Nov 29 19:57:06 stein sd 20:0:0:0: sd_attach
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_revalidate_disk
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 0 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 8, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 8 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] 781422768 512-byte hardware sectors: (400 GB/372 GiB)
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write Protect is off
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Mode Sense: 11 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57986b0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein buffer = 0xf725f9d0, bufflen = 22, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57986b0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 22 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_open
Nov 29 19:57:06 stein scsi_block_when_processing_errors: rtn: 1
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_revalidate_disk
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57984a0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57984a0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 0 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf5798550 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 19:57:06 stein buffer = 0xf725fd68, bufflen = 8, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf5798550 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 8 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] 781422768 512-byte hardware sectors: (400 GB/372 GiB)
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf57981e0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725fed8, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf57981e0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write Protect is off
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Mode Sense: 11 00 00 00
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf5798600 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein buffer = 0xf725f918, bufflen = 4, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf5798600 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 4 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: Send: 0xf5798ef0 
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein buffer = 0xe5961b40, bufflen = 22, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: Done: 0xf5798ef0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 19:57:06 stein sd 20:0:0:0: scsi host busy 1 failed 0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 19:57:06 stein 0 sectors total, 22 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

  # partition scanning code gets into gears,
  # issues very first READ command,
  # now the failures start

Nov 29 19:57:06 stein sdd:<6>sd 20:0:0:0: [sdd] sd_init_command: block=0, count=8
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] block=0
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] reading 8/8 512 byte blocks.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Nov 29 19:57:06 stein buffer = 0xe5961b40, bufflen = 4096, queuecommand 0xf85fb980
Nov 29 19:57:06 stein leaving scsi_dispatch_cmnd()
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 FAILED
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready, initializing command required
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
Nov 29 19:57:06 stein Waking error handler thread
Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand 0xf85fb980
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 19:57:06 stein Sense Key : No Sense [current] 
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense information
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1
Nov 29 19:57:06 stein scsi_send_eh_cmnd: scmd: f5798ef0, timeleft: 0
Nov 29 19:57:06 stein firewire_sbp2: fw3.0: sbp2_scsi_abort
Nov 29 19:57:06 stein scsi_eh_done scmd: f5798ef0 result: 20000
Nov 29 19:57:06 stein scsi_eh_20: START_UNIT failed to sdev: 0xe58fc7f0
Nov 29 19:57:06 stein scsi_eh_20: Sending BDR sdev: 0xe58fc7f0
Nov 29 19:57:06 stein scsi_eh_20: BDR failed sdev:0xe58fc7f0
Nov 29 19:57:06 stein scsi_eh_20: Sending target reset to target 0
Nov 29 19:57:06 stein scsi_eh_20: Target reset failed target: 0
Nov 29 19:57:06 stein scsi_eh_20: Sending BRST chan: 0
Nov 29 19:57:06 stein scsi_try_bus_reset: Snd Bus RST
Nov 29 19:57:06 stein scsi_eh_20: BRST failed chan: 0
Nov 29 19:57:06 stein scsi_eh_20: Sending HRST
Nov 29 19:57:06 stein scsi_try_host_reset: Snd Host RST
Nov 29 19:57:06 stein scsi_eh_20: HRST failed
Nov 29 19:57:06 stein sd 20:0:0:0: Device offlined - not ready after error recovery
Nov 29 19:57:06 stein scsi_eh_20: flush finish cmd: f5798ef0
Nov 29 19:57:06 stein sd 20:0:0:0: Notifying upper driver of completion (result 2)
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein 8 sectors total, 0 bytes done.
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 19:57:06 stein end_request: I/O error, dev sdd, sector 0
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein scsi_restart_operations: waking up host to restart
Nov 29 19:57:06 stein Error handler scsi_eh_20 sleeping
Nov 29 19:57:06 stein sd 20:0:0:0: rejecting I/O to offline device
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein ldm_validate_partition_table(): Disk read failed.
Nov 29 19:57:06 stein sd 20:0:0:0: rejecting I/O to offline device
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein sd 20:0:0:0: rejecting I/O to offline device
Nov 29 19:57:06 stein Buffer I/O error on device sdd, logical block 0
Nov 29 19:57:06 stein unable to read partition table
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_release
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Attached SCSI disk
Nov 29 19:57:06 stein sg_alloc: dev=3 
Nov 29 19:57:06 stein sd 20:0:0:0: Attached scsi generic sg3 type 14
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_open
Nov 29 19:57:06 stein scsi_block_when_processing_errors: rtn: 0
Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] sd_open
Nov 29 19:57:06 stein scsi_block_when_processing_errors: rtn: 0
Nov 29 19:57:08 stein firewire_sbp2: fw3.0: reconnected to LUN 0000 (0 retries)
Nov 29 19:57:14 stein firewire_sbp2: fw3.0: reconnected to LUN 0000 (0 retries)
Nov 29 19:58:09 stein xinetd[4299]: START: imap2 pid=18913 from=192.168.0.42
Nov 29 19:58:09 stein imapd[18913]: imap service init from 192.168.0.42
Nov 29 19:58:09 stein imapd[18913]: Login user=stefan host=stein [192.168.0.42]
Nov 29 19:58:15 stein imapd[18914]: imap service init from 192.168.0.42
Nov 29 19:58:15 stein xinetd[4299]: START: imap2 pid=18914 from=192.168.0.42
Nov 29 19:58:15 stein imapd[18914]: Login user=stefan host=stein [192.168.0.42]
Nov 29 19:58:19 stein imapd[18913]: Logout user=stefan host=stein [192.168.0.42]
Nov 29 19:58:19 stein xinetd[4299]: EXIT: imap2 status=0 pid=18913 duration=10(sec)
Nov 29 19:59:07 stein imapd[18914]: Logout user=stefan host=stein [192.168.0.42]
Nov 29 19:59:07 stein xinetd[4299]: EXIT: imap2 status=0 pid=18914 duration=52(sec)
Comment 4 Anonymous Emailer 2008-11-29 11:32:41 UTC
Reply-To: stefanr@s5r6.in-berlin.de

For comparison, here is a log from the very same disk, also spun down
when firewire-sbp2 logs in, from 2.6.27.4:

  # modprobe firewire-sbp2

Nov 29 20:17:01 mini scsi5 : SBP-2 IEEE-1394
Nov 29 20:17:01 mini Error handler scsi_eh_5 sleeping
Nov 29 20:17:01 mini firewire_sbp2: fw1.0: logged in to LUN 0000 (0 retries)
Nov 29 20:17:01 mini scsi 5:0:0:0: scsi scan: INQUIRY pass 1 length 36
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 5500, (ffffffff80375047)
Nov 29 20:17:01 mini scsi 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini scsi 5:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 36, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini scsi 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini scsi 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini scsi 5:0:0:0: CDB: Inquiry: 12 00 00 00 24 00
Nov 29 20:17:01 mini scsi 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini scsi 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 36 bytes done.
Nov 29 20:17:01 mini scsi scan: INQUIRY successful with code 0x0
Nov 29 20:17:01 mini scsi 5:0:0:0: Direct-Access-RBC ST340083 2A               3.03 PQ: 0 ANSI: 4
Nov 29 20:17:01 mini sd 5:0:0:0: sd_attach
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] sd_revalidate_disk
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 0 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 8, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 8 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB)
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write Protect is off
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Mode Sense: 11 00 00 00
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 22, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 22 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] sd_open
Nov 29 20:17:01 mini scsi_block_when_processing_errors: rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] sd_revalidate_disk
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 0 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 8, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Read Capacity(10): 25 00 00 00 00 00 00 00 00 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 8 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] 781422768 512-byte hardware sectors (400088 MB)
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 00 3f 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write Protect is off
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Mode Sense: 11 00 00 00
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 04 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 4 bytes done.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 30000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 22, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:01 mini sd 5:0:0:0: Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: CDB: Mode Sense(6): 1a 08 06 00 16 00
Nov 29 20:17:01 mini sd 5:0:0:0: scsi host busy 1 failed 0
Nov 29 20:17:01 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:01 mini 0 sectors total, 22 bytes done.
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA

  # coming up: the first READ(10)

Nov 29 20:17:01 mini sdb:<6>sd 5:0:0:0: [sdb] sd_init_command: block=0, count=8
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] block=0
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] reading 8/8 512 byte blocks.
Nov 29 20:17:01 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 60000, (ffffffff80375047)
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Nov 29 20:17:01 mini buffer = 0xffff88007df61eb0, bufflen = 4096, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:01 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:01 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 FAILED
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Add. Sense: Logical unit not ready, initializing command required
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 20:17:01 mini Waking error handler thread
Nov 29 20:17:01 mini Error handler scsi_eh_5 waking up
Nov 29 20:17:01 mini sd 5:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 20:17:01 mini Total of 1 commands on 1 devices require eh work
Nov 29 20:17:01 mini scsi_eh_5: Sending START_UNIT to sdev: 0xffff8800737fe000
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:01 mini sd 5:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 20:17:01 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:08 mini scsi_eh_done scmd: ffff88007d6cbed0 result: 0
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scmd: ffff88007d6cbed0, timeleft: 52572
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2002
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:08 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:08 mini scsi_eh_done scmd: ffff88007d6cbed0 result: 0
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scmd: ffff88007d6cbed0, timeleft: 10000
Nov 29 20:17:08 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2002
Nov 29 20:17:08 mini scsi_eh_tur: scmd ffff88007d6cbed0 rtn 2002
Nov 29 20:17:08 mini scsi_eh_5: flush retry cmd: ffff88007d6cbed0
Nov 29 20:17:08 mini Inserting command ffff88007d6cbed0 into mlqueue
Nov 29 20:17:08 mini scsi_restart_operations: waking up host to restart
Nov 29 20:17:08 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 60000, (ffffffff80375047)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Nov 29 20:17:08 mini buffer = 0xffff88007df61eb0, bufflen = 4096, queuecommand 0xffffffffa00d2a8a

  # cutting out some traffic to [sda], the root filesystem disk

Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 08 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 20:17:08 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini 8 sectors total, 4096 bytes done.
Nov 29 20:17:08 mini sdb1
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] sd_release
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Attached SCSI disk
Nov 29 20:17:08 mini sg_alloc: dev=1 
Nov 29 20:17:08 mini sd 5:0:0:0: Attached scsi generic sg1 type 14
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] sd_open
Nov 29 20:17:08 mini scsi_block_when_processing_errors: rtn: 1
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] sd_init_command: block=0, count=32
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] block=0
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] reading 32/32 512 byte blocks.
Nov 29 20:17:08 mini scsi_add_timer: scmd: ffff88007d6cbed0, time: 60000, (ffffffff80375047)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Send: 0xffff88007d6cbed0 
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 20 00
Nov 29 20:17:08 mini buffer = 0xffff88007df61eb0, bufflen = 16384, queuecommand 0xffffffffa00d2a8a
Nov 29 20:17:08 mini leaving scsi_dispatch_cmnd()
Nov 29 20:17:08 mini scsi_delete_timer: scmd: ffff88007d6cbed0, rtn: 1
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Done: 0xffff88007d6cbed0 SUCCESS
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 00 20 00
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 20:17:08 mini sd 5:0:0:0: Notifying upper driver of completion (result 0)
Nov 29 20:17:08 mini sd 5:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 20:17:08 mini 32 sectors total, 16384 bytes done.

  # et cetera
Comment 5 Anonymous Emailer 2008-11-29 11:55:17 UTC
Reply-To: James.Bottomley@HansenPartnership.com

OK, so there are a few problems.  First, by responding OK to the test
unit ready (which is illegal under spec) it avoids the spin up the sd
driver normally does, so we're relying on the eh allow_restart flag to
start the unit on the first failing command.  Then, in the failure case:

> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00
> 00
> 08 00
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
> initializing command required
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
> Nov 29 19:57:06 stein Waking error handler thread
> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
> cancel: 0
> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand
> 0xf85fb980
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
> driverbyte=DRIVER_OK,SUGGEST_OK
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
> information
> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1

The second start unit is a failure ... I suspect because of our change
to no sense return handling.  What the drive is probably trying to say
is (I'm spinning up) but this gets interpreted as an error because the
sense data for this isn't present (because we didn't ask for it).

Can you try this patch?  It should take the success return of the first
spin up and act on it instead of blindly sending another.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 3863617..635d8b4 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
 	if (scmd->device->allow_restart) {
 		int i, rtn = NEEDS_RETRY;
 
-		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
+		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
 			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
 						scmd->device->timeout, 0);
 
-		if (rtn == SUCCESS)
-			return 0;
+			if (rtn == SUCCESS)
+				return 0;
+			/* if failure, wait before retrying */
+			ssleep(3);
+		}
 	}
 
 	return 1;
Comment 6 Anonymous Emailer 2008-11-29 13:30:29 UTC
Reply-To: stefanr@s5r6.in-berlin.de

On 29 Nov, James Bottomley wrote:
> OK, so there are a few problems.  First, by responding OK to the test
> unit ready (which is illegal under spec) it avoids the spin up the sd
> driver normally does, so we're relying on the eh allow_restart flag to
> start the unit on the first failing command.  Then, in the failure case:
> 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Read(10): 28 00 00 00 00 00 00
>> 00
>> 08 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Sense Key : Not Ready [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: Logical unit not ready,
>> initializing command required
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 0
>> Nov 29 19:57:06 stein Waking error handler thread
>> Nov 29 19:57:06 stein Error handler scsi_eh_20 waking up
>> Nov 29 19:57:06 stein sd 20:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1,
>> cancel: 0
>> Nov 29 19:57:06 stein Total of 1 commands on 1 devices require eh work
>> Nov 29 19:57:06 stein scsi_eh_20: Sending START_UNIT to sdev: 0xe58fc7f0
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Send: 0xf5798ef0 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00
>> 01
>> 00
>> Nov 29 19:57:06 stein buffer = 0x00000000, bufflen = 0, queuecommand
>> 0xf85fb980
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Done: 0xf5798ef0 SUCCESS
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Result: hostbyte=DID_OK
>> driverbyte=DRIVER_OK,SUGGEST_OK
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] CDB: Start/Stop Unit: 1b 00 00 00
>> 01
>> 00
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Unrecognized sense data (in hex):
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
>> Nov 29 19:57:06 stein Sense Key : No Sense [current] 
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] Add. Sense: No additional sense
>> information
>> Nov 29 19:57:06 stein sd 20:0:0:0: [sdd] scsi host busy 1 failed 1
> 
> The second start unit is a failure ... I suspect because of our change
> to no sense return handling.  What the drive is probably trying to say
> is (I'm spinning up) but this gets interpreted as an error because the
> sense data for this isn't present (because we didn't ask for it).
> 
> Can you try this patch?  It should take the success return of the first
> spin up and act on it instead of blindly sending another.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..635d8b4 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -931,12 +931,15 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>       if (scmd->device->allow_restart) {
>               int i, rtn = NEEDS_RETRY;
>  
> -             for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> +             for (i = 0; rtn == NEEDS_RETRY && i < 2; i++) {
>                       rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
>                                               scmd->device->timeout, 0);
>  
> -             if (rtn == SUCCESS)
> -                     return 0;
> +                     if (rtn == SUCCESS)
> +                             return 0;
> +                     /* if failure, wait before retrying */
> +                     ssleep(3);
> +             }
>       }
>  
>       return 1;
> 
> 

Still no luck.  Here is a log from a simpler test:
# sg_start --stop /dev/sdb
# echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
# hdparm -tT /dev/sdb
This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
fails under 2.6.28-rc with disk put offline.

The firmware apparently doesn't like the command abortion (SBP-2 fetch
agent reset) while it is spinning up.


Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:12:53 mini buffer = 0xffff88007b60c5f0, bufflen = 262144, queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini leaving scsi_dispatch_cmnd()
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 FAILED
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready, initializing command required
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:12:53 mini Waking error handler thread
Nov 29 22:12:53 mini Error handler scsi_eh_2 waking up
Nov 29 22:12:53 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 22:12:53 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff8800712ef000
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:12:53 mini Sense Key : No Sense [current] 
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0
Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000
Nov 29 22:12:55 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:12:55 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max payload [2048]
Nov 29 22:12:56 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini scsi_eh_2: Sending BDR sdev: 0xffff8800712ef000
Nov 29 22:12:56 mini ieee1394: sbp2: reset requested
Nov 29 22:12:56 mini ieee1394: sbp2: generating sbp2 fetch agent reset
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
Nov 29 22:12:56 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:12:56 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa015d445
Nov 29 22:13:02 mini ieee1394: sbp2: Reconnected to SBP-2 device
Nov 29 22:13:02 mini ieee1394: sbp2: Node 0-01:1023: Max speed [S400] - Max payload [2048]
Nov 29 22:13:02 mini scsi_eh_done scmd: ffff88007b76af08 result: 20000
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_BUS_BUSY driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] CDB: Test Unit Ready: 00 00 00 00 00 00
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 4840
Nov 29 22:13:02 mini scsi_send_eh_cmnd: scsi_eh_completed_normally 2003
Nov 29 22:13:02 mini scsi_eh_tur: scmd ffff88007b76af08 rtn 2003
Nov 29 22:13:02 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:13:02 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:13:02 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:13:02 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:13:02 mini scsi_eh_2: Sending HRST
Nov 29 22:13:02 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:13:02 mini scsi_eh_2: HRST failed
Nov 29 22:13:02 mini sd 2:0:0:0: Device offlined - not ready after error recovery
Nov 29 22:13:02 mini scsi_eh_2: flush finish cmd: ffff88007b76af08
Nov 29 22:13:02 mini sd 2:0:0:0: Notifying upper driver of completion (result 2)
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini 512 sectors total, 0 bytes done.
Nov 29 22:13:02 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:13:02 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:13:02 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:13:02 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:13:02 mini scsi_restart_operations: waking up host to restart
Nov 29 22:13:02 mini Error handler scsi_eh_2 sleeping
Comment 7 Anonymous Emailer 2008-11-29 13:45:59 UTC
Reply-To: stefanr@s5r6.in-berlin.de

I wrote:
> On 29 Nov, James Bottomley wrote:
>> Can you try this patch?  It should take the success return of the first
>> spin up and act on it instead of blindly sending another.
...
> Still no luck.  Here is a log from a simpler test:
> # sg_start --stop /dev/sdb
> # echo 0xfffffff > /sys/module/scsi_mod/parameters/scsi_logging_level
> # hdparm -tT /dev/sdb
> This works under 2.6.27.y (i.e. disk spins up, hdparm proceeds) but
> fails under 2.6.28-rc with disk put offline.
> 
> The firmware apparently doesn't like the command abortion (SBP-2 fetch
> agent reset) while it is spinning up.

Same with firewire-sbp2 instead of ieee1394/sbp2:

Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] sd_init_command: block=0, count=512
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] block=0
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] reading 512/512 512 byte blocks.
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Send: 0xffff88007169cf08 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:36:31 mini buffer = 0xffff88007c8e75f0, bufflen = 262144, queuecommand 0xffffffffa013c9df
Nov 29 22:36:31 mini leaving scsi_dispatch_cmnd()
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Done: 0xffff88007169cf08 FAILED
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Read(10): 28 00 00 00 00 00 00 02 00 00
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Sense Key : Not Ready [current] 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Add. Sense: Logical unit not ready, initializing command required
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 0
Nov 29 22:36:31 mini Waking error handler thread
Nov 29 22:36:31 mini Error handler scsi_eh_2 waking up
Nov 29 22:36:31 mini sd 2:0:0:0: scsi_eh_prt_fail_stats: cmds failed: 1, cancel: 0
Nov 29 22:36:31 mini Total of 1 commands on 1 devices require eh work
Nov 29 22:36:31 mini scsi_eh_2: Sending START_UNIT to sdev: 0xffff88006e373000
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Send: 0xffff88007169cf08 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:36:31 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand 0xffffffffa013c9df
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Done: 0xffff88007169cf08 SUCCESS
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01 00
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
Nov 29 22:36:31 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:36:31 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
Nov 29 22:36:31 mini Sense Key : No Sense [current] 
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense information
Nov 29 22:36:31 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
Nov 29 22:36:31 mini scsi_send_eh_cmnd: scmd: ffff88007169cf08, timeleft: 0
Nov 29 22:36:31 mini firewire_sbp2: fw1.0: sbp2_scsi_abort
Nov 29 22:36:31 mini scsi_eh_done scmd: ffff88007169cf08 result: 20000
Nov 29 22:36:33 mini firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)
Nov 29 22:36:34 mini scsi_eh_2: START_UNIT failed to sdev: 0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: Sending BDR sdev: 0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: BDR failed sdev:0xffff88006e373000
Nov 29 22:36:34 mini scsi_eh_2: Sending target reset to target 0
Nov 29 22:36:34 mini scsi_eh_2: Target reset failed target: 0
Nov 29 22:36:34 mini scsi_eh_2: Sending BRST chan: 0
Nov 29 22:36:34 mini scsi_try_bus_reset: Snd Bus RST
Nov 29 22:36:34 mini scsi_eh_2: BRST failed chan: 0
Nov 29 22:36:34 mini scsi_eh_2: Sending HRST
Nov 29 22:36:34 mini scsi_try_host_reset: Snd Host RST
Nov 29 22:36:34 mini scsi_eh_2: HRST failed
Nov 29 22:36:34 mini sd 2:0:0:0: Device offlined - not ready after error recovery
Nov 29 22:36:34 mini scsi_eh_2: flush finish cmd: ffff88007169cf08
Nov 29 22:36:34 mini sd 2:0:0:0: Notifying upper driver of completion (result 2)
Nov 29 22:36:34 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:34 mini 512 sectors total, 0 bytes done.
Nov 29 22:36:34 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK driverbyte=DRIVER_OK,SUGGEST_OK
Nov 29 22:36:34 mini end_request: I/O error, dev sdb, sector 0
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:36:34 mini sd 2:0:0:0: rejecting I/O to offline device
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 0
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 1
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 2
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 3
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 4
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 5
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 6
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 7
Nov 29 22:36:34 mini Buffer I/O error on device sdb, logical block 8
Nov 29 22:36:34 mini scsi_restart_operations: waking up host to restart
Nov 29 22:36:34 mini Error handler scsi_eh_2 sleeping
Nov 29 22:36:37 mini sd 2:0:0:0: [sdb] sd_release
Nov 29 22:36:39 mini firewire_sbp2: fw1.0: reconnected to LUN 0000 (0 retries)
Comment 8 Anonymous Emailer 2008-11-29 16:08:30 UTC
Reply-To: James.Bottomley@HansenPartnership.com

On Sat, 2008-11-29 at 22:30 +0100, Stefan Richter wrote:
> Nov 29 22:12:53 mini scsi_eh_2: Sending START_UNIT to sdev:
> 0xffff8800712ef000
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Send: 0xffff88007b76af08 
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 22:12:53 mini buffer = 0x0000000000000000, bufflen = 0, queuecommand
> 0xffffffffa015d445
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Done: 0xffff88007b76af08 SUCCESS
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Result: hostbyte=DID_OK
> driverbyte=DRIVER_OK,SUGGEST_OK
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Unrecognized sense data (in hex):
> Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 22:12:53 mini 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 
> Nov 29 22:12:53 mini Sense Key : No Sense [current] 
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] Add. Sense: No additional sense
> information
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] scsi host busy 1 failed 1
> Nov 29 22:12:53 mini scsi_send_eh_cmnd: scmd: ffff88007b76af08, timeleft: 0

This is suggestive:  timeleft shouldn't be zero since according to the
logging ticks virtually no time had elapsed.

> Nov 29 22:12:53 mini ieee1394: sbp2: aborting sbp2 command
> Nov 29 22:12:53 mini sd 2:0:0:0: [sdb] CDB: Start/Stop Unit: 1b 00 00 00 01
> 00
> Nov 29 22:12:53 mini scsi_eh_done scmd: ffff88007b76af08 result: 50000

And this is the actual command completion occurring after timeout.

Theory number two is a block timeout cockup.  It looks like
sdev->timeout is vestigial and has zero value, so we shouldn't be using
it in the start command, so try this.

If this works, the correct fix will be to kill the sdev timeout
parameter so we get a compile failure where anything tries to use it.

James

---

diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
index 3863617..de3f6d0 100644
--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
 
 		for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
 			rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
-						scmd->device->timeout, 0);
+						3*SENSE_TIMEOUT, 0);
 
 		if (rtn == SUCCESS)
 			return 0;
Comment 9 Anonymous Emailer 2008-11-30 01:44:10 UTC
Reply-To: stefanr@s5r6.in-berlin.de

James Bottomley wrote:
> Theory number two is a block timeout cockup.  It looks like
> sdev->timeout is vestigial and has zero value, so we shouldn't be using
> it in the start command, so try this.
> 
> If this works, the correct fix will be to kill the sdev timeout
> parameter so we get a compile failure where anything tries to use it.
> 
> James
> 
> ---
> 
> diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> index 3863617..de3f6d0 100644
> --- a/drivers/scsi/scsi_error.c
> +++ b/drivers/scsi/scsi_error.c
> @@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
>  
>               for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
>                       rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
> -                                             scmd->device->timeout, 0);
> +                                             3*SENSE_TIMEOUT, 0);
>  
>               if (rtn == SUCCESS)
>                       return 0;
> 
> 

Yes, this works with all of the SBP-2 bridges here which exhibit this
false TEST UNIT READY "GOOD" status.

BTW, 3 * SENSE_TIMEOUT == 30 seconds may actually be a little bit narrow
as a START UNIT timeout.  The disks which I tested here --- a few 7200
RPM IDE or SATA disks behind the SBP-2 bridges --- usually take about 7
seconds to spin up in single-disk enclosures and 14 seconds in dual-disk
enclosures, since the bridge spins them up serially.  But a dual-disk
enclosure with Hitachi Deskstar 7K400 took 28 seconds.  Maybe this
should be used in scsi_eh_try_stu:
include/scsi/scsi.h:#define START_STOP_TIMEOUT          (60 * HZ)

Thanks a lot,
Comment 10 Anonymous Emailer 2008-11-30 08:16:44 UTC
Reply-To: James.Bottomley@HansenPartnership.com

On Sun, 2008-11-30 at 10:44 +0100, Stefan Richter wrote:
> James Bottomley wrote:
> > Theory number two is a block timeout cockup.  It looks like
> > sdev->timeout is vestigial and has zero value, so we shouldn't be using
> > it in the start command, so try this.
> > 
> > If this works, the correct fix will be to kill the sdev timeout
> > parameter so we get a compile failure where anything tries to use it.
> > 
> > James
> > 
> > ---
> > 
> > diff --git a/drivers/scsi/scsi_error.c b/drivers/scsi/scsi_error.c
> > index 3863617..de3f6d0 100644
> > --- a/drivers/scsi/scsi_error.c
> > +++ b/drivers/scsi/scsi_error.c
> > @@ -933,7 +933,7 @@ static int scsi_eh_try_stu(struct scsi_cmnd *scmd)
> >  
> >             for (i = 0; rtn == NEEDS_RETRY && i < 2; i++)
> >                     rtn = scsi_send_eh_cmnd(scmd, stu_command, 6,
> > -                                           scmd->device->timeout, 0);
> > +                                           3*SENSE_TIMEOUT, 0);
> >  
> >             if (rtn == SUCCESS)
> >                     return 0;
> > 
> > 
> 
> Yes, this works with all of the SBP-2 bridges here which exhibit this
> false TEST UNIT READY "GOOD" status.
> 
> BTW, 3 * SENSE_TIMEOUT == 30 seconds may actually be a little bit narrow
> as a START UNIT timeout.  The disks which I tested here --- a few 7200
> RPM IDE or SATA disks behind the SBP-2 bridges --- usually take about 7
> seconds to spin up in single-disk enclosures and 14 seconds in dual-disk
> enclosures, since the bridge spins them up serially.  But a dual-disk
> enclosure with Hitachi Deskstar 7K400 took 28 seconds.  Maybe this
> should be used in scsi_eh_try_stu:
> include/scsi/scsi.h:#define START_STOP_TIMEOUT          (60 * HZ)

Actually, unfortunately not.  This feature isn't designed to spin up non
standards compliant devices, it's designed to restart automatically
stopping media.  As such, the timeout is going to have to go back to
being the queue timeout ... it's just that this exposes a few more
problems than firewire.

James
Comment 11 Rafael J. Wysocki 2008-12-03 17:15:51 UTC
On Thursday, 4 of December 2008, James Bottomley wrote:
> On Wed, 2008-12-03 at 22:49 +0100, Rafael J. Wysocki wrote:
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12120
> > Subject         : [Block layer or SCSI] requests aborted too early
> > during check_partition()
> > Submitter       : Stefan Richter <stefan-r-bz@s5r6.in-berlin.de>
> > Date            : 2008-11-29 05:19 (5 days old)
> 
> Fixed by this patch:
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=9728c0814ecb505546696a659858fdb761375544