Bug 11817 - SCSI: timing out command for dvd drive
Summary: SCSI: timing out command for dvd drive
Status: CLOSED PATCH_ALREADY_AVAILABLE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: SCSI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: linux-scsi@vger.kernel.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2008-10-24 05:56 UTC by Jens Weibler
Modified: 2008-10-25 04:01 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.28-rc1
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
2.6.28-rc1 config (47.60 KB, text/plain)
2008-10-24 05:58 UTC, Jens Weibler
Details

Description Jens Weibler 2008-10-24 05:56:34 UTC
Latest working kernel version: 2.6.27
Earliest failing kernel version: 2.6.28-rc1
Distribution: Gentoo
Hardware Environment: Dell Latitude E6500
Problem Description:
I tried 2.6.28-rc1 today and the boot is delayed by scsi timeouts. It does work if I remove my dvd drive before booting.


dmesg:
Oct 24 12:07:12 jtb [    0.743466] Driver 'sd' needs updating - please use bus_type methods
Oct 24 12:07:12 jtb [    0.743492] Driver 'sr' needs updating - please use bus_type methods
Oct 24 12:07:12 jtb [    0.743555] ahci 0000:00:1f.2: version 3.0
Oct 24 12:07:12 jtb [    0.743564] ahci 0000:00:1f.2: PCI INT D -> GSI 19 (level, low) -> IRQ 19
Oct 24 12:07:12 jtb [    0.743596] ahci 0000:00:1f.2: irq 42 for MSI/MSI-X
Oct 24 12:07:12 jtb [    0.743666] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 4 ports 3 Gbps 0x33 impl RAID mode
Oct 24 12:07:12 jtb [    0.743668] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part ems
Oct 24 12:07:12 jtb [    0.743673] ahci 0000:00:1f.2: setting latency timer to 64
Oct 24 12:07:12 jtb [    0.743820] scsi0 : ahci
Oct 24 12:07:12 jtb [    0.743899] scsi1 : ahci
Oct 24 12:07:12 jtb [    0.743958] scsi2 : ahci
Oct 24 12:07:12 jtb [    0.744012] scsi3 : ahci
Oct 24 12:07:12 jtb [    0.744066] scsi4 : ahci
Oct 24 12:07:12 jtb [    0.744120] scsi5 : ahci
Oct 24 12:07:12 jtb [    0.744564] ata1: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1c900 irq 42
Oct 24 12:07:12 jtb [    0.744568] ata2: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1c980 irq 42
Oct 24 12:07:12 jtb [    0.744569] ata3: DUMMY
Oct 24 12:07:12 jtb [    0.744570] ata4: DUMMY
Oct 24 12:07:12 jtb [    0.744572] ata5: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1cb00 irq 42
Oct 24 12:07:12 jtb [    0.744575] ata6: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1cb80 irq 42
Oct 24 12:07:12 jtb [    1.063351] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Oct 24 12:07:12 jtb [    1.068536] ata1.00: ATA-8: SAMSUNG HM250JI, HS100-08, max UDMA7
Oct 24 12:07:12 jtb [    1.068538] ata1.00: 488397168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Oct 24 12:07:12 jtb [    1.075477] ata1.00: configured for UDMA/133
Oct 24 12:07:12 jtb [    1.406682] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Oct 24 12:07:12 jtb [    1.409289] ata2.00: ATAPI: PLDS DVD+/-RW DU-8A2S, 4D12, max UDMA/100
Oct 24 12:07:12 jtb [    1.412791] ata2.00: configured for UDMA/100
Oct 24 12:07:12 jtb [    1.746681] ata5: SATA link down (SStatus 0 SControl 300)
Oct 24 12:07:12 jtb [    2.080014] ata6: SATA link down (SStatus 0 SControl 300)
Oct 24 12:07:12 jtb [    2.093410] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HM250JI  HS10 PQ: 0 ANSI: 5
Oct 24 12:07:12 jtb [    2.093516] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors: (250 GB/232 GiB)
Oct 24 12:07:12 jtb [    2.093527] sd 0:0:0:0: [sda] Write Protect is off
Oct 24 12:07:12 jtb [    2.093529] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 24 12:07:12 jtb [    2.093547] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 24 12:07:12 jtb [    2.093592] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors: (250 GB/232 GiB)
Oct 24 12:07:12 jtb [    2.093603] sd 0:0:0:0: [sda] Write Protect is off
Oct 24 12:07:12 jtb [    2.093605] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 24 12:07:12 jtb [    2.093622] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 24 12:07:12 jtb [    2.093625]  sda: sda1 sda2 sda3
Oct 24 12:07:12 jtb [    2.132082] sd 0:0:0:0: [sda] Attached SCSI disk
Oct 24 12:07:12 jtb [    2.132157] sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 24 12:07:12 jtb [   24.136670] scsi 1:0:0:0: timing out command, waited 22s
Oct 24 12:07:12 jtb [   46.246667] scsi 1:0:0:0: timing out command, waited 22s
Oct 24 12:07:12 jtb [   68.356665] scsi 1:0:0:0: timing out command, waited 22s
Oct 24 12:07:12 jtb [   90.466663] scsi 1:0:0:0: timing out command, waited 22s
Oct 24 12:07:12 jtb [  112.576661] scsi 1:0:0:0: timing out command, waited 22s
Oct 24 12:07:12 jtb [  134.686658] scsi 1:0:0:0: timing out command, waited 22s
Oct 24 12:07:12 jtb [  134.686714] ata2: WARNING: synchronous SCSI scan failed without making any progress,
Oct 24 12:07:12 jtb [  134.686715]                   switching to async
Oct 24 12:07:12 jtb [  134.686876] Initializing USB Mass Storage driver...


The same with dvd drive removed:
Oct 24 12:28:02 jtb [    0.743457] Driver 'sd' needs updating - please use bus_type methods
Oct 24 12:28:02 jtb [    0.743482] Driver 'sr' needs updating - please use bus_type methods
Oct 24 12:28:02 jtb [    0.743543] ahci 0000:00:1f.2: version 3.0
Oct 24 12:28:02 jtb [    0.743552] ahci 0000:00:1f.2: PCI INT D -> GSI 19 (level, low) -> IRQ 19
Oct 24 12:28:02 jtb [    0.743584] ahci 0000:00:1f.2: irq 42 for MSI/MSI-X
Oct 24 12:28:02 jtb [    0.743652] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots 4 ports 3 Gbps 0x33 impl RAID mode
Oct 24 12:28:02 jtb [    0.743655] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part ems
Oct 24 12:28:02 jtb [    0.743659] ahci 0000:00:1f.2: setting latency timer to 64
Oct 24 12:28:02 jtb [    0.743804] scsi0 : ahci
Oct 24 12:28:02 jtb [    0.743881] scsi1 : ahci
Oct 24 12:28:02 jtb [    0.743933] scsi2 : ahci
Oct 24 12:28:02 jtb [    0.743985] scsi3 : ahci
Oct 24 12:28:02 jtb [    0.744040] scsi4 : ahci
Oct 24 12:28:02 jtb [    0.744093] scsi5 : ahci
Oct 24 12:28:02 jtb [    0.744533] ata1: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1c900 irq 42
Oct 24 12:28:02 jtb [    0.744536] ata2: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1c980 irq 42
Oct 24 12:28:02 jtb [    0.744538] ata3: DUMMY
Oct 24 12:28:02 jtb [    0.744539] ata4: DUMMY
Oct 24 12:28:02 jtb [    0.744541] ata5: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1cb00 irq 42
Oct 24 12:28:02 jtb [    0.744544] ata6: SATA max UDMA/133 abar m2048@0xfed1c800 port 0xfed1cb80 irq 42
Oct 24 12:28:02 jtb [    1.063351] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
Oct 24 12:28:02 jtb [    1.068536] ata1.00: ATA-8: SAMSUNG HM250JI, HS100-08, max UDMA7
Oct 24 12:28:02 jtb [    1.068538] ata1.00: 488397168 sectors, multi 0: LBA48 NCQ (depth 31/32)
Oct 24 12:28:02 jtb [    1.075469] ata1.00: configured for UDMA/133
Oct 24 12:28:02 jtb [    1.406681] ata2: SATA link down (SStatus 0 SControl 300)
Oct 24 12:28:02 jtb [    1.740014] ata5: SATA link down (SStatus 0 SControl 300)
Oct 24 12:28:02 jtb [    2.073347] ata6: SATA link down (SStatus 0 SControl 300)
Oct 24 12:28:02 jtb [    2.086746] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HM250JI  HS10 PQ: 0 ANSI: 5
Oct 24 12:28:02 jtb [    2.086857] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors: (250 GB/232 GiB)
Oct 24 12:28:02 jtb [    2.086869] sd 0:0:0:0: [sda] Write Protect is off
Oct 24 12:28:02 jtb [    2.086870] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 24 12:28:02 jtb [    2.086888] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 24 12:28:02 jtb [    2.086932] sd 0:0:0:0: [sda] 488397168 512-byte hardware sectors: (250 GB/232 GiB)
Oct 24 12:28:02 jtb [    2.086943] sd 0:0:0:0: [sda] Write Protect is off
Oct 24 12:28:02 jtb [    2.086945] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 24 12:28:02 jtb [    2.086962] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 24 12:28:02 jtb [    2.086965]  sda: sda1 sda2 sda3
Oct 24 12:28:02 jtb [    2.135048] sd 0:0:0:0: [sda] Attached SCSI disk
Oct 24 12:28:02 jtb [    2.135121] sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 24 12:28:02 jtb [    2.135180] Initializing USB Mass Storage driver...
Comment 1 Jens Weibler 2008-10-24 05:58:55 UTC
Created attachment 18425 [details]
2.6.28-rc1 config

It also happens without "Asynchronous SCSI scanning".
Comment 2 kishorekumar.mm 2008-10-24 06:08:25 UTC
I tried libata.dma=0 in the kernel param.

It gets detected and hal keeps polling until I disable it.
Comment 3 Jean Delvare 2008-10-24 08:39:22 UTC
Same problem here.
Comment 4 Anonymous Emailer 2008-10-24 08:55:50 UTC
Reply-To: James.Bottomley@HansenPartnership.com

On Fri, 2008-10-24 at 08:39 -0700, bugme-daemon@bugzilla.kernel.org
wrote:
> ------- Comment #3 from khali@linux-fr.org  2008-10-24 08:39 -------
> Same problem here.

This is Jens' tag cockup:

http://marc.info/?t=122483204900001

Isn't it?

CDs are also no-NCQ devices.

James
Comment 5 Jean Delvare 2008-10-24 09:28:40 UTC
James, you are correct, this is the same problem. Applying the patch at
http://lkml.org/lkml/2008/10/24/102
made my system boot again. Thanks!
Comment 6 Anonymous Emailer 2008-10-24 09:38:55 UTC
Reply-To: michaelc@cs.wisc.edu

bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=11817
> 
>            Summary: SCSI: timing out command for dvd drive
>            Product: IO/Storage
>            Version: 2.5
>      KernelVersion: 2.6.28-rc1
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: SCSI
>         AssignedTo: linux-scsi@vger.kernel.org
>         ReportedBy: bugzilla-kernel@jensthebrain.de
> 
> 
> Latest working kernel version: 2.6.27
> Earliest failing kernel version: 2.6.28-rc1
> Distribution: Gentoo
> Hardware Environment: Dell Latitude E6500
> Problem Description:
> I tried 2.6.28-rc1 today and the boot is delayed by scsi timeouts. It does
> work
> if I remove my dvd drive before booting.
> 
> 
> dmesg:
> Oct 24 12:07:12 jtb [    0.743466] Driver 'sd' needs updating - please use
> bus_type methods
> Oct 24 12:07:12 jtb [    0.743492] Driver 'sr' needs updating - please use
> bus_type methods
> Oct 24 12:07:12 jtb [    0.743555] ahci 0000:00:1f.2: version 3.0
> Oct 24 12:07:12 jtb [    0.743564] ahci 0000:00:1f.2: PCI INT D -> GSI 19
> (level, low) -> IRQ 19
> Oct 24 12:07:12 jtb [    0.743596] ahci 0000:00:1f.2: irq 42 for MSI/MSI-X
> Oct 24 12:07:12 jtb [    0.743666] ahci 0000:00:1f.2: AHCI 0001.0200 32 slots
> 4
> ports 3 Gbps 0x33 impl RAID mode
> Oct 24 12:07:12 jtb [    0.743668] ahci 0000:00:1f.2: flags: 64bit ncq sntf
> stag pm led clo pmp pio slum part ems
> Oct 24 12:07:12 jtb [    0.743673] ahci 0000:00:1f.2: setting latency timer
> to
> 64
> Oct 24 12:07:12 jtb [    0.743820] scsi0 : ahci
> Oct 24 12:07:12 jtb [    0.743899] scsi1 : ahci
> Oct 24 12:07:12 jtb [    0.743958] scsi2 : ahci
> Oct 24 12:07:12 jtb [    0.744012] scsi3 : ahci
> Oct 24 12:07:12 jtb [    0.744066] scsi4 : ahci
> Oct 24 12:07:12 jtb [    0.744120] scsi5 : ahci
> Oct 24 12:07:12 jtb [    0.744564] ata1: SATA max UDMA/133 abar
> m2048@0xfed1c800 port 0xfed1c900 irq 42
> Oct 24 12:07:12 jtb [    0.744568] ata2: SATA max UDMA/133 abar
> m2048@0xfed1c800 port 0xfed1c980 irq 42
> Oct 24 12:07:12 jtb [    0.744569] ata3: DUMMY
> Oct 24 12:07:12 jtb [    0.744570] ata4: DUMMY
> Oct 24 12:07:12 jtb [    0.744572] ata5: SATA max UDMA/133 abar
> m2048@0xfed1c800 port 0xfed1cb00 irq 42
> Oct 24 12:07:12 jtb [    0.744575] ata6: SATA max UDMA/133 abar
> m2048@0xfed1c800 port 0xfed1cb80 irq 42
> Oct 24 12:07:12 jtb [    1.063351] ata1: SATA link up 1.5 Gbps (SStatus 113
> SControl 300)
> Oct 24 12:07:12 jtb [    1.068536] ata1.00: ATA-8: SAMSUNG HM250JI, HS100-08,
> max UDMA7
> Oct 24 12:07:12 jtb [    1.068538] ata1.00: 488397168 sectors, multi 0: LBA48
> NCQ (depth 31/32)
> Oct 24 12:07:12 jtb [    1.075477] ata1.00: configured for UDMA/133
> Oct 24 12:07:12 jtb [    1.406682] ata2: SATA link up 1.5 Gbps (SStatus 113
> SControl 300)
> Oct 24 12:07:12 jtb [    1.409289] ata2.00: ATAPI: PLDS DVD+/-RW DU-8A2S,
> 4D12,
> max UDMA/100
> Oct 24 12:07:12 jtb [    1.412791] ata2.00: configured for UDMA/100
> Oct 24 12:07:12 jtb [    1.746681] ata5: SATA link down (SStatus 0 SControl
> 300)
> Oct 24 12:07:12 jtb [    2.080014] ata6: SATA link down (SStatus 0 SControl
> 300)
> Oct 24 12:07:12 jtb [    2.093410] scsi 0:0:0:0: Direct-Access     ATA     
> SAMSUNG HM250JI  HS10 PQ: 0 ANSI: 5
> Oct 24 12:07:12 jtb [    2.093516] sd 0:0:0:0: [sda] 488397168 512-byte
> hardware sectors: (250 GB/232 GiB)
> Oct 24 12:07:12 jtb [    2.093527] sd 0:0:0:0: [sda] Write Protect is off
> Oct 24 12:07:12 jtb [    2.093529] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> Oct 24 12:07:12 jtb [    2.093547] sd 0:0:0:0: [sda] Write cache: enabled,
> read
> cache: enabled, doesn't support DPO or FUA
> Oct 24 12:07:12 jtb [    2.093592] sd 0:0:0:0: [sda] 488397168 512-byte
> hardware sectors: (250 GB/232 GiB)
> Oct 24 12:07:12 jtb [    2.093603] sd 0:0:0:0: [sda] Write Protect is off
> Oct 24 12:07:12 jtb [    2.093605] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
> Oct 24 12:07:12 jtb [    2.093622] sd 0:0:0:0: [sda] Write cache: enabled,
> read
> cache: enabled, doesn't support DPO or FUA
> Oct 24 12:07:12 jtb [    2.093625]  sda: sda1 sda2 sda3
> Oct 24 12:07:12 jtb [    2.132082] sd 0:0:0:0: [sda] Attached SCSI disk
> Oct 24 12:07:12 jtb [    2.132157] sd 0:0:0:0: Attached scsi generic sg0 type
> 0
> Oct 24 12:07:12 jtb [   24.136670] scsi 1:0:0:0: timing out command, waited
> 22s
> Oct 24 12:07:12 jtb [   46.246667] scsi 1:0:0:0: timing out command, waited
> 22s
> Oct 24 12:07:12 jtb [   68.356665] scsi 1:0:0:0: timing out command, waited
> 22s
> Oct 24 12:07:12 jtb [   90.466663] scsi 1:0:0:0: timing out command, waited
> 22s
> Oct 24 12:07:12 jtb [  112.576661] scsi 1:0:0:0: timing out command, waited
> 22s
> Oct 24 12:07:12 jtb [  134.686658] scsi 1:0:0:0: timing out command, waited
> 22s

I think this caused due to my patch that adds the scsi_noretry_cmd check 
in scsi_decide_disposition. We saw this problem exposed in RHEL testing.

Basically what used to happen without my patch is that the command 
continuously timed out or we got an error, and we would eventually hit 
the cmd->retries and cmd->allowed check and the command would be failed 
fairly quietly in some code paths like during setup.

Now scsi_noretry_cmd is returning 0 and that causes us to retry the 
command until we hit the check:

         disposition = scsi_decide_disposition(cmd);
         if (disposition != SUCCESS &&
             time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
                 sdev_printk(KERN_ERR, cmd->device,
                             "timing out command, waited %lus\n",
                             wait_for/HZ);
                 disposition = SUCCESS;
         }

in scsi_softirq_done so we now see this error message get printed out 
for the failure.

Is Mike Anderson's patches that removes scsi_noretry_cmd going to be 
merged? If so that will fix the problem. If not then I will make a 
different patch.
Comment 7 Anonymous Emailer 2008-10-24 09:45:02 UTC
Reply-To: michaelc@cs.wisc.edu

Mike Christie wrote:
> bugme-daemon@bugzilla.kernel.org wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=11817
>>
>>            Summary: SCSI: timing out command for dvd drive
>>            Product: IO/Storage
>>            Version: 2.5
>>      KernelVersion: 2.6.28-rc1
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: SCSI
>>         AssignedTo: linux-scsi@vger.kernel.org
>>         ReportedBy: bugzilla-kernel@jensthebrain.de
>>
>>
>> Latest working kernel version: 2.6.27
>> Earliest failing kernel version: 2.6.28-rc1
>> Distribution: Gentoo
>> Hardware Environment: Dell Latitude E6500
>> Problem Description:
>> I tried 2.6.28-rc1 today and the boot is delayed by scsi timeouts. It 
>> does work
>> if I remove my dvd drive before booting.
>>
>>
>> dmesg:
>> Oct 24 12:07:12 jtb [    0.743466] Driver 'sd' needs updating - please 
>> use
>> bus_type methods
>> Oct 24 12:07:12 jtb [    0.743492] Driver 'sr' needs updating - please 
>> use
>> bus_type methods
>> Oct 24 12:07:12 jtb [    0.743555] ahci 0000:00:1f.2: version 3.0
>> Oct 24 12:07:12 jtb [    0.743564] ahci 0000:00:1f.2: PCI INT D -> GSI 19
>> (level, low) -> IRQ 19
>> Oct 24 12:07:12 jtb [    0.743596] ahci 0000:00:1f.2: irq 42 for 
>> MSI/MSI-X
>> Oct 24 12:07:12 jtb [    0.743666] ahci 0000:00:1f.2: AHCI 0001.0200 
>> 32 slots 4
>> ports 3 Gbps 0x33 impl RAID mode
>> Oct 24 12:07:12 jtb [    0.743668] ahci 0000:00:1f.2: flags: 64bit ncq 
>> sntf
>> stag pm led clo pmp pio slum part ems
>> Oct 24 12:07:12 jtb [    0.743673] ahci 0000:00:1f.2: setting latency 
>> timer to
>> 64
>> Oct 24 12:07:12 jtb [    0.743820] scsi0 : ahci
>> Oct 24 12:07:12 jtb [    0.743899] scsi1 : ahci
>> Oct 24 12:07:12 jtb [    0.743958] scsi2 : ahci
>> Oct 24 12:07:12 jtb [    0.744012] scsi3 : ahci
>> Oct 24 12:07:12 jtb [    0.744066] scsi4 : ahci
>> Oct 24 12:07:12 jtb [    0.744120] scsi5 : ahci
>> Oct 24 12:07:12 jtb [    0.744564] ata1: SATA max UDMA/133 abar
>> m2048@0xfed1c800 port 0xfed1c900 irq 42
>> Oct 24 12:07:12 jtb [    0.744568] ata2: SATA max UDMA/133 abar
>> m2048@0xfed1c800 port 0xfed1c980 irq 42
>> Oct 24 12:07:12 jtb [    0.744569] ata3: DUMMY
>> Oct 24 12:07:12 jtb [    0.744570] ata4: DUMMY
>> Oct 24 12:07:12 jtb [    0.744572] ata5: SATA max UDMA/133 abar
>> m2048@0xfed1c800 port 0xfed1cb00 irq 42
>> Oct 24 12:07:12 jtb [    0.744575] ata6: SATA max UDMA/133 abar
>> m2048@0xfed1c800 port 0xfed1cb80 irq 42
>> Oct 24 12:07:12 jtb [    1.063351] ata1: SATA link up 1.5 Gbps 
>> (SStatus 113
>> SControl 300)
>> Oct 24 12:07:12 jtb [    1.068536] ata1.00: ATA-8: SAMSUNG HM250JI, 
>> HS100-08,
>> max UDMA7
>> Oct 24 12:07:12 jtb [    1.068538] ata1.00: 488397168 sectors, multi 
>> 0: LBA48
>> NCQ (depth 31/32)
>> Oct 24 12:07:12 jtb [    1.075477] ata1.00: configured for UDMA/133
>> Oct 24 12:07:12 jtb [    1.406682] ata2: SATA link up 1.5 Gbps 
>> (SStatus 113
>> SControl 300)
>> Oct 24 12:07:12 jtb [    1.409289] ata2.00: ATAPI: PLDS DVD+/-RW 
>> DU-8A2S, 4D12,
>> max UDMA/100
>> Oct 24 12:07:12 jtb [    1.412791] ata2.00: configured for UDMA/100
>> Oct 24 12:07:12 jtb [    1.746681] ata5: SATA link down (SStatus 0 
>> SControl
>> 300)
>> Oct 24 12:07:12 jtb [    2.080014] ata6: SATA link down (SStatus 0 
>> SControl
>> 300)
>> Oct 24 12:07:12 jtb [    2.093410] scsi 0:0:0:0: Direct-Access     
>> ATA     SAMSUNG HM250JI  HS10 PQ: 0 ANSI: 5
>> Oct 24 12:07:12 jtb [    2.093516] sd 0:0:0:0: [sda] 488397168 512-byte
>> hardware sectors: (250 GB/232 GiB)
>> Oct 24 12:07:12 jtb [    2.093527] sd 0:0:0:0: [sda] Write Protect is off
>> Oct 24 12:07:12 jtb [    2.093529] sd 0:0:0:0: [sda] Mode Sense: 00 3a 
>> 00 00
>> Oct 24 12:07:12 jtb [    2.093547] sd 0:0:0:0: [sda] Write cache: 
>> enabled, read
>> cache: enabled, doesn't support DPO or FUA
>> Oct 24 12:07:12 jtb [    2.093592] sd 0:0:0:0: [sda] 488397168 512-byte
>> hardware sectors: (250 GB/232 GiB)
>> Oct 24 12:07:12 jtb [    2.093603] sd 0:0:0:0: [sda] Write Protect is off
>> Oct 24 12:07:12 jtb [    2.093605] sd 0:0:0:0: [sda] Mode Sense: 00 3a 
>> 00 00
>> Oct 24 12:07:12 jtb [    2.093622] sd 0:0:0:0: [sda] Write cache: 
>> enabled, read
>> cache: enabled, doesn't support DPO or FUA
>> Oct 24 12:07:12 jtb [    2.093625]  sda: sda1 sda2 sda3
>> Oct 24 12:07:12 jtb [    2.132082] sd 0:0:0:0: [sda] Attached SCSI disk
>> Oct 24 12:07:12 jtb [    2.132157] sd 0:0:0:0: Attached scsi generic 
>> sg0 type 0
>> Oct 24 12:07:12 jtb [   24.136670] scsi 1:0:0:0: timing out command, 
>> waited 22s
>> Oct 24 12:07:12 jtb [   46.246667] scsi 1:0:0:0: timing out command, 
>> waited 22s
>> Oct 24 12:07:12 jtb [   68.356665] scsi 1:0:0:0: timing out command, 
>> waited 22s
>> Oct 24 12:07:12 jtb [   90.466663] scsi 1:0:0:0: timing out command, 
>> waited 22s
>> Oct 24 12:07:12 jtb [  112.576661] scsi 1:0:0:0: timing out command, 
>> waited 22s
>> Oct 24 12:07:12 jtb [  134.686658] scsi 1:0:0:0: timing out command, 
>> waited 22s
> 
> I think this caused due to my patch that adds the scsi_noretry_cmd check 
> in scsi_decide_disposition. We saw this problem exposed in RHEL testing.
> 
> Basically what used to happen without my patch is that the command 
> continuously timed out or we got an error, and we would eventually hit 
> the cmd->retries and cmd->allowed check and the command would be failed 
> fairly quietly in some code paths like during setup.
> 
> Now scsi_noretry_cmd is returning 0 and that causes us to retry the 
> command until we hit the check:
> 
>         disposition = scsi_decide_disposition(cmd);
>         if (disposition != SUCCESS &&
>             time_before(cmd->jiffies_at_alloc + wait_for, jiffies)) {
>                 sdev_printk(KERN_ERR, cmd->device,
>                             "timing out command, waited %lus\n",
>                             wait_for/HZ);
>                 disposition = SUCCESS;
>         }
> 
> in scsi_softirq_done so we now see this error message get printed out 
> for the failure.
> 

Ignore this. In RHEL it was slightly different problem.
Comment 8 Jens Weibler 2008-10-25 04:01:50 UTC
(In reply to comment #5)
> James, you are correct, this is the same problem. Applying the patch at
> http://lkml.org/lkml/2008/10/24/102
> made my system boot again. Thanks!

same here. I close this bug - it's already included in the latest snapshot (Commit: e013e13bf605b9e6b702adffbe2853cfc60e7806)

Note You need to log in before you can comment on or make changes to this bug.