Bug 12351

Summary: sata_nv hotplug not work in 2.6.27.10
Product: IO/Storage Reporter: giovanni pancotti (gpanco)
Component: Serial ATAAssignee: Tejun Heo (tj)
Status: RESOLVED CODE_FIX    
Severity: normal CC: lament.email.si, nemesis, tj
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27.10 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: swncq-hardreset-debug
swncq-hardreset-debug-2
swncq-hardreset-debug-3
nv-hardreset-only-on-probing.patch
nv-hardreset-only-on-probing.patch
nv-hardreset-only-on-probing.patch
boot messages
The test

Description giovanni pancotti 2009-01-03 13:12:08 UTC
Latest working kernel version:2.6.27
Earliest failing kernel version:2.6.27.10
Hardware Environment:x86 ASUS M2N-E
Problem Description: sata_nv hotplug dont work in 2.6.27.10

Steps to reproduce:

when I hotplug my sata disk dmesg say:

- kernel 2.6.27:

ta2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake }
ata2: hard resetting link
ata2: link is slow to respond, please be patient (ready=-19)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATA-7: ST3320620AS, 3.AAJ, max UDMA/133
ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata2.00: configured for UDMA/133
ata2: EH complete
scsi 1:0:0:0: Direct-Access     ATA      ST3320620AS      3.AA PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdb: sdb1 sdb2 < sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 sdb12 sdb13 sdb14 sdb15 >
sd 1:0:0:0: [sdb] Attached SCSI disk


- kernel 2.6.27.10:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake Dispar }
ata2: link is slow to respond, please be patient (ready=0)
ata2: device not ready (errno=-16), forcing hardreset
ata2: soft resetting link
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: soft resetting link
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: soft resetting link
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: limiting SATA link speed to 1.5 Gbps
ata2: soft resetting link
ata2: SRST failed (errno=-16)
ata2: reset failed, giving up
ata2: EH complete

Thanks.
Comment 1 Anonymous Emailer 2009-01-05 15:37:22 UTC
Reply-To: akpm@linux-foundation.org


(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sat,  3 Jan 2009 13:12:09 -0800 (PST)
bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=12351
> 
>            Summary: sata_nv hotplug not work in 2.6.27.10
>            Product: IO/Storage
>            Version: 2.5
>      KernelVersion: 2.6.27.10
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Serial ATA
>         AssignedTo: jgarzik@pobox.com
>         ReportedBy: gpanco@tiscali.it
> 
> 
> Latest working kernel version:2.6.27
> Earliest failing kernel version:2.6.27.10

A regression in -stable.

> Hardware Environment:x86 ASUS M2N-E
> Problem Description: sata_nv hotplug dont work in 2.6.27.10
> 
> Steps to reproduce:
> 
> when I hotplug my sata disk dmesg say:
> 
> - kernel 2.6.27:
> 
> ta2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
> ata2: SError: { PHYRdyChg CommWake }
> ata2: hard resetting link
> ata2: link is slow to respond, please be patient (ready=-19)
> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
> ata2.00: ATA-7: ST3320620AS, 3.AAJ, max UDMA/133
> ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
> ata2.00: configured for UDMA/133
> ata2: EH complete
> scsi 1:0:0:0: Direct-Access     ATA      ST3320620AS      3.AA PQ: 0 ANSI: 5
> sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
> DPO or FUA
> sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
> sd 1:0:0:0: [sdb] Write Protect is off
> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
> DPO or FUA
>  sdb: sdb1 sdb2 < sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 sdb12 sdb13 sdb14
>  sdb15
> >
> sd 1:0:0:0: [sdb] Attached SCSI disk
> 
> 
> - kernel 2.6.27.10:
> 
> ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xe frozen
> ata2: SError: { PHYRdyChg CommWake Dispar }
> ata2: link is slow to respond, please be patient (ready=0)
> ata2: device not ready (errno=-16), forcing hardreset
> ata2: soft resetting link
> ata2: link is slow to respond, please be patient (ready=0)
> ata2: SRST failed (errno=-16)
> ata2: soft resetting link
> ata2: link is slow to respond, please be patient (ready=0)
> ata2: SRST failed (errno=-16)
> ata2: soft resetting link
> ata2: link is slow to respond, please be patient (ready=0)
> ata2: SRST failed (errno=-16)
> ata2: limiting SATA link speed to 1.5 Gbps
> ata2: soft resetting link
> ata2: SRST failed (errno=-16)
> ata2: reset failed, giving up
> ata2: EH complete
> 
Comment 2 Robert Hancock 2009-01-05 16:57:39 UTC
(CCing Tejun)

Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Sat,  3 Jan 2009 13:12:09 -0800 (PST)
> bugme-daemon@bugzilla.kernel.org wrote:
> 
>> http://bugzilla.kernel.org/show_bug.cgi?id=12351
>>
>>            Summary: sata_nv hotplug not work in 2.6.27.10
>>            Product: IO/Storage
>>            Version: 2.5
>>      KernelVersion: 2.6.27.10
>>           Platform: All
>>         OS/Version: Linux
>>               Tree: Mainline
>>             Status: NEW
>>           Severity: normal
>>           Priority: P1
>>          Component: Serial ATA
>>         AssignedTo: jgarzik@pobox.com
>>         ReportedBy: gpanco@tiscali.it
>>
>>
>> Latest working kernel version:2.6.27
>> Earliest failing kernel version:2.6.27.10
> 
> A regression in -stable.

Does reverting this patch help?

http://git.kernel.org/?p=linux/kernel/git/hpa/linux-2.6-allstable.git;a=commit;h=814eb57e1799337d9fbb68f5d838afa507dc014e

> 
>> Hardware Environment:x86 ASUS M2N-E

OK, this is an MCP61 board. We're now using softreset instead of 
hardreset on hotplug and apparently that doesn't work. Thing is that:

http://bugzilla.kernel.org/show_bug.cgi?id=11195

reported that hardreset was borked on that controller. Seems kind of 
contradictory..

/if only NVidia could be consistent in its hardware bugs..

>> Problem Description: sata_nv hotplug dont work in 2.6.27.10
>>
>> Steps to reproduce:
>>
>> when I hotplug my sata disk dmesg say:
>>
>> - kernel 2.6.27:
>>
>> ta2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
>> ata2: SError: { PHYRdyChg CommWake }
>> ata2: hard resetting link
>> ata2: link is slow to respond, please be patient (ready=-19)
>> ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
>> ata2.00: ATA-7: ST3320620AS, 3.AAJ, max UDMA/133
>> ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
>> ata2.00: configured for UDMA/133
>> ata2: EH complete
>> scsi 1:0:0:0: Direct-Access     ATA      ST3320620AS      3.AA PQ: 0 ANSI: 5
>> sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
>> sd 1:0:0:0: [sdb] Write Protect is off
>> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
>> DPO or FUA
>> sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
>> sd 1:0:0:0: [sdb] Write Protect is off
>> sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
>> sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support
>> DPO or FUA
>>  sdb: sdb1 sdb2 < sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 sdb12 sdb13 sdb14
>>  sdb15
>> sd 1:0:0:0: [sdb] Attached SCSI disk
>>
>>
>> - kernel 2.6.27.10:
>>
>> ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xe frozen
>> ata2: SError: { PHYRdyChg CommWake Dispar }
>> ata2: link is slow to respond, please be patient (ready=0)
>> ata2: device not ready (errno=-16), forcing hardreset
>> ata2: soft resetting link
>> ata2: link is slow to respond, please be patient (ready=0)
>> ata2: SRST failed (errno=-16)
>> ata2: soft resetting link
>> ata2: link is slow to respond, please be patient (ready=0)
>> ata2: SRST failed (errno=-16)
>> ata2: soft resetting link
>> ata2: link is slow to respond, please be patient (ready=0)
>> ata2: SRST failed (errno=-16)
>> ata2: limiting SATA link speed to 1.5 Gbps
>> ata2: soft resetting link
>> ata2: SRST failed (errno=-16)
>> ata2: reset failed, giving up
>> ata2: EH complete
>>
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ide" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
Comment 3 giovanni pancotti 2009-01-06 02:28:48 UTC
On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:

>>> Hardware Environment:x86 ASUS M2N-E
>
> OK, this is an MCP61 board. We're now using softreset instead of hardreset 
> on hotplug and apparently that doesn't work. Thing is that:

no, it is not MCP61, but MCP55:

dual ~ # lspci
00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a1)
00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a2)
00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a2)
00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1)
00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2)
00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1)
00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)

dual ~ # dmidecode
# dmidecode 2.9
SMBIOS 2.4 present.
72 structures occupying 2069 bytes.
Table at 0x000F0000.

Handle 0x0000, DMI type 0, 24 bytes
BIOS Information
        Vendor: Phoenix Technologies, LTD
                Version: ASUS M2N-E ACPI BIOS Revision 1601
Comment 4 Robert Hancock 2009-01-06 16:19:55 UTC
Giovanni Pancotti wrote:
> On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:
> 
>>>> Hardware Environment:x86 ASUS M2N-E
>> OK, this is an MCP61 board. We're now using softreset instead of hardreset 
>> on hotplug and apparently that doesn't work. Thing is that:
> 
> no, it is not MCP61, but MCP55:

Ahh, ok, that is less contradictory then :-) Presumably we should still 
be using hardreset on that chipset.

> 
> dual ~ # lspci
> 00:00.0 RAM memory: nVidia Corporation MCP55 Memory Controller (rev a1)
> 00:01.0 ISA bridge: nVidia Corporation MCP55 LPC Bridge (rev a2)
> 00:01.1 SMBus: nVidia Corporation MCP55 SMBus (rev a2)
> 00:02.0 USB Controller: nVidia Corporation MCP55 USB Controller (rev a1)
> 00:02.1 USB Controller: nVidia Corporation MCP55 USB Controller (rev a2)
> 00:04.0 IDE interface: nVidia Corporation MCP55 IDE (rev a1)
> 00:05.0 IDE interface: nVidia Corporation MCP55 SATA Controller (rev a2)
> 
> dual ~ # dmidecode
> # dmidecode 2.9
> SMBIOS 2.4 present.
> 72 structures occupying 2069 bytes.
> Table at 0x000F0000.
> 
> Handle 0x0000, DMI type 0, 24 bytes
> BIOS Information
>         Vendor: Phoenix Technologies, LTD
>                 Version: ASUS M2N-E ACPI BIOS Revision 1601
> 
> 
Comment 5 Tejun Heo 2009-01-06 17:21:29 UTC
Ah... we had report of broken hardreset on GENERIC and MCP55 shares code paths with GENERIC other than command issue path, so I assumed it would behave the same (in fact, w/ swncq disabled it shares all the code paths).  Making MCP55 to use hardreset isn't difficult but I'm afraid it might break boot probing on some machines.  GENERIC probing failure didn't occur on all the machines.  Argggh.... how many reset related bugs can this series of chips have?

Does anyone know where MCP55 is located in the chipset family tree?  I wanna make sure there's meaningful distinction between GENERICs and SWNCQs before making yet another switch.  Also, I think we should wait till 2.6.29 rather than risking breaking boot probing on 2.6.28 yet again.  :-(
Comment 6 Tejun Heo 2009-01-06 17:36:35 UTC
Created attachment 19684 [details]
swncq-hardreset-debug

Can you please verify whether this patch fixes the problem?
Comment 7 giovanni pancotti 2009-01-07 11:38:15 UTC
verified, hotplug don't works :-(

dmesg at boot time:

sata_nv 0000:00:05.0: version 3.5
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 23
sata_nv 0000:00:05.0: PCI INT A -> Link[APSI] -> GSI 23 (level, low) -> IRQ 23
sata_nv 0000:00:05.0: Using SWNCQ mode
sata_nv 0000:00:05.0: setting latency timer to 64
scsi0 : sata_nv
scsi1 : sata_nv
ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xdc00 irq 23
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xdc08 irq 23
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7: Maxtor 6V250F0, VA111900, max UDMA/133
ata1.00: 490234752 sectors, multi 1: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6V250F0   VA11 PQ: 0 ANSI: 5
ata1.00: Disabling SWNCQ mode (depth 1)
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 >
sd 0:0:0:0: [sda] Attached SCSI disk
ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 22
sata_nv 0000:00:05.1: PCI INT B -> Link[APSJ] -> GSI 22 (level, low) -> IRQ 22
sata_nv 0000:00:05.1: Using SWNCQ mode
sata_nv 0000:00:05.1: setting latency timer to 64
scsi2 : sata_nv
scsi3 : sata_nv
ata3: SATA max UDMA/133 cmd 0x9e0 ctl 0xbe0 bmdma 0xc800 irq 22
ata4: SATA max UDMA/133 cmd 0x960 ctl 0xb60 bmdma 0xc808 irq 22
ACPI: PCI Interrupt Link [ASA2] enabled at IRQ 21
sata_nv 0000:00:05.2: PCI INT C -> Link[ASA2] -> GSI 21 (level, low) -> IRQ 21
sata_nv 0000:00:05.2: Using SWNCQ mode
sata_nv 0000:00:05.2: setting latency timer to 64
scsi4 : sata_nv
scsi5 : sata_nv
ata5: SATA max UDMA/133 cmd 0xc400 ctl 0xc000 bmdma 0xb400 irq 21
ata6: SATA max UDMA/133 cmd 0xbc00 ctl 0xb800 bmdma 0xb408 irq 21

after hotplug:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake Dispar }
ata2: link is slow to respond, please be patient (ready=0)
ata2: device not ready (errno=-16), forcing hardreset
ata2: soft resetting link
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: soft resetting link
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: soft resetting link
ata2: link is slow to respond, please be patient (ready=0)
ata2: SRST failed (errno=-16)
ata2: limiting SATA link speed to 1.5 Gbps
ata2: soft resetting link
ata2: SRST failed (errno=-16)
ata2: reset failed, giving up
ata2: EH complete
Comment 8 Tejun Heo 2009-01-07 17:54:13 UTC
Created attachment 19712 [details]
swncq-hardreset-debug-2

Sorry, I missed one line.  Can you please test with both sata_nv.swncq=0 and sata_nv.swncq=1?
Comment 9 giovanni pancotti 2009-01-08 10:49:18 UTC
with swncq-hardreset-debug-2 and sata_nv.swncq=0, dmesg at boot:

sata_nv 0000:00:05.0: version 3.5
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 23
sata_nv 0000:00:05.0: PCI INT A -> Link[APSI] -> GSI 23 (level, low) -> IRQ 23
sata_nv 0000:00:05.0: setting latency timer to 64
scsi0 : sata_nv
scsi1 : sata_nv
ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xdc00 irq 23
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xdc08 irq 23
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7: Maxtor 6V250F0, VA111900, max UDMA/133
ata1.00: 490234752 sectors, multi 1: LBA48 NCQ (depth 0/32)
ata1.00: configured for UDMA/133
ata2: SATA link down (SStatus 0 SControl 300)
scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6V250F0   VA11 PQ: 0 ANSI: 5
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 >
sd 0:0:0:0: [sda] Attached SCSI disk
ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 22
sata_nv 0000:00:05.1: PCI INT B -> Link[APSJ] -> GSI 22 (level, low) -> IRQ 22
sata_nv 0000:00:05.1: setting latency timer to 64
scsi2 : sata_nv
scsi3 : sata_nv
ata3: SATA max UDMA/133 cmd 0x9e0 ctl 0xbe0 bmdma 0xc800 irq 22
ata4: SATA max UDMA/133 cmd 0x960 ctl 0xb60 bmdma 0xc808 irq 22
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ACPI: PCI Interrupt Link [ASA2] enabled at IRQ 21
sata_nv 0000:00:05.2: PCI INT C -> Link[ASA2] -> GSI 21 (level, low) -> IRQ 21
sata_nv 0000:00:05.2: setting latency timer to 64
scsi4 : sata_nv
scsi5 : sata_nv
ata5: SATA max UDMA/133 cmd 0xc400 ctl 0xc000 bmdma 0xb400 irq 21
ata6: SATA max UDMA/133 cmd 0xbc00 ctl 0xb800 bmdma 0xb408 irq 21
ata5: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)

after hotplug hdisk: no result :-(

dual ~ # echo "- - -" > /sys/class/scsi_host/host1/scan

ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xf
ata2: SError: { PHYRdyChg CommWake Dispar }
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5)
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: NODEV after polling detection
ata2: EH complete

--------------------------------------------------------------

with swncq-hardreset-debug-2 and sata_nv.swncq=1, dmesg at boot:

sata_nv 0000:00:05.0: version 3.5
ACPI: PCI Interrupt Link [APSI] enabled at IRQ 23
sata_nv 0000:00:05.0: PCI INT A -> Link[APSI] -> GSI 23 (level, low) -> IRQ 23
sata_nv 0000:00:05.0: Using SWNCQ mode
sata_nv 0000:00:05.0: setting latency timer to 64
scsi0 : sata_nv
scsi1 : sata_nv
ata1: SATA max UDMA/133 cmd 0x9f0 ctl 0xbf0 bmdma 0xdc00 irq 23
ata2: SATA max UDMA/133 cmd 0x970 ctl 0xb70 bmdma 0xdc08 irq 23
ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
ata1.00: ATA-7: Maxtor 6V250F0, VA111900, max UDMA/133
ata1.00: 490234752 sectors, multi 1: LBA48 NCQ (depth 31/32)
ata1.00: configured for UDMA/133
ata2: SATA link down (SStatus 0 SControl 300)
scsi 0:0:0:0: Direct-Access     ATA      Maxtor 6V250F0   VA11 PQ: 0 ANSI: 5
ata1.00: Disabling SWNCQ mode (depth 1)
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 0:0:0:0: [sda] 490234752 512-byte hardware sectors (251000 MB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sda: sda1 sda2 sda3 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 >
sd 0:0:0:0: [sda] Attached SCSI disk
ACPI: PCI Interrupt Link [APSJ] enabled at IRQ 22
sata_nv 0000:00:05.1: PCI INT B -> Link[APSJ] -> GSI 22 (level, low) -> IRQ 22
sata_nv 0000:00:05.1: Using SWNCQ mode
sata_nv 0000:00:05.1: setting latency timer to 64
scsi2 : sata_nv
scsi3 : sata_nv
ata3: SATA max UDMA/133 cmd 0x9e0 ctl 0xbe0 bmdma 0xc800 irq 22
ata4: SATA max UDMA/133 cmd 0x960 ctl 0xb60 bmdma 0xc808 irq 22
ata3: SATA link down (SStatus 0 SControl 300)
ata4: SATA link down (SStatus 0 SControl 300)
ACPI: PCI Interrupt Link [ASA2] enabled at IRQ 21
sata_nv 0000:00:05.2: PCI INT C -> Link[ASA2] -> GSI 21 (level, low) -> IRQ 21
sata_nv 0000:00:05.2: Using SWNCQ mode
sata_nv 0000:00:05.2: setting latency timer to 64
scsi4 : sata_nv
scsi5 : sata_nv
ata5: SATA max UDMA/133 cmd 0xc400 ctl 0xc000 bmdma 0xb400 irq 21
ata6: SATA max UDMA/133 cmd 0xbc00 ctl 0xb800 bmdma 0xb408 irq 21
ata5: SATA link down (SStatus 0 SControl 300)
ata6: SATA link down (SStatus 0 SControl 300)

after hotplug hdisk:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake }
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2: EH complete

dual ~ #echo "- - -" > /sys/class/scsi_host/host1/scan

ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: ATA-7: ST3320620AS, 3.AAJ, max UDMA/133
ata2.00: 625142448 sectors, multi 0: LBA48 NCQ (depth 31/32)
ata2.00: configured for UDMA/133
ata2: EH complete
scsi 1:0:0:0: Direct-Access     ATA      ST3320620AS      3.AA PQ: 0 ANSI: 5
sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
sd 1:0:0:0: [sdb] 625142448 512-byte hardware sectors (320073 MB)
sd 1:0:0:0: [sdb] Write Protect is off
sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
 sdb: sdb1 sdb2 < sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 sdb12 sdb13 sdb14 sdb15 >
sd 1:0:0:0: [sdb] Attached SCSI disk
sd 1:0:0:0: Attached scsi generic sg1 type 0

better :-)
Comment 10 Tejun Heo 2009-01-13 21:37:58 UTC
Hmm... I thought warmplug would work for swncq=0.  :-(

Can you please try hotplug several times without the explicit rescan request?  Does it always fail to proceed after that?
Comment 11 Greg Kroah-Hartman 2009-01-15 11:03:56 UTC
On Tue, Jan 06, 2009 at 06:19:44PM -0600, Robert Hancock wrote:
> Giovanni Pancotti wrote:
> > On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:
> > 
> >>>> Hardware Environment:x86 ASUS M2N-E
> >> OK, this is an MCP61 board. We're now using softreset instead of hardreset 
> >> on hotplug and apparently that doesn't work. Thing is that:
> > 
> > no, it is not MCP61, but MCP55:
> 
> Ahh, ok, that is less contradictory then :-) Presumably we should still 
> be using hardreset on that chipset.

So, do we know how to solve this in the 2.6.27.y tree?

thanks,

greg k-h
Comment 12 Tejun Heo 2009-01-15 19:42:42 UTC
No, not yet and I'd really like to delay this to the next release rather than risking breaking nv yet again as it only affects hotplug.

Giovanni, can you please test hotplug w/o the rescan request?

Thanks.
Comment 13 Robert Hancock 2009-01-16 18:38:59 UTC
Greg KH wrote:
> On Tue, Jan 06, 2009 at 06:19:44PM -0600, Robert Hancock wrote:
>> Giovanni Pancotti wrote:
>>> On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:
>>>
>>>>>> Hardware Environment:x86 ASUS M2N-E
>>>> OK, this is an MCP61 board. We're now using softreset instead of hardreset 
>>>> on hotplug and apparently that doesn't work. Thing is that:
>>> no, it is not MCP61, but MCP55:
>> Ahh, ok, that is less contradictory then :-) Presumably we should still 
>> be using hardreset on that chipset.
> 
> So, do we know how to solve this in the 2.6.27.y tree?

Can you try reverting this patch?

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=3c324283e6cdb79210cf7975c3e40d3ba3e672b2

This likely isn't a proper fix as it will probably re-break some other 
chipsets but it will confirm what the problem is in this case. It looks 
like this patch changed MCP55 to inherit from generic_ops instead of 
common_ops which caused it to use soft reset instead of hard reset. Not 
sure if that was intentional or not.. Tejun?

We really ought to fix up some of the naming in this driver to be less 
confusing. Especially the "generic" stuff should be renamed, it's not 
generic at all (it seems to only apply to MCP61 currently) yet it's used 
as a base operations for other chipset types.
Comment 14 giovanni pancotti 2009-01-17 02:18:40 UTC
Tejun,
with swncq=0 hotplug definitely don't work.
I've tried several time.

with swncq=1 w/o rescan the same :-( .

first hotplug:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake }
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2: EH complete
ata2: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen
ata2: SError: { PHYRdyChg LinkSeq TrStaTrns }
ata2: hard resetting link
ata2: SATA link down (SStatus 0 SControl 300)
ata2: EH complete

second:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake Dispar }
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
hda-intel: IRQ timing workaround is activated for card #0. Suggest a bigger bdl_pos_adj.
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5)
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: NODEV after polling detection
ata2: EH complete

third:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x1810000 action 0xe frozen
ata2: SError: { PHYRdyChg LinkSeq TrStaTrns }
ata2: hard resetting link
ata2: SATA link down (SStatus 0 SControl 300)
ata2: EH complete

last:

ata2: exception Emask 0x10 SAct 0x0 SErr 0x50000 action 0xe frozen
ata2: SError: { PHYRdyChg CommWake }
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x5)
ata2: hard resetting link
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: NODEV after polling detection
ata2: EH complete

Robert,
I am tring the patch.
Comment 15 giovanni pancotti 2009-01-17 03:36:33 UTC
Robert,
reverting the patch hotplug works.
Comment 16 Tejun Heo 2009-01-22 20:38:40 UTC
Created attachment 19945 [details]
swncq-hardreset-debug-3

Can you please try this one?
Comment 17 giovanni pancotti 2009-01-24 03:51:34 UTC
it works! :-)
if you need other info/test, let me know.

thanks a lot.
Comment 18 Tejun Heo 2009-01-24 17:49:23 UTC
So as long as reset protocol is concerned, swncq controllers are much closer to nf2 than generic.  Arghh... at this point, I can't say I have a lot positive feelings for this series of controllers with so many finely different reset protocol breakages.  :-)  Will forward the patch upstream.  Just in case it might break other cases, I'll submit it for 2.6.29 but not 2.6.28-stable or 2.6.27-stable.  Thanks.
Comment 19 Greg Kroah-Hartman 2009-02-02 15:44:46 UTC
On Fri, Jan 16, 2009 at 08:38:18PM -0600, Robert Hancock wrote:
> Greg KH wrote:
> > On Tue, Jan 06, 2009 at 06:19:44PM -0600, Robert Hancock wrote:
> >> Giovanni Pancotti wrote:
> >>> On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:
> >>>
> >>>>>> Hardware Environment:x86 ASUS M2N-E
> >>>> OK, this is an MCP61 board. We're now using softreset instead of
> hardreset 
> >>>> on hotplug and apparently that doesn't work. Thing is that:
> >>> no, it is not MCP61, but MCP55:
> >> Ahh, ok, that is less contradictory then :-) Presumably we should still 
> >> be using hardreset on that chipset.
> > 
> > So, do we know how to solve this in the 2.6.27.y tree?
> 
> Can you try reverting this patch?
> 
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=3c324283e6cdb79210cf7975c3e40d3ba3e672b2
> 
> This likely isn't a proper fix as it will probably re-break some other 
> chipsets but it will confirm what the problem is in this case. It looks 
> like this patch changed MCP55 to inherit from generic_ops instead of 
> common_ops which caused it to use soft reset instead of hard reset. Not 
> sure if that was intentional or not.. Tejun?

I don't want to revert that, as I don't want to break anything else :)

thanks,

greg k-h
Comment 20 Robert Hancock 2009-02-02 18:05:32 UTC
Greg KH wrote:
> On Fri, Jan 16, 2009 at 08:38:18PM -0600, Robert Hancock wrote:
>> Greg KH wrote:
>>> On Tue, Jan 06, 2009 at 06:19:44PM -0600, Robert Hancock wrote:
>>>> Giovanni Pancotti wrote:
>>>>> On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:
>>>>>
>>>>>>>> Hardware Environment:x86 ASUS M2N-E
>>>>>> OK, this is an MCP61 board. We're now using softreset instead of
>>>>>> hardreset 
>>>>>> on hotplug and apparently that doesn't work. Thing is that:
>>>>> no, it is not MCP61, but MCP55:
>>>> Ahh, ok, that is less contradictory then :-) Presumably we should still 
>>>> be using hardreset on that chipset.
>>> So, do we know how to solve this in the 2.6.27.y tree?
>> Can you try reverting this patch?
>>
>>
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=3c324283e6cdb79210cf7975c3e40d3ba3e672b2
>>
>> This likely isn't a proper fix as it will probably re-break some other 
>> chipsets but it will confirm what the problem is in this case. It looks 
>> like this patch changed MCP55 to inherit from generic_ops instead of 
>> common_ops which caused it to use soft reset instead of hard reset. Not 
>> sure if that was intentional or not.. Tejun?
> 
> I don't want to revert that, as I don't want to break anything else :)

That was directed at the reporter :-) However, hopefully this patch in 
current git will resolve the problem:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2d775708bc6613f1be47f1e720781343341ecc94
Comment 21 Greg Kroah-Hartman 2009-02-03 14:42:45 UTC
On Mon, Feb 02, 2009 at 08:04:42PM -0600, Robert Hancock wrote:
> Greg KH wrote:
>> On Fri, Jan 16, 2009 at 08:38:18PM -0600, Robert Hancock wrote:
>>> Greg KH wrote:
>>>> On Tue, Jan 06, 2009 at 06:19:44PM -0600, Robert Hancock wrote:
>>>>> Giovanni Pancotti wrote:
>>>>>> On Monday 05 January 2009, alle 18:57, Robert Hancock wrote:
>>>>>>
>>>>>>>>> Hardware Environment:x86 ASUS M2N-E
>>>>>>> OK, this is an MCP61 board. We're now using softreset instead of 
>>>>>>> hardreset on hotplug and apparently that doesn't work. Thing is that:
>>>>>> no, it is not MCP61, but MCP55:
>>>>> Ahh, ok, that is less contradictory then :-) Presumably we should still 
>>>>> be using hardreset on that chipset.
>>>> So, do we know how to solve this in the 2.6.27.y tree?
>>> Can you try reverting this patch?
>>>
>>>
>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff_plain;h=3c324283e6cdb79210cf7975c3e40d3ba3e672b2
>>>
>>> This likely isn't a proper fix as it will probably re-break some other 
>>> chipsets but it will confirm what the problem is in this case. It looks 
>>> like this patch changed MCP55 to inherit from generic_ops instead of 
>>> common_ops which caused it to use soft reset instead of hard reset. Not 
>>> sure if that was intentional or not.. Tejun?
>> I don't want to revert that, as I don't want to break anything else :)
>
> That was directed at the reporter :-) However, hopefully this patch in 
> current git will resolve the problem:
>
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2d775708bc6613f1be47f1e720781343341ecc94

Ah nice, I'll queue that one up as well :)

thanks,

greg k-h
Comment 22 Tejun Heo 2009-02-03 17:26:01 UTC
I wasn't really sure whether to put this in -stable or not as the pros and cons balanced each other very well.  ie. it's seemingly safe regression fix vs. it's only hotplug (boot is not broken) && any code change is dangerous.  Anyways, getting it into -stable is probably the better choice and if this one is going into -stable, the following one should too.

  http://article.gmane.org/gmane.linux.ide/38011

Thanks.
Comment 23 Samo Vodopivec 2009-05-24 18:09:50 UTC
I don't know if any patch has been submitted to 2.6.29 but hotplug still doesn't work in the 2.6.29.3 version (MCP61 chipset). Works ok with 2.6.24.5 and "echo scsi add-single-device 1 0 0 0 > /proc/scsi/scsi".

Output with 2.6.24.5:
[   73.875304] ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xb
[   73.875311] ata2: SError: { PHYRdyChg CommWake Dispar }
[   73.875320] ata2: hard resetting link
[   75.595026] ata2: SRST failed (errno=-19)
[   75.595032] ata2: reset failed (errno=-19), retrying in 9 secs
[   83.857952] ata2: hard resetting link
[   84.929820] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   84.949985] ata2.00: HPA detected: current 488395055, native 488397168
[   84.949994] ata2.00: ATA-7: ST3250410AS, 3.AAC, max UDMA/133
[   84.950016] ata2.00: 488395055 sectors, multi 0: LBA48 NCQ (depth 0/32)
[   84.989959] ata2.00: configured for UDMA/133
[   84.989975] ata2: EH complete
[   84.990031] scsi 1:0:0:0: Direct-Access     ATA      ST3250410AS      3.AA PQ: 0 ANSI: 5
[   84.990114] sd 1:0:0:0: [sdd] 488395055 512-byte hardware sectors (250058 MB)
[   84.990152] sd 1:0:0:0: [sdd] Write Protect is off
[   84.990155] sd 1:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[   84.990169] sd 1:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   84.990224] sd 1:0:0:0: [sdd] 488395055 512-byte hardware sectors (250058 MB)
[   84.990231] sd 1:0:0:0: [sdd] Write Protect is off
[   84.990233] sd 1:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[   84.990282] sd 1:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   84.990287]  sdd: sdd1 sdd2 sdd3
[   85.013247] sd 1:0:0:0: [sdd] Attached SCSI disk


Output with 2.6.29.3:
[563189.384129] ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xf
[563189.384139] ata2: SError: { PHYRdyChg CommWake Dispar }
[563190.112529] ata2: soft resetting link
[563195.320030] ata2: link is slow to respond, please be patient (ready=0)
[563200.140026] ata2: SRST failed (errno=-16)
[563200.140037] ata2: soft resetting link
[563205.340025] ata2: link is slow to respond, please be patient (ready=0)
[563210.160021] ata2: SRST failed (errno=-16)
[563210.160032] ata2: soft resetting link
[563215.330026] ata2: link is slow to respond, please be patient (ready=0)
[563245.200038] ata2: SRST failed (errno=-16)
[563245.200049] ata2: limiting SATA link speed to 1.5 Gbps
[563245.200057] ata2: soft resetting link
[563250.270046] ata2: SRST failed (errno=-16)
[563250.270055] ata2: reset failed, giving up
[563250.270064] ata2: EH complete
Comment 24 Tejun Heo 2009-05-31 01:32:03 UTC
Hello, Samo.

Hardreset was removed from sata_nv because it didn't work reliably.  Please read bko#11195 for details.  Because sometimes the link fails to come online, even if it's enabled only when softreset fails, we run the risk of losing the device due to hardreset killing the link when retrial of softreset can recover it.  Maybe it can be modified such that only hotplug event uses hardreset.  Ugh... this is getting uglier than I ever imagined.  :-(
Comment 25 Tejun Heo 2009-05-31 01:52:45 UTC
Created attachment 21645 [details]
nv-hardreset-only-on-probing.patch

Samo, can you please try this patch?  Thanks.
Comment 26 Tejun Heo 2009-05-31 01:55:41 UTC
Created attachment 21647 [details]
nv-hardreset-only-on-probing.patch

Slightly updated.  Please try this one.  Thanks.
Comment 27 Tejun Heo 2009-05-31 01:57:27 UTC
Created attachment 21649 [details]
nv-hardreset-only-on-probing.patch

Oops, inverted condition on the update.  Please test this one.  Sorry about the fuss.
Comment 28 Samo Vodopivec 2009-05-31 09:45:04 UTC
Vanilla 2.6.29.4:
[  100.104368] ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xf
[  100.104374] ata2: SError: { PHYRdyChg CommWake Dispar }
[  100.832524] ata2: soft resetting link
[  106.032575] ata2: link is slow to respond, please be patient (ready=0)
[  110.892534] ata2: SRST failed (errno=-16)
[  110.892541] ata2: soft resetting link
[  116.091302] ata2: link is slow to respond, please be patient (ready=0)
[  120.952514] ata2: SRST failed (errno=-16)
[  120.952521] ata2: soft resetting link
[  126.152518] ata2: link is slow to respond, please be patient (ready=0)
[  155.970025] ata2: SRST failed (errno=-16)
[  155.970035] ata2: limiting SATA link speed to 1.5 Gbps
[  155.970042] ata2: soft resetting link
[  160.992547] ata2: SRST failed (errno=-16)
[  160.992555] ata2: reset failed, giving up
[  160.992564] ata2: EH complete

Patching it:
patch -p1<../nv-sata.patch
patching file drivers/ata/libata-core.c
Hunk #1 succeeded at 5382 (offset -26 lines).
Hunk #3 succeeded at 5995 (offset -26 lines).
patching file drivers/ata/sata_nv.c
Hunk #2 succeeded at 415 (offset -1 lines).
Hunk #4 succeeded at 450 (offset -1 lines).
Hunk #6 succeeded at 1561 (offset -1 lines).

Patched 2.6.29.4:
[   88.168815] ata2: exception Emask 0x10 SAct 0x0 SErr 0x150000 action 0xf
[   88.168821] ata2: SError: { PHYRdyChg CommWake Dispar }
[   88.168831] ata2: hard resetting link
[   92.972538] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[   93.032766] ata2.00: HPA detected: current 488395055, native 488397168
[   93.032773] ata2.00: ATA-7: ST3250410AS, 3.AAC, max UDMA/133
[   93.032775] ata2.00: 488395055 sectors, multi 0: LBA48 NCQ (depth 0/32)
[   93.072780] ata2.00: configured for UDMA/133
[   93.072787] ata2: EH complete
[   93.072876] scsi 1:0:0:0: Direct-Access     ATA      ST3250410AS      3.AA PQ: 0 ANSI: 5
[   93.073157] sd 1:0:0:0: [sdd] 488395055 512-byte hardware sectors: (250 GB/232 GiB)
[   93.073172] sd 1:0:0:0: [sdd] Write Protect is off
[   93.073174] sd 1:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[   93.073192] sd 1:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   93.073255] sd 1:0:0:0: [sdd] 488395055 512-byte hardware sectors: (250 GB/232 GiB)
[   93.073266] sd 1:0:0:0: [sdd] Write Protect is off
[   93.073268] sd 1:0:0:0: [sdd] Mode Sense: 00 3a 00 00
[   93.073284] sd 1:0:0:0: [sdd] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[   93.073289]  sdd: sdd1 sdd2 sdd3
[   93.095861] sd 1:0:0:0: [sdd] Attached SCSI disk

Seems it works, thanks :)

Samo
Comment 29 Tejun Heo 2009-05-31 13:29:13 UTC
Great.  Can you please attach full kernel log including boot messages and the hotplug messages?  Also, please test booting with the port occupied, detaching the drive and then reattaching quickly.  Thanks.
Comment 30 Samo Vodopivec 2009-05-31 20:51:06 UTC
Created attachment 21670 [details]
boot messages
Comment 31 Samo Vodopivec 2009-05-31 21:05:38 UTC
I hope the kernel log helps. The hotplug detection process was started manually after inserting the disk with the "echo scsi add-single-device 1 0 0 0 > /proc/scsi/scsi" command.

I'm sorry but I won't be able to help you with the boot deataching test - the box is a production server and I can't play with it that much.

Samo
Comment 32 Tejun Heo 2009-05-31 23:24:00 UTC
Seems like it's working as expected.  Hmmm... the boot detaching test isn't pervasive at all tho.

1. Boot with ata2.00 occupied (the drive you used for warmplug testing)
2. Remove the drive and do "echo - - - > /sys/class/scsi_host/host1/scan".  ATA scan will kick in and detach the drive after retrying a few times.
3. Re-plug the drive and do "echo - - - > /sys/class/scsi_host/host1/scan".  The drive should appear again.
4. Attach the output of "dmesg".

Thanks.
Comment 33 Samo Vodopivec 2009-06-02 06:15:07 UTC
Will do during the weekend.

Samo
Comment 34 Samo Vodopivec 2009-06-07 08:38:25 UTC
Created attachment 21783 [details]
The test

This is the output from the requested test. I guess it works as it should, but you will know better :)

Samo
Comment 35 Tejun Heo 2009-06-10 05:39:05 UTC
Thanks for testing.  Yes, it worked as expected.  I'll polish up the patch and submit upstream.
Comment 36 Tejun Heo 2010-02-17 04:54:09 UTC
Resolving as FIXED.