Bug 12609

Summary: v2.6.29-rc2 libata sff 32bit PIO regression
Product: IO/Storage Reporter: Rafael J. Wysocki (rjw)
Component: OtherAssignee: Sergei Shtylyov (headless)
Status: CLOSED CODE_FIX    
Severity: normal    
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.29-rc2 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398    
Attachments: dmesg output

Description Rafael J. Wysocki 2009-02-01 15:32:34 UTC
Subject    : v2.6.29-rc2 gets HSM violations for CD drive
Submitter  : Larry Finger <Larry.Finger@lwfinger.net>
Date       : 2009-01-23 23:52
References : http://marc.info/?l=linux-kernel&m=123275478111406&w=4
Handled-By : Mikael Pettersson <mikpe@it.uu.se>
Patch      : http://marc.info/?l=linux-kernel&m=123254501314058&w=2
Notify-Also : Hugh Dickins <hugh@veritas.com>

This entry is being used for tracking a regression from 2.6.28.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2009-02-01 16:19:00 UTC
References : http://marc.info/?l=linux-kernel&m=123254501314058&w=4
Handled-By : Hugh Dickins <hugh@veritas.com>
Notify-Also : Sergei Shtylyov <sshtylyov@ru.mvista.com>
Notify-Also : Alan Cox <alan@lxorguk.ukuu.org.uk>
Comment 2 Rafael J. Wysocki 2009-02-06 15:58:39 UTC
On Saturday 07 February 2009, Larry Finger wrote:
> Rafael J. Wysocki wrote:
> > On Thursday 05 February 2009, Hugh Dickins wrote:
> >> On Wed, 4 Feb 2009, Rafael J. Wysocki wrote:
> >>> Bug-Entry : http://bugzilla.kernel.org/show_bug.cgi?id=12609
> >>> Subject           : v2.6.29-rc2 libata sff 32bit PIO regression
> >>> Submitter : Larry Finger <Larry.Finger@lwfinger.net>
> >>> Date              : 2009-01-23 23:52 (13 days old)
> >>> References        :
> http://marc.info/?l=linux-kernel&m=123275478111406&w=4
> >>>             http://marc.info/?l=linux-kernel&m=123254501314058&w=4
> >>> Handled-By        : Mikael Pettersson <mikpe@it.uu.se>
> >>>             Hugh Dickins <hugh@veritas.com>
> >>> Patch             :
> http://marc.info/?l=linux-kernel&m=123254501314058&w=2
> >> Yes, this does still need to be listed.  My initial patch for it was
> >> not enough, but I think the three patches necessary have all now been
> >> posted (though Sergei may rewrite mine).  I'm expecting Alan to gather
> >> them together and send them in due course, but I believe he's occupied
> >> with other stuff just at the moment.
> 
> I still have the problem as of 2.6.29-rc3-00634-g9be260a
Comment 3 Rafael J. Wysocki 2009-02-08 12:38:34 UTC
Ignore-Patch : http://marc.info/?l=linux-kernel&m=123254501314058&w=2

Handled-By : Sergei Shtylyov <sshtylyov@ru.mvista.com>
Patch : http://marc.info/?l=linux-kernel&m=123412278730735&w=4
Comment 4 Rafael J. Wysocki 2009-02-15 06:25:04 UTC
On Sunday 15 February 2009, Larry Finger wrote:
> Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.28.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=12609
> > Subject             : v2.6.29-rc2 libata sff 32bit PIO regression
> > Submitter   : Larry Finger <Larry.Finger@lwfinger.net>
> > Date                : 2009-01-23 23:52 (23 days old)
> > References  : http://marc.info/?l=linux-kernel&m=123275478111406&w=4
> >               http://marc.info/?l=linux-kernel&m=123254501314058&w=4
> > Handled-By  : Mikael Pettersson <mikpe@it.uu.se>
> >               Hugh Dickins <hugh@veritas.com>
> >               Sergei Shtylyov <sshtylyov@ru.mvista.com>
> > Patch               :
> http://marc.info/?l=linux-kernel&m=123412278730735&w=4
> 
> This problem is not fixed as of 2.6.29-rc5.
Comment 6 Rafael J. Wysocki 2009-02-15 13:24:25 UTC
*** Bug 12263 has been marked as a duplicate of this bug. ***
Comment 7 Sergei Shtylyov 2009-02-16 06:20:20 UTC
Patch has been resubmitted.
Comment 9 Rafael J. Wysocki 2009-02-23 13:28:01 UTC
On Monday 23 February 2009, Larry Finger wrote:
> Rafael J. Wysocki wrote:
> > On Saturday 07 February 2009, Larry Finger wrote:
> >> Rafael J. Wysocki wrote:
> >>> On Thursday 05 February 2009, Hugh Dickins wrote:
> >>>> On Wed, 4 Feb 2009, Rafael J. Wysocki wrote:
> >>>>> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12609
> >>>>> Subject         : v2.6.29-rc2 libata sff 32bit PIO regression
> >>>>> Submitter       : Larry Finger <Larry.Finger@lwfinger.net>
> >>>>> Date            : 2009-01-23 23:52 (13 days old)
> >>>>> References      :
> http://marc.info/?l=linux-kernel&m=123275478111406&w=4
> >>>>>                  
> http://marc.info/?l=linux-kernel&m=123254501314058&w=4
> >>>>> Handled-By      : Mikael Pettersson <mikpe@it.uu.se>
> >>>>>                   Hugh Dickins <hugh@veritas.com>
> >>>>> Patch           :
> http://marc.info/?l=linux-kernel&m=123254501314058&w=2
> >>>> Yes, this does still need to be listed.  My initial patch for it was
> >>>> not enough, but I think the three patches necessary have all now been
> >>>> posted (though Sergei may rewrite mine).  I'm expecting Alan to gather
> >>>> them together and send them in due course, but I believe he's occupied
> >>>> with other stuff just at the moment.
> >> I still have the problem as of 2.6.29-rc3-00634-g9be260a
> > 
> 
> The problem was fixed as of 2.6.29-rc5-00168-gba95fd4 with commit d1b3525. At
> least for my system, the regression is cleared.
Comment 10 Jeff Kuskin 2009-03-05 09:00:10 UTC
Created attachment 20441 [details]
dmesg output
Comment 11 Jeff Kuskin 2009-03-05 09:00:45 UTC
Please reopen this bug; it is still present in 2.6.29-rc7.  My system is a Dell 630 laptop (Core2 duo).  A kernel arg of 'acpi=off' causes the bug to not occur, but that also limits to the system to using just one CPU core.

dmesg output is attached.

Error reported in syslog is (this error is repeated many times):


Mar  5 11:45:56 jskusb kernel: [  319.568193] ata1.00: qc timeout (cmd 0xa0)
Mar  5 11:45:56 jskusb kernel: [  319.568216] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
Mar  5 11:45:56 jskusb kernel: [  319.568231] ata1.00: cmd a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
Mar  5 11:45:56 jskusb kernel: [  319.568233]          cdb 00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
Mar  5 11:45:56 jskusb kernel: [  319.568235]          res 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
Mar  5 11:45:56 jskusb kernel: [  319.568242] ata1.00: status: { DRDY ERR }
Mar  5 11:46:01 jskusb kernel: [  324.612172] ata1: link is slow to respond, please be patient (ready=0)
Mar  5 11:46:06 jskusb kernel: [  329.596160] ata1: device not ready (errno=-16), forcing hardreset
Mar  5 11:46:06 jskusb kernel: [  329.596177] ata1: soft resetting link
Mar  5 11:46:06 jskusb kernel: [  329.776652] ata1.00: configured for UDMA/33
Mar  5 11:46:06 jskusb kernel: [  329.780908] ata1: EH complete
Comment 12 Sergei Shtylyov 2009-03-05 09:24:54 UTC
(In reply to comment #11)
> Please reopen this bug; it is still present in 2.6.29-rc7.  My system is a
> Dell
> 630 laptop (Core2 duo).  A kernel arg of 'acpi=off' causes the bug to not
> occur,

This sounds fishy...

> but that also limits to the system to using just one CPU core.

I'm not at all sure it's the same bug...

> dmesg output is attached.

> Error reported in syslog is (this error is repeated many times):

> Mar  5 11:45:56 jskusb kernel: [  319.568193] ata1.00: qc timeout (cmd 0xa0)
> Mar  5 11:45:56 jskusb kernel: [  319.568216] ata1.00: exception Emask 0x0
> SAct
> 0x0 SErr 0x0 action 0x6 frozen
> Mar  5 11:45:56 jskusb kernel: [  319.568231] ata1.00: cmd
> a0/00:00:00:00:00/00:00:00:00:00/a0 tag 0
> Mar  5 11:45:56 jskusb kernel: [  319.568233]          cdb 00 00 00 00 00 00
> 00
> 00  00 00 00 00 00 00 00 00
> Mar  5 11:45:56 jskusb kernel: [  319.568235]          res
> 51/20:03:00:00:00/00:00:00:00:00/a0 Emask 0x5 (timeout)
> Mar  5 11:45:56 jskusb kernel: [  319.568242] ata1.00: status: { DRDY ERR }
Comment 13 Sergei Shtylyov 2009-03-05 09:30:59 UTC
(In reply to comment #10)
> Created an attachment (id=20441) [details]

It looks likle pata_acpi and ata_piix are both tryint to drive the same controller:

[    1.970146] pata_acpi 0000:00:1f.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    1.970179] pata_acpi 0000:00:1f.1: setting latency timer to 64
[    1.970196] pata_acpi 0000:00:1f.1: PCI INT A disabled
[    1.970224] pata_acpi 0000:00:1f.2: PCI INT C -> GSI 18 (level, low) -> IRQ 18
[    1.970245] pata_acpi 0000:00:1f.2: setting latency timer to 64
[    1.970260] pata_acpi 0000:00:1f.2: PCI INT C disabled
[    1.991332] ata_piix 0000:00:1f.1: version 2.12
[    1.991343] ata_piix 0000:00:1f.1: PCI INT A -> GSI 16 (level, low) -> IRQ 16
[    1.991385] ata_piix 0000:00:1f.1: setting latency timer to 64
[    2.005022] scsi0 : ata_piix
[    2.017023] scsi1 : ata_piix
[    2.019430] ata1: PATA max UDMA/100 cmd 0x1f0 ctl 0x3f6 bmdma 0x6fa0 irq 14
[    2.019432] ata2: PATA max UDMA/100 cmd 0x170 ctl 0x376 bmdma 0x6fa8 irq 15
[    2.157034] usb 7-1: new high speed USB device using ehci_hcd and address 2
[    2.181421] ata1.00: ATAPI: TEAC DVD-ROM DV28EV, D.AD, max UDMA/33
[    2.197327] ata1.00: configured for UDMA/33

I'm not surprised it ends badly and acpi=off helps then...