Bug 14310 - [bisected] 2.6.31 regression sis5513 PIO Mode 0 hang
Summary: [bisected] 2.6.31 regression sis5513 PIO Mode 0 hang
Status: CLOSED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: IDE (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: io_ide@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks: 13615
  Show dependency tree
 
Reported: 2009-10-03 02:52 UTC by David Fries
Modified: 2009-11-06 13:05 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.31
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
debug patch for sis5513 issue (960 bytes, patch)
2009-10-24 13:38 UTC, Bartlomiej Zolnierkiewicz
Details | Diff

Description David Fries 2009-10-03 02:52:50 UTC
I just did a git bisection from 2.6.30 to 2.6.31 because 2.6.31 will
not boot on this system.  d.stussy@yahoo.com's post September 12th
looks the same, different CPU but both have sis5513 IDE chips.  His
would normally init the ethernet chip next, mine would do PS/2 next,
both hang right after the Uniform CD-ROM message.

This was bisected down to chaging PIO mode 0 for probing.  What
problem was that patch trying to solve?  Reverting just that patch at
the top of 2.6.31 tree works.  I can test patches.

Uniform Multi-Platform E-IDE driver
sis5513 0000:00:02.5: SiS635 ATA 100 (2nd gen) controller
sis5513 0000:00:02.5: IDE controller (0x1039:0x5513 rev 0xd0)
sis5513 0000:00:02.5: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xff00-0xff07
    ide1: BM-DMA at 0xff08-0xff0f
Probing IDE interface ide0...
Switched to high resolution mode on CPU 0
hda: ST3250823A, ATA DISK drive
hda: host max PIO4 wanted PIO255(auto-tune) selected PIO4
hda: UDMA/100 mode selected
Probing IDE interface ide1...
hdc: _NEC DVD_RW ND-2500A, ATAPI CD/DVD-ROM drive
hdc: host max PIO4 wanted PIO255(auto-tune) selected PIO4
hdc: UDMA/33 mode selected
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0x170-0x177,0x376 on irq 15
ide_generic: please use "probe_mask=0x3f" module parameter for probing all legacy ISA IDE ports
ide-gd driver 1.18
hda: max request size: 512KiB
hda: 488397168 sectors (250059 MB) w/8192KiB Cache, CHS=30401/255/63
hda: cache flushes supported
 hda: hda1 hda2 hda3
ide-cd driver 5.00
ide-cd: hdc: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache
Uniform CD-ROM driver Revision: 3.20
<hang>

commit 6029336426a2b43e4bc6f4a84be8789a047d139e
Author: Joao Ramos <joao.ramos@inov.pt>
Date:   Sun May 17 17:22:54 2009 +0200

    ide: try to use PIO Mode 0 during probe if possible
    
    Initially set PIO Mode 0 for all host drivers that have a 'set_pio_mode'
    method before the IDE core figures out the most suited PIO mode for the
    attached device.
    
    Signed-off-by: Joao Ramos <joao.ramos@inov.pt>
    Cc: Sergei Shtylyov <sshtylyov@ru.montavista.com>
    Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

diff --git a/drivers/ide/ide-probe.c b/drivers/ide/ide-probe.c
index 7f264ed..b609a58 100644
--- a/drivers/ide/ide-probe.c
+++ b/drivers/ide/ide-probe.c
@@ -1032,6 +1032,15 @@ static void ide_port_init_devices(ide_hwif_t *hwif)
 		if (port_ops && port_ops->init_dev)
 			port_ops->init_dev(drive);
 	}
+
+	ide_port_for_each_dev(i, drive, hwif) {
+		/*
+		 * default to PIO Mode 0 before we figure out
+		 * the most suited mode for the attached device
+		 */
+		if (port_ops && port_ops->set_pio_mode)
+			port_ops->set_pio_mode(drive, 0);
+	}
 }
 
 static void ide_init_port(ide_hwif_t *hwif, unsigned int port,



vendor_id       : CentaurHauls
cpu family      : 6
model           : 7
model name      : VIA Samuel 2

lspci
00:00.0 Host bridge: Silicon Integrated Systems [SiS] 635 Host (rev 11)
00:01.0 PCI bridge: Silicon Integrated Systems [SiS] Virtual PCI-to-PCI bridge (AGP)
00:02.0 ISA bridge: Silicon Integrated Systems [SiS] SiS85C503/5513 (LPC Bridge)
00:02.1 SMBus: Silicon Integrated Systems [SiS] SiS961/2 SMBus Controller
00:02.2 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 07)
00:02.3 USB Controller: Silicon Integrated Systems [SiS] USB 1.1 Controller (rev 07)
00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0)
00:02.6 Modem: Silicon Integrated Systems [SiS] AC'97 Modem Controller (rev a0)
00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] AC'97 Sound Controller (rev a0)
00:03.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 PCI Fast Ethernet (rev 90)
00:09.0 Multimedia video controller: Conexant CX23880/1/2/3 PCI Video and Audio Decoder (rev 05)
00:09.2 Multimedia controller: Conexant CX23880/1/2/3 PCI Video and Audio Decoder [MPEG Port] (rev 05)
01:00.0 VGA compatible controller: Silicon Integrated Systems [SiS] 315PRO PCI/AGP VGA Display Adapter
Comment 1 Rafael J. Wysocki 2009-10-04 13:12:14 UTC
First-Bad-Commit : 6029336426a2b43e4bc6f4a84be8789a047d139e
Notify-Also : linux-ide@vger.kernel.org
Comment 2 David Fries 2009-10-10 01:07:54 UTC
The bad commit exposed a bug which was fixed by
e13ee546bb06453939014c7b854e77fb643fd6f1
sis5513: fix PIO setup for ATAPI devices
Comment 3 Jiri Bohac 2009-10-23 18:32:03 UTC
I can still reproduce this on 2.6.31.5, where the e13ee546bb06453939014c7b854e77fb643fd6f1 fix is included. Reverting
6029336426a2b43e4bc6f4a84be8789a047d139e helps.

Without reverting 6029336426a2b43e4bc6f4a84be8789a047d139e the system hangs on "ide-cd driver 5.00", but sometimes I also get {DriveReady SeekComplete Error} errors on hda before the hang -- never seen them without 
6029336426a2b43e4bc6f4a84be8789a047d139e.

A few interesting lines of output before the hang: (hand-copied, no serial console here)

hda: ST38421A <strange_nonASCII_triangle_character_here> , ATA DISK drive
hdc: LTN5269, ATAPI CD/DVD-ROM drive
hdc: tPIO > 2, assuming tPIO=2
hdc: tPIO > 2, assuming tPIO=2
..
..
<hda seek errors>
..
possibly failed opcode: 0xc6
hda: lost interrupt
hda: 
<hda seek error>
possibly failed opcode: 0x20
 unknown patrition table
ide-cd driver 5.00
<hang>

I don't have regular access to this machine, but if help is needed, I can arrange access to the machine.

Can we re-open this bug?
Comment 4 Bartlomiej Zolnierkiewicz 2009-10-24 13:38:41 UTC
Created attachment 23513 [details]
debug patch for sis5513 issue
Comment 5 Bartlomiej Zolnierkiewicz 2009-10-24 13:39:48 UTC
Jiri, please try the patch from

http://bugzilla.kernel.org/attachment.cgi?id=23513

Thanks.
Comment 6 Jiri Bohac 2009-10-27 08:35:25 UTC
(In reply to comment #5)
> Jiri, please try the patch from
> 
> http://bugzilla.kernel.org/attachment.cgi?id=23513

Great, this patch, applied on 2.6.31.5, fixes the problem for me.

Out of curiosity, I also tried to revert the original fix (e13ee546bb06453939014c7b854e77fb643fd6f1) and, with the debug patch still applied, it still worked.

So maybe 6029336426a2b43e4bc6f4a84be8789a047d139e causes two independent problems in sis5513? One fixed with e13ee546bb06453939014c7b854e77fb643fd6f1 and one fixed with this debug patch?

It would be great if this patch could be added to the stable tree. Thanks!
Comment 7 Bartlomiej Zolnierkiewicz 2009-11-05 15:56:23 UTC
On Tuesday 27 October 2009 09:35:27 bugzilla-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=14310
> 
> 
> 
> 
> 
> --- Comment #6 from Jiri Bohac <jbohac@jikos.cz>  2009-10-27 08:35:25 ---
> (In reply to comment #5)
> > Jiri, please try the patch from
> > 
> > http://bugzilla.kernel.org/attachment.cgi?id=23513
> 
> Great, this patch, applied on 2.6.31.5, fixes the problem for me.
> 
> Out of curiosity, I also tried to revert the original fix
> (e13ee546bb06453939014c7b854e77fb643fd6f1) and, with the debug patch still
> applied, it still worked.

Thanks for checking this.

> So maybe 6029336426a2b43e4bc6f4a84be8789a047d139e causes two independent
> problems in sis5513? One fixed with e13ee546bb06453939014c7b854e77fb643fd6f1
> and one fixed with this debug patch?

Yes, your findings are correct.

> It would be great if this patch could be added to the stable tree. Thanks!

The problem is that the change in the debug patch affects all host drivers
and at this moment is not safe enough neither for -stable nor for -rc.

Somebody needs to go and review all host drivers and their usage of ->media
field for tuning logic present in ->set_pio_mode implementations first.

Before this happens we probably should revert commit 6029336 but I'll leave
this up to the maintainer (I seriously doubt he will ever go and review any
IDE drivers or make any IDE fixes so please just ping him about the revert).

Thanks,
Comment 8 David S. Miller 2009-11-06 12:51:46 UTC
From: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Date: Thu, 5 Nov 2009 16:54:27 +0100

> The problem is that the change in the debug patch affects all host drivers
> and at this moment is not safe enough neither for -stable nor for -rc.
> 
> Somebody needs to go and review all host drivers and their usage of ->media
> field for tuning logic present in ->set_pio_mode implementations first.

The drivers which do this are:

drivers/ide/alim15x3.c
drivers/ide/it8172.c
drivers/ide/it8213.c
drivers/ide/pdc202xx_old.c
drivers/ide/piix.c
drivers/ide/qd65xx.c
drivers/ide/sis5513.c
drivers/ide/slc90e66.c

While the majority of them seem to use it to decide whether to enable
prefetching or not, alim15x3.c uses the ->media value to decide
whether to turn on the ATA or the ATAPI FIFO.

I don't think it's safe to call ->set_pio_mode() until we actually
know the media type.

I suppose we could guard the prefetch/FIFO changes in these drivers
with a test like:

	if (!(hwif->port_flags & IDE_PFLAG_PROBING))

or something like that.

But in this stage of the game, it's just too risky and reverting
the early PIO0 commit is the best way to go.

I'll do that now.
Comment 9 Alan 2009-11-06 13:05:14 UTC
You turn the FIFO off if you are probing. Always

You cannot know the media type until you probe, and you cannot properly probe until you've set the interface to a sane known state. The media type should be set to indicate no media at that point.

Agreed on reverting it - it generally only affects non PC platforms anyway - the PC ones tend to come up with the controller either in PIO 0 or configured to match the drive by the BIOS - both of which are generally acceptable.

Note You need to log in before you can comment on or make changes to this bug.