Bug 44111 - pata_via: crash with VT6415 controller
Summary: pata_via: crash with VT6415 controller
Status: ASSIGNED
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jeff Garzik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-07-01 19:22 UTC by Giuliano Procida
Modified: 2016-09-16 17:51 UTC (History)
6 users (show)

See Also:
Kernel Version: 3.0 and above
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Patch to test (4.99 KB, patch)
2012-07-09 10:16 UTC, Alan
Details | Diff
Patch to test (5.29 KB, patch)
2012-07-09 10:20 UTC, Alan
Details | Diff

Description Giuliano Procida 2012-07-01 19:22:33 UTC
Original Debian bug report http://bugs.debian.org/679039 has some further information.

Hardware is an ASUS M4A88TD-M/USB3 EVO motherboard with an apparently rare on-board VIA VT6415 PATA controller. I have one device linked to this controller, a LITE-ON DVDRW SOHW-1693S. CPU is a quad-core Phenom II.

Shortly after the pata_via module loads, the system hangs completely (no SysReq, no keyboard LEDs) with probability ~ 0.9. The crash is prevented with libata dma=1. Disabling the VIA option ROM makes no difference. The BIOS is the latest available version.

In Debian 3.6.33 and earlier kernels, I believe a different driver was used that did not crash. I could boot to confirm this.

The PCI device is:

04:00.0 IDE interface [0101]: VIA Technologies, Inc. VT6415 PATA IDE
Host Controller [1106:0415] (prog-if 85 [Master SecO PriO])
	Subsystem: ASUSTeK Computer Inc. M5A88-V EVO [1043:838f]
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr-
Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
<TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 17
	Region 0: I/O ports at dc00 [size=8]
	Region 1: I/O ports at d880 [size=4]
	Region 2: I/O ports at d800 [size=8]
	Region 3: I/O ports at d480 [size=4]
	Region 4: I/O ports at d400 [size=16]
	Expansion ROM at feaf0000 [disabled] [size=64K]
	Capabilities: <access denied>

These logs were obtained using netconsole:

Crash with 3.2.0:

[   65.934145] pata_via 0000:04:00.0: version 0.3.4
[   65.934184] pata_via 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[   65.934297] pata_via 0000:04:00.0: setting latency timer to 64
[   65.935274] scsi6 : pata_via
[   65.935560] scsi7 : pata_via
[   65.936047] ata7: PATA max UDMA/133 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
[   65.936059] ata8: PATA max UDMA/133 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17
[   66.224661] ata7.01: ATAPI: LITE-ON DVDRW SOHW-1693S, KS09, max UDMA/66
[   66.240680] ata7.01: configured for UDMA/66
[   66.243198] scsi 6:0:1:0: CD-ROM            LITE-ON  DVDRW SOHW-1693S KS09 PQ: 0 ANSI: 5

Crash with 3.4.4:

[  963.751260] pata_via 0000:04:00.0: version 0.3.4
[  963.756541] scsi6 : pata_via
[  963.760485] scsi7 : pata_via
[  963.764560] ata7: PATA max UDMA/133 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
[  963.768330] ata8: PATA max UDMA/133 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17
[  964.060046] ata7.01: ATAPI: LITE-ON DVDRW SOHW-1693S, KS09, max UDMA/66
[  964.079984] ata7.01: configured for UDMA/66
[  964.086377] scsi 6:0:1:0: CD-ROM            LITE-ON  DVDRW SOHW-1693S KS09 PQ: 0 ANSI: 5

A non-crash with 3.2.0:

[  105.576978] pata_via 0000:04:00.0: version 0.3.4
[  105.577016] pata_via 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[  105.577130] pata_via 0000:04:00.0: setting latency timer to 64
[  105.578414] scsi6 : pata_via
[  105.580212] scsi7 : pata_via
[  105.580687] ata7: PATA max UDMA/133 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
[  105.580708] ata8: PATA max UDMA/133 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17
[  105.868656] ata7.01: ATAPI: LITE-ON DVDRW SOHW-1693S, KS09, max UDMA/66
[  105.884670] ata7.01: configured for UDMA/66
[  105.887152] scsi 6:0:1:0: CD-ROM            LITE-ON  DVDRW SOHW-1693S KS09 PQ: 0 ANSI: 5
[  106.048133] sr0: scsi3-mmc drive: 48x/48x writer cd/rw xa/form2 cdda tray
[  106.048150] cdrom: Uniform CD-ROM driver Revision: 3.20
[  106.048497] sr 6:0:1:0: Attached scsi CD-ROM sr0
(no crash)

(unload module)
[  227.560172] ata7.01: disabled
[  227.563668] pata_via 0000:04:00.0: PCI INT A disabled

(reload module)
[  260.320995] pata_via 0000:04:00.0: version 0.3.4
[  260.321035] pata_via 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[  260.321142] pata_via 0000:04:00.0: setting latency timer to 64
(different device enumeration)
[  260.322064] scsi8 : pata_via
[  260.322663] scsi9 : pata_via
[  260.323093] ata9: PATA max UDMA/133 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
[  260.323105] ata10: PATA max UDMA/133 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17
[  260.608665] ata9.01: ATAPI: LITE-ON DVDRW SOHW-1693S, KS09, max UDMA/66
[  260.624671] ata9.01: configured for UDMA/66
[  260.626251] scsi 8:0:1:0: CD-ROM            LITE-ON  DVDRW SOHW-1693S KS09 PQ: 0 ANSI: 5
(hard crash)

Let me know if there is anything further I can do to test.

Is there a simple way of changing the ATA driver msg_enable bits without re-compiling?

Regards,
Giuliano Procida.
Comment 1 Giuliano Procida 2012-07-01 19:25:36 UTC
s/Debian 3.6.33 and earlier kernels/Debian 2.6.33 and earlier kernels/
Comment 2 Jonathan Nieder 2012-07-01 22:33:04 UTC
(cc-ing Alan Cox. Known problem?)
Comment 3 Alan 2012-07-01 23:39:02 UTC
Unless you can find an old release that did work (if one ever did) it'll be very hard to do much about. No docs.
Comment 4 Jonathan Nieder 2012-07-01 23:53:36 UTC
Are the difficult-to-handle cards or configurations easy to recognize?  Would it make sense for the driver to reject them by default?
Comment 5 Giuliano Procida 2012-07-02 20:33:18 UTC
> Unless you can find an old release that did work (if one ever did)

I was mistaken regarding 2.6.33 an earlier. My kernels from that era did not support the VT6415 and I had no use of the DVDRW after upgrading the motherboard.

> it'll be very hard to do much about. No docs.

All is not lost, VIA produced a driver package. :-)

http://www.viaarena.com/Driver/via_idepatch_linuxdriverpackage_1.7.0.zip

This is a mix of pre-compiled binary modules and specific replacement files for various distributions. I have not checked the source files exhaustively or tried to build a kernel with any of them. However, I did a diff of their vanilla pata_via.c against a baseline kernel 2.6.32:

--- Kernel/2.6.32/pata_via.c.orig	2012-07-02 20:56:46.000000000 +0100
+++ Kernel/2.6.32/pata_via.c	2010-01-11 10:15:48.000000000 +0000
@@ -559,8 +559,15 @@
 		}
 
 	pci_dev_put(isa);
+	
+	if (flags & VIA_NO_ENABLES) {
+		static struct via_isa_bridge tmp;
+		tmp = *config;
+		tmp.flags |= VIA_NO_ENABLES;
+		config = &tmp;
+	}
 
-	if (!(config->flags & VIA_NO_ENABLES)) {
+	if (!(config->flags & VIA_NO_ENABLES) && !(flags & VIA_NO_ENABLES)) {
 		/* 0x40 low bits indicate enabled channels */
 		pci_read_config_byte(pdev, 0x40 , &enable);
 		enable &= 3;
@@ -654,14 +661,15 @@
 #endif
 
 static const struct pci_device_id via[] = {
-	{ PCI_VDEVICE(VIA, 0x0415), },
+	{ PCI_VDEVICE(VIA, 0x0415), VIA_NO_ENABLES},
 	{ PCI_VDEVICE(VIA, 0x0571), },
 	{ PCI_VDEVICE(VIA, 0x0581), },
 	{ PCI_VDEVICE(VIA, 0x1571), },
 	{ PCI_VDEVICE(VIA, 0x3164), },
 	{ PCI_VDEVICE(VIA, 0x5324), },
 	{ PCI_VDEVICE(VIA, 0xC409), VIA_IDFLAG_SINGLE },
-
+	{ PCI_VDEVICE(VIA, 0x9001), },
+//	{ PCI_VDEVICE(VIA, 0x9041), },
 	{ },
 };
 
This is basically a clumsy attempt to add VIA_NO_ENABLES to one device, and have the "no enables" behaviour take effect if this is set on either the bridge or the device.

Just reading the code from 3.4 it's not clear if it also disables the enables for the VT6415 (0x0415).

Does that help at all?
Comment 6 Giuliano Procida 2012-07-02 20:54:33 UTC
See also http://old.nabble.com/unbrick-VIA-VT6410-VT6415-td33723575.html for an OpenBSD take on this.
Comment 7 Alan 2012-07-09 10:16:33 UTC
Created attachment 75131 [details]
Patch to test
Comment 8 Alan 2012-07-09 10:20:10 UTC
Created attachment 75141 [details]
Patch to test
Comment 9 Alan 2012-07-09 10:32:06 UTC
Thanks for all the digging - this is based on your digging so might do the trick.
Comment 10 Giuliano Procida 2012-07-18 22:37:11 UTC
Finally tried this.

I verified that vanilla 3.4.4 crashes as usual.

Patched 3.4.4 module loads but no devices are detected.

[  495.790434] pata_via 0000:04:00.0: version 0.3.4
Comment 11 Janne Kulmala 2013-01-11 15:46:11 UTC
For the record, seems that I'm hitting the exactly same bug.

I have a system via two CD/DVD drives attached to a VIA VT6415 controller. The system hangs during boot-up after drive detection, and it can be cured by disabling the controller from BIOS or with using libata.dma=1.

The motherboard is an Asus P7H55-M with the latest BIOS.

Kernel is 3.6.10-2.fc17.x86_64, but I recall hitting the problem with earlier versions too.

PCI information:

03:00.0 IDE interface [0101]: VIA Technologies, Inc. VT6415 PATA IDE Host Controller [1106:0415] (prog-if 85 [Master SecO PriO])
        Subsystem: ASUSTeK Computer Inc. M5A88-V EVO [1043:838f]
        Flags: bus master, fast devsel, latency 0, IRQ 17
        I/O ports at ec00 [size=8]
        I/O ports at e880 [size=4]
        I/O ports at e800 [size=8]
        I/O ports at e480 [size=4]
        I/O ports at e400 [size=16]
        Expansion ROM at f7ff0000 [disabled] [size=64K]
        Capabilities: [50] Power Management version 3
        Capabilities: [70] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [90] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [130] Device Serial Number 00-40-63-ff-ff-63-40-00
        Kernel driver in use: pata_via


Relevant boot messages:

[    6.888438] pata_via 0000:03:00.0: version 0.3.4
[    6.890360] scsi6 : pata_via
[    6.892897] scsi7 : pata_via
[    6.892953] ata7: PATA max UDMA/133 cmd 0xec00 ctl 0xe880 bmdma 0xe400 irq 17
[    6.892954] ata8: PATA max UDMA/133 cmd 0xe800 ctl 0xe480 bmdma 0xe408 irq 17
[    7.065642] ata7.00: ATAPI: HL-DT-ST GCE-8400B, 1.02, max MWDMA2
[    7.065683] ata7.01: ATAPI: HL-DT-STDVD-ROM GDR8161B, 0100, max UDMA/33
[    7.068616] ata7.00: configured for PIO4
[    7.106398] ata7.01: configured for PIO4
[    7.107447] scsi 6:0:0:0: CD-ROM            HL-DT-ST CD-RW GCE-8400B  1.02 PQ: 0 ANSI: 5
[    7.108871] sr0: scsi3-mmc drive: 40x/40x writer cd/rw xa/form2 cdda tray
[    7.108873] cdrom: Uniform CD-ROM driver Revision: 3.20
[    7.108966] sr 6:0:0:0: Attached scsi CD-ROM sr0
[    7.109026] sr 6:0:0:0: Attached scsi generic sg3 type 5
[    7.109104] ACPI: Invalid Power Resource to register!
[    7.115481] scsi 6:0:1:0: CD-ROM            HL-DT-ST DVD-ROM GDR8161B 0100 PQ: 0 ANSI: 5
[    7.122135] sr1: scsi3-mmc drive: 20x/48x cd/rw xa/form2 cdda tray
Comment 12 Janne Kulmala 2013-01-11 16:58:21 UTC
Tried Alan's patch from comment #8 and it results the drives not being detected.

It might be due to inverted logic. I added some debug prints, and the "enable" variable is always zero. I assume that the purpose of the patch was to skip checking that register.

-	if (vh->flags & VIA_IDFLAG_NO_ENABLES) {
+	if (!(vh->flags & VIA_IDFLAG_NO_ENABLES)) {

With that modification, the drives are detected, but the hang is still present.

The hang happens with the probability of 80%. At boot-ups that get past the hang, I can reproduce the problem by repeatedly unloading and loading pata_via module.
Comment 13 Giuliano Procida 2013-09-12 22:43:55 UTC
I have just retested this and have not seen any crashes with module loads or unloads. I don't see much in the way of changes to pata_via.c in the meantime.

Changed:

kernel now some Debian-provided 3.10 version
drive swapped in is a Pioneer DVR-110D
drive jumpers are set to master, instead of slave

Not changed:

other hardware, including the lack of any other PATA devices

I'll do some more tests (jumpers, cables, drives) to see if anything makes a difference.

Janne, does your drive work with 3.10 or later?
Comment 14 Giuliano Procida 2013-09-13 10:07:23 UTC
The swapped in drive worked. The old Lite-On drive did not, regardless of master/slave settings. I can only guess that the drive is dead (and the Via chipset reacts badly) or that it does not react in a timely fashion so some ATAPI command (ditto).

I have no spare machine/motherboard to test this drive with. I'm happy to ship it within the UK to anyone who would want to play with it (and who presumably also has a Via chipset MB).
Comment 15 Alexander Kandaurov 2016-02-21 08:25:11 UTC
I believe I have the same issue (random freezes on boot when a CD drive is connected) and I performed some testing on this.

The kernel is 4.4.0-gentoo-r1, the motherboard is ASUS P8H67-V with VT6415 PATA controller and I experimented with 4 CD drives:
1) ASUS DRW-1608P3S 1.24, max UDMA/66;
2) _NEC DVD_RW ND-3520A, 1.04, max UDMA/33;
3) NEC CD-ROM DRIVE:282 (model CDR-3001B), 4.A2, max UDMA/33;
4) LG CD-ROM CRD-8522B, 2.01, max MWDMA2.

I tried connecting a logic analyzer to the IDE bus and what I can tell from the captures is that the freeze at boot time always occurs after the MODE SENSE (10) command requesting the mode page 0x2a. In successful case, after the response has been sent through DMA, the INTRQ pin is asserted, then after some delay three status checks are performed in a row, with the first two of them reading the Alternate Status register and the last one reading the Command/Status register, the read of which also clears the interrupt request. In failure case, only the first read is performed and INTRQ stays high for an infinite time.

Therefore, this bug is reproducible by sending a MODE SENSE request e.g. using sg_modes from sg3_utils package, like this:
while true; do sg_modes --page=0x2a /dev/sr0; done

I also tried requesting different mode pages on different drives and it happens that some pages on some drives cause the freeze and some pages on some drives don't. For example, requesting the mode page 0x2a on NEC ND-3520A doesn't actually freeze the system (but other pages do).

Also, it happened but once that when I requested the mode page 0x01 on NEC ND-3520A the system didn't lock up as usual but instead threw the following in dmesg:
[  835.769352] ata7: lost interrupt (Status 0x50)
[  835.769381] ata7.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
[  835.769390] ata7.00: cmd a0/01:00:00:00:10/00:00:00:00:00/a0 tag 0 dma 20480 in
                        Mode Sense(10) 5a 00 01 00 00 00 00 10 00 00res 40/00:02:00:24:00/00:00:00:00:00/a0 Emask 0x4 (timeout)
[  835.769393] ata7.00: status: { DRDY }
[  835.769429] ata7: soft resetting link
[  835.929844] ata7.00: configured for UDMA/33
[  835.930140] ata7: EH complete

Just in case, here are the outputs of sg_modes -a /dev/sr0 for all four drives and page numbers that cause the lockup.

     ASUS      DRW-1608P3S       1.24   peripheral_type: cd/dvd [0x5]
Mode parameter header from MODE SENSE(10):
  Mode data length=194, medium type=0x70, specific param=0x00, longlba=0
  Block descriptor length=0
>> Unit Attention condition [vendor specific format], page_control: current
 00     00 02 00 00
>> Read-Write error recovery, page_control: current
 00     01 0a 80 03 00 00 00 00  00 00 00 00
>> Write parameters, page_control: current
 00     05 32 60 c7 08 10 00 00  00 00 00 00 00 00 00 96
 10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 30     00 00 00 00
>> Caching, page_control: current
 00     08 0a 04 00 00 00 00 00  00 00 00 00
>> CD audio, page_control: current
 00     0e 0e 04 00 00 00 00 4b  01 ff 02 ff 00 00 00 00
>> Power condition (mmc), page_control: current
 00     1a 0a 00 03 00 00 00 c8  00 00 12 c0
>> Timeout and protect, page_control: current
 00     1d 08 00 00 00 00 00 78  15 18
>> MM capabilities and mechanical status (obsolete), page_control: current
 00     2a 42 3f 37 f1 63 29 23  1b 90 01 00 07 d0 1b 90
 10     00 00 1b 90 16 0c 00 01  00 00 00 00 16 0c 00 08
 20     00 00 21 13 00 00 1b 90  00 00 16 0c 00 00 10 89
 30     00 00 0d c8 00 00 0b 06  00 00 06 e4 00 00 02 c1
 40     00 00 00 00
(freeze on pages 0x00, 0x01, 0x05, 0x08, 0x1a, 0x1d, 0x2a, 0x3f, no freeze on page 0x3e)

    _NEC      DVD_RW ND-3520A   1.04   peripheral_type: cd/dvd [0x5]
Mode parameter header from MODE SENSE(10):
  Mode data length=230, medium type=0x70, specific param=0x00, longlba=0
  Block descriptor length=0
>> Read-Write error recovery, page_control: current
 00     01 0a 80 0f 00 00 00 00  00 00 00 00
>> Write parameters, page_control: current
 00     05 32 40 c7 08 00 00 00  00 00 00 00 00 00 00 96
 10     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 20     00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00
 30     00 00 00 00
>> Caching, page_control: current
 00     08 0a 04 00 00 00 00 00  00 00 00 00
>> CD device parameters (obsolete), page_control: current
 00     0d 06 00 09 00 3c 00 4b
>> CD audio, page_control: current
 00     0e 0e 04 00 00 00 00 4b  01 ff 02 ff 00 00 00 00
>> LU control, page_control: current
 00     18 1a 00 01 00 01 00 00  00 01 00 01 00 01 00 01
 10     00 01 00 01 00 00 00 01  00 01 00 00
>> Power condition (mmc), page_control: current
 00     1a 0a 00 03 00 00 00 c8  00 00 12 c0
>> Timeout and protect, page_control: current
 00     1d 08 00 00 00 00 00 3c  00 3c
>> MM capabilities and mechanical status (obsolete), page_control: current
 00     2a 36 1f 17 f1 77 29 23  21 13 01 00 08 00 21 13
 10     00 00 21 13 21 13 00 01  00 00 00 00 21 13 00 06
 20     00 00 21 13 00 00 1b 90  00 00 16 0d 00 00 10 8a
 30     00 00 0b 06 00 00 05 83
>> page_code: 0x30, page_control: current
 00     30 0e 01 01 00 00 00 00  00 00 00 00 00 00 00 00
(freeze on pages 0x01, 0x05, 0x08, 0x18, 0x1a, 0x1d, no freeze on pages 0x0d, 0x0e, 0x2a, 0x30, 0x3f)

    NEC       CD-ROM DRIVE:282  4.A2   peripheral_type: cd/dvd [0x5]
Mode parameter header from MODE SENSE(10):
  Mode data length=88, medium type=0x70, specific param=0x00, longlba=0
  Block descriptor length=0
>> Read-Write error recovery, page_control: current
 00     01 06 00 0f 00 00 00 00
>> CD device parameters (obsolete), page_control: current
 00     0d 06 00 0d 00 3c 00 4b
>> CD audio, page_control: current
 00     0e 0e 04 00 00 00 00 4b  01 ff 02 ff 00 00 00 00
>> LU control, page_control: current
 00     18 1a 00 00 00 01 00 00  00 00 00 01 00 01 00 00
 10     00 00 00 00 00 00 00 00  00 00 00 00
>> MM capabilities and mechanical status (obsolete), page_control: current
 00     2a 12 03 00 71 67 29 23  1b 90 01 00 00 80 0b b7
 10     00 00 00 00
(freeze on page 0x18, 0x2a, no freeze on pages 0x01, 0x0d, 0x0e, 0x3f)

    LG        CD-ROM CRD-8522B  2.01   peripheral_type: cd/dvd [0x5]
Mode parameter header from MODE SENSE(10):
  Mode data length=60, medium type=0x70, specific param=0x00, longlba=0
  Block descriptor length=0
>> Read-Write error recovery, page_control: current
 00     01 06 00 1e 00 00 00 00
>> CD device parameters (obsolete), page_control: current
 00     0d 06 00 0b 00 3c 00 4b
>> CD audio, page_control: current
 00     0e 0e 04 00 00 00 00 4b  01 ff 02 ff 00 00 00 00
>> MM capabilities and mechanical status (obsolete), page_control: current
 00     2a 12 07 00 77 63 2d 03  23 c0 00 ff 00 80 23 c0
 10     00 06 00 00
(freeze on page 0x2a, 0x3f, no freeze on pages 0x01, 0x0d, 0x0e)
Comment 16 Alexander Kandaurov 2016-02-21 08:37:17 UTC
> (freeze on pages 0x00, 0x01, 0x05, 0x08, 0x1a, 0x1d, 0x2a, 0x3f, no freeze
> on page 0x3e)
a typo, must be 0x0e instead of 0x3e.

Also, when any of the acpi=off, acpi=noirq, noapic or libata.dma=1 kernel options is added, no freeze occurs.
Comment 17 Giuliano Procida 2016-09-15 09:57:19 UTC
I still have this motherboard and I'm still using it with the Pioneer DVD drive (and I've not tried the machine-killing Lite-On drive in many kernel versions).

I recently decided to use the drive again and it was rather slow. It turned out I'd left it in or it had been defaulted (due to libata.dma) to PIO4. I forced it to UDMA/66.

Reading from the drive produced occasionally corrupted data. I tried some slower DMA transfer modes but all resulted in some corruption. Corruption was (especially for UDMA/66) 4k chunks of data reading all zeros or (any DMA mode) 4k chunks of data being incorrect.

The thought occurred that this problem might go away with MSI turned on, if this is an interrupt / data race.

04:00.0 IDE interface: VIA Technologies, Inc. VT6415 PATA IDE Host Controller (prog-if 85 [Master SecO PriO])
        Subsystem: ASUSTeK Computer Inc. Motherboard
        Flags: bus master, fast devsel, latency 0, IRQ 17
        I/O ports at dc00 [size=8]
        I/O ports at d880 [size=4]
        I/O ports at d800 [size=8]
        I/O ports at d480 [size=4]
        I/O ports at d400 [size=16]
        Expansion ROM at feaf0000 [disabled] [size=64K]
        Capabilities: [50] Power Management version 3
        Capabilities: [70] MSI: Enable- Count=1/1 Maskable+ 64bit+
        Capabilities: [90] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [130] Device Serial Number 00-40-63-ff-ff-63-40-00
        Kernel driver in use: pata_via
        Kernel modules: pata_via, ata_generic

The changes to enable MSI could be as simple as those here: http://lxr.free-electrons.com/source/drivers/ata/sata_sil24.c#L1346

I'll try to give this go perhaps this weekend.
Comment 18 Giuliano Procida 2016-09-16 17:51:16 UTC
Enabling MSI made no difference to the data corruption, Debian 4.6.0-1. I don't have the will to see if it makes any difference to the problem with the Lite-On drive.

FTR...

patch

--- a/drivers/ata/pata_via.c.orig       2016-09-15 19:21:03.588820741 +0100
+++ b/drivers/ata/pata_via.c    2016-09-15 19:25:23.087335844 +0100
@@ -655,6 +655,11 @@
 
        via_fixup(pdev, config);
 
+        if (!pci_enable_msi(pdev)) {
+          dev_info(&pdev->dev, "Using MSI\n");
+          /* pci_intx(pdev, 0); */
+        }
+        
        /* We have established the device type, now fire it up */
        return ata_pci_bmdma_init_one(pdev, ppi, &via_sht, (void *)config, 0);
 }

lspci

04:00.0 IDE interface: VIA Technologies, Inc. VT6415 PATA IDE Host Controller (prog-if 85 [Master SecO PriO])
        Subsystem: ASUSTeK Computer Inc. Motherboard
        Flags: bus master, fast devsel, latency 0, IRQ 34
        I/O ports at dc00 [size=8]
        I/O ports at d880 [size=4]
        I/O ports at d800 [size=8]
        I/O ports at d480 [size=4]
        I/O ports at d400 [size=16]
        Expansion ROM at feaf0000 [disabled] [size=64K]
        Capabilities: [50] Power Management version 3
        Capabilities: [70] MSI: Enable+ Count=1/1 Maskable+ 64bit+
        Capabilities: [90] Express Legacy Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [130] Device Serial Number 00-40-63-ff-ff-63-40-00
        Kernel driver in use: pata_via
        Kernel modules: pata_via, ata_generic

 34:          0          1     175851        146   PCI-MSI 2097152-edge      pata_via

Corruption

524138 * 2k block file on a DVD, discrepancy pattern by 2k block between 5 copies (the first one done with PIO) piped into uniq -c. Blocks containing all zeros are signalled specially.

      6 A A A A B
      4 A A 0 A A
      6 A B A A A
     26 A A A 0 A
     42 A A A A B
     30 A B A A A
     58 A A A 0 A
     78 A A B A A

Note You need to log in before you can comment on or make changes to this bug.