Bug 10723

Summary: linux 2.6.24 fails to mount ext3 rootfs on FUJITSU MHK2060AT
Product: IO/Storage Reporter: Juergen Kosel (juergen.kosel)
Component: IDEAssignee: Bartlomiej Zolnierkiewicz (bzolnier)
Status: CLOSED CODE_FIX    
Severity: normal CC: devzero
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24.1 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: the requested output of "lspci -s 14.0 -vvv -xxx" for the good kernel version
opti621: disable DMA and use PCI clock info
[PATCH] opti621: remove DMA support
[PATCH] opti621: use PCI clock value provided by controller
[PATCH] opti621: program devices timings separately in ->set_pio_mode
[PATCH] opti621: use pre-calculated PIO timings
[PATCH] opti621: add PIO 4 support
[PATCH] opti621: disable read prefetch
dmesg of patched v2.6.25
dmesg of 2.6.25 without READ_PREFETCH

Description Juergen Kosel 2008-05-16 09:46:17 UTC
Latest working kernel version:2.6.22.3 (Debian)
Earliest failing kernel version:2.6.24.1 (from kernel.org)
Distribution: Debian
Hardware Environment:Laptop Compaq Armada 4120
Software Environment: Debian Testing
Problem Description: When booting linux 2.6.24 (and also 2.6.25) it fails to mount the rootfs (hda1, with the following messages:

hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: DMA timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
hda: dma disabled
ide0: reset: success
Done.
Begin: Mounting root file system... ...
Begin: Running /scripts/local-top ...
Done.
Begin: Running /scripts/local/premount ...
kinit: trying to resume from /dev/hda5
Attempting manual resume
kinit: No resume image, doing normal boot...
resume: libcript version: 1.4.1
Done.
JBD: no valid journal superblock found
EXT3-fs: error loading journal.
mount: Mounting /dev/hda1 on /root failed: invalid argument


When booting linux 2.6.22 the following could be retrieved:
compac:~# lspci -vv
00:00.0 Host bridge: OPTi Inc. 82C557 [Viper-M] (rev 14)
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0

00:01.0 ISA bridge: OPTi Inc. 82C558 [Viper-M ISA+IDE] (rev 02)
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR+ INTx-
	Latency: 0
	Kernel modules: ata_generic, generic

00:02.0 VGA compatible controller: Cirrus Logic GD 7548 (prog-if 00 [VGA controller])
	Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Interrupt: pin A routed to IRQ 0
	Region 0: Memory at 40000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at 41000000 (32-bit, non-prefetchable) [size=16M]
	Kernel modules: cirrusfb

00:14.0 IDE interface: OPTi Inc. 82C621 [Viper-M/N+] (rev 12) (prog-if 80 [Master])
	Control: I/O+ Mem- BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap- 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Region 0: [virtual] Memory at 000001f0 (32-bit, non-prefetchable) [disabled] [size=8]
	Region 1: [virtual] Memory at 000003f0 (type 3, non-prefetchable) [disabled] [size=1]
	Region 2: [virtual] Memory at 00000170 (32-bit, non-prefetchable) [disabled] [size=8]
	Region 3: [virtual] Memory at 00000370 (type 3, non-prefetchable) [disabled] [size=1]
	Region 4: I/O ports at 1000 [size=16]
	Kernel driver in use: Opti621_IDE
	Kernel modules: opti621

compac:~# dmesg 
Linux version 2.6.22-3-486 (Debian 2.6.22-6) (maks@debian.org) (gcc version 4.1.3 20071019 (prerelease) (Debian 4.1.2-17)) #1 Mon Nov 12 07:53:08 UTC 2007
BIOS-provided physical RAM map:
 BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e801: 0000000000100000 - 0000000003000000 (usable)
48MB LOWMEM available.
Entering add_active_range(0, 0, 12288) 0 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->    12288
early_node_map[1] active PFN ranges
    0:        0 ->    12288
On node 0 totalpages: 12288
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 64 pages used for memmap
  Normal zone: 8128 pages, LIFO batch:0
DMI not present or invalid.
Allocating PCI resources starting at 10000000 (gap: 03000000:fd000000)
Built 1 zonelists.  Total pages: 12192
Kernel command line: root=/dev/hda1 ro panic=5 
No local APIC present or hardware disabled
mapped APIC to ffffd000 (01061000)
Initializing CPU#0
PID hash table entries: 256 (order: 8, 1024 bytes)
Detected 120.691 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 39276k/49152k available (1656k kernel code, 9488k reserved, 650k data, 316k init, 0k highmem)
virtual kernel memory layout:
    fixmap  : 0xfffb5000 - 0xfffff000   ( 296 kB)
    vmalloc : 0xc3800000 - 0xfffb3000   ( 967 MB)
    lowmem  : 0xc0000000 - 0xc3000000   (  48 MB)
      .init : 0xc0344000 - 0xc0393000   ( 316 kB)
      .data : 0xc029e27b - 0xc0340cc4   ( 650 kB)
      .text : 0xc0100000 - 0xc029e27b   (1656 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 242.28 BogoMIPS (lpj=484566)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 000001bf 00000000 00000000 00000000 00000000 00000000 00000000
Intel Pentium with F0 0F bug - workaround enabled.

CPU: After all inits, caps: 000001bf 00000000 00000000 00000000 00000000 00000000 00000000
Compat vDSO mapped to ffffe000.
CPU: Intel Pentium 75 - 200 stepping 0c
Checking 'hlt' instruction... OK.
ACPI: Core revision 20070126
ACPI Exception (tbxface-0618): AE_NO_ACPI_TABLES, While loading namespace from ACPI tables [20070126]
ACPI: Unable to load the System Description Tables
Booting paravirtualized kernel on bare hardware
NET: Registered protocol family 16
EISA bus registered
PCI: PCI BIOS revision 2.10 entry at 0xf73d9, last bus=0
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI: disabled
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00f5010
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x1a65, dseg 0xf0000
PnPBIOS: 18 nodes reported by PnP BIOS; 18 recorded by driver
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
NET: Registered protocol family 8
NET: Registered protocol family 20
ACPI Error (evxfevnt-0186): Could not enable RealTimeClock event [20070126]
ACPI Warning (evxface-0145): Could not enable fixed event 4 [20070126]
pnp: 00:08: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:08: iomem range 0xf0000-0xfffff could not be reserved
pnp: 00:08: iomem range 0x100000-0x2ffffff could not be reserved
Time: tsc clocksource has been installed.
pnp: 00:13: ioport range 0x100-0x107 has been reserved
pnp: 00:13: ioport range 0x2a0-0x2a0 has been reserved
pnp: 00:13: ioport range 0x370-0x371 has been reserved
pnp: 00:13: iomem range 0xfffc0000-0xffffffff has been reserved
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 6144k freed
audit: initializing netlink socket (disabled)
audit(1210697492.240:1): initialized
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Boot video device is 0000:00:02.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A
00:02: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:03: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
PNP: PS/2 Controller [PNP0303,PNP0f0e] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
EISA: Probing bus 0 at eisa.0
Cannot allocate resource for EISA slot 1
EISA: Detected 0 cards.
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
Freeing unused kernel memory: 316k freed
input: AT Translated Set 2 keyboard as /class/input/input0
thermal: Unknown symbol acpi_processor_set_thermal_limit
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SCSI subsystem initialized
OPTI621: IDE controller at PCI slot 0000:00:14.0
OPTI621: chipset revision 18
OPTI621: not 100% native mode: will probe irqs later
PCI: Setting latency timer of device 0000:00:14.0 to 64
    ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:pio, hdb:pio
Probing IDE interface ide0...
hda: FUJITSU MHK2060AT, ATA DISK drive
libata version 2.21 loaded.
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
hda: max request size: 128KiB
hda: 11733120 sectors (6007 MB) w/512KiB Cache, CHS=12416/15/63, (U)DMA
hda: cache flushes not supported
 hda:<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda1 hda2 <<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda5<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda6 >
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
Attempting manual resume
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
input: PC Speaker as /class/input/input1
Real Time Clock Driver v1.12ac
parport_pc 00:01: reported by Plug and Play BIOS
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
parport1: PC-style at 0x3bc [PCSPP,TRISTATE]
parport2: PC-style at 0x278 [PCSPP,TRISTATE,EPP]
input: ImPS/2 Generic Wheel Mouse as /class/input/input2
Adding 75560k swap on /dev/hda5.  Priority:-1 extents:1 across:75560k
EXT3 FS on hda1, internal journal
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com
Intel ISA PCIC probe: 
  Cirrus PD672x ISA-to-PCMCIA at port 0x3e0 ofs 0x00, 2 sockets
    host opts [0]: [ring] [65/6/3] [1/15/3]
    host opts [1]: [ring] [65/6/3] [1/15/3]
    ISA irqs (default) = 3,4,5,9,10,11 polling interval = 1000 ms
pccard: PCMCIA card inserted into slot 1
cs: IO port probe 0x100-0x4ff: excluding 0x200-0x207 0x220-0x22f 0x330-0x337 0x388-0x38f 0x400-0x40f 0x480-0x48f 0x4c0-0x4df
cs: IO port probe 0x800-0x8ff: clean.
cs: IO port probe 0xc00-0xcff: clean.
cs: IO port probe 0x100-0x4ff: excluding 0x200-0x207 0x220-0x22f 0x330-0x337 0x388-0x38f 0x400-0x40f 0x480-0x48f 0x4c0-0x4df
cs: IO port probe 0x800-0x8ff: clean.
cs: IO port probe 0xc00-0xcff: clean.
cs: IO port probe 0xa00-0xaff: clean.
cs: memory probe 0x0d0000-0x0dffff: clean.
pcmcia: registering new device pcmcia1.0
cs: IO port probe 0xa00-0xaff: clean.
eth0: NE2000 Compatible: io 0x300, irq 3, hw_addr 00:E0:98:98:6C:8E
NET: Registered protocol family 10
lo: Disabled Privacy Extensions
compac:~# 

The hardware is very old (> 10 years) so it might be dying hardware.
But what appears suspicious to me is the fact that older linux versions
succeed in mounting the rootfs.


Steps to reproduce:
Comment 1 Roland Kletzing 2008-05-16 16:25:08 UTC
does "ide=nodma" make a difference ?
Comment 2 Juergen Kosel 2008-05-17 06:23:31 UTC
ide=nodma doesn't change much. It even still complains about the dma timeouts.

Except that it still fails to mount the rootfs it looks similar to this output from 2.6.22.3:

Linux version 2.6.22-3-486 (Debian 2.6.22-6) (maks@debian.org) (gcc version 4.1.3 20071019 (prerelease) (Debian 4.1.2-17)) #1 Mon Nov 12 07:53:08 UTC 2007
BIOS-provided physical RAM map:
 BIOS-e801: 0000000000000000 - 000000000009f000 (usable)
 BIOS-e801: 0000000000100000 - 0000000003000000 (usable)
48MB LOWMEM available.
Entering add_active_range(0, 0, 12288) 0 entries of 256 used
Zone PFN ranges:
  DMA             0 ->     4096
  Normal       4096 ->    12288
early_node_map[1] active PFN ranges
    0:        0 ->    12288
On node 0 totalpages: 12288
  DMA zone: 32 pages used for memmap
  DMA zone: 0 pages reserved
  DMA zone: 4064 pages, LIFO batch:0
  Normal zone: 64 pages used for memmap
  Normal zone: 8128 pages, LIFO batch:0
DMI not present or invalid.
Allocating PCI resources starting at 10000000 (gap: 03000000:fd000000)
Built 1 zonelists.  Total pages: 12192
Kernel command line: root=/dev/hda1 ro ide=nodma
No local APIC present or hardware disabled
mapped APIC to ffffd000 (01061000)
Initializing CPU#0
PID hash table entries: 256 (order: 8, 1024 bytes)
Detected 120.691 MHz processor.
Console: colour VGA+ 80x25
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 39276k/49152k available (1656k kernel code, 9488k reserved, 650k data, 316k init, 0k highmem)
virtual kernel memory layout:
    fixmap  : 0xfffb5000 - 0xfffff000   ( 296 kB)
    vmalloc : 0xc3800000 - 0xfffb3000   ( 967 MB)
    lowmem  : 0xc0000000 - 0xc3000000   (  48 MB)
      .init : 0xc0344000 - 0xc0393000   ( 316 kB)
      .data : 0xc029e27b - 0xc0340cc4   ( 650 kB)
      .text : 0xc0100000 - 0xc029e27b   (1656 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 242.28 BogoMIPS (lpj=484574)
Security Framework v1.0.0 initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
CPU: After generic identify, caps: 000001bf 00000000 00000000 00000000 00000000 00000000 00000000
Intel Pentium with F0 0F bug - workaround enabled.

CPU: After all inits, caps: 000001bf 00000000 00000000 00000000 00000000 00000000 00000000
Compat vDSO mapped to ffffe000.
CPU: Intel Pentium 75 - 200 stepping 0c
Checking 'hlt' instruction... OK.
ACPI: Core revision 20070126
ACPI Exception (tbxface-0618): AE_NO_ACPI_TABLES, While loading namespace from ACPI tables [20070126]
ACPI: Unable to load the System Description Tables
Booting paravirtualized kernel on bare hardware
NET: Registered protocol family 16
EISA bus registered
PCI: PCI BIOS revision 2.10 entry at 0xf73d9, last bus=0
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI: disabled
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00f5010
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x1a65, dseg 0xf0000
PnPBIOS: 18 nodes reported by PnP BIOS; 18 recorded by driver
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
NET: Registered protocol family 8
NET: Registered protocol family 20
ACPI Error (evxfevnt-0186): Could not enable RealTimeClock event [20070126]
ACPI Warning (evxface-0145): Could not enable fixed event 4 [20070126]
pnp: 00:08: iomem range 0x0-0x9ffff could not be reserved
pnp: 00:08: iomem range 0xf0000-0xfffff could not be reserved
pnp: 00:08: iomem range 0x100000-0x2ffffff could not be reserved
Time: tsc clocksource has been installed.
pnp: 00:13: ioport range 0x100-0x107 has been reserved
pnp: 00:13: ioport range 0x2a0-0x2a0 has been reserved
pnp: 00:13: ioport range 0x370-0x371 has been reserved
pnp: 00:13: iomem range 0xfffc0000-0xffffffff has been reserved
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 6144k freed
audit: initializing netlink socket (disabled)
audit(1211029346.240:1): initialized
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Boot video device is 0000:00:02.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A
00:02: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:03: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
PNP: PS/2 Controller [PNP0303,PNP0f0e] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
EISA: Probing bus 0 at eisa.0
Cannot allocate resource for EISA slot 1
EISA: Detected 0 cards.
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
Freeing unused kernel memory: 316k freed
input: AT Translated Set 2 keyboard as /class/input/input0
thermal: Unknown symbol acpi_processor_set_thermal_limit
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SCSI subsystem initialized
OPTI621: IDE controller at PCI slot 0000:00:14.0
OPTI621: chipset revision 18
OPTI621: not 100% native mode: will probe irqs later
PCI: Setting latency timer of device 0000:00:14.0 to 64
    ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:pio, hdb:pio
Probing IDE interface ide0...
hda: FUJITSU MHK2060AT, ATA DISK drive
libata version 2.21 loaded.
Floppy drive(s): fd0 is 1.44M
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
FDC 0 is a post-1991 82077
hda: max request size: 128KiB
hda: 11733120 sectors (6007 MB) w/512KiB Cache, CHS=12416/15/63, (U)DMA
hda: cache flushes not supported
 hda:<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda1 hda2 <<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda5<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda6 >
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
Attempting manual resume
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
kjournald starting.  Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
input: PC Speaker as /class/input/input1
parport_pc 00:01: reported by Plug and Play BIOS
parport0: PC-style at 0x378, irq 7 [PCSPP,TRISTATE,EPP]
parport1: PC-style at 0x3bc [PCSPP,TRISTATE]
Real Time Clock Driver v1.12ac
input: ImPS/2 Generic Wheel Mouse as /class/input/input2
Adding 75560k swap on /dev/hda5.  Priority:-1 extents:1 across:75560k
EXT3 FS on hda1, internal journal
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
device-mapper: ioctl: 4.11.0-ioctl (2006-10-12) initialised: dm-devel@redhat.com
Comment 3 Sergei Shtylyov 2008-05-17 06:33:06 UTC
(In reply to comment #1)
> does "ide=nodma" make a difference ?

This option has been obsoleted, use hda=nodma instead.
Comment 4 Juergen Kosel 2008-05-17 07:57:24 UTC
Hello,

also hda=nodma makes no difference.
But this time I found out that it is possible to mount the rootfs from the initramfs busybox comandline as ext2:

Initializing CPU#0
PID hash table entries: 256 (order: 8, 1024 bytes)
Detected 120.687 MHz processor.
Console: colour VGA+ 80x25
console [tty0] enabled
Dentry cache hash table entries: 8192 (order: 3, 32768 bytes)
Inode-cache hash table entries: 4096 (order: 2, 16384 bytes)
Memory: 38896k/49152k available (1747k kernel code, 9868k reserved, 711k data, 324k init, 0k highmem)
virtual kernel memory layout:
    fixmap  : 0xfffb3000 - 0xfffff000   ( 304 kB)
    vmalloc : 0xc3800000 - 0xfffb1000   ( 967 MB)
    lowmem  : 0xc0000000 - 0xc3000000   (  48 MB)
      .init : 0xc036a000 - 0xc03bb000   ( 324 kB)
      .data : 0xc02b4d71 - 0xc0366bec   ( 711 kB)
      .text : 0xc0100000 - 0xc02b4d71   (1747 kB)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay using timer specific routine.. 242.55 BogoMIPS (lpj=485115)
Security Framework initialized
SELinux:  Disabled at boot.
Capability LSM initialized
Mount-cache hash table entries: 512
Initializing cgroup subsys ns
Initializing cgroup subsys cpuacct
CPU: After generic identify, caps: 000001bf 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Intel Pentium with F0 0F bug - workaround enabled.

CPU: After all inits, caps: 000001bf 00000000 00000000 00000000 00000000 00000000 00000000 00000000
Compat vDSO mapped to ffffe000.
CPU: Intel Pentium 75 - 200 stepping 0c
Checking 'hlt' instruction... OK.
Freeing SMP alternatives: 0k freed
ACPI: Core revision 20070126
ACPI Exception (tbxface-0629): AE_NO_ACPI_TABLES, While loading namespace from ACPI tables [20070126]
ACPI: Unable to load the System Description Tables
net_namespace: 64 bytes
Booting paravirtualized kernel on bare hardware
NET: Registered protocol family 16
EISA bus registered
PCI: PCI BIOS revision 2.10 entry at 0xf73d9, last bus=0
PCI: Using configuration type 1
Setting up standard PCI resources
ACPI: Interpreter disabled.
Linux Plug and Play Support v0.97 (c) Adam Belay
pnp: PnP ACPI: disabled
PnPBIOS: Scanning system for PnP BIOS support...
PnPBIOS: Found PnP BIOS installation structure at 0xc00f5010
PnPBIOS: PnP BIOS version 1.0, entry 0xf0000:0x1a65, dseg 0xf0000
PnPBIOS: 18 nodes reported by PnP BIOS; 18 recorded by driver
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
NET: Registered protocol family 8
NET: Registered protocol family 20
ACPI Error (evxfevnt-0186): Could not enable RealTimeClock event [20070126]
ACPI Warning (evxface-0145): Could not enable fixed event 4 [20070126]
Time: tsc clocksource has been installed.
system 00:08: iomem range 0x0-0x9ffff could not be reserved
system 00:08: iomem range 0xf0000-0xfffff could not be reserved
system 00:08: iomem range 0x100000-0x2ffffff could not be reserved
system 00:13: ioport range 0x100-0x107 has been reserved
system 00:13: ioport range 0x2a0-0x2a0 has been reserved
system 00:13: ioport range 0x370-0x371 has been reserved
system 00:13: ioport range 0x374-0x374 has been reserved
system 00:13: ioport range 0x400-0x47f has been reserved
system 00:13: ioport range 0x480-0x4ff has been reserved
system 00:13: ioport range 0xcf7-0xcf7 has been reserved
system 00:13: iomem range 0xfffc0000-0xffffffff has been reserved
NET: Registered protocol family 2
IP route cache hash table entries: 1024 (order: 0, 4096 bytes)
TCP established hash table entries: 2048 (order: 2, 16384 bytes)
TCP bind hash table entries: 2048 (order: 1, 8192 bytes)
TCP: Hash tables configured (established 2048 bind 2048)
TCP reno registered
checking if image is initramfs... it is
Freeing initrd memory: 6372k freed
audit: initializing netlink socket (disabled)
audit(1211033382.013:1): initialized
Total HugeTLB memory allocated, 0
VFS: Disk quotas dquot_6.5.1
Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered (default)
Boot video device is 0000:00:02.0
isapnp: Scanning for PnP cards...
isapnp: No Plug & Play device found
Serial: 8250/16550 driver $Revision: 1.90 $ 4 ports, IRQ sharing enabled
serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
serial8250: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A
00:02: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
00:03: ttyS2 at I/O 0x3e8 (irq = 4) is a 16550A
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
PNP: PS/2 Controller [PNP0303,PNP0f0e] at 0x60,0x64 irq 1,12
serio: i8042 KBD port at 0x60,0x64 irq 1
serio: i8042 AUX port at 0x60,0x64 irq 12
mice: PS/2 mouse device common for all mice
EISA: Probing bus 0 at eisa.0
Cannot allocate resource for EISA slot 1
EISA: Detected 0 cards.
cpuidle: using governor ladder
cpuidle: using governor menu
TCP bic registered
NET: Registered protocol family 1
NET: Registered protocol family 17
Using IPI Shortcut mode
registered taskstats version 1
Freeing unused kernel memory: 324k freed
input: AT Translated Set 2 keyboard as /class/input/input0
thermal: Unknown symbol acpi_processor_set_thermal_limit
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
OPTI621: IDE controller (0x1045:0xc621 rev 0x12) at  PCI slot 0000:00:14.0
OPTI621: not 100% native mode: will probe irqs later
PCI: Setting latency timer of device 0000:00:14.0 to 64
    ide0: BM-DMA at 0x1000-0x1007, BIOS settings: hda:pio, hdb:pio
OPTI621: IDE port disabled
Probing IDE interface ide0...
SCSI subsystem initialized
hdb: IRQ probe failed (0xffffefb8)
hdb: IRQ probe failed (0xffffefb8)
libata version 3.00 loaded.
hda: FUJITSU MHK2060AT, ATA DISK drive
hda: host max PIO3 wanted PIO255(auto-tune) selected PIO3
hda: Selected PIO mode 3
ide0: master: address: 1, data: 3, recovery: 3, drdy: 2 [clk]
ide0: slave:  address: 1, data: 1, recovery: 2, drdy: 2 [clk]
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: max request size: 128KiB
hda: 11733120 sectors (6007 MB) w/512KiB Cache, CHS=12416/15/63
hda: cache flushes not supported
 hda:<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda1 hda2 <<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda5<4>hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
 hda6 >
hda: dma_timer_expiry: dma status == 0x21
hda: DMA timeout error
hda: dma timeout error: status=0x58 { DriveReady SeekComplete DataRequest }
ide: failed opcode was: unknown
hda: DMA disabled
ide0: reset: success
Attempting manual resume
swsusp: Marking nosave pages: 000000000009f000 - 0000000000100000
swsusp: Basic memory bitmaps created
swsusp: Basic memory bitmaps freed
JBD: no valid journal superblock found
EXT3-fs: error loading journal.
EXT2-fs warning (device hda1): ext2_fill_super: mounting ext3 filesystem as ext2
init_special_inode: bogus i_mode (54130)
init_special_inode: bogus i_mode (54130)
init_special_inode: bogus i_mode (54130)
init_special_inode: bogus i_mode (54130)
init_special_inode: bogus i_mode (54130)
init_special_inode: bogus i_mode (54130)
init_special_inode: bogus i_mode (54130)
JBD: no valid journal superblock found
EXT3-fs: error loading journal.
EXT2-fs warning (device hda1): ext2_fill_super: mounting ext3 filesystem as ext2
Comment 5 Vlad Codrea 2008-05-17 08:00:53 UTC
You might give libata a try by compiling a kernel with CONFIG_IDE=n. Alternatively, try booting a distro like Fedora that defaults to libata to see if it can access the hard drive.
Comment 6 Bartlomiej Zolnierkiewicz 2008-05-17 09:20:21 UTC
If it is possible to mount it like ext2 but not ext3 it is quite likely to be fs related issue and not IDE one (DMA timeouts are an orthogonal thing as they are also present with 2.6.22 according to the log).

In any case you should be able to narrow down the source of the problem by using git-bisect.  Please install git package for your distribution, get kernel git tree from kernel.org and do:

git bisect start
git bisect good 2.6.22
git bisect bad 2.6.24

It will select the kernel to test, compile and boot it to see if the problem is  there.  If so do "git bisect bad" which will give you new kernel to test.  If the kernel works fine do "git bisect good" instead.  After few iterations you should find the exact commit which introduced the bug.  Thanks!
Comment 7 Roland Kletzing 2008-05-17 10:08:15 UTC
>> does "ide=nodma" make a difference ?

> This option has been obsoleted, use hda=nodma instead.

so if this is obsoleted, why does Documentation/kernel-parameters.txt still mention that ?


        ide=            [HW] (E)IDE subsystem
                        Format: ide=nodma or ide=doubler
                        See Documentation/ide/ide.txt.
Comment 8 Sergei Shtylyov 2008-05-17 11:37:41 UTC
(In reply to comment #7)
> >> does "ide=nodma" make a difference ?

> > This option has been obsoleted, use hda=nodma instead.

Obsoleted in 2.6.24-rc1, see this:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=c223701cf6c706f42840631c1ca919a18e6e2800

> so if this is obsoleted, why does Documentation/kernel-parameters.txt still
> mention that ?

>         ide=            [HW] (E)IDE subsystem
>                         Format: ide=nodma or ide=doubler
>                         See Documentation/ide/ide.txt.

Hm, looks like Bart forgot to remove this mention...
Comment 9 Juergen Kosel 2008-05-23 09:51:12 UTC
Hello,

git bisect delivered the following:

912fb29a36a7269ac1c4a4df45bc0ac1d2637972 is first bad commit
commit 912fb29a36a7269ac1c4a4df45bc0ac1d2637972
Author: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>
Date:   Fri Oct 19 00:30:11 2007 +0200

    opti621: always tune PIO

    Signed-off-by: Bartlomiej Zolnierkiewicz <bzolnier@gmail.com>

:040000 040000 ac8b9614dad447f22ebc105dcec5e94d52434c41 95facc970da3f5821b06b859340232abc2518813 M      drivers


Which are changes of just 2 lines of code to allways enable autotune.
Either my hardware doesn't like it by construction or by agged electronic components.


Greetings
   Juergen
Comment 10 Bartlomiej Zolnierkiewicz 2008-05-28 15:13:52 UTC
Could you please send output of "lspci -s 14.0 -vvv -xxx" command for good and bad kernel?

Also please verify that booting good kernel with "hda=autotune" reproduces a bug.

Thanks.
Comment 11 Juergen Kosel 2008-05-29 11:44:41 UTC
Created attachment 16323 [details]
the requested output of "lspci -s 14.0 -vvv -xxx" for the good kernel version
Comment 12 Juergen Kosel 2008-05-29 11:56:38 UTC
Hello,

the good kernel started with "hda=autotune" doesn't reproduce the bug.
Could this be caused by the fact that the related driver is compiled and loaded as a module?

I have also tried to get the lspci output for the bad kernel.
With the bad kernel it is possible to mount the root fs as ext2.
But it has trouble to read from the disk.
So the directory usr appears as a file. Which makes usr/bin/lspci unreachable.

A later restart with a good kernel showed that the bad kernel had also trouble with writting. So fsck had something to fix...

I hope that I'll find the time at the weekend to try this again with a working copy of lspci on a floppy disc.


Greetings
  Juergen
Comment 13 Bartlomiej Zolnierkiewicz 2008-05-29 13:58:27 UTC
Hi,

> the good kernel started with "hda=autotune" doesn't reproduce the bug.
> Could this be caused by the fact that the related driver is compiled and
> loaded
> as a module?

It could if the IDE subsystem is itself also compiled as module (than "hda=autotune" should be passed to ide_core module).

> A later restart with a good kernel showed that the bad kernel had also
> trouble
> with writting. So fsck had something to fix...
> 
> I hope that I'll find the time at the weekend to try this again with a
> working
> copy of lspci on a floppy disc.

Upon closer inspection of opti621 host driver I recalled that it is a bit weird hardware - it doesn't use PCI configuration space but a sequence of special I/O writes to access controller configuration.

Please don't bother with the output for bad kernel (I'm sorry about asking for lspci outputs).  I'll prepare a debug patch over the weekend instead.

Thanks.
Comment 14 Bartlomiej Zolnierkiewicz 2008-06-02 15:33:02 UTC
Created attachment 16367 [details]
opti621: disable DMA and use PCI clock info

Please try attached patch (against 2.6.26-rc4 - I'll prepare patch for an older kernel if necessary), it disables DMA support in the driver (this controller doesn't support DMA at all according to the datasheet) and makes the driver use the PCI clock info from the controller.  If it still doesn't help please try uncommenting "#define READ_PREFETCH 0" in drivers/ide/pci/opti621.c.
Comment 15 Bartlomiej Zolnierkiewicz 2008-06-06 11:17:50 UTC
Any news about this issue?
Comment 16 Juergen Kosel 2008-06-06 11:38:33 UTC
Hello,

I have troubles to compile 2.6.26-rc4. It ends with a couple of unresolved symbols within several modules.
So I have to change the configuration...


Greetings
   Juergen
Comment 17 Bartlomiej Zolnierkiewicz 2008-06-06 11:45:31 UTC
Please try -rc5 (my patch should still apply fine) and if the unresolved symbols are still present send them to linux-kernel@vger.kernel.org so they can be fixed before final 2.6.26 release.

Thanks.
Comment 18 Juergen Kosel 2008-06-07 06:30:26 UTC
Hello,

there unresolved symbols are still there.
I have already mailed them to linux-kernel@vger.kernel.org
Does it makes sense to test your patch with an older version?

Greetings
  Juergen
Comment 19 Bartlomiej Zolnierkiewicz 2008-06-08 10:38:53 UTC
Created attachment 16434 [details]
[PATCH] opti621: remove DMA support
Comment 20 Bartlomiej Zolnierkiewicz 2008-06-08 10:39:22 UTC
Created attachment 16435 [details]
[PATCH] opti621: use PCI clock value provided by controller
Comment 21 Bartlomiej Zolnierkiewicz 2008-06-08 10:39:44 UTC
Created attachment 16436 [details]
[PATCH] opti621: program devices timings separately in ->set_pio_mode
Comment 22 Bartlomiej Zolnierkiewicz 2008-06-08 10:40:01 UTC
Created attachment 16437 [details]
[PATCH] opti621: use pre-calculated PIO timings
Comment 23 Bartlomiej Zolnierkiewicz 2008-06-08 10:40:18 UTC
Created attachment 16438 [details]
[PATCH] opti621: add PIO 4 support
Comment 24 Bartlomiej Zolnierkiewicz 2008-06-08 10:40:38 UTC
Created attachment 16439 [details]
[PATCH] opti621: disable read prefetch
Comment 25 Bartlomiej Zolnierkiewicz 2008-06-08 10:42:28 UTC
Hi,

I made some more changes to opti621 (basically rewritting the driver), splitted them on separate incremental patches and then backported everything to 2.6.25.

Please try them.
Comment 26 Juergen Kosel 2008-06-09 14:15:07 UTC
Hello,

I have applied your patches to 2.6.25. And with them it is possible again to
mount the rootfs as ext3.
Also the boot process is now fast, because there are no more dma timeouts.

Do you need some more output, like dmesg, to close this bug?

Thanks
  Juergen
Comment 27 Bartlomiej Zolnierkiewicz 2008-06-09 16:05:08 UTC
Hi,

Great to hear that.

> Do you need some more output, like dmesg, to close this bug?

Ideally I would like to know which patch actually fixes the problem because:

- there are a lot of changes in these patches and it is preferred that we include just a minimal fix in 2.6.25-stable tree

- patch #6 is optional (it disables read prefetch - was included just in case that using prefetch turns out to be problematic)

[bisecting 6 patches should take 3 build+reboot cycles in the worst case]

dmesg would be also useful (to see if the actual driver behavior matches with patches).

Thanks.
Comment 28 Juergen Kosel 2008-06-10 13:06:23 UTC
Hello,

(In reply to comment #27)

> - patch #6 is optional (it disables read prefetch - was included just in case
> that using prefetch turns out to be problematic)

without patch 6 it does not work!
So patch series 1-6 work but 1-5 not.
Should I test with only patch number 6 but without 1-5?

Greetings
   Juergen
Comment 29 Juergen Kosel 2008-06-10 13:07:37 UTC
Created attachment 16452 [details]
dmesg of patched v2.6.25
Comment 30 Bartlomiej Zolnierkiewicz 2008-06-10 13:28:36 UTC
> without patch 6 it does not work!

Thanks, this would explain the regression nicely
(autotuning by default also enabled read prefetch).

> So patch series 1-6 work but 1-5 not.
> Should I test with only patch number 6 but without 1-5?

#6 depends on #5 so we cannot try it but if you reverse all
patches and change the following line:

/* #define READ_PREFETCH 0 */

in drivers/ide/pci/opti621.c to

#define READ_PREFETCH 0

it should have the same effect as patch #6 alone.
Comment 31 Juergen Kosel 2008-06-11 09:45:18 UTC
Hello,

(In reply to comment #30)
> #6 depends on #5 so we cannot try it but if you reverse all
> patches and change the following line:
> 
> /* #define READ_PREFETCH 0 */
> 
> in drivers/ide/pci/opti621.c to
> 
> #define READ_PREFETCH 0

with this patch it works as before (commit 912fb29a36a7269ac1c4a4df45bc0ac1d2637972 ).
But when the kernel doesn't attempt to use the not existing DMA, booting is even faster (as it was with 2.4. kernel).

Greetings
   Juergen
Comment 32 Juergen Kosel 2008-06-11 09:46:33 UTC
Created attachment 16458 [details]
dmesg of 2.6.25 without READ_PREFETCH
Comment 33 Bartlomiej Zolnierkiewicz 2008-06-11 12:51:18 UTC
Thanks!  I re-ordered patches accordingly (patch #1 disables prefetch now).

I'll merge all patches into 2.6.26 and once they are in I'll ping -stable team to include patch #1 (disable read prefetch) and #2 (disable DMA) in the next 2.6.25.x kernel.
Comment 34 Bartlomiej Zolnierkiewicz 2008-06-19 13:55:34 UTC
All patches are in 2.6.26 now and patches #1-2 were queued for the next 2.6.25-stable release.