Bug 7235 - kernel panic, sata issue?
Summary: kernel panic, sata issue?
Status: RESOLVED CODE_FIX
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Tejun Heo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-09-29 13:18 UTC by Ryan Hope
Modified: 2007-02-27 07:04 UTC (History)
5 users (show)

See Also:
Kernel Version: >=2.6.18-rc*
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
dmesg after good boot of 2.6.17-gentoo-r8 (16.26 KB, text/plain)
2006-10-21 05:15 UTC, j.taimr
Details
dmesg from 'bad' boot of 2.6.18-gentoo (15.15 KB, text/plain)
2006-10-21 05:16 UTC, j.taimr
Details
dmesg from 'bad' boot of vanilla-2.6.19-rc2 (17.61 KB, text/plain)
2006-10-21 05:18 UTC, j.taimr
Details
lspci output (2.23 KB, text/plain)
2006-10-21 05:18 UTC, j.taimr
Details
lspci -n output (774 bytes, text/plain)
2006-10-21 05:19 UTC, j.taimr
Details
lspci -v output (6.96 KB, text/plain)
2006-10-21 05:20 UTC, j.taimr
Details

Description Ryan Hope 2006-09-29 13:18:34 UTC
Most recent kernel where this bug did not occur: ~2.6.17
Distribution: Gentoo
Hardware Environment: 
00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and 945GT
Express Memory Controller Hub (rev 03)
00:01.0 PCI bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and 945GT
Express PCI Express Root Port (rev 03)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition
Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1
(rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2
(rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3
(rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4
(rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express
Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express
Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GHM (ICH7-M DH) LPC Interface Bridge
(rev 02)
00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA
Storage Controller IDE (rev 02)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc M56P [Radeon Mobility X1600]
03:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet
Controller (Copper) (rev 03)
03:00.2 IDE interface: Intel Corporation Unknown device 108d (rev 03)
03:00.3 Serial controller: Intel Corporation Intel(R) Active Management
Technology - SOL (rev 03)
03:00.4 Class 0c07: Intel Corporation 82573E KCS (Active Management) (rev 03)
05:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network
Connection (rev 02)
0a:09.0 CardBus bridge: O2 Micro, Inc. OZ711MP1/MS1 MemoryCardBus Controller
(rev 21)
0a:09.1 CardBus bridge: O2 Micro, Inc. OZ711MP1/MS1 MemoryCardBus Controller
(rev 21)
0a:09.4 FireWire (IEEE 1394): O2 Micro, Inc. Firewire (IEEE 1394) (rev 02)

Problem Description: Kernel wont boot, kernel panic when it tries to mount root.
It seems like my sata card isnt working in 2.6.18.

Steps to reproduce: I boot my laptop with any 2.6.18 kernel.
Comment 1 Andrew Morton 2006-09-29 13:35:08 UTC
hm, sounds like a straightforward regression.

First, please double-check the .config.  If that looks OK then
please capture the boot-time messages.  That's pretty easy if
you have another computer on the LAN.  See
Documentation/networking/netconsole.txt
Comment 2 Tejun Heo 2006-09-29 17:22:56 UTC
All the libata (SATA) drivers moved to drivers/ata and the configuration menu
has been separated from SCSI.  It's under "Device Drivers -> Serial ATA and
Parallel ATA drivers".  Kconfig cannot convert to new ones automagically, so
you'll have to select them manually.  So, please double check you have your
libata drivers selected.
Comment 3 Ryan Hope 2006-09-29 21:52:22 UTC
cat .config | grep SATA
# CONFIG_BLK_DEV_IDE_SATA is not set
CONFIG_SATA_AHCI=y
# CONFIG_SATA_SVW is not set
# CONFIG_SATA_MV is not set
# CONFIG_SATA_NV is not set
# CONFIG_SATA_QSTOR is not set
# CONFIG_SATA_PROMISE is not set
# CONFIG_SATA_SX4 is not set
# CONFIG_SATA_SIL is not set
# CONFIG_SATA_SIL24 is not set
# CONFIG_SATA_SIS is not set
# CONFIG_SATA_ULI is not set
# CONFIG_SATA_VIA is not set
# CONFIG_SATA_VITESSE is not set
CONFIG_SATA_INTEL_COMBINED=y
Comment 4 Tejun Heo 2006-09-29 22:00:42 UTC
Is your controller in AHCI mode?  If not, you also need to turn on CONFIG_ATA_PIIX.
Comment 5 Ryan Hope 2006-09-29 22:12:06 UTC
cat .config | grep PIIX
CONFIG_BLK_DEV_PIIX=y
CONFIG_ATA_PIIX=y
CONFIG_PATA_MPIIX=y
# CONFIG_PATA_OLDPIIX is not set
CONFIG_I2C_PIIX4=y
Comment 6 j.taimr 2006-10-21 05:14:16 UTC
Very likely I hit the same issue. I am also using Gentoo, last working kernel is
2.6.17-gentoo-r8. Kernels 2.6.18-gentoo and vanilla-2.6.19-rc2 do not initialize
VIA SATA subsystem properly, the boot ends with kernel panic. The typical
message is:

sata_via 0000:00:0f.0: routed to hard irq line 2
ata1: SATA max UDMA/133 cmd 0xE000 ctl 0xE102 bmdma 0xE400 irq 18
ata2: SATA max UDMA/133 cmd 0xE200 ctl 0xE302 bmdma 0xE408 irq 18
scsi0 : sata_via
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: qc timeout (cmd 0xec)
ata1.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0xD8 on port 0xE007
scsi1 : sata_via
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata2.00: qc timeout (cmd 0xec)
ata2.00: failed to IDENTIFY (I/O error, err_mask=0x4)
ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ATA: abnormal status 0xD8 on port 0xE207

The same situation with kernel params acpi=off and/or noapic, it looks as like
as a bug in sata_via driver
Comment 7 j.taimr 2006-10-21 05:15:55 UTC
Created attachment 9315 [details]
dmesg after good boot of 2.6.17-gentoo-r8
Comment 8 j.taimr 2006-10-21 05:16:58 UTC
Created attachment 9316 [details]
dmesg from 'bad' boot of 2.6.18-gentoo
Comment 9 j.taimr 2006-10-21 05:18:02 UTC
Created attachment 9317 [details]
dmesg from 'bad' boot of vanilla-2.6.19-rc2
Comment 10 j.taimr 2006-10-21 05:18:53 UTC
Created attachment 9318 [details]
lspci output
Comment 11 j.taimr 2006-10-21 05:19:31 UTC
Created attachment 9319 [details]
lspci -n output
Comment 12 j.taimr 2006-10-21 05:20:11 UTC
Created attachment 9320 [details]
lspci -v output
Comment 13 Tejun Heo 2006-10-31 01:38:28 UTC
Please read the following two threads.

http://thread.gmane.org/gmane.linux.kernel/459475/focus=460168
http://thread.gmane.org/gmane.linux.ide/13627/focus=13628

Most (if not all) of these 'sata_via reports link is online but times out on
IDENTIFY' bug reports are due to via quirk changes related to IRQ.  I dunno why
it hasn't been fixed yet tho.
Comment 14 Ryan Hope 2006-12-24 11:17:11 UTC
this issue is still not fixed with 2.6.20-rc1-git7 or 2.6.20-rc1-mm1

===========================================

00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and 945GT
Express Memory Controller Hub (rev 03)
00:01.0 PCI bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and 945GT
Express PCI Express Root Port (rev 03)
00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition
Audio Controller (rev 02)
00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1
(rev 02)
00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2
(rev 02)
00:1c.2 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 3
(rev 02)
00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4
(rev 02)
00:1c.4 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express
Port 5 (rev 02)
00:1c.5 PCI bridge: Intel Corporation 82801GR/GH/GHM (ICH7 Family) PCI Express
Port 6 (rev 02)
00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 02)
00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 02)
00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI
Controller (rev 02)
00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e2)
00:1f.0 ISA bridge: Intel Corporation 82801GHM (ICH7-M DH) LPC Interface Bridge
(rev 02)
00:1f.2 IDE interface: Intel Corporation 82801GBM/GHM (ICH7 Family) Serial ATA
Storage Controller IDE (rev 02)
00:1f.3 SMBus: Intel Corporation 82801G (ICH7 Family) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: ATI Technologies Inc M56P [Radeon Mobility X1600]
03:00.0 Ethernet controller: Intel Corporation 82573E Gigabit Ethernet
Controller (Copper) (rev 03)
03:00.2 IDE interface: Intel Corporation Unknown device 108d (rev 03)
03:00.3 Serial controller: Intel Corporation Intel(R) Active Management
Technology - SOL (rev 03)
03:00.4 Class 0c07: Intel Corporation 82573E KCS (Active Management) (rev 03)
05:00.0 Network controller: Intel Corporation PRO/Wireless 3945ABG Network
Connection (rev 02)
0a:09.0 CardBus bridge: O2 Micro, Inc. OZ711MP1/MS1 MemoryCardBus Controller
(rev 21)
0a:09.1 CardBus bridge: O2 Micro, Inc. OZ711MP1/MS1 MemoryCardBus Controller
(rev 21)
0a:09.4 FireWire (IEEE 1394): O2 Micro, Inc. Unknown device 00f7 (rev 02)
Comment 15 j.taimr 2006-12-27 09:01:32 UTC
It has the identical symptoms as my problem #7415 - could it se the same problem?
Comment 16 Tejun Heo 2007-01-12 16:51:57 UTC
#7415 is via, this one is intel.  I don't think it's the same problem.  Ryan,
can you please post failing boot messages?  It's hard to tell what went wrong
without more info.  Netconsole is the easiest way to get it.  Please take a look
at Documentation/networking/netconsole.txt.  Thanks.
Comment 17 Tejun Heo 2007-02-27 07:04:34 UTC
Most ata_piix detection issues are ironed out as of 2.6.20.  Please test
2.6.20.1 and reopen if it's still broken for you.  Thanks.

Note You need to log in before you can comment on or make changes to this bug.