Bug 7641 - sata_via : VT6420 chipset support broken from kernel 2.6.18 onwards
Summary: sata_via : VT6420 chipset support broken from kernel 2.6.18 onwards
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Tejun Heo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-12-06 10:27 UTC by Ben Hodgetts (Enverex)
Modified: 2007-02-27 07:08 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.18/2.6.19
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
Kernel output (106.52 KB, image/jpeg)
2006-12-12 09:39 UTC, Ben Hodgetts (Enverex)
Details

Description Ben Hodgetts (Enverex) 2006-12-06 10:27:34 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.17.13

Distribution: Gentoo 2006.1

Hardware Environment: 
00:00.0 Host bridge: nVidia Corporation nForce3 Host Bridge (rev a4)
00:01.0 ISA bridge: nVidia Corporation nForce3 LPC Bridge (rev a6)
00:01.1 SMBus: nVidia Corporation nForce3 SMBus (rev a4)
00:02.0 USB Controller: nVidia Corporation nForce3 USB 1.1 (rev a5)
00:02.1 USB Controller: nVidia Corporation nForce3 USB 1.1 (rev a5)
00:02.2 USB Controller: nVidia Corporation nForce3 USB 2.0 (rev a2)
00:08.0 IDE interface: nVidia Corporation nForce3 IDE (rev a5)
00:0a.0 PCI bridge: nVidia Corporation nForce3 PCI Bridge (rev a2)
00:0b.0 PCI bridge: nVidia Corporation nForce3 AGP Bridge (rev a4)
00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
HyperTransport Technology Configuration
00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM
Controller
00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron]
Miscellaneous Control
01:08.0 Multimedia audio controller: Cirrus Logic CS 4614/22/24 [CrystalClear
SoundFusion Audio Accelerator] (rev 01)
01:0a.0 Network controller: RaLink RT2500 802.11g Cardbus/mini-PCI (rev 01)
01:0c.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID
Controller (rev 50)
01:0d.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit
Ethernet (rev 10)
02:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX 420]
(rev a3)

Software Environment:
gcc-4.1.1, glibc-2.5-r0

Problem Description:
On Kernel .18 and onwards, the machine wont boot. It reaches sata_via and tries
probing and takes a long time stating that the device is being slow, please
wait, it then starts repeating that the device is acting abnormally (echos this
about 6 times) then pauses for about 30 seconds. It repeats this process about 3
times and takes about 5 minutes in all. It eventually continues but hasn't
actually detected any drives and thus can't complete the boot process (as the
machine can't see the drive). The controller works fine with kernel .17 and
below though.

Steps to reproduce:
Try and use a VT6420 chipset with Kernel 2.6.18 or 2.6.19.
Comment 1 Tejun Heo 2006-12-06 17:27:33 UTC
I can't tell without full dmesg but it seems to be via IRQ quirk problem.  Ben,
can you post boot dmesg?  Netconsole comes handy when you can't mount root fs. 
Documentation/networking/netconsole.txt.
Comment 2 Ben Hodgetts (Enverex) 2006-12-07 02:35:01 UTC
Can't I'm afraid, the machine can only access the network through the
Serialmonkey rt2500 driver and it's kinda in a safe place not to be broken again
anytime soon. I can boot the old kernel however and take a photo of the screen
while it boots. Not perfect but it would be the same as dmesg would have
outputted and contain the entire error strings.
Comment 3 Tejun Heo 2006-12-07 02:51:46 UTC
That will be good enough.
Comment 4 Ben Hodgetts (Enverex) 2006-12-12 09:39:27 UTC
Created attachment 9797 [details]
Kernel output

Sorry this took so long, only just managed to get around to robbing a camera to
do it. Basically it's what you see just repeated 3 times.
Comment 5 Tejun Heo 2006-12-12 14:41:46 UTC
Does giving 'irqpoll' option to kernel change anything?
Comment 6 Ben Hodgetts (Enverex) 2006-12-13 02:41:41 UTC
Nope, I tried things like that first. irqpoll, pci=routeirq, irqbalance, etc etc
I tried anything I could think of and any combination of them but nothing
changed (well, the IRQ changed in the output but it still acted the same).
Comment 7 j.taimr 2006-12-21 01:26:55 UTC
Could it be the same situation as I hit (bug #7415)?
Comment 8 Ben Hodgetts (Enverex) 2006-12-27 09:39:16 UTC
Any word on this? I need to update the kernel to (hopefully) get my TV card
working, but at the moment I can't due to this issue (else I wont be able to
boot, heh).
Comment 9 Ben Hodgetts (Enverex) 2006-12-27 10:03:17 UTC
Actually I think the above may be right, I'm checking the patches now to see if
it's a duplicate of http://bugzilla.kernel.org/show_bug.cgi?id=7415.
Comment 10 j.taimr 2006-12-27 13:02:14 UTC
Could you detect, which is the first bad commit in git-tree ? (Just accordingly
to the post by Daniel Drake, described in #7415). Or, could you try just revert
the Tejun's commit by dsd's patch (also in #7415)? The solution(s) mentioned by
me works for me, without problems, for gentoo-2.6.18, -r2, -r4, -r5, -r6,
-2.6.19, 2.6.19-r2 and for git 2.6.20-rc1 as well... Or, it is also possible,
there is more than one source of troubles, but my system suffers just by
ATA_NIEN problem..
Comment 11 Tejun Heo 2006-12-27 19:29:52 UTC
There were a number of via detection failure reports which were caused by VIA
IRQ quirk problem.  Then, others probably with ->freeze() problem.  Also,
polling IDENTIFY seemed to fix some cases of mis-detections too.  Can you give a
shot at 2.6.20-rc2?
Comment 12 Ben Hodgetts (Enverex) 2006-12-28 15:35:32 UTC
No, it got worse, now it froze up for a while before it even started giving the
ATA error messages.
Comment 13 j.taimr 2006-12-28 23:36:34 UTC
Tejun, I am afraid, things are even more complicated:
Ben wrote, he uses Gentoo distro, and all gentoo-sources kernel are already
patched agaings VIA quirks for long, long time ( dsd knows details better, but
all gentoo-2.6.17, 2.6.18 and 2.6.19 kernels, which I tried, contain this
patch). I think, there must be anoter reason with identical symptoms.
Comment 14 Ben Hodgetts (Enverex) 2006-12-29 00:52:17 UTC
I wasn't using Gentoo sources, I was using the original ones from kernel.org.
Comment 15 j.taimr 2006-12-29 11:26:11 UTC
Aaah,sorry, I was mistaken by your report (Distribution: Gentoo 2006.1).But the
VIA quirk problem is quite probable then - I did not see the message like as:
'PCI: VIA IRQ fixup for 0000:00:0f.1, from 255 to 2'
in your picture #4, immediately before VIA SATA start-up. So, did you try any of
the gentoo-sources >= 2.6.18, or, eventually with the .freeze patch?
Comment 16 Ben Hodgetts (Enverex) 2006-12-29 16:24:56 UTC
Ok, using Gentoo Sources .19 with this
(http://bugzilla.kernel.org/attachment.cgi?id=9893&action=view) patch FIXED the
issue. Huzzah. :)
Comment 17 j.taimr 2006-12-30 04:29:01 UTC
If THIS works, then you really are in the same situation as me. But the using of
gentoo-sources (with VIA quirks patch) AND my ata_irq_on(ap) patch should work
as well. Because of .19, you should try the second part of the ata_irq_on patch
(designed for patching of libata-sff.c). Or, you can try vanilla sources and
patch them with VIA quirks patch AND ata_irq_on(ap) simultaneously... and it
should work also.
Comment 18 Tejun Heo 2007-02-27 07:07:51 UTC
This should have been fixed in v2.6.20 w/ svia_noop_freeze.  Closing.

Note You need to log in before you can comment on or make changes to this bug.