Bug 48631

Summary: [Regression] SATA reset failing in 3.6
Product: IO/Storage Reporter: Eugene (ken20001)
Component: Serial ATAAssignee: Aaron Lu (aaron.lu)
Status: CLOSED INVALID    
Severity: high CC: aaron.lu, alan, fis, russianneuromancer
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 56331    
Attachments: lspci
dmesg with kernel 3.5.0-17
dmesg with Linux 3.5
dmesg from fail kernel 3.6 boot

Description Eugene 2012-10-09 10:27:31 UTC
Created attachment 82701 [details]
lspci

OS: Kubuntu 12.10
kernel 3.6 took here: http://kernel.ubuntu.com/~kernel-ppa/mainline/
Motherboard: ASUS P5B-Deluxe (ICH8 and JMB36x)


When I installed kernel 3.6 (and later 3.6.1) I found it can't boot my system. But with kernel 3.5.0-17 (from distro) all is fine.  Next messages appears during boot with kernel 3.6(.1):

ata7: ACPI get timing mode failed (AE 0x300b)
ata7: ACPI get timing mode failed (AE 0x300b)
ata7: ACPI get timing mode failed (AE 0x300b)
ata7: prereset failed (errno=-19)
ata7: reset failed, giving up

ata8: ACPI get timing mode failed (AE 0x300b)
ata8: ACPI get timing mode failed (AE 0x300b)
ata8: ACPI get timing mode failed (AE 0x300b)
ata8: prereset failed (errno=-19)
ata8: reset failed, giving up

Gave up waiting for root device. Common problems:
- Boot args (cat /proc/cmdline)
  - Check rootdelay= (did the system wait long enough?)
  - Check root= (did the system wait for the right device?)
- Missing modules (cat /proc/modules; ls /dev)

ALERT! /dev/disk/by-uuid/8e6ab4ba-b981-4821-8683-9fa908a748e5 does not exist. Dropping to a shell!

BusyBos v1.19.3 (Ubuntu 1:1.19.3-7ubuntu1) built-in shell (ash)
Enter 'help' for a list of built-in commands
(initramfs)
Comment 1 Eugene 2012-10-09 10:30:36 UTC
Created attachment 82711 [details]
dmesg with kernel 3.5.0-17
Comment 2 RussianNeuroMancer 2012-10-12 06:08:52 UTC
Same issue with JMB368 IDE-controller. With Linux 3.5 system boot fine. Do I need to provide additional information?
Comment 3 Alan 2012-10-12 15:36:55 UTC
Can you attach a dmesg as well.. looks like 3.6 broke the Jmicron controllers.
Comment 4 RussianNeuroMancer 2012-10-12 19:03:20 UTC
Created attachment 83051 [details]
dmesg with Linux 3.5

Sure.
Comment 5 Eugene 2012-11-06 15:22:34 UTC
Do you need some more info or it is enough to start fixing the bug ?
Comment 6 Alan 2012-11-06 16:09:20 UTC
Bugzilla is just used to track bugs not fix them. Thats a question to direct to your distribution.

Discussing it on linux-ata@vger.kernel.org may also find folks interested in it.
Comment 7 Eugene 2012-11-07 12:15:37 UTC
I just ment do you need any additional info to change status? Cause it's still NEEDINFO. Just this. And, BTW, should I also write the same report on launchpad or somwhere else to make start fixing this bug ?
Comment 8 Alan 2012-11-07 14:48:24 UTC
I'm not sure what the Ubuntu procedure for bug fixing is sorry
Comment 9 Jeff Garzik 2012-11-16 04:42:03 UTC
Added Aaron Lu to CC, as he has been poking at libata ACPI lately.
Comment 10 Aaron Lu 2012-11-19 01:26:53 UTC
Hello Eugene,

Can you please check in 3.5, which kernel module is driving your JMicron controller? And also, in the failed 3.6 kernel, can you show more dmesg so that I can see which kernel module is initializing the controller.

And you can also try to blacklist pata_acpi to see if this makes a difference, thanks.
Comment 11 Aaron Lu 2012-11-19 01:32:47 UTC
Err...Just saw in dmesg it is pata_jmicron in 3.5 that is driving the JMicron controller. So my guess is that, due to a latent pata_acpi bug fixed, pata_acpi is now driving the controller but pre_reset failed.

Please blacklist pata_acpi to see if this is the case, thanks.
Comment 12 Eugene 2012-11-26 16:23:58 UTC
Recently tried

blacklist pata_acpi

in /etc/modprobe.d/blacklist-jmicron_module.conf

Nothing changed for Kernel 3.6.
Comment 13 Eugene 2012-11-26 17:40:45 UTC
Created attachment 87321 [details]
dmesg from fail kernel 3.6 boot
Comment 14 Eugene 2012-11-26 17:42:50 UTC
I've got dmesg after trying to boot with blacklisted module pata_acpi. See the attachment.
Comment 15 Eugene 2012-11-26 18:12:51 UTC
>dmesg from fail kernel 3.5 boot
Sorry, from kernel 3.6.
Comment 16 Aaron Lu 2012-11-27 00:45:58 UTC
Hello Eugene,

The dmesg shows that for the failed ata controller channel(ata7 and ata8), it is still pata_acpi being used:

[    2.238918] pata_acpi 0000:03:00.1: setting latency timer to 64
[    2.239727] scsi6 : pata_acpi
[    2.239937] scsi7 : pata_acpi
[    2.240128] ata7: PATA max UDMA/133 cmd 0xdc00 ctl 0xd880 bmdma 0xd400 irq 17
[    2.240130] ata8: PATA max UDMA/133 cmd 0xd800 ctl 0xd480 bmdma 0xd408 irq 17

So the problem is still there.

You may need to try another way to blacklist it.
Comment 17 Eugene 2012-11-27 14:37:18 UTC
>You may need to try another way to blacklist it.
Any suggestions how to do it ?
Comment 18 Eugene 2012-11-29 00:31:22 UTC
I don't know how to blacklist pata_acpi. I try it like noted in http://wiki.debian.org/KernelModuleBlacklisting but it is still loading during boot. Also I've tried LiveCD with kernel 3.6 and lsmod is not shows pata_acpi at all.
Comment 19 Aaron Lu 2012-11-29 01:19:23 UTC
Hello Eugene,

Just installed a Ubuntu 12.10 vm and saw that the PATA_ACPI is built into the kernel by default, so I'm afraid you have no way of blacklisting it.

You can try re-compiling the kernel and make sure the CONFIG_PATA_ACPI option is unset to see the result, if this is possible for you.

And you are advised to write to Ubuntu kernel team to suggest them not built in the PATA_ACPI module by default.
Comment 20 Alan 2012-11-29 10:13:15 UTC
They need to include it for some systems.

They need to ensure they either

- build in all the ATA drivers (in which case it is tried last)

or

- build the pata_acpi driver modular and load it last
Comment 21 Eugene 2012-11-29 19:59:08 UTC
> You can try re-compiling the kernel and make sure the CONFIG_PATA_ACPI option
> is unset to see the result, if this is possible for you.
Sorry, I'm not familiar with kernel compilation stuff. We're already know - this controller is driven dy different kernel module since 3.6 kernel module, and make system not possible to boot, if system disk attached to this controller.  Is ti possible for you to somehow fix problem on kernel side and make kernel use pata_jmicron for JMicron controllers again?
Comment 22 Alan 2012-11-29 21:55:50 UTC
It doesn't look like a kernel change. It appears at this point to be a mistake by your distribution so it needs to go back to the distribution maintainers.
Comment 23 Aaron Lu 2012-11-30 00:40:17 UTC
I agree with Alan.

Eugene,
Please talk to the distribution maintainer to let them make sure pata_acpi should be the last one to drive the controller.
Comment 24 RussianNeuroMancer 2012-11-30 00:41:00 UTC
https://bugs.launchpad.net/bugs/1084783
Comment 25 Aaron Lu 2012-11-30 00:55:37 UTC
Then please continue the discussion in the lauchpad bug page with the information you got from here.

And once it is confirmed that the problem can be solved, we can close the bug, thanks.
Comment 26 Aaron Lu 2013-01-30 01:48:20 UTC
Assign it to me to close it, since it is not a kernel bug.
Comment 27 Jan Sembera 2013-02-26 22:30:44 UTC
(In reply to comment #22)
> It doesn't look like a kernel change. It appears at this point to be a
> mistake
> by your distribution so it needs to go back to the distribution maintainers.

Actually, although this might be a distro error, this was done by a kernel change, specifically 30dcf76acc (I was tracing the same bug with different driver and bisected to this).
Comment 28 RussianNeuroMancer 2013-02-26 22:32:14 UTC
There is some progress in Ubuntu bugreport: https://bugs.launchpad.net/bugs/1084783