Bug 43039 - Acer Aspire 5560G Laptop: link online but device misclassified: only with some kernel configs
Summary: Acer Aspire 5560G Laptop: link online but device misclassified: only with som...
Status: NEW
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jeff Garzik
Depends on:
Reported: 2012-04-04 01:48 UTC by Matthew Stapleton
Modified: 2016-03-19 17:04 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.0.0 - 3.3.1
Tree: Mainline
Regression: No

dmesg.txt (70.38 KB, text/plain)
2012-04-04 01:50 UTC, Matthew Stapleton
config-3.2.0-21-generic (136.96 KB, application/octet-stream)
2012-04-04 01:52 UTC, Matthew Stapleton
Server-kernel-3.0-gentoo_x86-64_smp.config (96.90 KB, application/octet-stream)
2012-04-04 01:53 UTC, Matthew Stapleton
HardenedServer-kernel-3.2-gentoo_x86-64_smp.config (102.77 KB, application/octet-stream)
2012-04-04 01:55 UTC, Matthew Stapleton
testkernel-3.0-rev3.config (131.05 KB, application/octet-stream)
2012-04-16 00:26 UTC, Matthew Stapleton
testkernel-3.2_basedonserver_x86_64_mainline250.config (100.49 KB, application/octet-stream)
2012-04-18 07:30 UTC, Matthew Stapleton
testkernel-3.2_ubuntu250.config (135.67 KB, application/octet-stream)
2012-04-18 07:33 UTC, Matthew Stapleton

Description Matthew Stapleton 2012-04-04 01:48:41 UTC
I am getting "ata1: link is slow to respond, please be patient (ready=0)" and then after failing to softreset, it gets "ata1.00: link online but device misclassified" on a laptop: Aspire 5560G laptop with an SSD but only on the Ubuntu standard kernel config, not my custom kernel configs that I use on servers that I've setup. Once this happens the laptop has to be power cycled to get the SSD working again.

The SSD is a AR120GBE and the AHCI controller is 1022:7804, subsystem: 1025:059f

At first I thought the problem was with the Ubuntu kernel, but the timeout also occurs on the mainline kernels: 3.2.0 and 3.2.14 with no added patches but using the Ubuntu kernel config.  I am using the following method to test the ubuntu config:
make mrproper && make menuconfig
  then I load the config and save to .config
make && make modules
  then I copy arch/x86/boot/bzImage to a network boot option with a custom minimal initrd and boot with the laptop.

The Ubuntu bug url is https://bugs.launchpad.net/ubuntu/+source/linux/+bug/965863
Comment 1 Matthew Stapleton 2012-04-04 01:50:27 UTC
Created attachment 72803 [details]

This a dmesg log from one of the Ubuntu kernels with the timeout problem
Comment 2 Matthew Stapleton 2012-04-04 01:52:11 UTC
Created attachment 72804 [details]

standard Desktop Ubuntu kernel config that causes timeouts
Comment 3 Matthew Stapleton 2012-04-04 01:53:30 UTC
Created attachment 72805 [details]

My custom server kernel config for kernel series 3.0 that works without timeouts on the laptop.
Comment 4 Matthew Stapleton 2012-04-04 01:55:20 UTC
Created attachment 72806 [details]

My custom hardened server kernel config for kernel series 3.2 that works without timeouts on the laptop.  Even though this is configured for grsecurity I just loaded it on mainline 3.2.14 without any extra patches for this laptop timeout test.
Comment 5 Tejun Heo 2012-04-04 16:58:48 UTC
Hmmm... this could have been caused by the recent engine start change. Can you please try 3.3.1?

Comment 6 Matthew Stapleton 2012-04-05 00:37:06 UTC
Same timeout with 3.3.1 and Ubuntu config.  Works okay with custom Hardened 3.2 config.  I also have also just tried 3.0 mainline with Ubuntu 3.2.0-21=generic config and got the timeout.
Comment 7 Matthew Stapleton 2012-04-16 00:26:17 UTC
Created attachment 72929 [details]

Tried another config which disables EFI and EDD options and still getting the timeout.
Comment 8 Matthew Stapleton 2012-04-18 05:57:16 UTC
On kernel 3.0 x86_64, I think I have narrowed the problem down.  Setting CONFIG_HZ from 1000 to 250 causes the timeout if I try the load kernel from a power cycle boot up.  Sometimes I don't get the timeout if the kernel was loaded from a soft reboot.  Is there anything immediately obvious why changing CONFIG_HZ would cause ahci ports to timeout?
Comment 9 Matthew Stapleton 2012-04-18 07:26:42 UTC
I just tried a few more x64_64 kernel builds with various configs.  Configs are attached below.
Comment 10 Matthew Stapleton 2012-04-18 07:30:59 UTC
Created attachment 72949 [details]

Tried this config with Mainline 3.3.2 and got the timeout.  I also tested the same config with CONFIG_HZ changed to 1000 and 100 and both those HZ configs didn't timeout.
Comment 11 Matthew Stapleton 2012-04-18 07:33:53 UTC
Created attachment 72950 [details]

Tried this config with Mainline 3.4-rc3 and got the timeout.  I also tested the same config with CONFIG_HZ changed to 1000 and that didn't timeout.
Comment 12 Matthew Stapleton 2012-04-19 02:40:24 UTC
I just tried kernel: 2.6.27 with CONFIG_HZ=250 and got the timeout there as well.  Also, trying nohz=off on the command line doesn't help.
Comment 13 Matthew Stapleton 2012-05-28 00:56:41 UTC
Is there any more info I can provide to help solve the problem?
Comment 14 Alan 2012-09-04 12:45:00 UTC
Not that I can think of - other than classifying your machine is having utterly outweirded us not much obvious to progress this.

Note You need to log in before you can comment on or make changes to this bug.