Bug 43117 - ATA failures when operating on battery power with AMD SBx00 SATA controller due to hdparm
Summary: ATA failures when operating on battery power with AMD SBx00 SATA controller d...
Status: CLOSED INVALID
Alias: None
Product: Power Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: power-management_other
URL: https://bugs.launchpad.net/ubuntu/+so...
Keywords:
Depends on:
Blocks:
 
Reported: 2012-04-17 20:22 UTC by David Barker
Modified: 2013-01-28 23:46 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.4 rc3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output showing errors (2.85 KB, text/plain)
2012-04-17 21:13 UTC, David Barker
Details
lspci output of affected system (12.17 KB, text/plain)
2012-04-17 21:13 UTC, David Barker
Details
output from cat /proc/scsi/scsi (496 bytes, text/plain)
2012-04-17 21:20 UTC, David Barker
Details
output from cat /proc/modules (2.55 KB, text/plain)
2012-04-17 21:21 UTC, David Barker
Details

Description David Barker 2012-04-17 20:22:42 UTC
When operating on battery power (not AC), multiple ATA failures occur.
This issue was not present in 3.0 (stock Ubuntu 11.10 x86_64 build) but is observable in Ubuntu 12.04 beta 2 (3.2.0-23).

The issue also occurs in Ubuntu 12.04 daily (17/04/2012) with latest Ubuntu mainline kernel (3.4 rc3).

The issue occurs both when booting under battery power and when booting under AC and then switching to battery.

Under AC power, no issues are observed.

The system will function correctly for some time (up to a few minutes) then applications will stop responding. The following occurs in dmesg when the issue presents (after which the system begins functioning normally for another few minutes).

[ 238.848502] ata1.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen
[ 238.848526] ata1.00: failed command: WRITE FPDMA QUEUED
[ 238.848548] ata1.00: cmd 61/00:00:3f:e1:00/04:00:19:00:00/40 tag 0 ncq 524288 out
[ 238.848552] res 40/00:00:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
[ 238.848563] ata1.00: status: { DRDY }
[ 238.848571] ata1.00: failed command: WRITE FPDMA QUEUED
[ 238.848589] ata1.00: cmd 61/00:08:3f:e5:00/04:00:19:00:00/40 tag 1 ncq 524288 out
[ 238.848593] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 238.848602] ata1.00: status: { DRDY }
[ 238.848610] ata1.00: failed command: WRITE FPDMA QUEUED
[ 238.848628] ata1.00: cmd 61/00:10:3f:e9:00/04:00:19:00:00/40 tag 2 ncq 524288 out
[ 238.848632] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 238.848640] ata1.00: status: { DRDY }
[ 238.848648] ata1.00: failed command: WRITE FPDMA QUEUED
[ 238.848666] ata1.00: cmd 61/00:18:3f:ed:00/04:00:19:00:00/40 tag 3 ncq 524288 out
[ 238.848670] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout)
[ 238.848678] ata1.00: status: { DRDY }
[ 238.848696] ata1: hard resetting link
[ 248.860425] ata1: softreset failed (device not ready)
[ 248.860450] ata1: hard resetting link
[ 258.872300] ata1: softreset failed (device not ready)
[ 258.872322] ata1: hard resetting link
[ 269.448629] ata1: link is slow to respond, please be patient (ready=0)
[ 270.232426] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 270.244974] ata1.00: configured for UDMA/133
[ 270.260214] ata1.00: device reported invalid CHS sector 0
[ 270.260239] ata1.00: device reported invalid CHS sector 0
[ 270.260256] ata1.00: device reported invalid CHS sector 0
[ 270.260272] ata1.00: device reported invalid CHS sector 0
[ 270.260297] ata1: EH complete

I tried exiting early from /usr/lib/pm-utils/power.d/sata_alpm, thinking this was a SATA ALPM (via pm-utils) issue however this did not help.

I have tried changing the HDD and SATA cable to rule out a coincidental hardware failure however the issue persists. Switching back to Ubuntu 11.10 resolves the issue.

My SATA controller is as follows:
00:11.0 SATA controller: Advanced Micro Devices [AMD] nee ATI SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (rev 40)

The machine in question is a Sony Vaio Y-Series VPCYB2M1E with AMD E-350.

(NOTE: This bug has also been filed on Ubuntu Launchpad  - https://bugs.launchpad.net/ubuntu/+source/linux/+bug/984308)
Comment 1 David Barker 2012-04-17 21:13:25 UTC
Created attachment 72945 [details]
dmesg output showing errors
Comment 2 David Barker 2012-04-17 21:13:47 UTC
Created attachment 72946 [details]
lspci output of affected system
Comment 3 David Barker 2012-04-17 21:20:28 UTC
Created attachment 72947 [details]
output from cat /proc/scsi/scsi

Given that issue seems power/SATA/ALPM related, output from cat /proc/scsi/scsi may be helpful
Comment 4 David Barker 2012-04-17 21:21:55 UTC
Created attachment 72948 [details]
output from cat /proc/modules
Comment 5 David Barker 2012-05-06 18:51:04 UTC
I've narrowed this down to occurring when "hdparm -B 127" is called on switching to battery mode (by /lib/hdparm/hdparm-functions).

Previously, Ubuntu set hdparm -B to 128 on battery.
Ubuntu recently changed to this to 127 in hdparm-functions.

This may still be a kernel bug as -B 127 causes excessive ATA resets.
Comment 6 Alan 2012-06-18 13:55:01 UTC
Could be a drive firmware problem. We just pass the command to the drive so it's not a kernel thing really.

Check if there is a firmware update for your drive.
Comment 7 Len Brown 2012-10-30 01:47:01 UTC
Unclear that this is a kernel power management bug,
since the kernel isn't making this policy choice.

Probably should be filed with the maintainer of
/lib/hdparm/hdparm-functions

Note You need to log in before you can comment on or make changes to this bug.