Bug 14543

Summary: AHCI SATA: hard link reset on controller power management change
Product: IO/Storage Reporter: Andrey Vihrov (andrey.vihrov)
Component: Serial ATAAssignee: Jeff Garzik (jgarzik)
Status: CLOSED OBSOLETE    
Severity: normal CC: alan, brian, shopping, tj
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31.5 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
Additional information
dmesg with ext4 failure
retry-flush.patch
dmesg on patched kernel
A simple shell script to prevent changes to the hard drive power management
lpm-reset.patch

Description Andrey Vihrov 2009-11-04 19:35:47 UTC
Created attachment 23651 [details]
dmesg

If one switches hard disk controller power management states on Intel ICH8M, more precisely, sets the "max_performance" state, the following appears in dmesg:

ata1.00: exception Emask 0x10 SAct 0x1 SErr 0x50000 action 0xe frozen
ata1.00: irq_stat 0x00400008, PHY RDY changed
ata1: SError: { PHYRdyChg CommWake }
ata1.00: cmd 60/08:00:5a:81:2c/00:00:00:00:00/40 tag 0 ncq 4096 in
         res 40/00:04:5a:81:2c/00:00:00:00:00/40 Emask 0x10 (ATA bus error)
ata1.00: status: { DRDY }
ata1: hard resetting link
ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
ata1.00: configured for UDMA/133
ata1: EH complete

Example command to reproduce it:
for FILE in /sys/class/scsi_host/*/link_power_management_policy; do echo "min_power" > ${FILE} && echo "max_performance" > ${FILE}; done

This was first encountered on kernel 2.6.30, x86-64 architecture, but 2.6.31.5 behaves the same.
Comment 1 Andrey Vihrov 2009-11-04 19:36:20 UTC
Created attachment 23652 [details]
Additional information
Comment 2 Tejun Heo 2009-11-05 04:56:24 UTC
And the drive works fine afterwards?
Comment 3 Andrey Vihrov 2009-11-05 08:41:02 UTC
Everything works fine afterwards, I can run the example command any number of times with no bad effects (from the drive's point of view). I am bothered by this because if ext4 is doing I/O at this point, it will abort journal and remount the filesystem read-only.

It seems someone else was also affected by the problem: http://lkml.org/lkml/2008/8/30/92
Comment 4 Tejun Heo 2009-11-05 08:46:29 UTC
Oh... that isn't supposed to happen.  Such transient failures can happen for any number of reasons and it shouldn't trigger filesystem remount or drives being kicked out without any retry (I think md error handling improved on this regard).  You said that the problem was first encountered on 2.6.30, so kernels before 2.6.30 didn't show such problem?

Thanks.
Comment 5 Andrey Vihrov 2009-11-05 09:03:51 UTC
As the lkml posting suggests, I may have been having these errors since a while earlier, but they may have gone unnoticed. Starting from 2.6.30, I migrated my ext3 partitions to ext4. Hence, it may be that ext3 was more ignorant to link resets, while ext4 demonstrates the current behaviour. Unfortunately, I can't mount the partitions as ext3 anymore to test it. Maybe I can create a new ext3 partition later to compare ext3 and ext4 behaviour.
Comment 6 Tejun Heo 2009-11-05 09:09:05 UTC
Hmmm.... Is the ext4 ro remount triggable on 2.6.31.5?  There was a bug where requests of different type got merged and log transactions accidentally got FAILFAST flag set.  The problem you're reporting sounds awfully like that.
Comment 7 Andrey Vihrov 2009-11-05 09:43:18 UTC
Created attachment 23658 [details]
dmesg with ext4 failure

Yes. I was able to trigger it by running two scripts in parallel:

while true; do
    echo > test
    && sync
    && rm test
    && sync
    || break ;
done

and

while true; do
    for FILE in /sys/class/scsi_host/*/link_power_management_policy; do echo
"min_power" > ${FILE} && echo "max_performance" > ${FILE}; done
    && sleep 1 ;
done

I estimate 1 second is enough for the SATA connection to be restored and pending data written in subsequent retries.
Comment 8 Tejun Heo 2009-11-05 09:48:08 UTC
That's scary.  I'll try to reproduce it tomorrow.  Thanks.
Comment 9 Tejun Heo 2009-11-09 16:08:07 UTC
Sorry, I can't reproduce it here.  My drives seem to be happy with changing power mode.  :-(

Hmmm... looking again at the log, it looks like what's offending is the FLUSH_EXT failure.  Can you please repeat the test and see whether the last failing command is always 0xEA when the filesystem craps out?

Thanks.
Comment 10 Andrey Vihrov 2009-11-13 15:54:15 UTC
I ran the test five more times and each time the cmd preceding ext4 error was "ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0", as you pointed out.

Thanks
Comment 11 Tejun Heo 2009-11-15 08:28:48 UTC
Thanks a lot for verifying.  Looks like libata EH needs to learn about different failure modes of FLUSH_EXT.  I'll look into it.
Comment 12 Tejun Heo 2009-11-18 16:45:20 UTC
Created attachment 23825 [details]
retry-flush.patch

Sorry about the delay.  Can you please try the attached patch and see whether it makes the filesystem survive such events?

Thanks.
Comment 13 Andrey Vihrov 2009-11-18 18:50:06 UTC
I applied the patch using 2.6.31.6 as the base, and let the test run for whole 5 minutes. No ext4 errors occured, and on each relevant SATA error dmesg just said "ata1.00: retrying FLUSH 0xea Emask 0x10". Thus it appears that these changes fix the bug.

Thanks for looking into the problem.
Comment 14 Tejun Heo 2009-11-18 18:57:34 UTC
Can you please attach full kernel log?
Comment 15 Andrey Vihrov 2009-11-18 19:25:26 UTC
Created attachment 23827 [details]
dmesg on patched kernel
Comment 16 Tarek Loubani 2010-03-01 08:03:02 UTC
I have the same problem, reproducible by executing the two scripts as above in comment #7.

I am using Ext3 on Ubuntu's Lucid Lynx (2.6.32). Previously, Jaunty (2.6.30) and Karmic (2.6.31) had the same problem. With all three versions, I have had maddening hard system lockups 2-3 times daily that appear to be tied into these emask errors done at some unfortunate time after plugging / unplugging the cable.

I will try the patch and update you. I gather that the patch didn't make it into .32?

tarek : )
Comment 17 Tarek Loubani 2010-03-01 11:19:07 UTC
By way of update, I compiled the patch into an ubuntu kernel using the instructions here:

https://help.ubuntu.com/community/Kernel/Compile

The error message remains the same, as above.

How can I help debug this further?

tarek : )
Comment 18 Tejun Heo 2010-03-02 03:06:49 UTC
Do you still experience ro remount with the patch applied?  Can you please post full dmesg w/ timestamp turned on?
Comment 19 Tarek Loubani 2010-03-04 17:15:09 UTC
The patch made no difference to the behaviour seen. I will try to get some proper logs posted, but in the meantime, this is the script I used to fix the situation. It has to be run as a super user.

tarek : )
Comment 20 Tarek Loubani 2010-03-04 17:39:37 UTC
Created attachment 25357 [details]
A simple shell script to prevent changes to the hard drive power management

This hack prevents the ill effects of this bug - in my case, frequent freezes.
Comment 21 Tejun Heo 2010-03-09 08:01:28 UTC
If you're not seeing flush failures propagating as a fs error leading to ro-remount, I'm not quite sure what the driver should do.  Reducing command timeout interval could make the situation a bit easier.  Do you what is meddling with the link_power_management_policy?

Thanks.
Comment 22 Tarek Loubani 2010-03-09 13:00:31 UTC
Sure I know! It's probably ACPI. Ultimately, the problem needs a 1-2 punch, which is why it was so incredibly hard to diagnose for me (Over a year). The power management has to be changing as a particular kind of HD operation is happening. This is why it's not a 100% guaranteed replicability. Instead, I observed that the system would hang 3-4 times per day, and that often that was happening when I abandoned the system and returned later (rather than continuously using it actively), or entering / exiting suspend. Or when I plug / unplug. As I finally honed in on the problem, it became obvious that it had to do with ACPI or power management in some way, and ultimately I was ready to completely disable the ACPI system to try to gain stability. Luckily, that's when I ran into this thread, and ultimately made the script above.

If there's anything I can do to help diagnose the problem, let me know.

tarek : )
Comment 23 Tejun Heo 2010-03-11 03:08:51 UTC
Created attachment 25470 [details]
lpm-reset.patch

Can you please test whether this patch makes any difference?  Also, are you seeing timeouts or just command aborts due to PHY events?  In any case, can you please post full dmesg w/ timestamp turned on w/ and w/o the patch applied?

Thanks.