Bug 74961 - libata : SError: { HostInt PHYRdyChg 10B8B DevExch } hard resetting link
Summary: libata : SError: { HostInt PHYRdyChg 10B8B DevExch } hard resetting link
Status: REOPENED
Alias: None
Product: IO/Storage
Classification: Unclassified
Component: Serial ATA (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Tejun Heo
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-28 07:51 UTC by chenhao
Modified: 2017-12-29 14:36 UTC (History)
6 users (show)

See Also:
Kernel Version: 3.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments
the dmesg information of ubuntu 14.04 TLS (85.42 KB, application/octet-stream)
2014-04-28 07:51 UTC, chenhao
Details
new error:soft resetting link (424.41 KB, application/octet-stream)
2014-05-25 01:21 UTC, chenhao
Details

Description chenhao 2014-04-28 07:51:25 UTC
Created attachment 134001 [details]
the dmesg information of ubuntu 14.04 TLS

I got the error from my ASUS K450JF (OS : ubuntu 14.04 ; disk : HGST HTS541010A9E680 1T gpt ; cpu : i7-4700HQ ) .
It causes the OS to jam. And ASUS maintenance engineer said " your disk is ok and no bad sectors". So it may be a bug and what is wrong with my laptop.

More information in the attachment.

error information(from dmesg):

[ 190.709673] ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[ 190.711097] ata6: irq_stat 0x00400040, connection status changed
[ 190.712264] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 190.713302] ata6: hard resetting link
[ 190.713320] ata3: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[ 190.713900] ata3: irq_stat 0x00400040, connection status changed
[ 190.714523] ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 190.715181] ata3: hard resetting link
[ 191.437594] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 191.438545] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[ 191.439004] ata6.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[ 191.439008] ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[ 191.439010] ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[ 191.441546] ata6.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[ 191.441549] ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[ 191.441550] ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[ 191.442668] ata6.00: configured for UDMA/133
[ 191.442672] ata6: EH complete
[ 191.445565] ata3.00: configured for UDMA/133
[ 191.448998] ata3: EH complete
Comment 1 chenhao 2014-04-28 07:57:16 UTC
I got the same error when using ide (BIOS Compatible Mode).
And I also got the same error when using Fedora 20(amd64).
Comment 2 Alan 2014-05-19 12:28:54 UTC
[ 190.712264] ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }

Your disk changed. Thats usually either a cable problem or a power problem (the drive not getting enough power so dropping off the bus). The drive then comes back fine which makes me suspect a power issue.

Either way the Linux kernel is simply doing the right things. It sees the hardware report a disconnect/reconnect and it probes the new device. If your box has all its file systems on the "old" device then it'll die.
Comment 3 chenhao 2014-05-21 02:11:52 UTC
Is there a way to improve it? 
I use the asus official notebook power now.
Comment 4 Alan 2014-05-22 13:48:29 UTC
You would need to find out why the disk is disconnecting, or the cable losing link. 

You may want to ask for help on the Ubuntu forums about disabling
- laptop mode
- link power saving (ALPM)

and see if either of those help stop the drive misbehaving.

Bugzilla is not a support forum however, so you really need to work with your distro.
Comment 5 chenhao 2014-05-25 01:21:03 UTC
Created attachment 137321 [details]
new error:soft resetting link
Comment 6 chenhao 2014-05-25 01:35:38 UTC
Yesterday I met another error :

May 17 11:20:49 asus-X450JF kernel: [   50.264331] ata1: drained 65536 bytes to clear DRQ
May 17 11:20:49 asus-X450JF kernel: [   50.264348] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
May 17 11:20:49 asus-X450JF kernel: [   50.264350] ata1.01: ST_FIRST: DRQ=1 with device error, dev_stat 0x7F
May 17 11:20:49 asus-X450JF kernel: [   50.264354] sr 0:0:1:0: CDB: 
May 17 11:20:49 asus-X450JF kernel: [   50.264355] Get event status notification: 4a 01 00 00 10 00 00 00 08 00
May 17 11:20:49 asus-X450JF kernel: [   50.264368] ata1.01: cmd a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
May 17 11:20:49 asus-X450JF kernel: [   50.264368]          res 7f/ff:ff:ff:ff:ff/00:00:00:00:00/ff Emask 0x2 (HSM violation)
May 17 11:20:49 asus-X450JF kernel: [   50.264370] ata1.01: status: { DRDY DF DRQ ERR }
May 17 11:20:49 asus-X450JF kernel: [   50.264378] ata1: soft resetting link
May 17 11:20:50 asus-X450JF kernel: [   50.550219] ata1.01: configured for UDMA/133
May 17 11:20:50 asus-X450JF kernel: [   50.552101] ata1: EH complete
May 17 11:20:54 asus-X450JF kernel: [   54.837445] ata4.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
May 17 11:20:54 asus-X450JF kernel: [   54.837449] ata4.00: failed command: READ DMA EXT
May 17 11:20:54 asus-X450JF kernel: [   54.837452] ata4.00: cmd 25/00:90:88:63:a0/00:00:33:00:00/e0 tag 0 dma 73728 in
May 17 11:20:54 asus-X450JF kernel: [   54.837452]          res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout)
May 17 11:20:54 asus-X450JF kernel: [   54.837453] ata4.00: status: { DRDY }
May 17 11:20:54 asus-X450JF kernel: [   54.874228] ata1: drained 65536 bytes to clear DRQ
May 17 11:20:54 asus-X450JF kernel: [   54.874250] ata1.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
May 17 11:20:54 asus-X450JF kernel: [   54.874251] ata1.01: ST_FIRST: DRQ=1 with device error, dev_stat 0x7F
May 17 11:20:54 asus-X450JF kernel: [   54.874254] sr 0:0:1:0: CDB: 
May 17 11:20:54 asus-X450JF kernel: [   54.874257] Get event status notification: 4a 01 00 00 10 00 00 00 08 00
May 17 11:20:54 asus-X450JF kernel: [   54.874263] ata1.01: cmd a0/00:00:00:08:00/00:00:00:00:00/b0 tag 0 pio 16392 in
May 17 11:20:54 asus-X450JF kernel: [   54.874263]          res 7f/ff:ff:ff:ff:ff/00:00:00:00:00/ff Emask 0x2 (HSM violation)
May 17 11:20:54 asus-X450JF kernel: [   54.874264] ata1.01: status: { DRDY DF DRQ ERR }
May 17 11:20:54 asus-X450JF kernel: [   54.874271] ata1: soft resetting link
May 17 11:20:54 asus-X450JF kernel: [   55.161917] ata1.01: configured for UDMA/133
May 17 11:20:54 asus-X450JF kernel: [   55.163719] ata1: EH complete
May 17 11:20:59 asus-X450JF kernel: [   59.849536] ata4: link is slow to respond, please be patient (ready=0)
May 17 11:21:04 asus-X450JF kernel: [   64.893643] ata4: device not ready (errno=-16), forcing hardreset
May 17 11:21:04 asus-X450JF kernel: [   64.893651] ata4: soft resetting link
May 17 11:21:04 asus-X450JF kernel: [   65.198303] ata4.00: configured for UDMA/133
May 17 11:21:04 asus-X450JF kernel: [   65.198309] ata4.00: device reported invalid CHS sector 0
May 17 11:21:04 asus-X450JF kernel: [   65.198316] ata4: EH complete

More information in the attachment("new error:soft resetting link").

Is the same problem?
Why the issue changed from "hard resetting link" to "soft resetting link"?
And How to reproduce this issue?Because it appears random.
Comment 7 Tejun Heo 2014-06-03 18:26:20 UTC
Looks like you turned off ahci mode, so the controller is now operating in the legacy ata_piix mode which can't perform hard resets. The same underlying problem. The connection between the disk controller and disk is not reliable and drops off from time to time. The driver is doing what it can to recover from the issues reported by the hardware. It could be something as simple as the drive not being seated well enough and opening it up and reseating it could resolve the issue or it can be something far more complex electronically. Booting with "libata.force=1.5Gbps" might work around the issue too.
Comment 8 chenhao 2014-07-10 02:58:49 UTC
Today I tried it(Booting with "libata.force=1.5Gbps" and turn on ahci mode),but got the sample error.

Jul 10 10:48:27 chenhao-lp kernel: ata6: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
Jul 10 10:48:27 chenhao-lp kernel: ata6: irq_stat 0x00400040, connection status changed
Jul 10 10:48:27 chenhao-lp kernel: ata6: SError: { HostInt PHYRdyChg 10B8B DevExch }
Jul 10 10:48:27 chenhao-lp kernel: ata6: hard resetting link
Jul 10 10:48:27 chenhao-lp kernel: ata3: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
Jul 10 10:48:27 chenhao-lp kernel: ata3: irq_stat 0x00400040, connection status changed
Jul 10 10:48:27 chenhao-lp kernel: ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }
Jul 10 10:48:27 chenhao-lp kernel: ata3: hard resetting link
Jul 10 10:48:27 chenhao-lp kernel: ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jul 10 10:48:27 chenhao-lp kernel: ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
Jul 10 10:48:27 chenhao-lp kernel: ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jul 10 10:48:27 chenhao-lp kernel: ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jul 10 10:48:27 chenhao-lp kernel: ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
Jul 10 10:48:27 chenhao-lp kernel: ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
Jul 10 10:48:27 chenhao-lp kernel: ata6.00: configured for UDMA/33
Jul 10 10:48:27 chenhao-lp kernel: ata6: EH complete
Jul 10 10:48:27 chenhao-lp kernel: ata3.00: configured for PIO0
Jul 10 10:48:27 chenhao-lp kernel: ata3: EH complete
Comment 9 Alan 2014-07-10 13:25:36 UTC
Same as your original report - your disk decided to go away and come back. Nothing there that looks like a Linux bug.

See Comment #4. Nothing here has changed, you need to work through your distribution.
Comment 10 chenhao 2014-07-11 03:56:02 UTC
Sorry, my description(Comment 8)was not detailed.Yesterday i met it on CentOS 7.0.1406.

Now I got the error on centos 7(kernel 3.10)/Ubuntu14.04(kernel 3.13)/Fedora 20.If    kernel is OK,my laptop may have some problems.

Thank you very much.
Comment 11 chenhao 2015-01-21 06:55:03 UTC
Now I got more information from asus.They said "It is caused by intel节能模式(intel power saving model) and had no idea to solve the problem".

How to turn intel power saving model off? Or,are there any other ways to help solving   this problems?
Comment 12 Ivan S. Zapreev 2015-03-30 23:11:08 UTC
I have the same Issue, it cropped up since a week or two ago. After one of the kernel updates (Ubuntu 14.10)... The computer freeses and sometimes the SSD just drops off completely and I need to reboot:

[  518.889172] ata1.00: supports DRM functions and may not be fully accessible
[  518.889202] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1
[  518.889204] ata1.00: configured for UDMA/33
[  518.889221] ata1: EH complete
[  521.271992] ata1: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[  521.271995] ata1: irq_stat 0x00400040, connection status changed
[  521.271997] ata1: SError: { HostInt PHYRdyChg 10B8B DevExch }
[  521.272000] ata1: hard resetting link
[  521.272007] ata3: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[  521.272009] ata3: irq_stat 0x00400040, connection status changed
[  521.272010] ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }
[  521.272012] ata3: hard resetting link
[  521.996402] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[  521.997005] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[  521.997007] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[  521.997008] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[  521.998019] ata3.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded
[  521.998021] ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out
[  521.998022] ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out
[  521.998436] ata3.00: configured for UDMA/33
[  521.998439] ata3: EH complete
[  522.000385] ata1: SATA link up 1.5 Gbps (SStatus 113 SControl 310)
[  522.002339] ata1.00: supports DRM functions and may not be fully accessible
[  522.002437] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1
[  522.002596] ata1.00: supports DRM functions and may not be fully accessible
[  522.002625] ata1.00: failed to get NCQ Send/Recv Log Emask 0x1
[  522.002627] ata1.00: configured for UDMA/33
[  522.002644] ata1: EH complete
[  535.632701] ata3: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[  535.632705] ata3: irq_stat 0x00400040, connection status changed
[  535.632707] ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }
[  535.632710] ata3: hard resetting link
[  535.632717] ata1: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[  535.632719] ata1: irq_stat 0x00400040, connection status changed
[  535.632720] ata1: SError: { HostInt PHYRdyChg 10B8B DevExch }
[  535.632721] ata1: hard resetting link
Comment 13 Eugene 2015-05-20 10:20:21 UTC
Also found this in my log today:
May 20 00:43:12 p5q3 kernel: [ 1608.649084] ata3: SError: { HostInt PHYRdyChg 10B8B DevExch }

Linux 4.1RC4 x86_64
Kubuntu 15.04
Comment 14 Francisco Cribari 2017-12-29 14:36:05 UTC
Is this bug report still open? I run Arch Linux on a Samsung notebook and I am facing the same problem. I see in dmesg: 

[ 9334.847059] ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[ 9334.847072] ata2: irq_stat 0x00400040, connection status changed
[ 9334.847080] ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 9334.847093] ata2: hard resetting link
[ 9335.555165] ata2: SATA link down (SStatus 0 SControl 300)
[ 9340.748508] ata2: hard resetting link
[ 9341.062029] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 9341.065483] ata2.00: configured for UDMA/133
[ 9341.065493] ata2: EH complete
[ 9761.430721] ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[ 9761.430734] ata2: irq_stat 0x00400040, connection status changed
[ 9761.430742] ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 9761.430754] ata2: hard resetting link
[ 9762.140437] ata2: SATA link down (SStatus 0 SControl 300)
[ 9767.200461] ata2: hard resetting link
[ 9767.514650] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 9767.518299] ata2.00: configured for UDMA/133
[ 9767.518309] ata2: EH complete
[ 9815.316294] ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[ 9815.316298] ata2: irq_stat 0x00400040, connection status changed
[ 9815.316300] ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 9815.316303] ata2: hard resetting link
[ 9816.056922] ata2: SATA link down (SStatus 0 SControl 300)
[ 9821.173474] ata2: hard resetting link
[ 9821.487011] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 9821.490652] ata2.00: configured for UDMA/133
[ 9821.490673] ata2: EH complete
[ 9928.869644] ata2: limiting SATA link speed to 3.0 Gbps
[ 9928.869653] ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[ 9928.869664] ata2: irq_stat 0x00400040, connection status changed
[ 9928.869672] ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
[ 9928.869684] ata2: hard resetting link
[ 9929.609750] ata2: SATA link down (SStatus 0 SControl 320)
[ 9934.666317] ata2: hard resetting link
[ 9934.979789] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[ 9934.983244] ata2.00: configured for UDMA/133
[ 9934.983253] ata2: EH complete
[10345.644744] ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[10345.644757] ata2: irq_stat 0x00400040, connection status changed
[10345.644765] ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
[10345.644777] ata2: hard resetting link
[10346.384145] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[10346.387705] ata2.00: configured for UDMA/133
[10346.387718] ata2: EH complete
[10455.149729] ata2: exception Emask 0x50 SAct 0x0 SErr 0x4090800 action 0xe frozen
[10455.149743] ata2: irq_stat 0x00400040, connection status changed
[10455.149751] ata2: SError: { HostInt PHYRdyChg 10B8B DevExch }
[10455.149764] ata2: hard resetting link
[10455.887955] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 320)
[10455.891689] ata2.00: configured for UDMA/133
[10455.891699] ata2: EH complete

My hardware: 

System:    Host: darwin5 Kernel: 4.14.8-1-zen x86_64 bits: 64 gcc: 7.2.1
           Desktop: KDE Plasma 5.11.4 (Qt 5.10.0) Distro: Arch Linux
Machine:   Device: laptop System: SAMSUNG product: 900X3L v: P05AFN serial: N/A
           Mobo: SAMSUNG model: NP900X3L-KW1BR v: SGL8776A06-C01-G001-S0001+10.0.10586 serial: N/A
           UEFI: American Megatrends v: P05AFN.035.160331.PS date: 03/31/2016
Battery    BAT1: charge: 21.0 Wh 71.0% condition: 29.6/30.0 Wh (99%)
           model: SAMSUNG SR Real status: Discharging
CPU:       Dual core Intel Core i7-6500U (-MT-MCP-) arch: Skylake rev.3 cache: 4096 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 10372
           clock speeds: max: 3100 MHz 1: 2600 MHz 2: 2600 MHz 3: 2600 MHz 4: 2600 MHz
Graphics:  Card: Intel HD Graphics 520 bus-ID: 00:02.0
           Display Server: N/A driver: intel tty size: 98x27
Audio:     Card Intel Sunrise Point-LP HD Audio driver: snd_hda_intel bus-ID: 00:1f.3
           Sound: Advanced Linux Sound Architecture v: k4.14.8-1-zen
Network:   Card-1: Intel Wireless 8260 driver: iwlwifi bus-ID: 01:00.0
           IF: wlp1s0 state: up mac: <filter>
           Card-2: Realtek RTL8111/8168/8411 PCIE Gigabit Ethernet Controller
           driver: r8169 v: 2.3LK-NAPI port: e000 bus-ID: 02:00.0
           IF: enp2s0 state: down mac: <filter>
Drives:    HDD Total Size: 256.1GB (15.4% used)
           ID-1: /dev/sda model: LITEON_CV1 size: 256.1GB
Partition: ID-1: / size: 226G used: 29G (14%) fs: ext4 dev: /dev/sda2
           ID-2: /boot size: 510M used: 111M (22%) fs: vfat dev: /dev/sda1
           ID-3: swap-1 size: 8.59GB used: 0.00GB (0%) fs: swap dev: /dev/sda3
Sensors:   System Temperatures: cpu: 41.0C mobo: 40.0C
           Fan Speeds (in rpm): cpu: N/A
Info:      Processes: 179 Uptime: 10:24 Memory: 2846.7/7898.9MB Init: systemd Gcc sys: 7.2.1
           Client: Shell (bash 4.4.121) inxi: 2.3.53

Note You need to log in before you can comment on or make changes to this bug.