Created attachment 282579 [details] dmesg of the errors occuring I have a Samsung SSD 860 EVO mSATA 500GB SSD connected via an ASMedia ASM1062 Serial ATA Controller. It causes has 20-30 seconds lockups on fstrim (which runs during bootup on my system), with messages such as: [ 332.792044] ata14.00: exception Emask 0x0 SAct 0x3fffe SErr 0x0 action 0x6 frozen [ 332.798271] ata14.00: failed command: SEND FPDMA QUEUED [ 332.804499] ata14.00: cmd 64/01:08:00:00:00/00:00:00:00:00/a0 tag 1 ncq dma 512 out res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 332.817145] ata14.00: status: { DRDY } After disabling queued TRIM via the included patch, the issue disappears.
Created attachment 282581 [details] disable queued TRIM for Samsung 860 series SSDs
This patch is still relevant for master. Add my vote to merging this; I'd like to be able to re-enable NCQ on this SSD.
This patch looks good - any chance you can email one with a proper commit log and signed-off-by etc to linux-ide@vger.kernel.org? And you can CC me, axboe@kernel.dk, and I'll get it queued up for the current kernel.
Jens, thanks, sent to https://marc.info/?l=linux-ide&m=156312691006716&w=2, it is now being discussed there. Solomon: what model do you have that also has a problem with TRIM, 860 EVO mSATA too? And which firmware revision?
I have the 1TB SATA (not mSATA!) version. smartctl -a dump: Model Family: Samsung based SSDs Device Model: Samsung SSD 860 EVO 1TB Serial Number: S3Z8NB0K717690X LU WWN Device Id: 5 002538 e4054049c Firmware Version: RVT01B6Q User Capacity: 1,000,204,886,016 bytes [1.00 TB] Sector Size: 512 bytes logical/physical Rotation Rate: Solid State Device Form Factor: 2.5 inches Device is: In smartctl database [for details use: -P show] ATA Version is: ACS-4 T13/BSR INCITS 529 revision 5 SATA Version is: SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s) Local Time is: Mon Jul 15 13:47:44 2019 EDT SMART support is: Available - device has SMART capability. SMART support is: Enabled kernel log snippet: (Untainted Fedora 5.1.16-300.fc30.x86_64 kernel) ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) ata1.00: supports DRM functions and may not be fully accessible ata1.00: ATA-11: Samsung SSD 860 EVO 1TB, RVT01B6Q, max UDMA/133 ata1.00: 1953525168 sectors, multi 1: LBA48 NCQ (depth 32), AA ata1.00: supports DRM functions and may not be fully accessible ata1.00: configured for UDMA/133 scsi 0:0:0:0: Direct-Access ATA Samsung SSD 860 1B6Q PQ: 0 ANSI: 5 sd 0:0:0:0: Attached scsi generic sg0 type 0 ata1.00: Enabling discard_zeroes_data sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB) sd 0:0:0:0: [sda] Write Protect is off sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA ata1.00: Enabling discard_zeroes_data sda: sda1 sda2 sda3 ata1.00: Enabling discard_zeroes_data sd 0:0:0:0: [sda] supports TCG Opal sd 0:0:0:0: [sda] Attached SCSI disk
See also BZ #201693
> See also BZ #201693 Did you confirm that with my patch applied you have no problem with 860 EVO on the AMD SATA controller anymore? I thought that one is a hopeless matter and the issues extend to more than just TRIM, to regular (high-speed) reads/writes too. For that reason I moved mine to an ASMedia controller, and here it is clear-cut that only the queued TRIM fails, everything else works fine.
I'm building a patched fedora kernel with the patch, and will get back to you later today. But in the mean time I can confirm that by setting the drive's queue depth to 1, I have no timeout or corruption issues. [[ echo 1 > /sys/block/sda/device/queue_depth ]]
Finally got it built and booted up.. and it went kaboom. Same kernel (Fedora 5.1.16-300) but with Roman's patch applied, yields much the same kernel log, with this addition: ata1.00: disabling queued TRIM support Unfortunately, about 30 seconds later, it went kaboom: [ 35.527148] ata1.00: exception Emask 0x10 SAct 0xfc000 SErr 0x0 action 0x6 frozen [ 35.527155] ata1.00: irq_stat 0x08000000, interface fatal error [ 35.527161] ata1.00: failed command: WRITE FPDMA QUEUED [ 35.527171] ata1.00: cmd 61/20:70:e0:a6:8b/00:00:25:00:00/40 tag 14 ncq dma 16384 out res 40/00:70:e0:a6:8b/00:00:25:00:00/40 Emask 0x10 (ATA bus error) [ 35.527176] ata1.00: status: { DRDY } [ 35.527179] ata1.00: failed command: WRITE FPDMA QUEUED [ 35.527187] ata1.00: cmd 61/08:78:e0:ad:8b/00:00:25:00:00/40 tag 15 ncq dma 4096 out res 40/00:70:e0:a6:8b/00:00:25:00:00/40 Emask 0x10 (ATA bus error) [ 35.527191] ata1.00: status: { DRDY } [ 35.527194] ata1.00: failed command: WRITE FPDMA QUEUED [ 35.527202] ata1.00: cmd 61/20:80:60:d0:91/00:00:25:00:00/40 tag 16 ncq dma 16384 out res 40/00:70:e0:a6:8b/00:00:25:00:00/40 Emask 0x10 (ATA bus error) [ 35.527205] ata1.00: status: { DRDY } [ 35.527208] ata1.00: failed command: WRITE FPDMA QUEUED [ 35.527216] ata1.00: cmd 61/40:88:00:d1:91/00:00:25:00:00/40 tag 17 ncq dma 32768 out res 40/00:70:e0:a6:8b/00:00:25:00:00/40 Emask 0x10 (ATA bus error) [ 35.527219] ata1.00: status: { DRDY } [ 35.527222] ata1.00: failed command: WRITE FPDMA QUEUED [ 35.527230] ata1.00: cmd 61/08:90:c0:51:92/00:00:25:00:00/40 tag 18 ncq dma 4096 out res 40/00:70:e0:a6:8b/00:00:25:00:00/40 Emask 0x10 (ATA bus error) [ 35.527233] ata1.00: status: { DRDY } [ 35.527236] ata1.00: failed command: WRITE FPDMA QUEUED [ 35.527243] ata1.00: cmd 61/20:98:20:52:92/00:00:25:00:00/40 tag 19 ncq dma 16384 out res 40/00:70:e0:a6:8b/00:00:25:00:00/40 Emask 0x10 (ATA bus error) [ 35.527246] ata1.00: status: { DRDY } [ 35.527252] ata1: hard resetting link [ 35.986132] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [ 35.986457] ata1.00: supports DRM functions and may not be fully accessible [ 35.987384] ata1.00: disabling queued TRIM support [ 35.989818] ata1.00: supports DRM functions and may not be fully accessible [ 35.990591] ata1.00: disabling queued TRIM support [ 35.992641] ata1.00: configured for UDMA/133 [ 35.992670] ata1: EH complete [ 35.992941] ata1.00: Enabling discard_zeroes_data So perhaps this SSD is simply incompatible with NCQ. Sigh.
> So perhaps this SSD is simply incompatible with NCQ. Not in general, only in combination with AMD SATA, as discussed in that other bugreport. And indeed there it's not only TRIM, but also regular writes. Any chance you could test on a different controller (ASMedia, Marvell, ...)?
It's frustrating that Samsung has demonstrated no interest in solving this problem properly. It's not like AMD-based systems are _that_ rare. Every system I have at home is AMD-based or has an incompatible form factor. I'll see what I can dig up around the office.
I just swapped in an ASMedia-based SATA controller, and re-enabled NCQ (by using the default queue_depth). The system is subjectively much, much faster and is (so far) error free.
I'm getting the same issue on 4.15..5.4.49 with an Intel ASRock Z170 Extreme4 SATA controller: [389520.385306] ata2.00: exception Emask 0x0 SAct 0xf SErr 0x0 action 0x6 frozen [389520.385315] ata2.00: failed command: WRITE FPDMA QUEUED [389520.385327] ata2.00: cmd 61/60:00:80:8e:20/00:00:98:00:00/40 tag 0 ncq dma 49152 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389520.385332] ata2.00: status: { DRDY } [389520.385336] ata2.00: failed command: WRITE FPDMA QUEUED [389520.385345] ata2.00: cmd 61/20:08:00:8f:20/00:00:98:00:00/40 tag 1 ncq dma 16384 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389520.385349] ata2.00: status: { DRDY } [389520.385353] ata2.00: failed command: SEND FPDMA QUEUED [389520.385364] ata2.00: cmd 64/01:10:00:00:00/00:00:00:00:00/a0 tag 2 ncq dma 512 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389520.385370] ata2.00: status: { DRDY } [389520.385374] ata2.00: failed command: WRITE FPDMA QUEUED [389520.385382] ata2.00: cmd 61/e0:18:b8:ea:77/05:00:97:00:00/40 tag 3 ncq dma 770048 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389520.385386] ata2.00: status: { DRDY } [389520.385393] ata2: hard resetting link [389520.699442] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [389520.701434] ata2.00: supports DRM functions and may not be fully accessible [389520.704682] ata2.00: supports DRM functions and may not be fully accessible [389520.707501] ata2.00: configured for UDMA/133 [389520.707511] ata2: EH complete [389520.707742] ata2.00: Enabling discard_zeroes_data [389551.093259] ata2.00: exception Emask 0x0 SAct 0x1fc0000 SErr 0x0 action 0x6 frozen [389551.093261] ata2.00: failed command: WRITE FPDMA QUEUED [389551.093264] ata2.00: cmd 61/d8:90:a8:bc:a0/09:00:97:00:00/40 tag 18 ncq dma 1290240 ou res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389551.093265] ata2.00: status: { DRDY } [389551.093266] ata2.00: failed command: WRITE FPDMA QUEUED [389551.093267] ata2.00: cmd 61/e0:98:b8:ea:77/05:00:97:00:00/40 tag 19 ncq dma 770048 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389551.093268] ata2.00: status: { DRDY } [389551.093269] ata2.00: failed command: SEND FPDMA QUEUED [389551.093271] ata2.00: cmd 64/01:a0:00:00:00/00:00:00:00:00/a0 tag 20 ncq dma 512 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389551.093271] ata2.00: status: { DRDY } [389551.093272] ata2.00: failed command: WRITE FPDMA QUEUED [389551.093274] ata2.00: cmd 61/20:a8:00:8f:20/00:00:98:00:00/40 tag 21 ncq dma 16384 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389551.093274] ata2.00: status: { DRDY } [389551.093275] ata2.00: failed command: WRITE FPDMA QUEUED [389551.093295] ata2.00: cmd 61/60:b0:80:8e:20/00:00:98:00:00/40 tag 22 ncq dma 49152 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389551.093296] ata2.00: status: { DRDY } [389551.093296] ata2.00: failed command: WRITE FPDMA QUEUED [389551.093298] ata2.00: cmd 61/b0:b8:80:c6:a0/09:00:97:00:00/40 tag 23 ncq dma 1269760 ou res 40/00:00:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [389551.093299] ata2.00: status: { DRDY } [389551.093300] ata2.00: failed command: WRITE FPDMA QUEUED [389551.093301] ata2.00: cmd 61/10:c0:f0:21:22/00:00:96:00:00/40 tag 24 ncq dma 8192 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389551.093302] ata2.00: status: { DRDY } [389551.093303] ata2: hard resetting link [389551.407389] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [389551.409259] ata2.00: supports DRM functions and may not be fully accessible [389551.412712] ata2.00: supports DRM functions and may not be fully accessible [389551.415759] ata2.00: configured for UDMA/133 [389551.415773] ata2: EH complete [389581.797243] ata2.00: exception Emask 0x0 SAct 0x3f80 SErr 0x0 action 0x6 frozen [389581.797246] ata2.00: failed command: WRITE FPDMA QUEUED [389581.797248] ata2.00: cmd 61/10:38:f0:21:22/00:00:96:00:00/40 tag 7 ncq dma 8192 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389581.797249] ata2.00: status: { DRDY } [389581.797250] ata2.00: failed command: WRITE FPDMA QUEUED [389581.797252] ata2.00: cmd 61/b0:40:80:c6:a0/09:00:97:00:00/40 tag 8 ncq dma 1269760 ou res 40/00:01:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [389581.797253] ata2.00: status: { DRDY } [389581.797253] ata2.00: failed command: WRITE FPDMA QUEUED [389581.797255] ata2.00: cmd 61/60:48:80:8e:20/00:00:98:00:00/40 tag 9 ncq dma 49152 out res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [389581.797256] ata2.00: status: { DRDY } [389581.797257] ata2.00: failed command: WRITE FPDMA QUEUED [389581.797258] ata2.00: cmd 61/20:50:00:8f:20/00:00:98:00:00/40 tag 10 ncq dma 16384 out res 40/00:01:00:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) [389581.797259] ata2.00: status: { DRDY } [389581.797260] ata2.00: failed command: SEND FPDMA QUEUED [389581.797262] ata2.00: cmd 64/01:58:00:00:00/00:00:00:00:00/a0 tag 11 ncq dma 512 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389581.797262] ata2.00: status: { DRDY } [389581.797263] ata2.00: failed command: WRITE FPDMA QUEUED [389581.797265] ata2.00: cmd 61/e0:60:b8:ea:77/05:00:97:00:00/40 tag 12 ncq dma 770048 out res 40/00:ff:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [389581.797265] ata2.00: status: { DRDY } [389581.797266] ata2.00: failed command: WRITE FPDMA QUEUED [389581.797268] ata2.00: cmd 61/d8:68:a8:bc:a0/09:00:97:00:00/40 tag 13 ncq dma 1290240 ou res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389581.797268] ata2.00: status: { DRDY } [389581.797270] ata2: hard resetting link [389582.111393] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [389582.113289] ata2.00: supports DRM functions and may not be fully accessible [389582.116517] ata2.00: supports DRM functions and may not be fully accessible [389582.119421] ata2.00: configured for UDMA/133 [389582.119438] ata2: EH complete [389582.119715] ata2.00: Enabling discard_zeroes_data [389582.120788] ata2.00: Enabling discard_zeroes_data [389612.533285] ata2.00: NCQ disabled due to excessive errors [389612.533292] ata2.00: exception Emask 0x0 SAct 0x7c00000f SErr 0x0 action 0x6 frozen [389612.533301] ata2.00: failed command: WRITE FPDMA QUEUED [389612.533313] ata2.00: cmd 61/b0:00:80:c6:a0/09:00:97:00:00/40 tag 0 ncq dma 1269760 ou res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533317] ata2.00: status: { DRDY } [389612.533322] ata2.00: failed command: WRITE FPDMA QUEUED [389612.533331] ata2.00: cmd 61/10:08:f0:21:22/00:00:96:00:00/40 tag 1 ncq dma 8192 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533335] ata2.00: status: { DRDY } [389612.533339] ata2.00: failed command: READ FPDMA QUEUED [389612.533347] ata2.00: cmd 60/18:10:c0:d3:00/00:00:00:00:00/40 tag 2 ncq dma 12288 in res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533351] ata2.00: status: { DRDY } [389612.533354] ata2.00: failed command: READ FPDMA QUEUED [389612.533363] ata2.00: cmd 60/20:18:80:b9:e7/00:00:58:00:00/40 tag 3 ncq dma 16384 in res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533366] ata2.00: status: { DRDY } [389612.533371] ata2.00: failed command: WRITE FPDMA QUEUED [389612.533380] ata2.00: cmd 61/d8:d0:a8:bc:a0/09:00:97:00:00/40 tag 26 ncq dma 1290240 ou res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533383] ata2.00: status: { DRDY } [389612.533387] ata2.00: failed command: WRITE FPDMA QUEUED [389612.533396] ata2.00: cmd 61/e0:d8:b8:ea:77/05:00:97:00:00/40 tag 27 ncq dma 770048 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533399] ata2.00: status: { DRDY } [389612.533402] ata2.00: failed command: SEND FPDMA QUEUED [389612.533410] ata2.00: cmd 64/01:e0:00:00:00/00:00:00:00:00/a0 tag 28 ncq dma 512 out res 40/00:01:00:00:00/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533414] ata2.00: status: { DRDY } [389612.533417] ata2.00: failed command: WRITE FPDMA QUEUED [389612.533426] ata2.00: cmd 61/20:e8:00:8f:20/00:00:98:00:00/40 tag 29 ncq dma 16384 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533429] ata2.00: status: { DRDY } [389612.533433] ata2.00: failed command: WRITE FPDMA QUEUED [389612.533441] ata2.00: cmd 61/60:f0:80:8e:20/00:00:98:00:00/40 tag 30 ncq dma 49152 out res 40/00:01:00:4f:c2/00:00:00:00:00/40 Emask 0x4 (timeout) [389612.533445] ata2.00: status: { DRDY } [389612.533451] ata2: hard resetting link [389612.851755] ata2: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [389612.853797] ata2.00: supports DRM functions and may not be fully accessible [389612.857594] ata2.00: supports DRM functions and may not be fully accessible [389612.860819] ata2.00: configured for UDMA/133 [389612.860879] ata2: EH complete [389612.865362] ata2.00: Enabling discard_zeroes_data This is during an fstrim, and it doesn't happen on the Samsung 850 EVO. Device Model: Samsung SSD 850 EVO 2TB Firmware Version: EMT02B6Q Device Model: Samsung SSD 860 EVO 2TB Firmware Version: RVT04B6Q 00:17.0 SATA controller: Intel Corporation Q170/Q150/B150/H170/H110/Z170/CM236 Chipset SATA Controller [AHCI Mode] (rev 31)
Same issue, different controller: System: FUJITSU PRIMERGY TX1310 M1/D3219-A1, BIOS V4.6.5.4 R1.11.0 for D3219-A1x 09/25/2018 Kernel: Linux server 5.4.72-gentoo-x86_64 #1 SMP Sat Oct 17 05:17:10 EET 2020 x86_64 Intel(R) Xeon(R) CPU E3-1226 v3 @ 3.30GHz GenuineIntel GNU/Linux Controller: 00:1f.2 SATA controller: Intel Corporation 8 Series/C220 Series Chipset Family 6-port SATA Controller 1 [AHCI mode] (rev 04) Device Model: Samsung SSD 860 EVO 500GB Firmware Version: RVT04B6Q [395138.151251] ata6.00: exception Emask 0x10 SAct 0x40003fff SErr 0x400100 action 0x6 frozen [395138.152011] ata6.00: irq_stat 0x08000008, interface fatal error [395138.152755] ata6: SError: { UnrecovData Handshk } [395138.153470] ata6.00: failed command: WRITE FPDMA QUEUED [395138.154222] ata6.00: cmd 61/08:00:78:38:80/00:00:0e:00:00/40 tag 0 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.155801] ata6.00: status: { DRDY } [395138.156579] ata6.00: failed command: WRITE FPDMA QUEUED [395138.156581] ata6.00: cmd 61/08:08:18:26:81/00:00:0e:00:00/40 tag 1 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.156581] ata6.00: status: { DRDY } [395138.156582] ata6.00: failed command: WRITE FPDMA QUEUED [395138.156593] ata6.00: cmd 61/08:10:50:59:81/00:00:0e:00:00/40 tag 2 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.156594] ata6.00: status: { DRDY } [395138.156594] ata6.00: failed command: WRITE FPDMA QUEUED [395138.156596] ata6.00: cmd 61/08:18:90:6a:81/00:00:0e:00:00/40 tag 3 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.156596] ata6.00: status: { DRDY } [395138.156597] ata6.00: failed command: WRITE FPDMA QUEUED [395138.156598] ata6.00: cmd 61/08:20:58:b2:81/00:00:0e:00:00/40 tag 4 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.156599] ata6.00: status: { DRDY } [395138.156599] ata6.00: failed command: WRITE FPDMA QUEUED [395138.156601] ata6.00: cmd 61/08:28:b0:26:c0/00:00:0e:00:00/40 tag 5 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.156602] ata6.00: status: { DRDY } [395138.171913] ata6.00: failed command: WRITE FPDMA QUEUED [395138.171915] ata6.00: cmd 61/10:30:a0:27:c0/00:00:0e:00:00/40 tag 6 ncq dma 8192 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.171916] ata6.00: status: { DRDY } [395138.171916] ata6.00: failed command: WRITE FPDMA QUEUED [395138.171919] ata6.00: cmd 61/08:38:50:2a:c0/00:00:0e:00:00/40 tag 7 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.176836] ata6.00: status: { DRDY } [395138.176837] ata6.00: failed command: WRITE FPDMA QUEUED [395138.176839] ata6.00: cmd 61/08:40:e8:49:c8/00:00:0e:00:00/40 tag 8 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.176839] ata6.00: status: { DRDY } [395138.176840] ata6.00: failed command: WRITE FPDMA QUEUED [395138.176841] ata6.00: cmd 61/08:48:58:08:80/00:00:0f:00:00/40 tag 9 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.176842] ata6.00: status: { DRDY } [395138.183063] ata6.00: failed command: WRITE FPDMA QUEUED [395138.183065] ata6.00: cmd 61/08:50:08:08:c0/00:00:13:00:00/40 tag 10 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.183075] ata6.00: status: { DRDY } [395138.183076] ata6.00: failed command: WRITE FPDMA QUEUED [395138.183077] ata6.00: cmd 61/08:58:80:08:c0/00:00:13:00:00/40 tag 11 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.183078] ata6.00: status: { DRDY } [395138.189053] ata6.00: failed command: WRITE FPDMA QUEUED [395138.189055] ata6.00: cmd 61/08:60:a8:12:c0/00:00:13:00:00/40 tag 12 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.189055] ata6.00: status: { DRDY } [395138.189065] ata6.00: failed command: WRITE FPDMA QUEUED [395138.189066] ata6.00: cmd 61/08:68:f8:12:c0/00:00:13:00:00/40 tag 13 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.189067] ata6.00: status: { DRDY } [395138.189068] ata6.00: failed command: WRITE FPDMA QUEUED [395138.189070] ata6.00: cmd 61/08:f0:90:2d:80/00:00:0e:00:00/40 tag 30 ncq dma 4096 out res 40/00:68:f8:12:c0/00:00:13:00:00/40 Emask 0x10 (ATA bus error) [395138.189071] ata6.00: status: { DRDY } [395138.199031] ata6: hard resetting link [395138.511140] ata6: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [395138.517064] ata6.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded [395138.519256] ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out [395138.521402] ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out [395138.523837] ata6.00: supports DRM functions and may not be fully accessible [395138.529475] ata6.00: ACPI cmd ef/10:06:00:00:00:00 (SET FEATURES) succeeded [395138.530403] ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out [395138.531236] ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out [395138.532417] ata6.00: supports DRM functions and may not be fully accessible [395138.536106] ata6.00: configured for UDMA/133 [395138.537034] ata6: EH complete [395138.537973] ata6.00: Enabling discard_zeroes_data What's the recommended way to go? Disable NCQ?
> What's the recommended way to go? Disable NCQ? I believe if you see "WRITE FPDMA QUEUED" messages, the issue is with NCQ in general, and yes, you should try disabling it for the device. But if you see "SEND FPDMA QUEUED" as in the initial post, then you might've gotten away with disabling just the queued TRIM. It is surprising to see that it even fails on Intel's controllers as well, all of this was mostly discussed with regard to AMD SATA.
(In reply to Roman Mamedov from comment #15) > It is surprising to see that it even fails on Intel's controllers as well, > all of this was mostly discussed with regard to AMD SATA. It's not surprising when you realise that queued trim used to be disabled on the Samsung 8* until Samsung's marketing department made an unsubstantiated claim that "the improved queued trim enhances Linux compatibility": https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ca6bfcb2f6d9deab3924bf901e73622a94900473
(In reply to Simon Arlott from comment #16) > It's not surprising when you realise that queued trim used to be disabled on > the Samsung 8* until Samsung's marketing department made an unsubstantiated > claim that "the improved queued trim enhances Linux compatibility": > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/ > ?id=ca6bfcb2f6d9deab3924bf901e73622a94900473 So it sounds like we just need to revert that patch, or at least re-enable the ATA_HORKAGE_NO_NCQ_TRIM quirk for the 860 series ?
Hans: also see https://bugzilla.kernel.org/show_bug.cgi?id=201693 . My personal experience is detailed over on https://marc.info/?t=154644279600003&r=1&w=2 and happens on plain reads. I've been booting with the kernel param libata.force=2.00:noncq to disable NCQ on the second ATA port where the Samsung 860 is plugged in which seems to stabilize things.
(In reply to Sitsofe Wheeler from comment #18) > Hans: also see https://bugzilla.kernel.org/show_bug.cgi?id=201693 . My > personal experience is detailed over on > https://marc.info/?t=154644279600003&r=1&w=2 and happens on plain reads. > I've been booting with the kernel param libata.force=2.00:noncq to disable > NCQ on the second ATA port where the Samsung 860 is plugged in which seems > to stabilize things. I disabled NCQ for the drive using the equivalent kernel parameter and have not seen these messages again (although they have only appeared once recently - after a few months of the SSD's operation). For what is worth it, performance of 4K random reads has seen a tenfold decline (from 380MB/s down to 38MB/s) without NCQ, which I guess is expectable. Performance on other tests, with NCQ vs without NCQ, didn't seem to be affected much.
Intel controller, same issue. Model: Samsung SSD 860 EVO 1TB Firmware Revision: RVT04B6Q Machine: Dell Precision M4700 BIOS: A19, 11/30/2018 SATA controller: Intel Corporation 7 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04) Kernel: Linux 5.10.0-1-amd64 #1 SMP Debian 5.10.5-1 (2021-01-09) x86_64 GNU/Linux Linux version 5.10.0-1-amd64 (debian-kernel@lists.debian.org) (gcc-10 (Debian 10.2.1-5) 10.2.1 20210108, GNU ld (GNU Binutils for Debian) 2.35.1) #1 SMP Debian 5.10.5-1 (2021-01-09) ata1.00: exception Emask 0x10 SAct 0x7f80 SErr 0x440100 action 0x6 frozen ata1.00: irq_stat 0x08000000, interface fatal error ata1: SError: { UnrecovData CommWake Handshk } ata1.00: failed command: WRITE FPDMA QUEUED ata1.00: cmd 61/00:38:20:16:02/0a:00:65:00:00/40 tag 7 ncq dma 1310720 ou res 40/00:40:20:20:02/00:00:65:00:00/40 Emask 0x10 (ATA bus error) ata1.00: status: { DRDY } Disabling NCQ and setting link_power_management_policy to max_performance reduces the frequency of errors. echo 1 > /sys/block/sda/device/queue_depth echo max_performance > /sys/class/scsi_host/host*/link_power_management_policy I had some days without errors, but occasionally they are happening again mostly after updating/installing packages.
I'm encountering this bug as well, on a Thinkpad t450s, a Samsung SSD 860 EVO 1TB (firmware RVT04B6Q) with Slackware-14.2 with kernel upgraded to 5.10.15. I'm adding my info particularly because of my non-AMD SATA controller: 00:1f.2 SATA controller: Intel Corporation Wildcat Point-LP SATA Controller [AHCI Mode] (rev 03) (prog-if 01 [AHCI 1.0]) Subsystem: Lenovo Wildcat Point-LP SATA Controller [AHCI Mode] Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+ Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0 Interrupt: pin B routed to IRQ 44 Region 0: I/O ports at 30a8 [size=8] Region 1: I/O ports at 30b4 [size=4] Region 2: I/O ports at 30a0 [size=8] Region 3: I/O ports at 30b0 [size=4] Region 4: I/O ports at 3060 [size=32] Region 5: Memory at f123c000 (32-bit, non-prefetchable) [size=2K] Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit- Address: fee00298 Data: 0000 Capabilities: [70] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot+,D3cold-) Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [a8] SATA HBA v1.0 BAR4 Offset=00000004 Kernel driver in use: ahci I have two ext4 partitions mounted with discards on, one of which encrypted. I see ata errors just about every time I reboot my machine, and was able to easily provoke it manually by issuing fstrim on my root and home partitions. I was (apparently) able to work around this bug both by issueing echo 1 > /sys/block/sda/device/queue_depth and by reverting https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ca6bfcb2f6d9deab3924bf901e73622a94900473 Please let me know if there's anything else I can do to help. I personally was quite put off by the sudden onset of all these ata errors after I thought I had prolonged my laptop's life with a nice and big SSD. I'm happy to work around the issue, but it would be better to be able to use vanilla sources without failures.