Bug 218832
Summary: | ata1.00: ACPI cmd f5/00:00:00:00:00:a0(SECURITY FREEZE LOCK) filtered out | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | doru iorgulescu (doru.iorgulescu1) |
Component: | Serial ATA | Assignee: | Tejun Heo (tj) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | high | CC: | alex.tkd.alex, cassel, damien.lemoal, lp610mh |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 6.9.0 6.10.0-rc1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
dmesg.txt
lspci-v.txt successful boot with libata.force=nolpm Boot with LPM enabled HDPARM with the patch dmesg after patch dmesg-amd.txt hdparam-i.txt dmesg693.txt hdparam-I.txt dmesg-610-rc2.txt dmesg694.txt dmesg on a kernel built with latest patch submitted by Niklas |
Description
doru iorgulescu
2024-05-13 04:40:17 UTC
Please check existing Google results: https://www.google.com/search?q=%22(SECURITY+FREEZE+LOCK)+filtered+out%22 Working with Linux Kernel 6.8.9 6.6.30 6.1.90 5.15.158 5.10.216 5.4.275 4.19.313 If it's a regression please perform regression testing using: https://docs.kernel.org/admin-guide/bug-bisect.html CCing Damien, he might be interested in it and have an idea. But if not I guess we won't get any further without a bisection. Here is a alternative guide: https://docs.kernel.org/admin-guide/verify-bugs-and-bisect-regressions.html The problem apear also on Linux Kernel 6.10.0-rc1 Please help me to find the commit responsable Thank You! (In reply to doru iorgulescu from comment #6) > The problem apear also on Linux Kernel 6.10.0-rc1 > Please help me to find the commit responsable > Thank You! As Artem mentioned, please try to git bisect this. Also, please better describe how you get this: is it on boot during device scan ? Is it after a resume from suspend or hybernate ? If it is the latter, please post your dmesg output after booting. The drive is BIOS locked and seems to fail unlocking for some reason. There has been no changes around this recently. So it could be ACPI or some regression from another unrelated change. What comes to mind is recent fixes for low power mode... What is the drive and what is the adapter ? The drive is SSD R3SL240G, O1015A I send atached dmesg.txt The machine is DMI: ASUSTeK Computer INC. 1015PEM/1015PE, BIOS 1202 04/13/2011 Created attachment 306308 [details]
dmesg.txt
(In reply to doru iorgulescu from comment #8) > The drive is SSD R3SL240G, O1015A > I send atached dmesg.txt > The machine is > DMI: ASUSTeK Computer INC. 1015PEM/1015PE, BIOS 1202 04/13/2011 What chipset is this (lspci -n) ? I need the AHCI adapter type or PCI vendor/model at least, not the entire machine. Also, for the drive, which vendor ? Please give precise details to help narrow this down. Created attachment 306309 [details]
lspci-v.txt
I have atached lspci-v.txt The SSD is Model Family: Silicon Motion based SSDs Device Model: R3SL240G Serial Number: E201604190080002 Firmware Version: O1015A User Capacity: 240,057,409,536 bytes [240 GB] Hello Doru, So you say that v6.8 works but v6.9 does not. There was not a lot of patches that went in to drivers/ata. The main suspect would be that your SSD (AMD Radeon R3SL240G) has broken LPM. As I suspect that your AHCI controller: 00:17.0 SATA controller: Intel Corporation C620 Series Chipset Family SATA Controller [AHCI mode] (rev 09) (prog-if 01 [AHCI 1.0]) should be able to handle LPM properly. Could you see if supplying: libata.force=nolpm on the kernel command line solves your problem? Could you please also separately try with: libata.force=noncq (without specifying libata.force=nolpm) Hey Niklas, i'm currently having a similar issue as Doru reported on this thread and using libata.force=nolpm as a kernel command line fixed a similar issue for my machine, This is my hardware: Motherboard = ASUS M5A78L-M/USB3 SSD = Crucial BX500 240GB running in SATA 2 due to motherboard limitations More info about it: Model Number: CT240BX500SSD1 Serial Number: 1903E16E4885 Firmware Revision: M6CR013 Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0]) Subsystem: ASUSTeK Computer Inc. Device 8389 Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx- Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 64, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 22 NUMA node: 0 Region 0: I/O ports at c000 [size=8] Region 1: I/O ports at b000 [size=4] Region 2: I/O ports at a000 [size=8] Region 3: I/O ports at 9000 [size=4] Region 4: I/O ports at 8000 [size=16] Region 5: Memory at fe9ffc00 (32-bit, non-prefetchable) [size=1K] Capabilities: [60] Power Management version 2 Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [70] SATA HBA v1.0 InCfgSpace Kernel driver in use: ahci This is what journalctl says whenever i run 6.9 with LPM enabled: ata1.00: status: { DRDY } I/O error, dev sda, sector 11509488 op 0x0:(READ) flags 0x80700 phys_seg 1 prio class 0 ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x40d0002 action 0xe frozen ata1.00: irq_stat 0x00000040, connection status changed ata1: SError: { RecovComm PHYRdyChg CommWake 10B8B DevExch } ata1.00: failed command: READ DMA ata1.00: cmd c8/00:10:f0:9e:af/00:00:00:00:00/e0 tag 13 dma 8192 in res 50/00:00:00:30:f2/00:00:1b:00:00/e0 Emask 0x10 (ATA bus error) I thought that i had a bad SATA cable or port but it was not the case since 6.8 is completely fine. I hope this extra information is helpful in any way shape or form to diagnose why its happening on certain SSDs or AHCI controllers. Thanks in advance. Regards (In reply to Aarrayy from comment #15) > Hey Niklas, i'm currently having a similar issue as Doru reported on this > thread and using libata.force=nolpm as a kernel command line fixed a similar > issue for my machine, > > This is my hardware: > > Motherboard = ASUS M5A78L-M/USB3 > > SSD = Crucial BX500 240GB running in SATA 2 due to motherboard limitations > > More info about it: > > Model Number: CT240BX500SSD1 > Serial Number: 1903E16E4885 > Firmware Revision: M6CR013 > Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA > Rev 2.5, SATA Rev 2.6, SATA Rev 3.0 > > 00:11.0 SATA controller: Advanced Micro Devices, Inc. [AMD/ATI] > SB7x0/SB8x0/SB9x0 SATA Controller [AHCI mode] (prog-if 01 [AHCI 1.0]) > Subsystem: ASUSTeK Computer Inc. Device 8389 > Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- > Stepping- SERR+ FastB2B- DisINTx- > Status: Cap+ 66MHz+ UDF- FastB2B- ParErr- DEVSEL=medium >TAbort- > <TAbort- > <MAbort- >SERR- <PERR- INTx- > Latency: 64, Cache Line Size: 64 bytes > Interrupt: pin A routed to IRQ 22 > NUMA node: 0 > Region 0: I/O ports at c000 [size=8] > Region 1: I/O ports at b000 [size=4] > Region 2: I/O ports at a000 [size=8] > Region 3: I/O ports at 9000 [size=4] > Region 4: I/O ports at 8000 [size=16] > Region 5: Memory at fe9ffc00 (32-bit, non-prefetchable) [size=1K] > Capabilities: [60] Power Management version 2 > Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA > PME(D0-,D1-,D2-,D3hot-,D3cold-) > Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- > Capabilities: [70] SATA HBA v1.0 InCfgSpace > Kernel driver in use: ahci > > This is what journalctl says whenever i run 6.9 with LPM enabled: > > ata1.00: status: { DRDY } > I/O error, dev sda, sector 11509488 op 0x0:(READ) flags 0x80700 phys_seg 1 > prio class 0 > ata1.00: exception Emask 0x10 SAct 0x0 SErr 0x40d0002 action 0xe frozen > ata1.00: irq_stat 0x00000040, connection status changed > ata1: SError: { RecovComm PHYRdyChg CommWake 10B8B DevExch } > ata1.00: failed command: READ DMA > ata1.00: cmd c8/00:10:f0:9e:af/00:00:00:00:00/e0 tag 13 dma 8192 in > res 50/00:00:00:30:f2/00:00:1b:00:00/e0 Emask 0x10 (ATA bus error) > > I thought that i had a bad SATA cable or port but it was not the case since > 6.8 is completely fine. > > I hope this extra information is helpful in any way shape or form to > diagnose why its happening on certain SSDs or AHCI controllers. > > Thanks in advance. > Regards Btw i already tried libata.force=noncq without also specifying libata.force=nolpm but this regression keeps happening, only disabling LPM helped disabling LPM in a similar way: https://lore.kernel.org/all/lsq.1527677561.498932115@decadent.org.uk/ Thanks, Regards Could you please send a dmesg of a successful boot? Also, do you have a password protected drive? I have atached dmesg.txt No password protected drive Thanks, Regards The dmesg that is a attached is from a non-working boot. Could you please attach a dmesg from a working boot? (i.e. with libata.force=nolpm specified on kernel command line.) I don't now how to libata.force=nolpm specified on kernel command line I have grub2 Thank you Regards Could you please post the dmesg from a boot where LPM is disabled using: """ disabling LPM in a similar way: https://lore.kernel.org/all/lsq.1527677561.498932115@decadent.org.uk/ Thanks, Regards """ then? (In reply to doru iorgulescu from comment #22) > I don't now how to > libata.force=nolpm specified on kernel command line > I have grub2 > Thank you > Regards You'll have to edit the following file with sudo: /etc/default/grub and then look for the line that says: GRUB_CMDLINE_LINUX_DEFAULT= between those single or double quotes, at the end add libata.force=nolpm with an space For example: GRUB_CMDLINE_LINUX_DEFAULT="sysrq_always_enabled nmi_watchdog=0 usbcore.autosuspend=-1 usbhid.mousepoll=1 amdgpu.ppfeaturemask=0xfffd3fff libata.force=nolpm" After you'll need to regenerate your grub entries, run the following command: update-grub Side note: Don't add any of the extra parameters that i have shown above in CMDLINE since they're only for demonstration purposes, only include libata.force=nolpm or libata.force=noncq as Nikkel suggested. (In reply to Aarrayy from comment #24) > (In reply to doru iorgulescu from comment #22) > > I don't now how to > > libata.force=nolpm specified on kernel command line > > I have grub2 > > Thank you > > Regards > > You'll have to edit the following file with sudo: /etc/default/grub > and then look for the line that says: GRUB_CMDLINE_LINUX_DEFAULT= > between those single or double quotes, at the end add libata.force=nolpm > with an space > > For example: > > GRUB_CMDLINE_LINUX_DEFAULT="sysrq_always_enabled nmi_watchdog=0 > usbcore.autosuspend=-1 usbhid.mousepoll=1 amdgpu.ppfeaturemask=0xfffd3fff > libata.force=nolpm" > > After you'll need to regenerate your grub entries, run the following > command: update-grub > > Side note: Don't add any of the extra parameters that i have shown above in > CMDLINE since they're only for demonstration purposes, only include > libata.force=nolpm or libata.force=noncq as Niklas suggested. Your don't need edit /etc/default/grub you can just press 'e' while in grub, add parameters to the kernel command line and then ctrl+x to boot. See: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html/managing_monitoring_and_updating_the_kernel/configuring-kernel-command-line-parameters_managing-monitoring-and-updating-the-kernel#changing-kernel-command-line-parameters-temporarily-at-boot-time_configuring-kernel-command-line-parameters Aarrayy, perhaps you could upload your dmesg? Created attachment 306320 [details]
successful boot with libata.force=nolpm
Thank you, I wil try Regards Aarrayy, could you please also upload your dmesg for the non-working case? You seem to have 3 drives. Is it all 3 drives that are failing with kernel v6.9? I only have one drive with Linux (Crucial BX 500) and another HDD for Windows 10, three? that's weird. Just to be clear, the drive works "fine" and it manages to boot Linux fine but dmesg and journalctl get spammed by those ata errors which i showed above Also performance seems to be affected too since on a benchmark of a job Random 4 KiB 1 queue 1 thread, transfer speeds went from 37MB/s average to 4.34MB/s without LPM, performance is as per usual and normal on this SSD I attached a new dmesg with LPM enabled on 6.9 Created attachment 306322 [details]
Boot with LPM enabled
I have one drive SSD Is failing with linux kernel 6.9.0 6.10.0-rc1 Regards Aarrayy, does: diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 4f35aab81a0a..78f08d52f364 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4138,6 +4138,7 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { /* Crucial BX100 SSD 500GB has broken LPM support */ { "CT500BX100SSD1", NULL, ATA_HORKAGE_NOLPM }, + { "CT240BX500SSD1", NULL, ATA_HORKAGE_NOLPM }, /* 512GB MX100 with MU01 firmware has both queued TRIM and LPM issues */ { "Crucial_CT512MX100*", "MU01", ATA_HORKAGE_NO_NCQ_TRIM | Solve the problem for you? Also, could you please upload the output of: $ sudo hdparm -I /dev/sdX For all your SATA devices. doru, does: diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 4f35aab81a0a..897af12c432f 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4155,6 +4155,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { ATA_HORKAGE_ZERO_AFTER_TRIM | ATA_HORKAGE_NOLPM }, + /* AMD Radeon devices with broken LPM support */ + { "R3SL240G", NULL, ATA_HORKAGE_NOLPM }, + /* These specific Samsung models/firmware-revs do not handle LPM well */ { "SAMSUNG MZMPC128HBFU-000MV", "CXM14M1Q", ATA_HORKAGE_NOLPM }, { "SAMSUNG SSD PM830 mSATA *", "CXM13D1Q", ATA_HORKAGE_NOLPM }, Solve the problem for you? Also, could you please upload the output of: $ sudo hdparm -I /dev/sdX For all your SATA devices. Aarrayy: Yes, I meant two drives, not three, sorry. There should be read I/Os issued to the drive on boot up, to perform the partition scanning, regardless if you have Windows or Linux on the drive. [ 1.162789] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1.163569] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1.166455] ata1.00: ATA-9: CT240BX500SSD1, M6CR013, max UDMA/133 [ 1.168006] ata1.00: 468862128 sectors, multi 1: LBA48 NCQ (depth 32), AA [ 1.168810] ata3.00: ATA-8: ST320LM001 HN-M320MBB, 2AR10002, max UDMA/133 [ 1.170226] ata3.00: 625142448 sectors, multi 16: LBA48 NCQ (depth 32), AA We can see that the ST320LM001 works fine (no command timeouts in dmesg), but the CT240BX500SSD1, which is connected to the same AHCI controller gives a bunch or errors. This makes me suspect broken LPM implementation for the CT240BX500SSD1 drive. Yep, maybe it was broken since a long time ago but somehow the ATA driver didn't catch it until the new changes in 6.9. I'll compile a kernel with that patch but i assume its going to work since its disabling LPM. Also in a few minutes i'll also upload hdparm's output Created attachment 306371 [details]
HDPARM with the patch
Created attachment 306372 [details]
dmesg after patch
Thanks Niklas, it's now fixed and i no longer need to have libata.force=nolpm as a kernel command line, i highly appreciate your attention and willing to check this bug! Regards Thank You, The patch diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 4f35aab81a0a..897af12c432f 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4155,6 +4155,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { ATA_HORKAGE_ZERO_AFTER_TRIM | ATA_HORKAGE_NOLPM }, + /* AMD Radeon devices with broken LPM support */ + { "R3SL240G", NULL, ATA_HORKAGE_NOLPM }, + /* These specific Samsung models/firmware-revs do not handle LPM well */ { "SAMSUNG MZMPC128HBFU-000MV", "CXM14M1Q", ATA_HORKAGE_NOLPM }, { "SAMSUNG SSD PM830 mSATA *", "CXM13D1Q", ATA_HORKAGE_NOLPM }, Is OK !! Must be ported to Linux Kernel 6.9 and 6.10 Linux version 6.10.0-rc2 (root@mirela4) (gcc (Debian 13.2.0-25) 13.2.0, GNU ld (GNU Binutils for Debian) 2.42) #1 SMP PREEMPT_DYNAMIC Thu May 30 09:09:47 EEST 2024 I have apached dmesg-amd.txt Thank You Regards Created attachment 306375 [details]
dmesg-amd.txt
Thank you guys! doru, could you please upload the output of: $ sudo hdparm -I /dev/sda ? Created attachment 306377 [details]
hdparam-i.txt
I have submited hdparam-i.txt Thank you Regards I have compiled linux kernel 6.9.3 with the patch aplied Linux version 6.9.3 (root@mirela4) (gcc (Debian 13.2.0-25) 13.2.0, GNU ld (GNU Binutils for Debian) 2.42) #1 SMP PREEMPT_DYNAMIC Thu May 30 13:37:47 EEST 2024 Is OK ! Wen you send the patch to linux kernel 6.9 and 6.10 ? I atach dmesg693.txt Thank You Regards Created attachment 306378 [details]
dmesg693.txt
Doru, I intend to send it out today or tomorrow. Doru, could you please send the output of hdparm -I /dev/sda that is -I (capital i) The output you uploaded was from: hdparm -i which unfortunately is not of much help. Created attachment 306382 [details]
hdparam-I.txt
I have upload hdparam-I.txt Linux Kernel 6.10.0-rc1 Thank You Regards Applied, thanks! [1/1] ata: libata-core: Add ATA_HORKAGE_NOLPM for AMD Radeon S3 SSD commit: 473880369304cfd4445720cdd8bae4c6f1e16e60 Best regards, Thank You Regards Created attachment 306390 [details]
dmesg-610-rc2.txt
I have upload dmesg-6.10-rc2.txt For linux kernel 6.10.0-rc2 with patch applied Is OK! Thank You Regards Today the patch is not aplied to linux kernel 6.10.0-rc1 and 6.9.3 Applied, thanks! [1/1] ata: libata-core: Add ATA_HORKAGE_NOLPM for AMD Radeon S3 SSD commit: 473880369304cfd4445720cdd8bae4c6f1e16e60 Best regards, Thank You Regards ?????????????????????? Today the patch is not aplied to linux kernel 6.10.0-rc1 and 6.9.3 Applied, thanks! [1/1] ata: libata-core: Add ATA_HORKAGE_NOLPM for AMD Radeon S3 SSD commit: 473880369304cfd4445720cdd8bae4c6f1e16e60 Best regards, Thank You Regards ?????????????????????? Please aplied these patch Thank You Regards Today the patch was aplied on linux kernel 6.10.0-rc2 ! Thank You Regards I have compiled linux kernel 6.9.4 -rw-rw-r-- 1 mirela mirela 237597394 Jun 3 15:44 linux-stable-rc-linux-6.9.y.tar.gz Greg Kroah-Hartman not applied the patch + /* AMD Radeon devices with broken LPM support */ + { "R3SL240G", NULL, ATA_HORKAGE_NOLPM }, + I have applied manual I have upload dmesg694.txt Thank You Regards Created attachment 306405 [details]
dmesg694.txt
Hello, I've a similar issue. Performance dropped with Kernel > 6.9.0. I've a CT1000BX500SSD1, downgrading to 6.8.9 fix the stability (In reply to Tkd-Alex from comment #59) > Hello, I've a similar issue. Performance dropped with Kernel > 6.9.0. I've a > CT1000BX500SSD1, downgrading to 6.8.9 fix the stability Hello Tkd-Alex, If you have a CT1000BX500SSD1, just like Aarrayy, then that should be fixed in: https://github.com/torvalds/linux/commit/86aaa7e9d641c1ad1035ed2df88b8d0b48c86b30 Which was first included in 6.10-rc2. Could you please try 6.10-rc5 and see if your issue is still there? This fix has also been backported to 6.9.xx and was first included in 6.9.6 so you should be able to try that as well. Thank you. I will try the 6.10-rc5. Please accept my apologies, but the commit https://github.com/torvalds/linux/commit/86aaa7e9d641c1ad1035ed2df88b8d0b48c86b30 does involve only the 240 gb version? Probably should be disable the LPM for all variants? CT*BX500SSD1. Mine serial is different that the one liste on libata-core.c Aha, the serial was not exactly the same. If you know how to compile your own kernel, perhaps you could try the following patch: diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index e1bf8a19b3c8..1040b684595e 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4137,6 +4137,7 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { { "PIONEER BD-RW BDR-205", NULL, ATA_HORKAGE_NOLPM }, /* Crucial devices with broken LPM support */ + { "CT1000BX500SSD1", NULL, ATA_HORKAGE_NOLPM }, { "CT500BX100SSD1", NULL, ATA_HORKAGE_NOLPM }, { "CT240BX500SSD1", NULL, ATA_HORKAGE_NOLPM }, While I like your suggestion, it would potentially enable the quirk on some Crucial drive models that might actually have working LPM. (Even though I agree that most likely all Crucial devices will have the problem.) We would need both: "CT*BX500SSD1" and "CT*BX100SSD1" or "CT*BX*SSD1" Damien, thoughts? I send a patch: https://lore.kernel.org/linux-ide/20240624132729.3001688-2-cassel@kernel.org/T/#u Testing would be appreciated. Created attachment 306505 [details]
dmesg on a kernel built with latest patch submitted by Niklas
I can confirm that your patch still works as intended Niklas!, in the log there is a message with: "ata1.00: LPM support broken, forcing max_power"
Aarrayy, thank you for the feedback. It is appreciated :) I didn't want to re-introduce problems for you. |