Bug 218198
Summary: | Suspend/Resume Regression with attached ATA devices | ||
---|---|---|---|
Product: | SCSI Drivers | Reporter: | Dieter Mummenschanz (dmummenschanz) |
Component: | Other | Assignee: | scsi_drivers-other |
Status: | NEW --- | ||
Severity: | high | CC: | Niklas.Cassel, regressions |
Priority: | P3 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | 6.7-rc1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | d035e4eb38b3ea5ae9ead342f888fd3c394b0fe0 |
Attachments: |
dmesg partial log
dmesg partial log stuck at pc2 dmesg 6.7-rc3 with device info debug logs from kernel 6.6 hdparm -I output for sda hdparm -I output for sdb dmesg with no pc8 transition dmesg with pc8 transition kernel config 6.7-rc |
Description
Dieter Mummenschanz
2023-11-28 07:06:45 UTC
On 11/28/23 16:06, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=218198 > > Bug ID: 218198 > Summary: Suspend/Resume Regression with attached ATA devices > Product: SCSI Drivers > Version: 2.5 > Hardware: Intel > OS: Linux > Status: NEW > Severity: high > Priority: P3 > Component: Other > Assignee: scsi_drivers-other@kernel-bugs.osdl.org > Reporter: dmummenschanz@web.de > Regression: No > > Hello, > > the following commit from Kernel 6.7-rc1: > > > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/diff/drivers/ata/libata-core.c?id=d035e4eb38b3ea5ae9ead342f888fd3c394b0fe0 > > introduced a regression on my system where after successful resuming the CPU > won't enter lower Package Sates below pc2 even after letting it sit for 15+ > minutes. Reverting this commit fixed the issue on my system with two ata > drives. Anyone experiencing the same issue? > > I'm happy to try any troubleshooting steps or provide more details if needed. Could you try adding this: diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 09ed67772fae..8d4871fff099 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -6185,6 +6185,9 @@ void ata_pci_shutdown_one(struct pci_dev *pdev) for (i = 0; i < host->n_ports; i++) { struct ata_port *ap = host->ports[i]; + /* Wait for EH to complete before freezing the port */ + ata_port_wait_eh(ap); + ap->pflags |= ATA_PFLAG_FROZEN; /* Disable port interrupts */ and see if this changes anything ? Beside that, I am at a loss seeing what is going on. The commit you mention essentially reverted an earlier change that is not necessary, bringing back ata_pci_shutdown_one() to the how it was for ages... When you say "successful resuming", what exactly are you talking about ? System resume from suspend-to-ram ? from hibernation (suspend to disk) ? I can always revert this revert, but I would rather understand why that is needed. Do you see any suspicious libata EH activity in dmesg ? Also, are the lower Package Sates transitions automatic or driven by the kernel PM core ? I do not know that. If it is the latter, does the pc2 state also include adapters ? Isn't it limited to the CPU power state ? I ask because if it is and libata EH is in a bad loop constantly running after resume, that would explain why you cannot reach a lower CPU power state. Thanks for the swift reply. I've applied your patch. Booted up my machine and waited until it transitions into lower package states (pc8 at the lowest). After that I closed the laptop LID and let the machine suspend to RAM (S3). After that I reopened the LID and gave the machine 1-3 minutes time to transition to lower package states which it now does. I've attached the dmesg part including your patch when the machine enters suspend. One thing is odd though: [ 109.424369] ata5.00: qc timeout after 5000 msecs (cmd 0xe0) [ 109.424397] ata5.00: STANDBY IMMEDIATE failed (err_mask=0x4) this shouldn't be there, right? Regarding automatic transitioning I'm not sure how this works. However even though I've set CONFIG_SATA_MOBILE_LPM_POLICY=3 in the kernel config, I have to call an init script explicitly forcing the scsi host to use low power when idle: for foo in /sys/class/scsi_host/host*/link_power_management_policy; do echo med_power_with_dipm > $foo; done Otherwise the machine is stuck at PC2 state always. So we have a fix I guess? Anyway if there is anything alse I can try or provide you with more debug output please let me know. Created attachment 305490 [details]
dmesg partial log
Update: Please disregard my last report. After a second longer suspend I'm stuck at pc2 again :(. logs attached. Created attachment 305491 [details]
dmesg partial log stuck at pc2
On Tue, Nov 28, 2023 at 08:24:01AM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=218198 > > --- Comment #2 from Dieter Mummenschanz (dmummenschanz@web.de) --- > Thanks for the swift reply. > > I've applied your patch. Booted up my machine and waited until it transitions > into lower package states (pc8 at the lowest). After that I closed the laptop > LID and let the machine suspend to RAM (S3). After that I reopened the LID > and > gave the machine 1-3 minutes time to transition to lower package states which > it now does. > I've attached the dmesg part including your patch when the machine enters > suspend. One thing is odd though: > > [ 109.424369] ata5.00: qc timeout after 5000 msecs (cmd 0xe0) > [ 109.424397] ata5.00: STANDBY IMMEDIATE failed (err_mask=0x4) > > this shouldn't be there, right? > Hello Dieter, I took a look at your logs, but they are very stripped. It would be nice if you could test with latest v6.7-rcX. And then provide these messages at boot: [ 50.101909] ahci 0000:00:17.0: flags: 64bit ncq sntf pm led clo only pio slum part ems deso sadm sds apst [ 50.375109] ata10: SATA max UDMA/133 abar m524288@0xa5700000 port 0xa5700480 irq 270 lpm-pol 4 [ 50.783496] ata10.00: Features: Dev-Sleep NCQ-sndrcv NCQ-prio Because, for devsleep to be enabled, you need support in: 1) The SATA device 2) The SATA controller 3) The SATA port (even if the controller supports devsleep, not all SATA ports have to support it) For devsleep to get enabled, you need "sadm sds" in the SATA controller print, and "Dev-Sleep" in the SATA device print. Unfortunately, there does not seem to be a print for the port. Additionally, your lpm-policy (lpm-pol) has to be either ATA_LPM_MIN_POWER or ATA_LPM_MIN_POWER_WITH_PARTIAL (i.e. lpm-pol has to print either 4 or 5). > Regarding automatic transitioning I'm not sure how this works. However even > though I've set CONFIG_SATA_MOBILE_LPM_POLICY=3 in the kernel config, I have > to > call an init script explicitly forcing the scsi host to use low power when > idle: Note that even if you have LPM_POLICY=3 (ATA_LPM_MED_POWER_WITH_DIPM) in your Kconfig, ahci_update_initial_lpm_policy() will possibly override this by default to either ATA_LPM_MIN_POWER_WITH_PARTIAL or ATA_LPM_MIN_POWER, see: https://github.com/torvalds/linux/blob/master/drivers/ata/ahci.c#L1639 The best way is to show the: [ 50.375109] ata10: SATA max UDMA/133 abar m524288@0xa5700000 port 0xa5700480 irq 270 lpm-pol 4 print as this is printed after ahci_update_initial_lpm_policy(). The reason why many platforms do this override, is because: "One of the requirement for modern x86 system to enter lowest power mode (SLP_S0) is SATA IP block to be off. This is true even during when platform is suspended to idle and not only in opportunistic (runtime) suspend." "SATA IP block doesn't get turned off till SATA is in DEVSLP mode. Here user has to either use scsi-host sysfs or tools like powertop to set the sata-host link_power_management_policy to min_power." See: https://github.com/torvalds/linux/commit/b1a9585cc396cac5a9e5a09b2721f3b8568e62d0 > > for foo in /sys/class/scsi_host/host*/link_power_management_policy; > do echo med_power_with_dipm > $foo; > done I would really not recommend you doing this, because when you force set lpm policy via sysfs, ahci_update_initial_lpm_policy() is not called, so if your platform requires ATA_LPM_MIN_* to enter lower power states, you forcing lpm-policy to ATA_LPM_MED_POWER_WITH_DIPM will ensure that you never enter lower power states. Looking at your logs, we see: [ 2022.700556] ahci 0000:00:17.0: port does not support device sleep Which comes from: https://github.com/torvalds/linux/blob/master/drivers/ata/libahci.c#L2260 So it appears that your port does not support devsleep... PxDEVSLP.DSP (in AHCI specification) is the bit that determines if devsleep is supported for a specific port. This bit is initialized by BIOS. So this could be a BIOS bug... But you said that it works if you revert Damien's patch... So the question is, is PxDEVSLP.DSP always 0? (Even on the first boot, before you have done any suspend/resume). If so, it could be a BIOS bug... Perhaps you could test this patch: And show us all the prints, and tell us which prints are before/after the suspend/resume. --- a/drivers/ata/libahci.c +++ b/drivers/ata/libahci.c @@ -2255,6 +2255,9 @@ static void ahci_set_aggressive_devslp(struct ata_port *ap, bool sleep) int rc; unsigned int err_mask; + dev_info(ap->host->dev, "setting devsleep to: %d port support %d\n", + sleep, readl(port_mmio + PORT_DEVSLP)); + devslp = readl(port_mmio + PORT_DEVSLP); if (!(devslp & PORT_DEVSLP_DSP)) { dev_info(ap->host->dev, "port does not support device sleep\n"); You mentioned that it works when you revert Damien's patch. It could be interesting to see these prints, before/after the suspend/resume both with and without the revert. I would expect us to be able to read PxDEVSLP even when the SATA device is in suspend state... I could imagine that we could get a bogus value back from the read if the from the SATA controller itself is in a suspend state, but I don't see how Damien's patch that you bisected to: https://github.com/torvalds/linux/commit/fd3a6837d8e18cb7be80dcca1283276290336a7a changed any of that. It does touch ata_pci_shutdown_one(), which should only get called on shutdown... not suspend/resume AFAICT. So the fact that this patch changes things for you is weird in the first place. Damien, is it possible that ata_pci_shutdown_one() is incorrectly called during suspend/resume? Got any ideas? Kind regards, Niklas (In reply to Niklas.Cassel from comment #6) Hello Niklas, thanks for looking into this. > It would be nice if you could test with latest v6.7-rcX. Okay I took the latest 6.7-rc3 from Linux's git and included your patch. Booted up my Laptop, put it back to sleep and brought it up again. See attached log. I've disabed my link_power_management_policy override so all the time my system was stuck at pc2 package state according to powertop. > For devsleep to get enabled, you need "sadm sds" in the SATA controller Yep I can see that: [ 0.379531] ahci 0000:00:17.0: flags: 64bit ncq sntf stag pm clo only pio slum part ems sxs deso sadm sds apst > and "Dev-Sleep" in the SATA device print. I guess this is it? [ 1.031169] ata5.00: Features: Trust Dev-Sleep NCQ-sndrcv > Additionally, your lpm-policy (lpm-pol) has to be either ATA_LPM_MIN_POWER or > ATA_LPM_MIN_POWER_WITH_PARTIAL (i.e. lpm-pol has to print either 4 or 5). I'm afraid this is not the case here: [ 0.399005] ata1: SATA max UDMA/133 abar m2048@0x42233000 port 0x42233100 irq 125 lpm-pol 0 > Note that even if you have LPM_POLICY=3 (ATA_LPM_MED_POWER_WITH_DIPM) in your > Kconfig, ahci_update_initial_lpm_policy() will possibly override this by > default That would explain why I'm seeing "max_performance" in link_power_management_policy without overriding it. > The reason why many platforms do this override, is because: > "One of the requirement for modern x86 system to enter lowest power mode > (SLP_S0) is SATA IP block to be off. This is true even during when > platform is suspended to idle and not only in opportunistic (runtime) > suspend." > > "SATA IP block doesn't get turned off till SATA is in DEVSLP mode. Here > user has to either use scsi-host sysfs or tools like powertop to set > the sata-host link_power_management_policy to min_power." > > See: > https://github.com/torvalds/linux/commit/ > b1a9585cc396cac5a9e5a09b2721f3b8568e62d0 Thanks for the insight. > I would really not recommend you doing this, because when you force set > lpm policy via sysfs, ahci_update_initial_lpm_policy() is not called, > so if your platform requires ATA_LPM_MIN_* to enter lower power states, > you forcing lpm-policy to ATA_LPM_MED_POWER_WITH_DIPM will ensure that > you never enter lower power states. I understand however I do not know of any other way to enable lower package states on my machine. > So it appears that your port does not support devsleep... Anything we can do to preserve the old bwhaviour until 6.6? > PxDEVSLP.DSP (in AHCI specification) is the bit that determines if devsleep > is supported for a specific port. This bit is initialized by BIOS. > > So this could be a BIOS bug... > But you said that it works if you revert Damien's patch... Yes, I had no issues up to 6.7-rc1. > Perhaps you could test this patch: Yweah, see logs. > You mentioned that it works when you revert Damien's patch. > It could be interesting to see these prints, before/after the > suspend/resume both with and without the revert. I've attached the dmesg logs from kernel 6.6. Thanks again for your help Dieter Created attachment 305512 [details]
dmesg 6.7-rc3 with device info debug
Created attachment 305513 [details]
logs from kernel 6.6
Niklas Cassel <Niklas.Cassel@wdc.com> writes: > The reason why many platforms do this override, is because: > "One of the requirement for modern x86 system to enter lowest power mode > (SLP_S0) is SATA IP block to be off. This is true even during when > platform is suspended to idle and not only in opportunistic (runtime) > suspend." > > "SATA IP block doesn't get turned off till SATA is in DEVSLP mode. Here > user has to either use scsi-host sysfs or tools like powertop to set > the sata-host link_power_management_policy to min_power." What? I'm confused. Here are several things that come to mind that when taken together with this statement confuse me. Perhaps I am wrong about one or more of them? 1) DEVSLP is an odd thing they invented to have ACPI twiggle a GPIO bit to cut the power rails to CDROM drives that aren't in active use. 2) SATA ALPM has the SATA controller automatically transition the sata link to lower power states when it isn't being used, and back again on use. 3) The SATA controller can not be suspended until all of its ports are suspended. 4) Suspending a SATA port does not happen during ALPM, but rather only either during system suspend, or when you enable runtime pm on the disk and the ata port in sysfs ( both are disabled by default ). A quick google led me to this: https://smarthdd.com/device_sleep.htm Which indicates that this DEVSLP thing is now being used to send an OOB signal to a SATA SSD so that it can power down its PHY completely, but isn't that what ATA SLEEP is for? aka. hdparm -Y. That command tells the drive that it can power down its PHY and then the host can power down its PHY and it takes a sata link reset to wake the drive back up. The link reset is easy to detect with minimum circuitry that is far simpler than the full sata PHY. It sounds like somebody decided to hack an OOB signal into ALPM rather than use ATA SLEEP, but why? On Wed, Nov 29, 2023 at 01:47:16PM -0500, Phillip Susi wrote: > Niklas Cassel <Niklas.Cassel@wdc.com> writes: > > > The reason why many platforms do this override, is because: > > "One of the requirement for modern x86 system to enter lowest power mode > > (SLP_S0) is SATA IP block to be off. This is true even during when > > platform is suspended to idle and not only in opportunistic (runtime) > > suspend." > > > > "SATA IP block doesn't get turned off till SATA is in DEVSLP mode. Here > > user has to either use scsi-host sysfs or tools like powertop to set > > the sata-host link_power_management_policy to min_power." Note that I copied this text from the original patch authored by someone at Intel. I've seen similar claims from people at AMD. > > What? I'm confused. Here are several things that come to mind that > when taken together with this statement confuse me. Perhaps I am wrong > about one or more of them? > > 1) DEVSLP is an odd thing they invented to have ACPI twiggle a GPIO bit > to cut the power rails to CDROM drives that aren't in active use. > > 2) SATA ALPM has the SATA controller automatically transition the sata > link to lower power states when it isn't being used, and back again on > use. Yes, the HBA can initiate a transition to DevSleep automatically using ALPM, however, the specs says that: -DevSleep is disabled by the device on power up (and has to be enabled by a SET FEATURES command). -PxDEVSLP.ADSE (Aggressive Device Sleep Enable) for the port is 0 on reset (the kernel resets the HBA on boot, so ADSE will be disabled for the port). (DevSleep is never initiated by the device.) The kernel will only do: -SET FEATURES to enable the DevSleep feature on the device, and -Set PxDEVSLP.ADSE to 1 If _all_ of the following are true: 1) SADM and SDS bits are set in the HBA 2) the device reports that it supports DevSleep 3) PxDEVSLP.DSP is set for the port 4) the kernel LPM policy is either ATA_LPM_MIN_POWER or ATA_LPM_MIN_POWER_WITH_PARTIAL See: https://github.com/torvalds/linux/blob/v6.7-rc3/drivers/ata/libahci.c#L2322-L2324 This means that if any of 1-5 it not true, the HBA ALPM will never transition the device into DevSleep state. AFAICT, the HBA asserts the DEVSLP signal as long as being in DevSleep state. To exit from DevSleep state you can just issue an I/O like normal using PxCI, or write to PxCMD.ICC to force a state change. > 3) The SATA controller can not be suspended until all of its ports are > suspended. > > 4) Suspending a SATA port does not happen during ALPM, but rather only > either during system suspend, or when you enable runtime pm on the disk > and the ata port in sysfs ( both are disabled by default ). > > A quick google led me to this: https://smarthdd.com/device_sleep.htm > > Which indicates that this DEVSLP thing is now being used to send an OOB > signal to a SATA SSD so that it can power down its PHY completely, but > isn't that what ATA SLEEP is for? aka. hdparm -Y. That command tells > the drive that it can power down its PHY and then the host can power > down its PHY and it takes a sata link reset to wake the drive back up. > The link reset is easy to detect with minimum circuitry that is far > simpler than the full sata PHY. > > It sounds like somebody decided to hack an OOB signal into ALPM rather > than use ATA SLEEP, but why? I guess you need to ask Intel. I assume that their firmware simply requires the DEVSLP signal to be asserted in order to enter lower CPU power states. If you have a signal, it is easy for the HW logic to detect if the signal is set or not. If you just send a command to the device, if it not easy for HW logic to determine which state is in. It would need to read some registers or similar. Sounds way more complex than just having a logic gate. Kind regards, Niklas bugzilla-daemon@kernel.org writes: > I guess you need to ask Intel. > > I assume that their firmware simply requires the DEVSLP signal to be > asserted in order to enter lower CPU power states. I would say it's the hardware that requires the ata link to be powered down, not the firmware. They seem to have decided that the way to get the hardware to power down was to add a new state to ALPM instead of using runtime pm to actually disable the port. I would love to hear from Intel why they went this route. > If you just send a command to the device, if it not easy for HW logic > to determine which state is in. It would need to read some registers > or similar. Sounds way more complex than just having a logic gate. I'm not quite sure what you are getting at here. The point of the ATA SLEEP command is to inform the device that it should power down everything, including the SATA PHY, and only keep a simple logic gate active that can detect the big and obvious RESET signal from the host. It seems to accomplish the same goal as DEVSLP without requiring hacking in another logic line somewhere that wasn't meant for that purpose. Maybe it helps to mention that if I issue hdparm -Y /dev/sda && hdparm -Y /dev/sdb a couple of seconds after successful resume my system transitions down to pc8. hdparm manual doesn't explicitly tell what command is issued with the -Y switch. bugzilla-daemon@kernel.org writes: > a couple of seconds after successful resume my system transitions down to > pc8. > hdparm manual doesn't explicitly tell what command is issued with the -Y > switch. SLEEP. One problem with using it though is that several things tend to wake the drive up from it soon. I've had patches to address this for years. Maybe I'll finally get them merged soon. Those things are: 1) smartd or udisks2 polling the SMART status of the drive. They already issue CHECK POWER CONDITION first to see if the drive is in STANDBY and if so, don't bother with the SMART read. That avoids waking up a disk in STANDBY mode, but in SLEEP, you can't even ask the drive whether it's in standby without waking it up. 2) A mounted filesystem periodically issues a FLUSH CACHE command, even if nothing has been written. A drive in STANDBY just ignores it but in SLEEP, it wakes up. I would imagine at once you access the disk in some way and so it wakes up, you won't get back to pc8 again without another hdparm -Y? Is that correct? (In reply to Phillip Susi from comment #14) > bugzilla-daemon@kernel.org writes: > > > 2) A mounted filesystem periodically issues a FLUSH CACHE command, even > if nothing has been written. A drive in STANDBY just ignores it but in > SLEEP, it wakes up. > > I would imagine at once you access the disk in some way and so it wakes > up, you won't get back to pc8 again without another hdparm -Y? Is that > correct? No or at least I don't see this behaviour. However the drives occasional seem to have some trouble waking up after resume and hdparm -Y: [59984.831894] ata5.00: exception Emask 0x0 SAct 0x2000000 SErr 0x10000 action 0x6 [59984.831907] ata5.00: waking up from sleep [59984.831914] ata5: hard resetting link [59985.145261] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) [59985.148840] ata5.00: supports DRM functions and may not be fully accessible [59985.153322] ata5.00: supports DRM functions and may not be fully accessible [59985.156822] ata5.00: configured for UDMA/133 [59985.166903] ahci 0000:00:17.0: port does not support device sleep [59985.167045] ata5: EH complete [59985.167268] ata5.00: Enabling discard_zeroes_data When that happens the system is stuck in pc2/pc3 for a couple of minutes but then transitions back in partial to pc8. Regarding your patches: Anything I can test for you? :) bugzilla-daemon@kernel.org writes: > No or at least I don't see this behaviour. However the drives occasional seem > to have some trouble waking up after resume and hdparm -Y: That's strange. Once the drive is woken back up that should once again, prevent pc8, just like it did before being put to SLEEP. > [59984.831894] ata5.00: exception Emask 0x0 SAct 0x2000000 SErr 0x10000 > action > 0x6 > [59984.831907] ata5.00: waking up from sleep > [59984.831914] ata5: hard resetting link > [59985.145261] ata5: SATA link up 6.0 Gbps (SStatus 133 SControl 300) > [59985.148840] ata5.00: supports DRM functions and may not be fully > accessible > [59985.153322] ata5.00: supports DRM functions and may not be fully > accessible > [59985.156822] ata5.00: configured for UDMA/133 > [59985.166903] ahci 0000:00:17.0: port does not support device sleep > [59985.167045] ata5: EH complete > [59985.167268] ata5.00: Enabling discard_zeroes_data That's normal. The way to wake the drive up from SLEEP is to issue a link hard reset. That's just the kernel showing that it's doing that. At least the first part is normal, I'm not sure about the bit about DRM functions. The bit about the port not supporting device sleep seems to be your main issue with it not getting to pc8 using ALPM. > Regarding your patches: Anything I can test for you? :) Keep an eye on the libata list. Hopefully I'll clean them up and post them in the next few days. Initially it will just be to keep hdparm -Y from being woken up for silly reasons. Sorting out the bigger problems with runtime pm so that the kernel can automatically put things to sleep when idle is going to be a bigger issue than I first thought. What's the status of this? I have this in my list of tracked regressions and it seems nothing happened for about a week -- but I might have missed something? Or is this not considered a regression for some good reason? Hello Dieter, I'm really sorry for missing your reply. The updates from bugzilla were not set to the libata mailing list (only to the scsi mailing list). Please consider writing to the libata mailing list instead of using bugzilla, I'm quite sure that you will get more eyes on your problem that way. On Wed, Nov 29, 2023 at 06:10:20PM +0000, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=218198 > > --- Comment #7 from Dieter Mummenschanz (dmummenschanz@web.de) --- > (In reply to Niklas.Cassel from comment #6) > > Hello Niklas, > > thanks for looking into this. > > > It would be nice if you could test with latest v6.7-rcX. > > Okay I took the latest 6.7-rc3 from Linux's git and included your patch. > Booted > up my Laptop, put it back to sleep and brought it up again. See attached log. > > I've disabed my link_power_management_policy override so all the time my > system > was stuck at pc2 package state according to powertop. > > > For devsleep to get enabled, you need "sadm sds" in the SATA controller > > Yep I can see that: > [ 0.379531] ahci 0000:00:17.0: flags: 64bit ncq sntf stag pm clo only pio > slum part ems sxs deso sadm sds apst Ok, good 1/4. > > > and "Dev-Sleep" in the SATA device print. > > I guess this is it? > [ 1.031169] ata5.00: Features: Trust Dev-Sleep NCQ-sndrcv Yes, good, 2/4. > > > Additionally, your lpm-policy (lpm-pol) has to be either ATA_LPM_MIN_POWER > or > > ATA_LPM_MIN_POWER_WITH_PARTIAL (i.e. lpm-pol has to print either 4 or 5). > > I'm afraid this is not the case here: > [ 0.399005] ata1: SATA max UDMA/133 abar m2048@0x42233000 port 0x42233100 > irq 125 lpm-pol 0 It appears that you have built your kernel with: CONFIG_SATA_MOBILE_LPM_POLICY=0 All major distros use: CONFIG_SATA_MOBILE_LPM_POLICY=3 Could you please try with CONFIG_SATA_MOBILE_LPM_POLICY=3 ? If you build with CONFIG_SATA_MOBILE_LPM_POLICY=0, then set_lpm() in libata will never be called, which means that we will never enable (or disable devsleep), so you will use whatever your boot firmware has configured. (And boot firmware might have enabled devsleep in the device.) In your v6.7-rcX dmesg, we don't see any: "port does not support device sleep" or "setting devsleep to %d" (from my debug patch) prints in your dmesg from v6.7-rcX, so set_lpm() is never called at all, most likely because you have built with CONFIG_SATA_MOBILE_LPM_POLICY=0, so DevSleep could be enabled if platform firmware enabled it. In your v6.6 dmesg, we see: "port does not support device sleep" (but not "setting devsleep to %d" from my debug patch, so you probably forgot to apply it to your v6.6 kernel). However, the fact that you see "port does not support device sleep" suggests that you either built your kernel with CONFIG_SATA_MOBILE_LPM_POLICY set to something other than 0, or you have manually overriden the policy via sysfs. Note that since your platform firmware claims that the port does not support DevSleep (PxDEVSLP.DSP is set to 0), this means that ahci_set_aggressive_devslp() will return early: https://github.com/torvalds/linux/blob/v6.6/drivers/ata/libahci.c#L2258-L2262 So it will neither enable nor disable DevSleep in the device, so DevSleep could be enabled if platform firmware enabled it. In both of your cases, it looks like you should have DevSleep set to whatever platform firmware has set it to. But you say that, for these logs, v6.6 can enter low CPU power states, but v6.7-rcX can not? You can run: hdparm -I /dev/<your_device> to see if DEVSLP is enabled in your device (hdparm prints a * in front of the feature if the feature is enabled). It could also be worth checking your BIOS, I've seen some cases where you had to enable aggressive devsleep in BIOS for it to get enabled for the port. > > > Note that even if you have LPM_POLICY=3 (ATA_LPM_MED_POWER_WITH_DIPM) in > your > > Kconfig, ahci_update_initial_lpm_policy() will possibly override this by > > default > > That would explain why I'm seeing "max_performance" in > link_power_management_policy without overriding it. The sysfs reporting for LPM is really broken in libata... $ git grep max_performance drivers/ata/libata-sata.c drivers/ata/libata-sata.c: [ATA_LPM_UNKNOWN] = "max_performance", drivers/ata/libata-sata.c: [ATA_LPM_MAX_POWER] = "max_performance", So it reports "max_performance" both for LPM_POLICY == 0 (ATA_LPM_UNKNOWN) and LPM_POLICY == 1 (ATA_LPM_MAX_POWER). In your case, it is because of ATA_LPM_UNKNOWN, which means use whatever platform firmware has configured. ahci_update_initial_lpm_policy() will only override your policy if your LPM_POLICY >= 3, so it will not override it for your case. > > I would really not recommend you doing this, because when you force set > > lpm policy via sysfs, ahci_update_initial_lpm_policy() is not called, > > so if your platform requires ATA_LPM_MIN_* to enter lower power states, > > you forcing lpm-policy to ATA_LPM_MED_POWER_WITH_DIPM will ensure that > > you never enter lower power states. > > I understand however I do not know of any other way to enable lower package > states on my machine. I do not understand why that helps you. If you set: ATA_LPM_MED_POWER_WITH_DIPM (LPM_POLICY=3) on v6.6 in sysfs, you can enter low CPU power states? If you set CONFIG_SATA_MOBILE_LPM_POLICY=3 on v6.6, can you enter low CPU power states? Looking at AHCI 1.3.1 PM:Aggr: State PM:DevSleep is gated with: PxDEVSLP.ADSE = ‘1’ and CAP2.SDS = ‘1’ and CAP2.SADM = ‘1’ and PxDEVSLP.DSP = 1’ and PxSCTL.IPM != ‘4h’ and PxSCTL.IPM != ‘5h’ and PxSCTL.IPM != ‘6h’ and PxSCTL.IPM != ‘7h’ and CAP2.DESO = ‘0’ and pDitoTimeout = ‘1’ So AFAICT, ALPM by the HBA should never be able to transition to DevSleep if PxDSP.DSP == 0. Perhaps your platform actually does NOT support devsleep and simply requires the device to enter slumber or partial in order to enter the lower CPU power states? We could test this by you disabling devslp on the device via hdparm, and see if you can still enter the lower CPU power states. Kind regards, Niklas Hallo Niklas, thanks for getting back to me on that. > Please consider writing to the libata mailing list instead of using > bugzilla, I'm quite sure that you will get more eyes on your problem > that way. Could point me to that mailing list please. I can't seem to find it. Do you want me to repost the issue there or is it okay to just copy & past the messages from this bug? > It appears that you have built your kernel with: > CONFIG_SATA_MOBILE_LPM_POLICY=0 Definately not! I've attached my kernel config. LPM policy is definately set tp "3" CONFIG_SATA_MOBILE_LPM_POLICY=3. Thats puzzles me because after booting the link_power_management_policy is set to performance. Somehow the kernel overrides the config or uses parameters suggested by the BIOS I don't know. I've also tried to to switch the ASPM polity to Powersave instead of BIOS default. Didn't help. > In your v6.7-rcX dmesg, we don't see any: > "port does not support device sleep" or > "setting devsleep to %d" (from my debug patch) > prints in your dmesg from v6.7-rcX, so set_lpm() is never called at all, > most likely because you have built with CONFIG_SATA_MOBILE_LPM_POLICY=0, > so DevSleep could be enabled if platform firmware enabled it. The print is not called, right but not because of CONFIG_SATA_MOBILE_LPM_POLICY=0 since It is set to "3". However when I force link_power_management_policy to "echo med_power_with_dipm" the message appears in dmesg: [ 9.434663] ahci 0000:00:17.0: setting devsleep to: 0 port support 0 [ 9.434668] ahci 0000:00:17.0: port does not support device sleep [ 9.434674] ahci 0000:00:17.0: setting devsleep to: 0 port support 0 [ 9.434678] ahci 0000:00:17.0: port does not support device sleep so somehow the LPM Policy is either not set or reset by the kernel I assume. > However, the fact that you see "port does not support device sleep" > suggests that you either built your kernel with > CONFIG_SATA_MOBILE_LPM_POLICY set to something other than 0, > or you have manually overriden the policy via sysfs. > > Note that since your platform firmware claims that the port does > not support DevSleep (PxDEVSLP.DSP is set to 0), this means that > ahci_set_aggressive_devslp() will return early: > https://github.com/torvalds/linux/blob/v6.6/drivers/ata/libahci.c#L2258-L2262 > So it will neither enable nor disable DevSleep in the device, > so DevSleep could be enabled if platform firmware enabled it. > > > In both of your cases, it looks like you should have DevSleep set > to whatever platform firmware has set it to. > But you say that, for these logs, v6.6 can enter low CPU power states, > but v6.7-rcX can not? Not quite. In 6.6 the PC8 transition worked only when forcing link_power_management_policy to "echo med_power_with_dipm". The policy was still performance after bootup like it is in 6.7-rc. The only difference I see is that after resume my laptop won't enter PC8 again. For 6.7-rc I have to call hdparm -Y for sda and sdb after resume to enable the transition. > You can run: > hdparm -I /dev/<your_device> > to see if DEVSLP is enabled in your device (hdparm prints a * in front > of the feature if the feature is enabled). Attached to this bug for both (identical) drives. > It could also be worth checking your BIOS, I've seen some cases where you had > to enable aggressive devsleep in BIOS for it to get enabled for the port. My Laptop BIOS (INSYDE) does not offer any knobs to tune dleep for any ATA or PCI devices. Only thing I have set is the powersave profile. > The sysfs reporting for LPM is really broken in libata... > > $ git grep max_performance drivers/ata/libata-sata.c > drivers/ata/libata-sata.c: [ATA_LPM_UNKNOWN] = > "max_performance", > drivers/ata/libata-sata.c: [ATA_LPM_MAX_POWER] = > "max_performance", > > So it reports "max_performance" both for LPM_POLICY == 0 (ATA_LPM_UNKNOWN) > and LPM_POLICY == 1 (ATA_LPM_MAX_POWER). > > In your case, it is because of ATA_LPM_UNKNOWN, which means use whatever > platform firmware has configured. Thats confusing :( > ahci_update_initial_lpm_policy() will only override your policy if your > LPM_POLICY >= 3, so it will not override it for your case. > > I understand however I do not know of any other way to enable lower package > > states on my machine. > > I do not understand why that helps you. Me neither but it works. :( > If you set: > ATA_LPM_MED_POWER_WITH_DIPM (LPM_POLICY=3) on v6.6 in sysfs, > you can enter low CPU power states? > If you set > CONFIG_SATA_MOBILE_LPM_POLICY=3 on v6.6, > can you enter low CPU power states? As described above I'm doing this for years during bootup otherwise I'm stuck at PC2. > > Looking at AHCI 1.3.1 > PM:Aggr: > > State PM:DevSleep is gated with: > > PxDEVSLP.ADSE = ‘1’ and CAP2.SDS = ‘1’ and CAP2.SADM > = ‘1’ and PxDEVSLP.DSP = 1’ and PxSCTL.IPM != ‘4h’ and > PxSCTL.IPM != ‘5h’ and PxSCTL.IPM != ‘6h’ and PxSCTL.IPM > != ‘7h’ and CAP2.DESO = ‘0’ and pDitoTimeout = ‘1’ > > So AFAICT, ALPM by the HBA should never be able to transition > to DevSleep if PxDSP.DSP == 0. > > Perhaps your platform actually does NOT support devsleep and > simply requires the device to enter slumber or partial in order > to enter the lower CPU power states? How can I tell? > We could test this by you disabling devslp on the device via hdparm, > and see if you can still enter the lower CPU power states. Okay and how do I so that? Kind regards, Dieter Created attachment 305626 [details]
hdparm -I output for sda
hdparm -I output for sda
Created attachment 305627 [details]
hdparm -I output for sdb
hdparm -I output for sdb
Created attachment 305628 [details]
dmesg with no pc8 transition
dmesg with no pc8 transition
Created attachment 305629 [details]
dmesg with pc8 transition
dmesg with pc8 transition
Created attachment 305630 [details]
kernel config 6.7-rc
kernel config 6.7-rc
On 2023/12/18 19:18, bugzilla-daemon@kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=218198 > > --- Comment #19 from Dieter Mummenschanz (dmummenschanz@web.de) --- > Hallo Niklas, > > thanks for getting back to me on that. > >> Please consider writing to the libata mailing list instead of using >> bugzilla, I'm quite sure that you will get more eyes on your problem >> that way. > > Could point me to that mailing list please. I can't seem to find it. Do you > want me to repost the issue there or is it okay to just copy & past the > messages from this bug? For any subsystem, you can see the emails to use by running: scripts/get_maintainer.pl So for ATA: scripts/get_maintainer.pl drivers/ata/ Damien Le Moal <dlemoal@kernel.org> (maintainer:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers)) linux-ide@vger.kernel.org (open list:LIBATA SUBSYSTEM (Serial and Parallel ATA drivers)) linux-kernel@vger.kernel.org (open list) Given the high traffic, no need to add linux-kernel@vger.kernel.org. But please add linux-ide@vger.kernel.org to the bugzilla case. And also please add Niklas and myself. Niklas Cassel <Niklas.Cassel@wdc.com> Damien Le Moal <dlemoal@kernel.org> Please also include the regressions list: Linux kernel regressions list <regressions@lists.linux.dev> Dieter (or Damien/Niklas), what is the status of this? Was this fixed? |