Bug 214035

Summary: acpi_turn_off_unused_power_resources() may take down necessary hardware
Product: ACPI Reporter: Sam Edwards (CFSworks)
Component: Power-OtherAssignee: Rafael J. Wysocki (rjw)
Status: CLOSED CODE_FIX    
Severity: high CC: antdev66, bczhc0, bjorn, c.sawczuk, ilari.nieminen, mastag, rjw, rui.zhang, t.widmo
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.14.0 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: acpidump
dmesg output up to when NVMe driver is running (with acpi_turn_off_unused_power_resources disabled)

Description Sam Edwards 2021-08-10 23:01:27 UTC
The commit 7e4fdea changes the ACPI power system's initialization to turn off any unused power resources, even ones in an unknown state. This essentially has the effect of ensuring that all unused power resources are turned off, which is great in theory.

I have recently encountered an issue on 5.13.0+ where this behavior is taking down the NVMe SSD before the driver can initialize, resulting in the following in the kernel log:
acpi LNXPOWER:02: Turning OFF
...
pci 0000:02:00.0: CLS mismatch (64 != 1020), using 64 bytes
...
nvme 0000:02:00.0: can't change power state from D3hot to D0 (config space
inaccessible)

(See bug 214025 for a discussion about making this error message more clear)

What is happening is that the LNXPOWER:02 resource is controlling the power to the PCIe port where the NVMe SSD is attached, but no other ACPI object is claiming this in power_resources_D*, and so:
$ cat /sys/bus/acpi/devices/LNXPOWER:02/resource_in_use
0

This causes the acpi_turn_off_unused_power_resources() function to believe that the resource is fair game and turn off the PCIe port, between the time that the PCIe device is discovered and the time that the driver gets a chance to probe the device.

I'm currently working around this by bypassing acpi_turn_off_unused_power_resources() entirely, but a proper fix will require flagging the power resource as "in use." I don't know whether this is a problem with the device's ACPI or if Linux should be claiming all LNXPOWER:* resources under each PCI bridge's firmware_node.

Happy to do any additional debugging steps.
Comment 1 Bjorn Helgaas 2021-08-10 23:35:57 UTC
https://git.kernel.org/linus/7e4fdeafa61f ("ACPI: power: Turn off unused power resources unconditionally")

Is this a regression caused by 7e4fdeafa61f, i.e., did this device work correctly in v5.12 and broke in v5.13?  If so, this is a much higher priority problem.

Could you attach the complete dmesg log and the output of acpidump?
Comment 2 Sam Edwards 2021-08-10 23:43:51 UTC
Created attachment 298279 [details]
acpidump
Comment 3 Sam Edwards 2021-08-10 23:44:46 UTC
Created attachment 298281 [details]
dmesg output up to when NVMe driver is running (with acpi_turn_off_unused_power_resources disabled)
Comment 4 Sam Edwards 2021-08-10 23:49:28 UTC
I do know that this worked fine on a 5.12.x kernel and the issue appeared when I attempted to boot 5.13.0. I can see if 7e4fdeafa61f itself introduced the problem (by trying a boot with 7e4fdeafa61f and 7e4fdeafa61f^) if desired.

I'll spend some time configuring a more minimalist kernel that I can kexec to test patches and do any additional debug steps.
Comment 5 Sam Edwards 2021-08-11 01:26:18 UTC
The improper device poweroff DOES happen with 7e4fdeafa61f, but NOT with 4b9ee772eaa8 (7e4fdea's parent commit).

So it's a regression with 7e4fdeafa61f, although the resource_in_use=0 is not new.
Comment 6 Antonio 2021-08-21 20:48:12 UTC
I don't know if there is a correlation with this problem, but with the kernel 5.13.x my notebook (asus) sometimes does not perform turn off and I have to do it manually. Also, while it is locked after last kernel log message, the notebook gets overheats a lot.
I will try to recompile the kernel without this patch to check if it resolve.
Comment 7 Sam Edwards 2021-08-31 04:41:37 UTC
This issue has now made it into the 5.14.0 release
Comment 8 Gino Badouri 2021-09-15 07:30:43 UTC
Hi there,

I'm also affected by this bug.
Using kernel 5.11 my NVME drive was detected properly.
Now using 5.13 or 5.14 I'm getting:

pci 0000:02:00.0: CLS mismatch (64 != 1020), using 64 bytes
nvme 0000:02:00.0: can't change power state from D3hot to D0 (config space inaccessible)
Comment 9 Gino Badouri 2021-09-17 09:35:27 UTC
Hello,

A small update.
I've replaced the SSSTC nvme ssd with another one from WD and the same thing happens.
So for the record, the first (internal) one is now from WD and the second one is the Samsung EVO 970 ssd.

So it seems the nvme ssd type/brand is not to blame here.
It just doesn't initialize properly when it's in the first slot regardless of the brand.
A rather large regression..

If you don't have a second ssd installed, you could try to move it to the optional slot which is right above it.
Or optionally enable the Intel VMD Rapid Storage chipset (you don't have to create a RAID set), but that will require you to reinstall Windows with the floppy driver from Intel.
Comment 10 Zhang Rui 2021-10-11 02:45:49 UTC
(In reply to Sam Edwards from comment #7)
> This issue has now made it into the 5.14.0 release

There are several improvements after commit 7e4fdeafa61f, and they're all shipped before 5.14.

6381195ad7d0 ACPI: power: Rework turning off unused power resources
9b7ff25d129d ACPI: power: Refine turning off unused power resources
29038ae2ae56 Revert "Revert "ACPI: scan: Turn off unused power resources during initialization""
5db91e9cb5b3 Revert "ACPI: scan: Turn off unused power resources during initialization"
7e4fdeafa61f ACPI: power: Turn off unused power resources unconditionally

So do you mean that the problem still occurs in 5.14 final release?
Comment 11 Sam Edwards 2021-10-11 03:21:44 UTC
> So do you mean that the problem still occurs in 5.14 final release?

Yes, precisely that.
Comment 12 Christopher Sawczuk 2021-11-22 21:49:20 UTC
Trying to make sense of the conversation here- I have nvme hardware that appears to be affected by this issue. 

There appears to be no kernel command line arguments that can be used to work around the issue and it's still present in the latest release of the kernel?

Would it not be better to revert the problematic change? It's almost been a quarter of a year at this point.

I'd be happy to help if possible.
Comment 13 Rafael J. Wysocki 2021-11-23 19:06:55 UTC
So can you test 5.16-rc2, please?

This has been reworked in 5.15 and 5.16-rc.
Comment 14 Gino Badouri 2021-11-23 21:08:18 UTC
Sorry to reply so late.
But for me the problem resolved itself after upgrading to kernel 5.15.

Both my nvme drives are being detected at all times now.

No need to enable the Intel VMD Rapid Storage controller any longer.
Comment 15 Christopher Sawczuk 2021-11-26 02:35:10 UTC
I can confirm that with 5.15.5 I am now able to boot.
Comment 16 tomjan 2021-11-26 08:03:25 UTC
Same here - had this problem on 5.13.0, on 5.15.4 the SSD is detected correctly.
Comment 17 Zhang Rui 2021-12-16 15:08:54 UTC
Good to know.
Bug closed.
Comment 18 bczhc0 2023-02-15 15:40:28 UTC
I'm experiencing exactly the same problem again now, with kernel 6.1.11. My laptop has two NVMe SSD slots, and when I plug my SSD into slot 1, it will be OK. But if I use slot 2, this SSD will be shut down:
```
nvme 0000:02:00.0: Unable to change power state from D3cold to D0, device inaccessible
```

Also, when I plug the SSD into slot 2, not only will the SSD be shut down, my Nvidia GPU will be shut down too. This makes me unable to use the Nvidia GPU, until I remove the SSD at slot 2. Journalctl log about Nvidia GPU:
```
nvidia 0000:01:00.0: Unable to change power state from D3cold to D0, device inaccessible
```

I tried many kernel versions. 5.13 has this problem, but with 5.15 it doesn't. But, I encounter this problem again with kernel 5.18, and now with the near-latest 6.1.11 it's the same thing.