Bug 212053 - system resumes automatically by the XHCI host controller - Intel N6000
Summary: system resumes automatically by the XHCI host controller - Intel N6000
Status: CLOSED WILL_NOT_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Default virtual assignee for Drivers/USB
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-03-04 08:11 UTC by Jian-Hong Pan
Modified: 2023-01-31 08:29 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.10+
Subsystem:
Regression: No
Bisected commit-id:


Attachments
cat /proc/acpi/wakeup (1.41 KB, text/plain)
2021-03-04 08:11 UTC, Jian-Hong Pan
Details
dmesg log (71.81 KB, text/plain)
2021-03-04 08:12 UTC, Jian-Hong Pan
Details
The dmesg without nvme support (73.83 KB, text/plain)
2021-03-09 08:00 UTC, Jian-Hong Pan
Details
dmesg log of trace device's power state (77.52 KB, text/plain)
2021-03-11 06:42 UTC, Jian-Hong Pan
Details
cat /proc/interrupts before suspend with builtin NVME driver for comment #6 (3.25 KB, text/plain)
2021-03-25 06:16 UTC, Jian-Hong Pan
Details
cat /proc/interrupts after resume with builtin NVME driver for comment #6 (3.25 KB, text/plain)
2021-03-25 06:20 UTC, Jian-Hong Pan
Details

Description Jian-Hong Pan 2021-03-04 08:11:10 UTC
Created attachment 295637 [details]
cat /proc/acpi/wakeup

We have a laptop equipped with Intel N6000 and Samsung NVMe SSD Controller SM981/PM981/PM983.  I tested with the latest mainline kernel.  System always resumes from suspend automatically.

Checked the enabled ACPI wakeup sources:

$ cat /proc/acpi/wakeup | grep enabled
RP01	  S4	*enabled   pci:0000:00:1c.0
RP02	  S4	*enabled   pci:0000:00:1c.1
RP05	  S4	*enabled   pci:0000:00:1c.4
XHCI	  S3	*enabled   pci:0000:00:14.0

System wakes up automatically until "RP05" is disabled, which corresponds to pci:0000:00:1c.4.

$ lspci -tv
-[0000:00]-+-00.0  Intel Corporation Device 4e26
           +-02.0  Intel Corporation Device 4e71
           +-04.0  Intel Corporation Device 4e03
           +-05.0  Intel Corporation Device 4e19
           +-08.0  Intel Corporation Device 4e11
           +-14.0  Intel Corporation Device 4ded
           +-14.2  Intel Corporation Device 4def
           +-14.3  Intel Corporation Device 4df0
           +-15.0  Intel Corporation Device 4de8
           +-15.2  Intel Corporation Device 4dea
           +-15.3  Intel Corporation Device 4deb
           +-16.0  Intel Corporation Device 4de0
           +-19.0  Intel Corporation Device 4dc5
           +-19.1  Intel Corporation Device 4dc6
           +-1a.0  Intel Corporation Device 4dc4
           +-1c.0-[01]----00.0  Intel Corporation XMM7360 LTE Advanced Modem
           +-1c.1-[02]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1c.4-[03]----00.0  Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
           +-1f.0  Intel Corporation Device 4d87
           +-1f.3  Intel Corporation Device 4dc8
           +-1f.4  Intel Corporation Device 4da3
           \-1f.5  Intel Corporation Device 4da4

Here is the PCI information of the bridge and NVMe SSD controller:

$ sudo lspci -nnvs 00:1c.4
00:1c.4 PCI bridge [0604]: Intel Corporation Device [8086:4dbc] (rev 01) (prog-if 00 [Normal decode])
	Flags: bus master, fast devsel, latency 0, IRQ 122
	Bus: primary=00, secondary=03, subordinate=03, sec-latency=0
	I/O behind bridge: [disabled]
	Memory behind bridge: 80000000-800fffff [size=1M]
	Prefetchable memory behind bridge: [disabled]
	Capabilities: [40] Express Root Port (Slot+), MSI 00
	Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
	Capabilities: [90] Subsystem: ASUSTeK Computer Inc. Device [1043:1792]
	Capabilities: [a0] Power Management version 3
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [220] Access Control Services
	Capabilities: [150] Precision Time Measurement
	Capabilities: [200] L1 PM Substates
	Capabilities: [a30] Secondary PCI Express
	Capabilities: [a00] Downstream Port Containment
	Kernel driver in use: pcieport

$ sudo lspci -nnvs 03:00.00
03:00.0 Non-Volatile memory controller [0108]: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a808] (prog-if 02 [NVM Express])
	Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983 [144d:a801]
	Flags: bus master, fast devsel, latency 0, IRQ 16, NUMA node 0
	Memory at 80000000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/32 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [b0] MSI-X: Enable+ Count=33 Masked-
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
	Capabilities: [158] Power Budgeting <?>
	Capabilities: [168] Secondary PCI Express
	Capabilities: [188] Latency Tolerance Reporting
	Capabilities: [190] L1 PM Substates
	Kernel driver in use: nvme
	Kernel modules: nvme
Comment 1 Jian-Hong Pan 2021-03-04 08:12:11 UTC
Created attachment 295639 [details]
dmesg log
Comment 2 Jian-Hong Pan 2021-03-09 08:00:09 UTC
Created attachment 295763 [details]
The dmesg without nvme support

To reduce the problem scope, I clear the NVME option in kernel's building config.  So, no NVMe device.

However, this issue still can be reproduced, even the kernel does not have nvme support.

Submitted the dmesg as well.
Comment 3 kch 2021-03-09 08:39:35 UTC
Then close the jira this is clearly not related to NVMe
Comment 4 Jian-Hong Pan 2021-03-11 06:42:16 UTC
Created attachment 295793 [details]
dmesg log of trace device's power state

To trace the device's power state transition, I add some debug message in "acpi_pci_set_power_state()":

diff --git a/drivers/pci/pci-acpi.c b/drivers/pci/pci-acpi.c
index 53502a751914..f421ab2a58bb 100644
--- a/drivers/pci/pci-acpi.c
+++ b/drivers/pci/pci-acpi.c
@@ -1000,6 +1000,7 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
        };
        int error = -EINVAL;
 
+       pci_warn(dev, "%s: going to set as %s", __func__, acpi_power_state_string(state_conv[state]));
        /* If the ACPI device has _EJ0, ignore the device */
        if (!adev || acpi_has_method(adev->handle, "_EJ0"))
                return -ENODEV;
@@ -1022,6 +1023,9 @@ static int acpi_pci_set_power_state(struct pci_dev *dev, pci_power_t state)
        if (!error)
                pci_dbg(dev, "power state changed by ACPI to %s\n",
                         acpi_power_state_string(state_conv[state]));
+       else
+               pci_warn(dev, "%s: power state changed by ACPI to %s failed with err 0x%x\n",
+                        __func__, acpi_power_state_string(state_conv[state]), error);
 
        return error;
 }



Log shows PCI devices: the bridge 00:1c.4 and NVMe 03:00.0 go to D3cold before CPUs are offline. Then, go to D0 after system resume.

[  102.606564] PM: suspend entry (deep)
[  102.663727] Filesystems sync: 0.057 seconds
[  102.663941] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  102.665725] OOM killer disabled.
[  102.665727] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done.
[  102.666698] printk: Suspending console(s) (use no_console_suspend to debug)
[  102.669124] intel-lpss 0000:00:15.0: acpi_pci_set_power_state: going to set as D0
[  104.976535] ACPI: EC: interrupt blocked
[  104.987679] snd_hda_intel 0000:00:1f.3: acpi_pci_set_power_state: going to set as D3cold
[  104.990402] nvme 0000:03:00.0: acpi_pci_set_power_state: going to set as D3cold
[  104.990678] intel-lpss 0000:00:15.0: acpi_pci_set_power_state: going to set as D3cold
[  104.991663] xhci_hcd 0000:00:14.0: acpi_pci_set_power_state: going to set as D3hot
[  104.992675] pcieport 0000:00:1c.0: acpi_pci_set_power_state: going to set as D3cold
[  105.001668] pcieport 0000:00:1c.1: acpi_pci_set_power_state: going to set as D3cold
[  105.002670] pcieport 0000:00:1c.4: acpi_pci_set_power_state: going to set as D3cold
[  105.064340] ACPI: Preparing to enter system sleep state S3
[  105.098985] ACPI: EC: event blocked
[  105.098986] ACPI: EC: EC stopped
[  105.098986] PM: Saving platform NVS memory
[  105.098990] Disabling non-boot CPUs ...
[  105.100296] smpboot: CPU 1 is now offline
...
[  105.108856] ACPI: Waking up from system sleep state S3
[  105.111802] ACPI: EC: interrupt unblocked
[  105.111892] xhci_hcd 0000:00:14.0: acpi_pci_set_power_state: going to set as D0
[  105.112239] pci 0000:00:14.3: acpi_pci_set_power_state: going to set as D0
[  105.112317] intel-lpss 0000:00:15.0: acpi_pci_set_power_state: going to set as D0
[  105.112364] intel-lpss 0000:00:15.2: acpi_pci_set_power_state: going to set as D0
[  105.112367] intel-lpss 0000:00:15.3: acpi_pci_set_power_state: going to set as D0
[  105.112519] intel-lpss 0000:00:19.1: acpi_pci_set_power_state: going to set as D0
[  105.112520] pci 0000:00:1a.0: acpi_pci_set_power_state: going to set as D0
[  105.112557] pcieport 0000:00:1c.0: acpi_pci_set_power_state: going to set as D0
[  105.112561] pcieport 0000:00:1c.1: acpi_pci_set_power_state: going to set as D0
[  105.112606] pcieport 0000:00:1c.4: acpi_pci_set_power_state: going to set as D0
[  105.123117] snd_hda_intel 0000:00:1f.3: acpi_pci_set_power_state: going to set as D0
[  105.246288] nvme 0000:03:00.0: acpi_pci_set_power_state: going to set as D0
[  105.254334] pci 0000:01:00.0: acpi_pci_set_power_state: going to set as D0

Full log is uploaded as the attachment.
Comment 5 Jian-Hong Pan 2021-03-11 07:06:47 UTC
Think about the wakeup source again

$ cat /proc/acpi/wakeup | grep enabled
RP01	  S4	*enabled   pci:0000:00:1c.0
RP02	  S4	*enabled   pci:0000:00:1c.1
RP05	  S4	*enabled   pci:0000:00:1c.4
XHCI	  S3	*enabled   pci:0000:00:14.0

According to the PCI tree:

$ lspci -tv
-[0000:00]-+-00.0  Intel Corporation Device 4e26
           ...
           +-14.0  Intel Corporation Device 4ded
           ...
           +-1c.0-[01]----00.0  Intel Corporation XMM7360 LTE Advanced Modem
           +-1c.1-[02]----00.0  Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller
           +-1c.4-[03]----00.0  Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983
           ...

00:1c.0 is attached by 01:00.0 Intel Corporation XMM7360 LTE Advanced Modem
00:1c.1 is attached by 02:00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller

These are attached by wireless devices.  Maybe, for wake on LAN something like that.

00:14.0 is USB controller [0c03]: Intel Corporation Ice Lake-LP USB 3.1 xHCI Host Controller [8086:34ed].  So, USB HID devices wake system up.  This makes sense.

00:1c.4 is attached by 03:00.0 Samsung Electronics Co Ltd NVMe SSD Controller SM981/PM981/PM983

Waked up by NVMe?  Or, some things else?  Or ...  Have no idea ...
Comment 6 Zhang Rui 2021-03-21 16:22:02 UTC
First, please attach the output of "cat /proc/interrupts", WITH NVME drivers built in.

is it possible to remove the nvme disk physically?
is it possible to disable the nvme controller via BIOS options?
Comment 7 Jian-Hong Pan 2021-03-25 06:16:24 UTC
Created attachment 296049 [details]
cat /proc/interrupts before suspend with builtin NVME driver for comment #6
Comment 8 Jian-Hong Pan 2021-03-25 06:20:13 UTC
Created attachment 296051 [details]
cat /proc/interrupts after resume with builtin NVME driver for comment #6

Tried to remove the NVME before.  But the case is too strong to be opened.

Here is no on/off like option for NVME in the BIOS.
Comment 9 Chen Yu 2021-05-27 13:56:30 UTC
(In reply to Jian-Hong Pan from comment #2)
> Created attachment 295763 [details]
> The dmesg without nvme support
> 
> To reduce the problem scope, I clear the NVME option in kernel's building
> config.  So, no NVMe device.
> 
> However, this issue still can be reproduced, even the kernel does not have
> nvme support.
> 
> Submitted the dmesg as well.

Without nvme, how to boot up the system? Is the rootfs not on nvme disk?

Besides, even without nvme driver loaded, I was thinking if the firmware of nvme controller might also be able to generate a wakeup interrupt and route to the pci bride. What if :
echo disable > /sys/devices/pci0000:00/03:00.0/power/wakeup
if not work, what if
echo disable > /sys/devices/pci0000:00/00:1c.4/power/wakeup
Comment 10 Jian-Hong Pan 2021-06-01 06:57:49 UTC
(In reply to Chen Yu from comment #9)
> (In reply to Jian-Hong Pan from comment #2)
> > Created attachment 295763 [details]
> > The dmesg without nvme support
> > 
> > To reduce the problem scope, I clear the NVME option in kernel's building
> > config.  So, no NVMe device.
> > 
> > However, this issue still can be reproduced, even the kernel does not have
> > nvme support.
> > 
> > Submitted the dmesg as well.
> 
> Without nvme, how to boot up the system? Is the rootfs not on nvme disk?
It can boot up with an USB disk with an OS.
Comment 11 Jian-Hong Pan 2021-06-01 08:00:04 UTC
(In reply to Chen Yu from comment #9)

> Besides, even without nvme driver loaded, I was thinking if the firmware of
> nvme controller might also be able to generate a wakeup interrupt and route
> to the pci bride. What if :
> echo disable > /sys/devices/pci0000:00/03:00.0/power/wakeup

It seems already disabled

# cat /sys/devices/pci0000\:00/0000\:00\:1c.4/0000\:03\:00.0/power/wakeup
disabled

> if not work, what if
> echo disable > /sys/devices/pci0000:00/00:1c.4/power/wakeup

This seems helps!
# cat /sys/devices/pci0000\:00/0000\:00\:1c.4/power/wakeup
enabled
# echo disabled > /sys/devices/pci0000\:00/0000\:00\:1c.4/power/wakeup
Comment 12 Jian-Hong Pan 2021-06-01 08:04:35 UTC
(In reply to Jian-Hong Pan from comment #11)
> (In reply to Chen Yu from comment #9)
> > if not work, what if
> > echo disable > /sys/devices/pci0000:00/00:1c.4/power/wakeup
> 
> This seems helps!
> # cat /sys/devices/pci0000\:00/0000\:00\:1c.4/power/wakeup
> enabled
> # echo disabled > /sys/devices/pci0000\:00/0000\:00\:1c.4/power/wakeup
System can be suspended until I wake it up by a key pressed.
Comment 13 Jian-Hong Pan 2023-01-31 08:29:14 UTC
The machine is returned to the owner.

Note You need to log in before you can comment on or make changes to this bug.