Bug 112121 - Some PCIe options cause devices to be removed after suspend
Summary: Some PCIe options cause devices to be removed after suspend
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: PCI (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_pci@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-02-07 17:04 UTC by Mike Lothian
Modified: 2018-02-13 15:39 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.5-rc2
Tree: Mainline
Regression: No


Attachments
Dmesg showing PCIe device removals (87.99 KB, text/plain)
2016-02-07 17:04 UTC, Mike Lothian
Details
Dmesg with ignore_loglevel (204.84 KB, text/plain)
2016-02-13 23:36 UTC, Mike Lothian
Details
Dot config (90.95 KB, text/plain)
2016-02-13 23:36 UTC, Mike Lothian
Details

Description Mike Lothian 2016-02-07 17:04:34 UTC
Created attachment 203091 [details]
Dmesg showing PCIe device removals

I was having issues with suspend, when the machine was being resumed iommu started removing devices - including my PCIe NVMe drive which contained my root partition

The problem showed up with:

[*] PCI support
[*]   Support mmconfig PCI config space access
[*]   PCI Express Port Bus support
[*]     PCI Express Hotplug driver
[*]     Root Port Advanced Error Reporting support
[*]       PCI Express ECRC settings control
< >       PCIe AER error injector support
-*-     PCI Express ASPM control
[ ]       Debug PCI Express ASPM
          Default ASPM policy (BIOS default)  --->
[*]   Message Signaled Interrupts (MSI and MSI-X)
[ ]   PCI Debugging
[*]   Enable PCI resource re-allocation detection
< >   PCI Stub driver
[*]   Interrupts on hypertransport devices
[ ] PCI IOV support
[*] PCI PRI support
-*- PCI PASID support
    PCI host controller drivers  ----
< > PCCard (PCMCIA/CardBus) support  ----
[*] Support for PCI Hotplug  --->
< > RapidIO support


This is what I have now:

[*] PCI support
[*]   Support mmconfig PCI config space access
[*]   PCI Express Port Bus support
[ ]     Root Port Advanced Error Reporting support
-*-     PCI Express ASPM control
[ ]       Debug PCI Express ASPM
          Default ASPM policy (BIOS default)  --->
[*]   Message Signaled Interrupts (MSI and MSI-X)
[*]   PCI Debugging
[ ]   Enable PCI resource re-allocation detection
< >   PCI Stub driver
[*]   Interrupts on hypertransport devices
[ ] PCI IOV support
[ ] PCI PRI support
[ ] PCI PASID support
    PCI host controller drivers  ----
< > PCCard (PCMCIA/CardBus) support  ----
[ ] Support for PCI Hotplug  ----
< > RapidIO support

I tried disabling the iommu driver first but it had no effect

If people are interested I could play with the above options to see which one causes the issue
Comment 1 Mike Lothian 2016-02-13 23:35:18 UTC
I did try and reply to the last email but I can't figure out how to send plaintext emails using Google Inbox

I've just tested this again, I enabled PCI Hotplug & PCIe Hotplug and nothing - then I noticed I hadn't enabled the ACPI Hotplug driver - once I did the issue re-appeared

I then had to use testdisk to restore my partition table :'(

I've got a new dmesg out which I'll attach now along with my .config
Comment 2 Mike Lothian 2016-02-13 23:36:00 UTC
Created attachment 203621 [details]
Dmesg with ignore_loglevel
Comment 3 Mike Lothian 2016-02-13 23:36:37 UTC
Created attachment 203631 [details]
Dot config
Comment 4 nickkrause 2016-03-23 23:26:19 UTC
Try changing this lines:
              [PCI_D3hot] = ACPI_STATE_D3_HOT,
to:
             [PCI_D3hot] = ACPI_STATE_D3_COLD,
as you may be having a issue with ACPI states being screwed up during your suspend due to this change.
Comment 5 Keith Busch 2016-03-23 23:26:35 UTC
Work travel 3/23 - 3/24, responses will be delayed. Please contact Jon Derrick if needing a quicker response.

Thanks,
Keith
Comment 6 Mike Lothian 2016-04-15 18:18:44 UTC
Sorry for the delay - where about should I be changing that?
Comment 7 [account disabled by administrator] 2016-04-16 04:32:28 UTC
Greetings Mike,
Seems that's not the issue I am wondering if due to not enabling CONFIG_ACPI_PCI_SLOT. If not then reply and we can take it for there.
Comment 8 Mike Lothian 2018-01-24 08:36:23 UTC
I tried enabling CONFIG_ACPI_PCI_SLOT but the laptop refuses to boot with it, it complains about being unable to find systemd
Comment 9 Mike Lothian 2018-01-24 09:47:35 UTC
I've also tried changing 

              [PCI_D3hot] = ACPI_STATE_D3_HOT,
to:
             [PCI_D3hot] = ACPI_STATE_D3_COLD,

in drivers/pci/pci-acpi.c, but it had no effect
Comment 10 Mike Lothian 2018-02-12 11:38:13 UTC
Patch https://patchwork.kernel.org/patch/10212201/ fixes this issue, it also fixes USB-C detection and the CONFIG_ACPI_PCI_SLOT option now works too

Note You need to log in before you can comment on or make changes to this bug.