Since commit 1fb3a7a75e2efcc83ef21f2434069cddd6fae6f5 (igb: Fix an issue that PME is not enabled during runtime suspend), PCI PM code generates a WARN_ONCE upon boot. [ 11.946372] ------------[ cut here ]------------ [ 11.946384] PCI PM: State of device not saved by igb_runtime_suspend+0x0/0x80 [ 11.946406] WARNING: CPU: 27 PID: 265 at drivers/pci/pci-driver.c:1280 pci_pm_runtime_suspend+0x127/0x130 [ 11.946407] Modules linked in: 8021q zfs(PO) zunicode(PO) zlua(PO) zcommon(PO) znvpair(PO) zavl(PO) icp(PO) spl(O) zlib_inflate x86_pkg_temp_thermal kvm_intel kvm coretemp hwmon irqbypass crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper intel_cstate intel_uncore intel_rapl_perf i2c_i801 intel_pch_thermal ipmi_si ipmi_devintf ipmi_msghandler [ 11.946448] CPU: 27 PID: 265 Comm: kworker/27:1 Tainted: P O T 4.20.8+ #20 [ 11.946450] Hardware name: Supermicro Super Server/X10SDV-7TP8F, BIOS 2.0 06/13/2018 [ 11.946458] Workqueue: pm pm_runtime_work [ 11.946466] RIP: 0010:pci_pm_runtime_suspend+0x127/0x130 [ 11.946470] Code: f7 0f 00 8b 44 24 04 eb a1 44 39 e8 74 9a 80 3d cb 7b d0 00 00 75 91 48 c7 c7 d8 b3 d5 bd c6 05 bb 7b d0 00 01 e8 fe 37 cf ff <0f> 0b 31 c0 e9 77 ff ff ff 41 55 41 54 45 31 e4 55 53 48 89 fb 48 [ 11.946473] RSP: 0018:ffffb85142a0fd88 EFLAGS: 00010282 [ 11.946476] RAX: 0000000000000000 RBX: ffff92ceffe690b0 RCX: ffffffffbde29cb8 [ 11.946479] RDX: 0000000000000001 RSI: 0000000000000082 RDI: ffffffffbed55d88 [ 11.946482] RBP: ffffffffbdaa7980 R08: 0000000000000001 R09: 0000000000000446 [ 11.946484] R10: 0000000000000003 R11: 0000000000000000 R12: ffff92ceffe69000 [ 11.946487] R13: 0000000000000000 R14: ffffffffbd162550 R15: 0000000000000000 [ 11.946491] FS: 0000000000000000(0000) GS:ffff92cedfac0000(0000) knlGS:0000000000000000 [ 11.946494] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 11.946497] CR2: 00007fbdb68e6c38 CR3: 000000006e20a001 CR4: 00000000003606e0 [ 11.946499] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 11.946502] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 11.946503] Call Trace: [ 11.946515] ? __switch_to_asm+0x34/0x70 [ 11.946523] __rpm_callback+0x70/0x1b0 [ 11.946529] ? __switch_to_asm+0x34/0x70 [ 11.946534] ? pci_dev_put+0x20/0x20 [ 11.946540] rpm_callback+0x66/0x90 [ 11.946546] ? pci_dev_put+0x20/0x20 [ 11.946552] rpm_suspend+0x15a/0x570 [ 11.946559] pm_runtime_work+0x8c/0xa0 [ 11.946566] process_one_work+0x1d2/0x350 [ 11.946573] ? wq_calc_node_cpumask.constprop.50+0x20/0x20 [ 11.946578] worker_thread+0x28/0x3d0 [ 11.946585] ? wq_calc_node_cpumask.constprop.50+0x20/0x20 [ 11.946589] kthread+0x103/0x120 [ 11.946594] ? kthread_create_on_node+0x90/0x90 [ 11.946600] ret_from_fork+0x35/0x40 [ 11.946604] ---[ end trace 7d317c4a3f4ca539 ]---
How was this change supposed to work? Looking at the code, it seems like it cannot avoid hitting the warning path, in which case it won't call pci_save_state/pci_finish_runtime_suspend anyway, since igb_runtime_suspend still sets the power state before returning? In my case after returning from igb_runtime_suspend, it has gone from D0 (==prev) to D3Hot (==pci_dev->current_state), and hits this code in pci_pm_runtime_suspend which issues the warning and returns early. if (pm && pm->runtime_suspend && !pci_dev->state_saved && pci_dev->current_state != PCI_D0 && pci_dev->current_state != PCI_UNKNOWN) { WARN_ONCE(pci_dev->current_state != prev, "PCI PM: State of device not saved by %pF\n", pm->runtime_suspend); return 0; }
Created attachment 281203 [details] Kernel config
Created attachment 281205 [details] lspci -vvv output for igb devices
Created attachment 281209 [details] Test patch Please test this patch, thanks!
(In reply to Kai-Heng Feng from comment #4) > Created attachment 281209 [details] > Test patch > > Please test this patch, thanks! It doesn't help -- not sure I understand the patch: it just makes it unconditionally not call pci_save_state, but how would that change anything, it's not getting called now anyway? Also: should have mentioned in terms of hardware configuration -- none of the igb ports have a cable plugged into them in my setup at present. In your tests, do you not see the warning?
Sorry I don't have access to igb devices right now - I'll work on this next week.
Created attachment 281211 [details] Possible patch This quiets the warning for me.
(In reply to Kai-Heng Feng from comment #6) > Sorry I don't have access to igb devices right now - I'll work on this next > week. Thanks. I'll be traveling next week let me know if you want any additional information about the system.
I think your patch is correct. Do you plan to send it upstream? My patch in #4 can come as a separate patch. I don't find anywhere in the function that touches config space, so just let PCI core to handle that.
(In reply to Kai-Heng Feng from comment #9) > I think your patch is correct. Do you plan to send it upstream? > > My patch in #4 can come as a separate patch. I don't find anywhere in the > function that touches config space, so just let PCI core to handle that. I'd like to understand how the original change was supposed to work in case something specific with my config is causing issues. From the code at least I can't see how it would have done anything but trigger warning, and still not call the finish runtime stuff. Did it actually work in your testing?
(In reply to Arvind Sankar from comment #10) > I'd like to understand how the original change was supposed to work in case > something specific with my config is causing issues. From the code at least "Original change" means my patch which triggers the warning right? Your config is perfectly fine. > I can't see how it would have done anything but trigger warning, and still > not call the finish runtime stuff. Did it actually work in your testing? static int __maybe_unused igb_suspend(struct device *dev) { int retval; bool wake; struct pci_dev *pdev = to_pci_dev(dev); retval = __igb_shutdown(pdev, &wake, 0); if (retval) return retval; if (wake) { pci_prepare_to_sleep(pdev); } else { pci_wake_from_d3(pdev, false); pci_set_power_state(pdev, PCI_D3hot); } return 0; } It depends on the variable 'wake'. The card I tested seems to support whatever igb_enable_mng_pass_thru() enables, so the 'wake' evals as TRUE, hence the device is still in D0.
Sorry, I pasted the wrong function, should be igb_runtiem_suspend()instead of igb_suspend(), but you get the idea. Anyway, I think most of the pci_*() routines in igb_runtiem_suspend() and __igb_shutdown() can be skipped as PCI core already handles them.
Hi there, I can confirm this bug, too, on Linux 4.20.12: [ 48.481549] ------------[ cut here ]------------ [ 48.481555] PCI PM: State of device not saved by igb_runtime_suspend+0x0/0x80 [igb] [ 48.481562] WARNING: CPU: 7 PID: 165 at drivers/pci/pci-driver.c:1280 pci_pm_runtime_suspend+0x173/0x180 [ 48.481563] Modules linked in: nls_ascii nls_cp437 vfat fat raid1 md_mod arc4 wmi_bmof rtl8821ae intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp btcoexist kvm_intel rtl_pci rtlwifi efi_pstore kvm mac80211 irqbypass intel_cstate intel_uncore intel_rapl_perf cfg80211 iTCO_wdt efivars pcspkr iTCO_vendor_support rfkill intel_pch_thermal wmi button acpi_pad msr nct6683 efivarfs ip_tables x_tables autofs4 zfs(PO) zunicode(PO) zavl(PO) icp(PO) zlua(PO) zcommon(PO) znvpair(PO) spl(O) algif_skcipher af_alg dm_crypt dm_mod crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel aesni_intel aes_x86_64 crypto_simd cryptd glue_helper xhci_pci igb e1000e i2c_algo_bit dca xhci_hcd usbcore [ 48.481583] CPU: 7 PID: 165 Comm: kworker/7:1 Tainted: P O 4.20.12 #1 [ 48.481584] Hardware name: LENOVO 30BK0020GE/103D, BIOS S06KT39A 12/19/2018 [ 48.481586] Workqueue: pm pm_runtime_work [ 48.481588] RIP: 0010:pci_pm_runtime_suspend+0x173/0x180 [ 48.481588] Code: ff 44 39 e8 0f 84 64 ff ff ff 80 3d e9 61 d1 00 00 0f 85 57 ff ff ff 48 c7 c7 60 ee 82 8b c6 05 d5 61 d1 00 01 e8 47 2f ca ff <0f> 0b 31 c0 e9 3d ff ff ff 0f 1f 40 00 0f 1f 44 00 00 53 48 89 fb [ 48.481589] RSP: 0018:ffffae5ac3b7fd98 EFLAGS: 00010282 [ 48.481590] RAX: 0000000000000000 RBX: ffff9120bb5800b0 RCX: 0000000000000006 [ 48.481591] RDX: 0000000000000007 RSI: 0000000000000096 RDI: ffff9120bfbd54e0 [ 48.481591] RBP: ffffffffc02cc7a0 R08: 0000000000000058 R09: 0000000000000004 [ 48.481592] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9120bb580000 [ 48.481592] R13: 0000000000000000 R14: 0000000000000000 R15: ffff9120bb5800b0 [ 48.481593] FS: 0000000000000000(0000) GS:ffff9120bfbc0000(0000) knlGS:0000000000000000 [ 48.481593] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 48.481594] CR2: 000055e43062b5d8 CR3: 000000042020a001 CR4: 00000000003606e0 [ 48.481595] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 48.481595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 48.481595] Call Trace: [ 48.481598] ? __switch_to_asm+0x40/0x70 [ 48.481600] ? pci_has_legacy_pm_support+0x60/0x60 [ 48.481601] __rpm_callback+0x7d/0x1d0 [ 48.481602] rpm_callback+0x4f/0x70 [ 48.481604] ? pci_has_legacy_pm_support+0x60/0x60 [ 48.481604] rpm_suspend+0x138/0x660 [ 48.481605] ? __switch_to_asm+0x40/0x70 [ 48.481607] pm_runtime_work+0x7a/0x80 [ 48.481609] process_one_work+0x18c/0x380 [ 48.481611] worker_thread+0x30/0x370 [ 48.481612] kthread+0x10f/0x130 [ 48.481614] ? process_one_work+0x380/0x380 [ 48.481615] ? kthread_park+0x80/0x80 [ 48.481616] ret_from_fork+0x35/0x40 [ 48.481617] ---[ end trace 0799d14756480c9a ]--- [ 365.755070] igb 0000:03:00.3 ens3f3: PCIe link lost [ 365.760285] igb 0000:03:00.2 ens3f2: PCIe link lost [ 365.765474] igb 0000:03:00.1 ens3f1: PCIe link lost [ 365.770661] igb 0000:03:00.0 ens3f0: PCIe link lost [103711.060321] pcieport 0000:00:1d.0: Intel SPT PCH root port ACS workaround enabled
(In reply to Kai-Heng Feng from comment #11) > (In reply to Arvind Sankar from comment #10) > > I'd like to understand how the original change was supposed to work in case > > something specific with my config is causing issues. From the code at least > > "Original change" means my patch which triggers the warning right? > Your config is perfectly fine. > > > I can't see how it would have done anything but trigger warning, and still > > not call the finish runtime stuff. Did it actually work in your testing? > > static int __maybe_unused igb_suspend(struct device *dev) > { > int retval; > bool wake; > struct pci_dev *pdev = to_pci_dev(dev); > > retval = __igb_shutdown(pdev, &wake, 0); > if (retval) > return retval; > > if (wake) { > pci_prepare_to_sleep(pdev); > } else { > pci_wake_from_d3(pdev, false); > pci_set_power_state(pdev, PCI_D3hot); > } > > return 0; > } > > It depends on the variable 'wake'. The card I tested seems to support > whatever igb_enable_mng_pass_thru() enables, so the 'wake' evals as TRUE, > hence the device is still in D0. Was that card was not capable of generating PME from D3? According to lspci at least my igb is capable of PME from D3 so even with wake being true it would have been powered down into D3Hot. If you remove the pci_save_state unconditionally I think you also need to apply my patch to the igb_suspend function as well otherwise I think it will generate the same WARN when suspending.
Created attachment 281461 [details] igb: Fix WARN_ONCE on runtime suspend Here's a revised patch to make suspend and runtime_suspend consistent, removing pci_save_state and power state setting from both.
(In reply to Arvind Sankar from comment #15) > Created attachment 281461 [details] > igb: Fix WARN_ONCE on runtime suspend > > Here's a revised patch to make suspend and runtime_suspend consistent, > removing pci_save_state and power state setting from both. Maybe it's another issue, but when I applied this patch on 4.15 & 4.19, there are chances that the first few hotplugs still leave interface in down state after booting without cable connected. /proc/interrupts lists no instance of this card, and there is no ACPI_NOTIFY_DEVICE_WAKE event ever propagated either. But when it begins to respond, the problem vanishes until the next boot (1/3 chances or so).
(In reply to You-Sheng Yang from comment #16) > Maybe it's another issue, but when I applied this patch on 4.15 & 4.19, > there are chances that the first few hotplugs still leave interface in down > state after booting without cable connected. /proc/interrupts lists no > instance of this card, and there is no ACPI_NOTIFY_DEVICE_WAKE event ever > propagated either. But when it begins to respond, the problem vanishes until > the next boot (1/3 chances or so). Hmm, this card should use PME interrupts directly instead of ACPI GPE. Can you attach `lspci -vvv` for this device when issue happens?
Created attachment 281521 [details] normal lspci -vvv output Captured right after boot with cable disconnected. The link is up as expected when cable is plugged.
Created attachment 281523 [details] lspci -vvv output when link is still down Captured right after boot with cable disconnected. The link is NOT up as expected when cable is plugged. The difference between the two lspci output: $ diff -Nu Desktop/lspci.vvv.normal Desktop/lspci.vvv.link-not-up --- Desktop/lspci.vvv.normal 2019-03-06 01:43:06.000000000 +0800 +++ Desktop/lspci.vvv.link-not-up 2019-03-06 01:43:06.000000000 +0800 @@ -431,7 +431,7 @@ Region 4: Memory at 4000000000 (64-bit, non-prefetchable) [size=1M] Capabilities: [50] Power Management version 3 Flags: PMEClk- DSI- D1- D2- AuxCurrent=55mA PME(D0-,D1-,D2-,D3hot+,D3cold+) - Status: D3 NoSoftRst+ PME-Enable+ DSel=0 DScale=0 PME- + Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME- Capabilities: [80] Vendor Specific Information: Len=14 <?> Capabilities: [60] MSI: Enable+ Count=1/1 Maskable- 64bit+ Address: 00000000fee004f8 Data: 0000
(In reply to You-Sheng Yang from comment #16) > (In reply to Arvind Sankar from comment #15) > > Created attachment 281461 [details] > > igb: Fix WARN_ONCE on runtime suspend > > > > Here's a revised patch to make suspend and runtime_suspend consistent, > > removing pci_save_state and power state setting from both. > > Maybe it's another issue, but when I applied this patch on 4.15 & 4.19, > there are chances that the first few hotplugs still leave interface in down > state after booting without cable connected. /proc/interrupts lists no > instance of this card, and there is no ACPI_NOTIFY_DEVICE_WAKE event ever > propagated either. But when it begins to respond, the problem vanishes until > the next boot (1/3 chances or so). Does your card wakeup work without the patch?
Please try linux-next, particular for commit c528f7bd362b097eeeafa6fbbeccd9750b79c7ba ("Revert "PCI/PME: Implement runtime PM callbacks").
(In reply to Kai-Heng Feng from comment #21) > Please try linux-next, particular for commit > c528f7bd362b097eeeafa6fbbeccd9750b79c7ba ("Revert "PCI/PME: Implement > runtime PM callbacks"). Packages with specified commits built, but don't have the card at hand currently. Probably gonna take a couple of days.
(In reply to Kai-Heng Feng from comment #21) > Please try linux-next, particular for commit > c528f7bd362b097eeeafa6fbbeccd9750b79c7ba ("Revert "PCI/PME: Implement > runtime PM callbacks"). Hi, tested with kernel built from linux-next tree. No more previous problem observed.