Created attachment 197721 [details] dmesg.txt Hi, PC fails suspend to RAM with WoWLAN enabled. After that WiFi stops working. The logs indicate that firmware operation stopped. Scenario: 1. iw phy0 wowlan enable magic-packet 2. echo mem > /sys/power/state 3. Result: PC not suspended. Logs show the warning "Can't start_fw since the HW hasn't been started" HW: Intel(R) Centrino(R) Wireless-N 130 Module: iwlwifi FW version: 18.168.6.1
Hi, please record tracing: sudo trace-cmd record -e iwlwifi -e iwlwifi_msg -e mac80211 -e cfg80211 Compress the result and attach it to the bug. Thank you. Please note that this flow is not used by most of the users, so this bug will get a lower priority than others.
I haven't attached compressed trace because its too big (10MB) but I've debugged this problem and discovered next flow: iwlagn_suspend iwl_trans_stop_device _iwl_trans_stop_device _iwl_trans_pcie_stop_device trans_pcie->is_down = true; iwl_load_ucode_wait_alive iwl_trans_start_fw iwl_trans_pcie_start_fw if (trans_pcie->is_down) IWL_WARN(trans, "Can't start_fw since the HW hasn't been started\n"); When suspend is called _iwl_trans_pcie_stop_device sets flag trans_pcie->is_down. As result iwl_trans_pcie_start_fw fails with warning. After I reverted commit fa9f3281cbb1075545d4528c84059a3f4e117b44 PC can suspend to RAM.
Ok - that's really helpful. I'll take a look. Thanks.
Created attachment 197831 [details] fix Hi, here is a fix, please test.
Patch is needed internally. I am waiting until I get your approval to send the pull request.
Thank you for patch. I've tested it on my system. PC can suspend to RAM but received dump below. [ 190.766598] PM: Syncing filesystems ... done. [ 191.602038] PM: Preparing system for sleep (mem) [ 191.602917] Freezing user space processes ... (elapsed 0.002 seconds) done. [ 191.605113] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done. [ 191.606470] PM: Suspending system (mem) [ 191.606496] Suspending console(s) (use no_console_suspend to debug) [ 191.606820] sd 0:0:0:0: [sda] Synchronizing SCSI cache [ 191.607492] iwlwifi 0000:05:00.0: L1 Enabled - LTR Disabled [ 191.614206] iwlwifi 0000:05:00.0: L1 Enabled - LTR Disabled [ 191.614298] iwlwifi 0000:05:00.0: Radio type=0x2-0x2-0x1 [ 191.667033] ------------[ cut here ]------------ [ 191.667033] WARNING: CPU: 0 PID: 1025 at drivers/net/wireless/iwlwifi/pcie/trans.c:1543 iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi]() [ 191.667033] Timeout waiting for hardware access (CSR_GP_CNTRL 0x080003d8) [ 191.667033] Modules linked in: iwldvm iwlwifi fuse [ 191.667033] CPU: 0 PID: 1025 Comm: kworker/u4:8 Not tainted 4.4.0-rc4+ #4 [ 191.667033] Hardware name: SAMSUNG ELECTRONICS CO., LTD. NC210/NC110 /NC210/NC110 , BIOS 01VG.M013.20110704.RHU 07/04/2011 [ 191.667033] Workqueue: events_unbound async_run_entry_fn [ 191.667033] 00000000 00000000 f52a5d30 c12c60ce f52a5d70 f52a5d60 c104a472 f88c549c [ 191.667033] f52a5d8c 00000401 f88c51a8 00000607 f88b688c f88b688c f2388000 f2389c8c [ 191.667033] f52a5dac f52a5d78 c104a4ce 00000009 f52a5d70 f88c549c f52a5d8c f52a5da0 [ 191.667033] Call Trace: [ 191.667033] [<c12c60ce>] dump_stack+0x41/0x53 [ 191.667033] [<c104a472>] warn_slowpath_common+0x82/0xb0 [ 191.667033] [<f88b688c>] ? iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi] [ 191.667033] [<f88b688c>] ? iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi] [ 191.667033] [<c104a4ce>] warn_slowpath_fmt+0x2e/0x30 [ 191.667033] [<f88b688c>] iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi] [ 191.667033] [<f88a927a>] iwl_write_direct32+0x1a/0x40 [iwlwifi] [ 191.667033] [<f88b25af>] iwl_trans_pcie_tx_reset+0x5f/0xe0 [iwlwifi] [ 191.667033] [<f88b6798>] ? iwl_trans_pcie_set_bits_mask+0x48/0x50 [iwlwifi] [ 191.667033] [<f88b6f5f>] iwl_trans_pcie_d3_suspend+0x10f/0x120 [iwlwifi] [ 191.667033] [<f8a7516b>] iwlagn_mac_suspend+0xfb/0x150 [iwldvm] [ 191.667033] [<c182556e>] __ieee80211_suspend+0x1fe/0x410 [ 191.667033] [<c17f1da1>] ieee80211_suspend+0x11/0x20 [ 191.667033] [<c17964a2>] wiphy_suspend+0x72/0x1f0 [ 191.667033] [<c1474ab3>] dpm_run_callback+0x33/0xf0 [ 191.667033] [<c1796430>] ? addresses_show+0x90/0x90 [ 191.667033] [<c14755b9>] __device_suspend+0xe9/0x290 [ 191.667033] [<c147577a>] async_suspend+0x1a/0x90 [ 191.667033] [<c1064511>] async_run_entry_fn+0x31/0xd0 [ 191.667033] [<c105da8d>] process_one_work+0x10d/0x350 [ 191.667033] [<c105ddbd>] worker_thread+0xed/0x410 [ 191.667033] [<c105dcd0>] ? process_one_work+0x350/0x350 [ 191.667033] [<c106214c>] kthread+0x9c/0xb0 [ 191.667033] [<c1836209>] ret_from_kernel_thread+0x21/0x38 [ 191.667033] [<c10620b0>] ? kthread_park+0x40/0x40 [ 191.667033] ---[ end trace 09fd0fecba534dd5 ]--- [ 191.720430] sd 0:0:0:0: [sda] Stopping disk [ 192.731738] queueing ieee80211 work while going to suspend [ 192.935251] iwlwifi 0000:05:00.0: iwl_trans_txq_enable_cfg bad state = 0 [ 193.960585] PM: suspend of devices complete after 2353.891 msecs [ 193.972094] PM: late suspend of devices complete after 11.498 msecs [ 193.972619] pcieport 0000:00:1c.2: System wakeup enabled by ACPI [ 193.972932] pcieport 0000:00:1c.0: System wakeup enabled by ACPI [ 193.984090] PM: noirq suspend of devices complete after 11.987 msecs [ 193.984138] ACPI: Preparing to enter system sleep state S3 [ 193.989673] ACPI : EC: EC stopped [ 193.989675] PM: Saving platform NVS memory [ 193.989684] Disabling non-boot CPUs ... [ 193.989905] Broke affinity for irq 32 [ 193.991030] smpboot: CPU 1 is now offline
Does this help? diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c index 7422ed0..38b415f 100644 --- a/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c +++ b/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c @@ -394,12 +394,12 @@ static int iwlagn_mac_suspend(struct ieee80211_hw *hw, if (ret) goto error; + iwl_trans_d3_suspend(priv->trans, false); + /* let the ucode operate on its own */ iwl_write32(priv->trans, CSR_UCODE_DRV_GP1_SET, CSR_UCODE_DRV_GP1_BIT_D3_CFG_COMPLETE); - iwl_trans_d3_suspend(priv->trans, false); - goto out; error: --- Sorry - I don't have a setup with this type of device right now. I can build one, but it might take a bit of time.
The same dump. I've just tried configure WoWLAN on older kernels starting from 3.2. There no suspend problem but the feature does not work though FW reports support of it (priv->fw->img[IWL_UCODE_WOWLAN].sec[0].len).
Well... This feature hasn't been tested for a long time...
Regarding WoWLAN problem it can be BIOS/HW related because Ethernet WoL also does not work for me. Lets put it aside and just fix problem with suspend.
so we are good with my patch, right? Or the WARNING is still biting you? You seemed to say that 3.2 was fine? Do you know what latest kernel version was fine?
The patch does not fix the bug completely. The warning (Comment 6) is still present. Before fa9f3281cbb1075545d4528c84059a3f4e117b44 suspend with WoWLAN enabled worked correctly, so latest version is v4.3-rc1. I my previous post I meant that on older then 4.3 rc1 kernels I can configure WoWLAN feature and go to suspend state but I cannot test resume upon receiving magic packet due to FW/HW/BIOS related problem.
I cannot reproduce the bug you are seeing. I built a setup with a slightly different device though.
Weird, it looks as general code. Is this top of branch kernel, CONFIG_PM_SLEEP set, iw phy0 info | grep WoWLAN -A 7?
I could reproduce the original bug. And I could see that I fixed it. I couldn't reproduce the problem for comment 6. I did enable WoWLAN and I saw that the WoWLAN actually runs.
I will debug problem in comment 6 share results a bit later.
Thanks. Seems a non trivial hardware configuration thing though. We try to wake up the device and it claims it goes to sleep.
Problem in comment 6 is not reproduced after I've commented iwl_trans_stop_device call on the "out". Do we need it? In case of error iwl_trans_stop_device is called anyway by iwlagn_mac_suspend -> iwlagn_prepare_restart -> iwl_down. diff --git a/drivers/net/wireless/iwlwifi/dvm/lib.c b/drivers/net/wireless/iwlwifi/dvm/lib.c index e18629a..4b09fb4 100644 --- a/drivers/net/wireless/iwlwifi/dvm/lib.c +++ b/drivers/net/wireless/iwlwifi/dvm/lib.c @@ -1154,6 +1154,9 @@ int iwlagn_suspend(struct iwl_priv *priv, struct cfg80211_wowlan *wowlan) priv->ucode_loaded = false; iwl_trans_stop_device(priv->trans); + ret = iwl_trans_start_hw(priv->trans); + if (ret) + goto out; priv->wowlan = true; @@ -1247,6 +1250,7 @@ int iwlagn_suspend(struct iwl_priv *priv, struct cfg80211_wowlan *wowlan) ret = iwlagn_send_patterns(priv, wowlan); out: + // iwl_trans_stop_device(priv->trans); kfree(key_data.rsc_tsc); return ret; }
You are perfectly right. I can remove this call. Thank you for your work. So you were going to the failure path. This is why I couldn't reproduce on my setup. Thanks again.
Wait - I am missing something. In case of an error, we shouldn't be calling iwl_trans_d3_suspend, but this function appears in your callstack? ret = iwlagn_suspend(priv, wowlan); if (ret) goto error; /* let the ucode operate on its own */ iwl_write32(priv->trans, CSR_UCODE_DRV_GP1_SET, CSR_UCODE_DRV_GP1_BIT_D3_CFG_COMPLETE); iwl_trans_d3_suspend(priv->trans, false);
You have got it wrong. Please consider additional info: 1. flow: iwlagn_mac_suspend iwlagn_suspend iwl_trans_stop_device iwl_trans_start_hw ... out: iwl_trans_stop_device iwl_trans_d3_suspend If iwl_trans_stop_device is called at the end of iwlagn_suspend my HW became inaccessible and dump is generated on iwl_trans_d3_suspend call. I think you cannot reproduce this because your HW/FW is different. Regarding error see below. 2. In case of iwlagn_suspend returns error (not my case, just analysis) we will have next flow: iwlagn_mac_suspend iwlagn_suspend iwl_trans_stop_device iwl_trans_start_hw ... out: iwl_trans_stop_device error: iwlagn_prepare_restart iwl_down iwl_trans_stop_device Two calls of iwl_trans_stop_device looks redundant.
Ok - so you don't go through the error path, that was my question. Good. I fixed my patch as you can see in https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/backport-iwlwifi.git/ I'll push this fix out after the merge window. I am now closing the bug. Many thanks for your report and the help!
Great work, thanks!
ah - the repository above doesn't include dvm. Anyway, this will be fixed in 4.5 and backported to linux-stable as needed.
Could there be a regression in kernel 4.6: I found this problem in Fedora 24, kernel 4.6 Linux cba 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux This was working fine with Fedora 23, kernel 4.5 Linux abc 4.5.7-200.fc23.x86_64 #1 SMP Wed Jun 8 17:41:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux With kernel 4.6 and WOWLAN enabled, the system fails to suspend to RAM: (journalctl -r): Aug 01 18:29:46 cba kernel: PM: Some devices failed to suspend, or early wake event detected Aug 01 18:29:46 cba kernel: sd 1:0:0:0: [sda] Stopping disk Aug 01 18:29:46 cba kernel: PM: Device phy0 failed to suspend async: error -16 Aug 01 18:29:46 cba kernel: dpm_run_callback(): wiphy_suspend+0x0/0x240 [cfg80211] returns -16 After doing: sudo iw phy phy0 wowlan disable magic-packet then kernel 4.6 can suspend normally. Could it be Fedora's kernel missing this patch?
I doubt it. The best would be if you could install the backport tree and bisect. Can you do that?