Bug 109591 - iwlwifi: dvm: Cannot suspend PC to RAM with WoWLAN enabled - MWG100252556
Summary: iwlwifi: dvm: Cannot suspend PC to RAM with WoWLAN enabled - MWG100252556
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: DO NOT USE - assign "network-wireless-intel" component instead
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-12-18 21:04 UTC by Bogdan Bogush
Modified: 2016-08-02 02:30 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.4.0-rc4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg.txt (159.47 KB, text/plain)
2015-12-18 21:04 UTC, Bogdan Bogush
Details
fix (1.93 KB, patch)
2015-12-20 06:52 UTC, Emmanuel Grumbach
Details | Diff

Description Bogdan Bogush 2015-12-18 21:04:34 UTC
Created attachment 197721 [details]
dmesg.txt

Hi,

PC fails suspend to RAM with WoWLAN enabled. After that WiFi stops working. The logs indicate that firmware operation stopped.

Scenario:
1. iw phy0 wowlan enable magic-packet
2. echo mem > /sys/power/state
3. Result: PC not suspended. Logs show the warning "Can't start_fw since the HW hasn't been started"

HW: Intel(R) Centrino(R) Wireless-N 130
Module: iwlwifi
FW version: 18.168.6.1
Comment 1 Emmanuel Grumbach 2015-12-19 19:58:54 UTC
Hi,

please record tracing:

sudo trace-cmd record -e iwlwifi -e iwlwifi_msg -e mac80211 -e cfg80211

Compress the result and attach it to the bug.
Thank you.

Please note that this flow is not used by most of the users, so this bug will get a lower priority than others.
Comment 2 Bogdan Bogush 2015-12-20 00:31:20 UTC
I haven't attached compressed trace because its too big (10MB) but I've debugged this problem and discovered next flow:

iwlagn_suspend
  iwl_trans_stop_device
    _iwl_trans_stop_device
      _iwl_trans_pcie_stop_device
      trans_pcie->is_down = true;
  iwl_load_ucode_wait_alive
    iwl_trans_start_fw
      iwl_trans_pcie_start_fw
      if (trans_pcie->is_down)
        IWL_WARN(trans, "Can't start_fw since the HW hasn't been started\n");

When suspend is called _iwl_trans_pcie_stop_device sets flag trans_pcie->is_down. As result iwl_trans_pcie_start_fw fails with warning. After I reverted commit  fa9f3281cbb1075545d4528c84059a3f4e117b44 PC can suspend to RAM.
Comment 3 Emmanuel Grumbach 2015-12-20 04:56:04 UTC
Ok - that's really helpful. I'll take a look.
Thanks.
Comment 4 Emmanuel Grumbach 2015-12-20 06:52:13 UTC
Created attachment 197831 [details]
fix

Hi,


here is a fix, please test.
Comment 5 Emmanuel Grumbach 2015-12-20 15:45:00 UTC
Patch is needed internally.
I am waiting until I get your approval to send the pull request.
Comment 6 Bogdan Bogush 2015-12-20 17:51:58 UTC
Thank you for patch. I've tested it on my system. PC can suspend to RAM but  received dump below. 

[  190.766598] PM: Syncing filesystems ... done.
[  191.602038] PM: Preparing system for sleep (mem)
[  191.602917] Freezing user space processes ... (elapsed 0.002 seconds) done.
[  191.605113] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  191.606470] PM: Suspending system (mem)
[  191.606496] Suspending console(s) (use no_console_suspend to debug)
[  191.606820] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[  191.607492] iwlwifi 0000:05:00.0: L1 Enabled - LTR Disabled
[  191.614206] iwlwifi 0000:05:00.0: L1 Enabled - LTR Disabled
[  191.614298] iwlwifi 0000:05:00.0: Radio type=0x2-0x2-0x1
[  191.667033] ------------[ cut here ]------------
[  191.667033] WARNING: CPU: 0 PID: 1025 at drivers/net/wireless/iwlwifi/pcie/trans.c:1543 iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi]()
[  191.667033] Timeout waiting for hardware access (CSR_GP_CNTRL 0x080003d8)
[  191.667033] Modules linked in: iwldvm iwlwifi fuse
[  191.667033] CPU: 0 PID: 1025 Comm: kworker/u4:8 Not tainted 4.4.0-rc4+ #4
[  191.667033] Hardware name: SAMSUNG ELECTRONICS CO., LTD. NC210/NC110                /NC210/NC110                , BIOS 01VG.M013.20110704.RHU 07/04/2011
[  191.667033] Workqueue: events_unbound async_run_entry_fn
[  191.667033]  00000000 00000000 f52a5d30 c12c60ce f52a5d70 f52a5d60 c104a472 f88c549c
[  191.667033]  f52a5d8c 00000401 f88c51a8 00000607 f88b688c f88b688c f2388000 f2389c8c
[  191.667033]  f52a5dac f52a5d78 c104a4ce 00000009 f52a5d70 f88c549c f52a5d8c f52a5da0
[  191.667033] Call Trace:
[  191.667033]  [<c12c60ce>] dump_stack+0x41/0x53
[  191.667033]  [<c104a472>] warn_slowpath_common+0x82/0xb0
[  191.667033]  [<f88b688c>] ? iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi]
[  191.667033]  [<f88b688c>] ? iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi]
[  191.667033]  [<c104a4ce>] warn_slowpath_fmt+0x2e/0x30
[  191.667033]  [<f88b688c>] iwl_trans_pcie_grab_nic_access+0xec/0x100 [iwlwifi]
[  191.667033]  [<f88a927a>] iwl_write_direct32+0x1a/0x40 [iwlwifi]
[  191.667033]  [<f88b25af>] iwl_trans_pcie_tx_reset+0x5f/0xe0 [iwlwifi]
[  191.667033]  [<f88b6798>] ? iwl_trans_pcie_set_bits_mask+0x48/0x50 [iwlwifi]
[  191.667033]  [<f88b6f5f>] iwl_trans_pcie_d3_suspend+0x10f/0x120 [iwlwifi]
[  191.667033]  [<f8a7516b>] iwlagn_mac_suspend+0xfb/0x150 [iwldvm]
[  191.667033]  [<c182556e>] __ieee80211_suspend+0x1fe/0x410
[  191.667033]  [<c17f1da1>] ieee80211_suspend+0x11/0x20
[  191.667033]  [<c17964a2>] wiphy_suspend+0x72/0x1f0
[  191.667033]  [<c1474ab3>] dpm_run_callback+0x33/0xf0
[  191.667033]  [<c1796430>] ? addresses_show+0x90/0x90
[  191.667033]  [<c14755b9>] __device_suspend+0xe9/0x290
[  191.667033]  [<c147577a>] async_suspend+0x1a/0x90
[  191.667033]  [<c1064511>] async_run_entry_fn+0x31/0xd0
[  191.667033]  [<c105da8d>] process_one_work+0x10d/0x350
[  191.667033]  [<c105ddbd>] worker_thread+0xed/0x410
[  191.667033]  [<c105dcd0>] ? process_one_work+0x350/0x350
[  191.667033]  [<c106214c>] kthread+0x9c/0xb0
[  191.667033]  [<c1836209>] ret_from_kernel_thread+0x21/0x38
[  191.667033]  [<c10620b0>] ? kthread_park+0x40/0x40
[  191.667033] ---[ end trace 09fd0fecba534dd5 ]---
[  191.720430] sd 0:0:0:0: [sda] Stopping disk
[  192.731738] queueing ieee80211 work while going to suspend
[  192.935251] iwlwifi 0000:05:00.0: iwl_trans_txq_enable_cfg bad state = 0
[  193.960585] PM: suspend of devices complete after 2353.891 msecs
[  193.972094] PM: late suspend of devices complete after 11.498 msecs
[  193.972619] pcieport 0000:00:1c.2: System wakeup enabled by ACPI
[  193.972932] pcieport 0000:00:1c.0: System wakeup enabled by ACPI
[  193.984090] PM: noirq suspend of devices complete after 11.987 msecs
[  193.984138] ACPI: Preparing to enter system sleep state S3
[  193.989673] ACPI : EC: EC stopped
[  193.989675] PM: Saving platform NVS memory
[  193.989684] Disabling non-boot CPUs ...
[  193.989905] Broke affinity for irq 32
[  193.991030] smpboot: CPU 1 is now offline
Comment 7 Emmanuel Grumbach 2015-12-20 18:02:37 UTC
Does this help?

diff --git a/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c b/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c
index 7422ed0..38b415f 100644
--- a/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c
+++ b/drivers/net/wireless/intel/iwlwifi/dvm/mac80211.c
@@ -394,12 +394,12 @@ static int iwlagn_mac_suspend(struct ieee80211_hw *hw,
        if (ret)
                goto error;
 
+       iwl_trans_d3_suspend(priv->trans, false);
+
        /* let the ucode operate on its own */
        iwl_write32(priv->trans, CSR_UCODE_DRV_GP1_SET,
                    CSR_UCODE_DRV_GP1_BIT_D3_CFG_COMPLETE);
 
-       iwl_trans_d3_suspend(priv->trans, false);
-
        goto out;
 
  error:



---

Sorry - I don't have a setup with this type of device right now. I can build one, but it might take a bit of time.
Comment 8 Bogdan Bogush 2015-12-20 19:08:47 UTC
The same dump. 

I've just tried configure WoWLAN on older kernels starting from 3.2. There no suspend problem but the feature does not work though FW reports support of it (priv->fw->img[IWL_UCODE_WOWLAN].sec[0].len).
Comment 9 Emmanuel Grumbach 2015-12-20 19:29:55 UTC
Well... This feature hasn't been tested for a long time...
Comment 10 Bogdan Bogush 2015-12-20 21:04:40 UTC
Regarding WoWLAN problem it can be BIOS/HW related because Ethernet WoL also does not work for me. Lets put it aside and just fix problem with suspend.
Comment 11 Emmanuel Grumbach 2015-12-20 21:10:09 UTC
so we are good with my patch, right?

Or the WARNING is still biting you?
You seemed to say that 3.2 was fine?
Do you know what latest kernel version was fine?
Comment 12 Bogdan Bogush 2015-12-20 21:55:17 UTC
The patch does not fix the bug completely. The warning (Comment 6) is still present. 
Before fa9f3281cbb1075545d4528c84059a3f4e117b44 suspend with WoWLAN enabled worked correctly, so latest version is v4.3-rc1. 

I my previous post I meant that on older then 4.3 rc1 kernels I can configure WoWLAN feature and go to suspend state but I cannot test resume upon receiving magic packet due to FW/HW/BIOS related problem.
Comment 13 Emmanuel Grumbach 2015-12-22 14:47:08 UTC
I cannot reproduce the bug you are seeing.
I built a setup with a slightly different device though.
Comment 14 Bogdan Bogush 2015-12-22 21:11:06 UTC
Weird, it looks as general code. Is this top of branch kernel, CONFIG_PM_SLEEP set, iw phy0 info | grep WoWLAN -A 7?
Comment 15 Emmanuel Grumbach 2015-12-22 21:14:51 UTC
I could reproduce the original bug. And I could see that I fixed it.
I couldn't reproduce the problem for comment 6.
I did enable WoWLAN and I saw that the WoWLAN actually runs.
Comment 16 Bogdan Bogush 2015-12-22 21:25:51 UTC
I will debug problem in comment 6 share results a bit later.
Comment 17 Emmanuel Grumbach 2015-12-22 21:31:28 UTC
Thanks.
Seems a non trivial hardware configuration thing though. We try to wake up the device and it claims it goes to sleep.
Comment 18 Bogdan Bogush 2015-12-27 16:17:36 UTC
Problem in comment 6 is not reproduced after I've commented iwl_trans_stop_device call on the "out". Do we need it? In case of error iwl_trans_stop_device is called anyway by iwlagn_mac_suspend -> iwlagn_prepare_restart -> iwl_down.

diff --git a/drivers/net/wireless/iwlwifi/dvm/lib.c b/drivers/net/wireless/iwlwifi/dvm/lib.c
index e18629a..4b09fb4 100644
--- a/drivers/net/wireless/iwlwifi/dvm/lib.c
+++ b/drivers/net/wireless/iwlwifi/dvm/lib.c
@@ -1154,6 +1154,9 @@ int iwlagn_suspend(struct iwl_priv *priv, struct cfg80211_wowlan *wowlan)
 
        priv->ucode_loaded = false;
        iwl_trans_stop_device(priv->trans);
+       ret = iwl_trans_start_hw(priv->trans);
+       if (ret)
+               goto out;
 
        priv->wowlan = true;
 
@@ -1247,6 +1250,7 @@ int iwlagn_suspend(struct iwl_priv *priv, struct cfg80211_wowlan *wowlan)
 
        ret = iwlagn_send_patterns(priv, wowlan);
  out:
+       // iwl_trans_stop_device(priv->trans);
        kfree(key_data.rsc_tsc);
        return ret;
 }
Comment 19 Emmanuel Grumbach 2015-12-27 21:01:07 UTC
You are perfectly right. I can remove this call.

Thank you for your work. So you were going to the failure path. This is why I couldn't reproduce on my setup.

Thanks again.
Comment 20 Emmanuel Grumbach 2015-12-27 21:06:10 UTC
Wait - I am missing something.
In case of an error, we shouldn't be calling iwl_trans_d3_suspend, but this function appears in your callstack?

        ret = iwlagn_suspend(priv, wowlan);
        if (ret)
                goto error;

        /* let the ucode operate on its own */
        iwl_write32(priv->trans, CSR_UCODE_DRV_GP1_SET,
                    CSR_UCODE_DRV_GP1_BIT_D3_CFG_COMPLETE);

        iwl_trans_d3_suspend(priv->trans, false);
Comment 21 Bogdan Bogush 2015-12-28 20:39:32 UTC
You have got it wrong. Please consider additional info:

1. flow:
iwlagn_mac_suspend
  iwlagn_suspend
    iwl_trans_stop_device
    iwl_trans_start_hw
    ...
    out:
    iwl_trans_stop_device
  iwl_trans_d3_suspend

If iwl_trans_stop_device is called at the end of iwlagn_suspend my HW became inaccessible and dump is generated on iwl_trans_d3_suspend call. I think you cannot reproduce this because your HW/FW is different. Regarding error see below.

2. In case of iwlagn_suspend returns error (not my case, just analysis) we will have next flow:
iwlagn_mac_suspend
  iwlagn_suspend
    iwl_trans_stop_device
    iwl_trans_start_hw
    ...
    out:
    iwl_trans_stop_device
  error:
  iwlagn_prepare_restart
    iwl_down
      iwl_trans_stop_device

Two calls of iwl_trans_stop_device looks redundant.
Comment 22 Emmanuel Grumbach 2015-12-28 20:55:43 UTC
Ok - so you don't go through the error path, that was my question.
Good.

I fixed my patch as you can see in https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/backport-iwlwifi.git/

I'll push this fix out after the merge window.
I am now closing the bug.

Many thanks for your report and the help!
Comment 23 Bogdan Bogush 2015-12-28 21:00:59 UTC
Great work, thanks!
Comment 24 Emmanuel Grumbach 2015-12-28 21:05:07 UTC
ah - the repository above doesn't include dvm.

Anyway, this will be fixed in 4.5 and backported to linux-stable as needed.
Comment 25 Ariel 2016-08-01 23:35:33 UTC
Could there be a regression in kernel 4.6:

I found this problem in Fedora 24, kernel 4.6

Linux cba 4.6.4-301.fc24.x86_64 #1 SMP Tue Jul 12 11:50:00 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

This was working fine with Fedora 23, kernel 4.5

Linux abc 4.5.7-200.fc23.x86_64 #1 SMP Wed Jun 8 17:41:50 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux


With kernel 4.6 and WOWLAN enabled, the system fails to suspend to RAM:

(journalctl -r):

Aug 01 18:29:46 cba kernel: PM: Some devices failed to suspend, or early wake event detected
Aug 01 18:29:46 cba kernel: sd 1:0:0:0: [sda] Stopping disk
Aug 01 18:29:46 cba kernel: PM: Device phy0 failed to suspend async: error -16
Aug 01 18:29:46 cba kernel: dpm_run_callback(): wiphy_suspend+0x0/0x240 [cfg80211] returns -16


After doing:

sudo iw phy phy0 wowlan disable magic-packet


then kernel 4.6 can suspend normally. Could it be Fedora's kernel missing this patch?
Comment 26 Emmanuel Grumbach 2016-08-02 02:30:55 UTC
I doubt it.
The best would be if you could install the backport tree and bisect.
Can you do that?

Note You need to log in before you can comment on or make changes to this bug.