Bug 88961 - iwlwifi: dvm: NMI_WDG since "drop non VO frames when flushing" - MWG100223326
Summary: iwlwifi: dvm: NMI_WDG since "drop non VO frames when flushing" - MWG100223326
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-11-27 21:06 UTC by Vasyl Demin
Modified: 2014-12-01 19:17 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.17.3
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
dmesg (120.76 KB, text/plain)
2014-11-27 21:06 UTC, Vasyl Demin
Details
trace-cmd record -e iwlwifi (2.02 MB, application/x-ns-proxy-autoconfig)
2014-11-30 19:06 UTC, Vasyl Demin
Details
dmesg (147.98 KB, text/plain)
2014-11-30 19:12 UTC, Vasyl Demin
Details
final version of the fix (5.99 KB, patch)
2014-12-01 07:53 UTC, Emmanuel Grumbach
Details | Diff

Description Vasyl Demin 2014-11-27 21:06:13 UTC
Created attachment 158991 [details]
dmesg

OS: Archlinux x86_64
Notebook: Lenovo Thinkpad T510
Wireless: 03:00.0 Network controller: Intel Corporation WiMAX/WiFi Link 5150
        Subsystem: Intel Corporation WiMAX/WiFi Link 5150 ABG
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0, Cache Line Size: 64 bytes
        Interrupt: pin A routed to IRQ 28
        Region 0: Memory at f8100000 (64-bit, non-prefetchable) [size=8K]
        Capabilities: <access denied>
        Kernel driver in use: iwlwifi
        Kernel modules: iwlwifi
Affected kernel: 3.17.3, 3.17.4

The wireless has become unstable after upgrade kernel to 3.17.3. Dmesg fills with "Microcode SW error detected.  Restarting 0x2000000.". I have successfully used Archlinux on this notebook last three years, so it is definitely not a firmware problem. The culprit is the commit below.

commit 989d3425e2b1ecf7f24635faa09f13d8a6e90085
Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com>
Date:   Sun Oct 5 09:11:14 2014 +0300

    iwlwifi: dvm: drop non VO frames when flushing

    commit a0855054e59b0c5b2b00237fdb5147f7bcc18efb upstream.

After reverting this, the wireless is working properly again. Seems I'm not alone in facing this problem:
https://bugzilla.kernel.org/show_bug.cgi?id=56581#c148
Comment 1 Emmanuel Grumbach 2014-11-29 20:01:02 UTC
The commit you point to is a work around for the bug you also mentioned (it is mentioned in the commit.

This commit did help a lot for many of the reporters of bug.
The specific comment you point to has nothing to do with your problem. You have a different firmware problem.

I will try to see what I can do tomorrow - but I am afraid that reverting 989d3425e2b1ecf7f24635faa09f13d8a6e90085 is the only thing I can do and that seems unlikely because it did help a lot of users.
Comment 2 Vasyl Demin 2014-11-29 20:16:49 UTC
Thanks for reply!

Reverting is not necessary. Let's try to find another solution, I'm ready to provide any additional info.
Comment 3 Emmanuel Grumbach 2014-11-30 12:36:03 UTC
please run tracing while this is happening:

sudo trace-cmd record -e iwlwifi

You'll get a trace.dat file - this is the file I need.
Leave it recording for a few seconds when you see at least one of the microcode error.

You don't need to recompile anything for that - I can see that your driver is already compiled with TRACING enabled.
Comment 4 Vasyl Demin 2014-11-30 19:06:18 UTC
Created attachment 159231 [details]
trace-cmd record -e iwlwifi

~10s before the error and ~10s after
Comment 5 Vasyl Demin 2014-11-30 19:12:41 UTC
Created attachment 159241 [details]
dmesg
Comment 6 Vasyl Demin 2014-11-30 19:13:21 UTC
Maybe it does not matter, but the error repeated exactly every 2 minutes, see new dmesg.
Comment 7 Emmanuel Grumbach 2014-11-30 19:55:43 UTC
can you please try this?

diff --git a/drivers/net/wireless/iwlwifi/dvm/lib.c b/drivers/net/wireless/iwlwifi/dvm/lib.c
index 02e4ede..96a79b1 100644
--- a/drivers/net/wireless/iwlwifi/dvm/lib.c
+++ b/drivers/net/wireless/iwlwifi/dvm/lib.c
@@ -140,7 +140,7 @@ int iwlagn_txfifo_flush(struct iwl_priv *priv, u32 scd_q_msk)
        struct iwl_txfifo_flush_cmd flush_cmd;
        struct iwl_host_cmd cmd = {
                .id = REPLY_TXFIFO_FLUSH,
-               .len = { sizeof(struct iwl_txfifo_flush_cmd), },
+               .len = { sizeof(struct iwl_txfifo_flush_cmd) - 4, },
                .data = { &flush_cmd, },
        };
Comment 8 Emmanuel Grumbach 2014-11-30 20:09:55 UTC
Sorry - bad patch.

Please try this instead:


diff --git a/drivers/net/wireless/iwlwifi/dvm/commands.h b/drivers/net/wireless/iwlwifi/dvm/commands.h
index 751ae1d..ddc441f 100644
--- a/drivers/net/wireless/iwlwifi/dvm/commands.h
+++ b/drivers/net/wireless/iwlwifi/dvm/commands.h
@@ -1006,9 +1006,8 @@ struct iwl_rem_sta_cmd {
  *     2: Dump all FIFO
  */
 struct iwl_txfifo_flush_cmd {
-       __le32 queue_control;
+       __le16 queue_control;
        __le16 flush_control;
-       __le16 reserved;
 } __packed;

 /*
Comment 9 Vasyl Demin 2014-11-30 22:53:20 UTC
The last patch did the trick, no errors in dmesg and no wlan problems.
Comment 10 Emmanuel Grumbach 2014-12-01 07:53:48 UTC
Created attachment 159301 [details]
final version of the fix

Please test this final version of the fix

thank you.
Comment 11 Emmanuel Grumbach 2014-12-01 14:55:30 UTC
I sorry to push here, but 3.19 will be cut very soon, and I'd like this one to make it :)
Comment 12 Vasyl Demin 2014-12-01 19:08:13 UTC
Sorry, I was a little busy today. This patch works too.

Emmanuel, thanks for your help!
Comment 13 Emmanuel Grumbach 2014-12-01 19:17:05 UTC
thank you for your time!

Note You need to log in before you can comment on or make changes to this bug.