Bug 89771 - iwlwifi 7260: firmware crashes repeatedly on AC network
Summary: iwlwifi 7260: firmware crashes repeatedly on AC network
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: Intel Linux
: P1 blocking
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-12-14 18:44 UTC by Sebastian Jug
Modified: 2015-01-12 06:05 UTC (History)
3 users (show)

See Also:
Kernel Version: 3.17.6-1-ARCH
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg (151.82 KB, application/octet-stream)
2014-12-14 18:44 UTC, Sebastian Jug
Details
Output from dmesg with new -10 driver as suggested. (55.01 KB, application/octet-stream)
2014-12-14 20:06 UTC, Sebastian Jug
Details
Output from dmesg with new -10 driver as suggested. (67.83 KB, application/octet-stream)
2014-12-14 20:09 UTC, Sebastian Jug
Details
Photo of dmesg output (2.70 MB, image/jpeg)
2014-12-14 20:11 UTC, Sebastian Jug
Details
dmesg from the iwlwifi-9 ucode (54.93 KB, application/octet-stream)
2014-12-15 02:10 UTC, Sebastian Jug
Details
DEBUGFS Kernel Config (152.21 KB, application/octet-stream)
2014-12-18 02:57 UTC, Sebastian Jug
Details
Core7 FW with uSniffer (661.26 KB, application/octet-stream)
2014-12-18 06:09 UTC, Emmanuel Grumbach
Details
dmesg with new kernel debug as well as new ucode-10 (61.55 KB, application/octet-stream)
2015-01-11 16:01 UTC, Sebastian Jug
Details

Description Sebastian Jug 2014-12-14 18:44:31 UTC
Created attachment 160571 [details]
dmesg

Hello there,

Right now I am running a stock Arch linux kernel so I do not have tracing, monitoring or debugfs. That being said I will do whatever I can to help resolve these issues. 

I have a brand new Lenovo X1 carbon gen 2 and the laptop is totally unusable via wireless network. I initially created a thread on the arch linux forums, but it seems to be a kernel driver issue.

lspci | grep Network:
    03:00.0 Network controller [0280]: Intel Corporation Wireless 7260 [8086:08b2] (rev 83)

lsmod | grep iwlwifi wrote:
    iwlwifi               156837  1 iwlmvm
    cfg80211              445286  3 iwlwifi,mac80211,iwlmvm

modinfo iwlwifi | grep -e 7260 -e version wrote:
    version:        in-tree:
    firmware:       iwlwifi-7260-9.ucode
    srcversion:     B92D41B0FC64FD1196EE1C3
    vermagic:       3.17.6-1-ARCH SMP preempt mod_unload modversions

ls -al /lib/firmware/ | grep 7260 wrote:
    -rw-r--r-- 1 root root  672480 Dec  6 09:23 iwlwifi-7260-10.ucode
    -rw-r--r-- 1 root root  683236 Dec  6 09:23 iwlwifi-7260-7.ucode
    -rw-r--r-- 1 root root  679780 Dec  6 09:23 iwlwifi-7260-8.ucode
    -rw-r--r-- 1 root root  680508 Dec  6 09:23 iwlwifi-7260-9.ucode

I've attached an excellent dmesg that demonstrates several issues that the driver/card is having:
-Deauthentication due to Reason 15=4WAY_HANDSHAKE_TIMEOUT, cripples the card on b/g/n though also present on AC.
-Ongoing microcode SW errors 
-The card hangs with "Q X is active and mapped to fifo"
Comment 1 Emmanuel Grumbach 2014-12-14 19:08:06 UTC
Please take the FW from here:
https://git.kernel.org/cgit/linux/kernel/git/egrumbach/linux-firmware.git/tree/iwlwifi-7260-10.ucode?id=bc3cd75fee783721346f2971d777fc39716ce5e2

copy the file to /lib/firmware and let me know if your wifi feels better.

Thanks.
Comment 2 Sebastian Jug 2014-12-14 19:25:39 UTC
(In reply to Emmanuel Grumbach from comment #1)

Thanks Emmanuel, but how can I force the new iwlwifi-7260-10.ucode to be loaded, as by default my kernel is picking up -9?
Comment 3 Emmanuel Grumbach 2014-12-14 19:28:46 UTC
Nope, you load -10.ucode. I can see that in the logs.
Comment 4 Sebastian Jug 2014-12-14 19:46:43 UTC
(In reply to Emmanuel Grumbach from comment #3)

Okay I'll backup the existing -10 and drop in the new version.

Where did you see the -10 being loaded? Not trying to question your expertise, just to learn myself.


modinfo iwlwifi | grep -e 7260
firmware:       iwlwifi-7260-9.ucode

dmesg | grep iwlwifi
[    8.054870] iwlwifi 0000:03:00.0: loaded firmware version 23.10.10.0 op_mode iwlmvm
Comment 5 Emmanuel Grumbach 2014-12-14 20:00:53 UTC
23.10.10.10 is the firmware you actually load. The first line you pasted shows the firmware we advertise as being ok.
I don't think we can advertise several firmwares. I can check though.
Comment 6 Emmanuel Grumbach 2014-12-14 20:02:49 UTC
BTW, you can also just rename -10.ucode. The driver will pick -9 up and you should be fine.
Comment 7 Sebastian Jug 2014-12-14 20:06:09 UTC
I have swapped out the firmware as per your instruction, at first glance dmesg is far less polluted. It seems thta iwlwifi isn't writing to dmesg anymore? Not sure how that's possible? It still hangs as it did before but now with no apparent logging to dmesg. However, when I reboot I can see the dmesg logging still going. I will attach my new dmesg as well as a photo of the "not occurring" errors. Attachments to follow.
Comment 8 Sebastian Jug 2014-12-14 20:06:49 UTC
Created attachment 160591 [details]
Output from dmesg with new -10 driver as suggested.
Comment 9 Sebastian Jug 2014-12-14 20:09:50 UTC
Created attachment 160601 [details]
Output from dmesg with new -10 driver as suggested.

Waited long enough and the errors eventually flowed into dmesg normally.
Comment 10 Sebastian Jug 2014-12-14 20:11:17 UTC
Created attachment 160611 [details]
Photo of dmesg output

This output is not matching the output in dmesg, but still hanging the connection.
Comment 11 Emmanuel Grumbach 2014-12-14 20:22:05 UTC
Odd.

Can you please try to remove 10.ucode? You can just rename it.

Thanks.
Comment 12 Sebastian Jug 2014-12-14 20:23:03 UTC
Remove 10.ucode and replace 9.ucode with the NEW-10.ucode? Or revert to original -9?
Comment 13 Emmanuel Grumbach 2014-12-14 20:31:39 UTC
Just remove -10.ucode.

(sorry for not explaining, I am typing on my phone)
Comment 14 Sebastian Jug 2014-12-15 02:10:14 UTC
Created attachment 160621 [details]
dmesg from the iwlwifi-9 ucode

Good clean dmesg from -9 firmware.
Comment 15 Sebastian Jug 2014-12-15 02:11:29 UTC
(In reply to Emmanuel Grumbach from comment #13)

Thank you very much for your support @Emmanuel, after a few hours of continuous testing -9 looks very stable.
Comment 16 Emmanuel Grumbach 2014-12-15 07:50:46 UTC
I am glad you know have a stable connection, but I need your help to debug the -10.ucode.

Would it be possible for you to get a kernel with DEBUGFS compiled?
This would allow us to collect logs from your setup and provide helpful information to the firmware team.
Comment 17 Sebastian Jug 2014-12-15 12:58:40 UTC
(In reply to Emmanuel Grumbach from comment #16)

Of course I'd love to help, not a problem. Should we re-open this ticket?
Comment 18 Emmanuel Grumbach 2014-12-15 13:55:55 UTC
great - thanks.
The first step is to have a kernel with IWLWIFI_DEBUGFS enabled.
Then, I'll give you a special -10.ucode firmware that you'll install, load iwlwifi with fw_monitor=1 and crash the firmware when the issues reproduce.
Comment 19 Sebastian Jug 2014-12-18 02:57:36 UTC
Created attachment 161121 [details]
DEBUGFS Kernel Config

Hey Emmanuel,

Is this kernel config sufficient for iwlwifi debugging?
Comment 20 Sebastian Jug 2014-12-18 04:09:02 UTC
(In reply to Emmanuel Grumbach from comment #18)


I've built the current mainline kernel 3.18 with all IWLWIFI debugging, debugfs, and tracing. So we should be good to go. Sorry I'm a bit slow, busy few weeks for me until the new years.
Comment 21 Emmanuel Grumbach 2014-12-18 06:09:26 UTC
Created attachment 161131 [details]
Core7 FW with uSniffer

Please copy the file attached into /lib/firmware/

Then, reload iwlwifi with fw_monitor=1:

sudo modprobe -r iwlmvm iwlwifi
sudo modprobe iwlwifi fw_monitor=1

Then, when you have networking issues quickly do (as root):
echo 1 > /sys/kernel/debug/iwlwifi/*/iwlmvm/fw_restart

Then, you can follow the procedure here http://wireless.kernel.org/en/users/Drivers/iwlwifi#Debugging the section Firmware debugging:

cat /sys/kernel/debug/iwlwifi/*/iwlmvm/fw_error_dump > iwl.bin

I'll need the iwl.bin file. It should be around 4M large. You can compress it.

Please take the time to read the privacy note at the end of this page.

Thank you
Comment 22 Emmanuel Grumbach 2014-12-29 11:26:07 UTC
I understand that it is holiday period.

Do you plan to provide the required input or should I close the issue?

Thank you.
Comment 23 Sebastian Jug 2014-12-31 03:34:21 UTC
(In reply to Emmanuel Grumbach from comment #22)
> I understand that it is holiday period.
> 
> Do you plan to provide the required input or should I close the issue?
> 
> Thank you.

Hello Emmanuel,

I've just returned to the country within the hour. I do plan on getting you the required input ASAP. Sorry for the delay.
Comment 24 Emilien Richard 2015-01-05 21:36:54 UTC
Hello,

I've the same issues with the 3.17.6-1-ARCH kernel. I've a Lenovo X240 with an Intel 7260AN network controller.

What can I do too help you?
Comment 25 Emmanuel Grumbach 2015-01-06 05:53:15 UTC
@Emilien

please open a new bug. I prefer to not mix 2 issues unless I am completely sure they are identical.

In this new bug, please attach your dmesg output.

Thanks
Comment 26 Sebastian Jug 2015-01-11 16:01:54 UTC
Created attachment 163201 [details]
dmesg with new kernel debug as well as new ucode-10

After reloading the module with the monitor parameter enabled I'm unable to connect to my wifi network. Any suggestions?
Comment 27 Sebastian Jug 2015-01-11 16:05:24 UTC
Hey Emmanuel, 

I ran the two commands to reload the iwlwifi module as per above and the outcome is after the 84.048859 timestamp in the dmesg attached.
Comment 28 Emmanuel Grumbach 2015-01-11 17:37:24 UTC
Hi,

This is not related to fw_monitor module parameter but is is bothering...
Did you use the firmware I attached to this bug (Core7 FW with uSniffer)?
Comment 29 Sebastian Jug 2015-01-11 18:29:47 UTC
(In reply to Emmanuel Grumbach from comment #28)

Yes sir I followed all instructions including copying the Core7FW w/. uSniffer to /lib/firmware/. I boot up with the new firmware, and I have no issues connecting or reconnecting to the network. However after the modprobe with the parameter I'm no longer able to connect.
Comment 30 Emmanuel Grumbach 2015-01-11 18:39:00 UTC
are you sure you aren't having conflicts between 2 user space applications trying to associate?

I guess you are using the supplicant, did you kill the supplicant after having reloaded the iwlwifi module?

another option is to add fw_monifor=1 to /etc/modprobe.d/iwlwifi.conf and reboot.
This avoids to reload iwlwifi
Comment 31 Sebastian Jug 2015-01-11 22:04:12 UTC
(In reply to Emmanuel Grumbach from comment #30)
> are you sure you aren't having conflicts between 2 user space applications
> trying to associate?
> 
> I guess you are using the supplicant, did you kill the supplicant after
> having reloaded the iwlwifi module?
> 
> another option is to add fw_monifor=1 to /etc/modprobe.d/iwlwifi.conf and
> reboot.
> This avoids to reload iwlwifi

Hey Emmanuel,

I've added the fw_monitor parameter to the conf file as you suggested and I'm running the module you provided, but I am unable to reproduce the issue. However I'm also running the latest mainline kernel, so perhaps it was a kernel specific issue? Is that possible, as nothing else has changed other than the driver and the kernel?
Comment 32 Emmanuel Grumbach 2015-01-12 06:04:53 UTC
Hi,

Yes - the latest kernel might have improved a few things.

In any case, 23.11.10.0 hit linux-firmware.git quite a bit ago. I'd hope you'll ARCH will soon ship it.

I will close this bug for now.

Thanks for your help!

Note You need to log in before you can comment on or make changes to this bug.