Bug 108271 - iwlwifi-7265: connection freezes with firmware version 17.246894.0
Summary: iwlwifi-7265: connection freezes with firmware version 17.246894.0
Status: CLOSED DUPLICATE of bug 107471
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: DO NOT USE - assign "network-wireless-intel" component instead
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-11-21 21:33 UTC by Stijn Tintel
Modified: 2015-12-21 08:01 UTC (History)
0 users

See Also:
Kernel Version: 4.3.0
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
trace.dat (601.67 KB, application/x-xz)
2015-11-26 20:07 UTC, Stijn Tintel
Details
trace.dat with uapsd_disable=1 (4.22 MB, application/x-ns-proxy-autoconfig)
2015-11-30 20:36 UTC, Stijn Tintel
Details
trace.dat with uapsd_disable=1 (621.94 KB, application/x-xz)
2015-12-04 19:13 UTC, Stijn Tintel
Details

Description Stijn Tintel 2015-11-21 21:33:29 UTC
A few days ago I noticed there were new firmware versions available for iwlwifi-7265, so I installed them on my machine. I was using version 15.195093.0 before, which was relatively stable. After reloading the kernel modules, the v17 firmware was used:

iwlwifi 0000:02:00.0: api_index larger than supported by driver
iwlwifi 0000:02:00.0: loaded firmware version 17.246894.0 op_mode iwlmvm

After a few hours I noticed my connection froze; still associated but no traffic possible. This was solved with rfkill block and rfkill unblock (via Fn+PrtScr on my XPS13):

nov 19 06:22:36 sylvester.nomad.adlevio.net kernel: iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.
nov 19 06:22:36 sylvester.nomad.adlevio.net kernel: iwlwifi 0000:02:00.0: RF_KILL bit toggled to enable radio.
nov 19 06:22:40 sylvester.nomad.adlevio.net kernel: wlan0: associated

This happened a 2nd time:

nov 19 12:38:01 sylvester.nomad.adlevio.net kernel: wlan0: associated
nov 19 15:26:07 sylvester.nomad.adlevio.net kernel: iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.
nov 19 15:26:07 sylvester.nomad.adlevio.net kernel: iwlwifi 0000:02:00.0: RF_KILL bit toggled to enable radio.
nov 19 15:26:11 sylvester.nomad.adlevio.net kernel: wlan0: associated

I had this problem before with the v14 firmware (that was much worse, happened after a few minutes, sometimes even less than a minute), so I immediately suspected the new firmware to be the problem.

I then removed the iwlwifi-7265D-17.ucode file and reloaded the kernel modules again. Since then it using version 16.242414.0. I did not see the problem with this firmware in the last 2 days.
Comment 1 Emmanuel Grumbach 2015-11-22 08:21:00 UTC
Please run tracing as explained here:
https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging

Thanks.
Comment 2 Emmanuel Grumbach 2015-11-22 10:28:36 UTC
If you need hours to reproduce the problem, tracing might not be the right tool.
Let's start with a dmesg output. You don't see anything special from iwlwifi there?
Comment 3 Stijn Tintel 2015-11-22 23:42:56 UTC
(In reply to Emmanuel Grumbach from comment #2)
> Let's start with a dmesg output. You don't see anything special from iwlwifi
> there?

Nothing in dmesg at all, just RF_KILL when I disable wireless to solve the problem:
nov 19 06:17:25 sylvester.nomad.adlevio.net kernel: kvm: zapping shadow pages for mmio generation wraparound
nov 19 06:22:36 sylvester.nomad.adlevio.net kernel: iwlwifi 0000:02:00.0: RF_KILL bit toggled to disable radio.

I will install the v17 firmware again, try to reproduce it and run tracing when it happens.
Comment 4 Emmanuel Grumbach 2015-11-25 21:10:05 UTC
Any news?
Did you try to disable 11n (11n_disable=1 as a module parameter to iwlwifi)?
Comment 5 Stijn Tintel 2015-11-26 20:07:24 UTC
Created attachment 195551 [details]
trace.dat
Comment 6 Stijn Tintel 2015-11-26 20:14:30 UTC
After installing the v17 firmware again, I was unable to reproduce the problem. I then remembered I had enabled U-APSD again to test if the latency spikes I had on my AP (DAP-2695) would be fixed. It still took a while, but today the problem happened twice. The laptop is in a different location than where I usually use it, so it's possible that uapsd_disable=1 is unrelated, not sure.

I ran "trace-cmd record -e iwlwifi" and ran a ping to the def gw when I noticed the connection hang.
Comment 7 Emmanuel Grumbach 2015-11-26 20:53:03 UTC
Ok thanks. I'll take a look. Note that there are serious interrop issues with uAPSD, this is why it is disable by default.
Comment 8 Emmanuel Grumbach 2015-11-29 08:22:27 UTC
If the problem doesn't reproduce with uapsd_disable=1 (as default), please close this issue.
Comment 9 Stijn Tintel 2015-11-30 20:35:24 UTC
Problem just happened again with uapsd_disable=1.

sylvester ~ # cat /sys/module/iwlwifi/parameters/uapsd_disable
Y
Comment 10 Stijn Tintel 2015-11-30 20:36:04 UTC
Created attachment 196161 [details]
trace.dat with uapsd_disable=1
Comment 11 Emmanuel Grumbach 2015-11-30 20:41:00 UTC
I don't see anything bad in this tracing.

What are the symptoms?
You run pings and the pings stop?
Comment 12 Stijn Tintel 2015-11-30 22:01:29 UTC
Associated but unable to ping default gateway. I only start the trace when I notice the problem (ssh sessions no longer respond, websites no longer open).
Comment 13 Emmanuel Grumbach 2015-12-01 06:16:09 UTC
Are you using Bluetooth?
Can you try to disable Bluetooth?

Please let trace-cmd run for more time and do some network operations while it is running.

Thanks
Comment 14 Stijn Tintel 2015-12-04 19:13:58 UTC
Created attachment 196401 [details]
trace.dat with uapsd_disable=1

Still associated, unable to ping anything in local subnet
Tried disconnect + reconnect -> network not found
Toggle rfkill on/off -> reconnected fine
Comment 15 Stijn Tintel 2015-12-04 21:02:11 UTC
Missed the bluetooth part. I am using bluetooth, and I need it (working remote), so disabling it is difficult. Did you mean to completely disable bluetooth, or just disable bt_coex_active?
Comment 16 Emmanuel Grumbach 2015-12-05 19:28:17 UTC
I meant completely disable BT.

Please try with firmware -16.ucode. Please remove the -17.ucode version before doing so.
I also recommend you install the latest BT firmware from linux-firmware.git.
Comment 17 Emmanuel Grumbach 2015-12-21 08:01:53 UTC

*** This bug has been marked as a duplicate of bug 107471 ***

Note You need to log in before you can comment on or make changes to this bug.