Created attachment 169351 [details] Contains logs that happen when the connection drops and then during reconnecting I'm using a Dell Inspiron 15R (N5110) laptop with following network card: Lshw output: description: Wireless interface product: Centrino Wireless-N 1030 [Rainbow Peak] vendor: Intel Corporation physical id: 0 bus info: pci@0000:09:00.0 logical name: wlan0 version: 34 serial: bc:77:37:46:b1:0d width: 64 bits clock: 33MHz capabilities: pm msi pciexpress bus_master cap_list ethernet physical wireless configuration: broadcast=yes driver=iwlwifi driverversion=3.13.0-37-generic firmware=18.168.6.1 ip=192.168.1.4 latency=0 link=yes multicast=yes wireless=IEEE 802.11bgn resources: irq:56 memory:f7a00000-f7a01fff Lspci output: 09:00.0 Network controller: Intel Corporation Centrino Wireless-N 1030 [Rainbow Peak] (rev 34) Subsystem: Intel Corporation Centrino Wireless-N 1030 BGN Flags: bus master, fast devsel, latency 0, IRQ 56 Memory at f7a00000 (64-bit, non-prefetchable) [size=8K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [e0] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number bc-77-37-ff-ff-46-b1-0d Kernel driver in use: iwlwifi The card randomly disconnects from the AP. After it disconnects I have to disable and enable wireless in network manager to be able to reconnect again. What I see in the logs while this is happening can be seen in the attached file. Adding the following module options to the modprobe configuration seems to mostly solve the problem: options iwlmvm power_scheme=1 options iwlwifi bt_coex_active=N swcrypto=1 11n_disable=1 Though I'm still seeing some disconnects, and the card does not reconnect after suspend and I have to do the mentioned disable/enable process described above. This driver card been working flawlessly for several years with Ubuntu 12.04 with the 3.2 kernel. I have also filed this bug here: https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1420935 I'm more than happy to provide any additional information to get this fixed.
Please share your dmesg output. Also - please make sure you have: commit a0855054e59b0c5b2b00237fdb5147f7bcc18efb Author: Emmanuel Grumbach <emmanuel.grumbach@intel.com> Date: Sun Oct 5 09:11:14 2014 +0300 iwlwifi: dvm: drop non VO frames when flushing This commit should have been ported to 3.16 by your distribution.
Created attachment 170021 [details] Dmesg output
If there is an issue (not clear from the dmesg), it will most likely be a firmware issue and hence will not be fixed. You can nevertheless try to record a WiFi sniffer of the problem so that we can get a better understanding of what's going on.
It seems that the kernel I'm using contains the mentioned patch according to the changelog: # zgrep -A1 dvm /usr/share/doc/linux-image-3.16.0-31-generic/changelog.Debian.gz * iwlwifi: dvm: fix flush support for old firmware - LP: #1419125 -- * iwlwifi: dvm: drop non VO frames when flushing - LP: #1393401 This is the bug refereed in the change log: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1393401 I also suspect that this might be related to suspending the laptop. I just reinstalled my laptop yesterday and had several restarts after the installation, after the final restart I used the laptop for hours without disconnect. Today morning when the laptop came back from suspend it was not able to reconnect, I had to disable and enable wireless in network manager. I just used the laptop for a short time in the morning, then suspended it again. Now (in the evening) I woke the laptop up from hibernation and I have quite frequent disconnects. Can you please suggest me an exact program which I can use for doing the sniffing? If that is really a firmware issue what can I do? Thanks.
For sniffing you'd need an additional WiFi device and you'd need to put it in monitor mode on the right frequency. If that's a firmware issue, there isn't much we can do. The only thing I can think about is to re-run the calibration after suspend resume. Can you test a patch?
Okay, sniffing seems not to be executable right now as I have no additional device for that at home. Regarding trying a patch, of course, that could work. Would it be okay to patch the sources of the kernel I'm currently running (downloading the source package) and compiling it with my distributions configuration? I just had a disconnect again. I'm attaching an additional dmesg snippet with comments about what happened when.
Created attachment 170031 [details] Dmesg snippet with comments
It'd be ok I think since the driver you are using hasn't changed much for a while.
please try this: diff --git a/drivers/net/wireless/iwlwifi/dvm/ucode.c b/drivers/net/wireless/iwlwifi/dvm/ucode.c index 4dbef7e..bb1322b 100644 --- a/drivers/net/wireless/iwlwifi/dvm/ucode.c +++ b/drivers/net/wireless/iwlwifi/dvm/ucode.c @@ -440,8 +440,6 @@ int iwl_run_init_ucode(struct iwl_priv *priv) */ ret = iwl_wait_notification(&priv->notif_wait, &calib_wait, UCODE_CALIB_TIMEOUT); - if (!ret) - priv->init_ucode_run = true; goto out;
Compiled and installed the patched kernel. Let's see if that fixes the problem.
Any news here?
Yes, it seems to be much better now, but let me also test it over the weekend and come back after that.
Created attachment 170711 [details] fix final version of the fix.
So I experienced some disconnects, but very few, once a day maybe. To be 100% sure I wold have to go back to a previous kernel for a few days to gather some more evidence and then come back to the patched one for another few days. I'm just a bit suspicious because I would have expected a complete fix based on the patch (the removed code) and I'm not sure if the improvement is really caused by the patch or some other unknown factor. But to summarize it is for sure not worse than it was before in any way.
The fix seems trivial, but it is not. What the fix does is impacting the physical layer behavior which means that it impacts the less predictable component in the system. WiFi disconnects from time to time. This is totally fine. I won't consider this as a real bug. The disconnections you had were a bug. What I do with that patch is that I re-calibrate the PHY layer every time you exit suspend which makes the PHY layer more accurate. I'll leave the bug open for a few more days but I'll send the patch upstream anyway. An open bug consumes resources and this bug seems to be fixed from my point of view, I'll close it next Sunday if I don't get any show stopper from your side.
In that case I think that it is completely okay to close the bug and consider it as fixed. The disconnects are really very rare. For example I had none in the last 2-3 days while my laptop was running all night long. Although I restarted my laptop today and had one disconnect after 3 hours and this is what I saw in dmesg after the disconnect (maybe it helps to decide if it is related to what we have seen before): [10916.367646] cfg80211: Calling CRDA to update world regulatory domain [10916.440559] cfg80211: World regulatory domain updated: [10916.440564] cfg80211: DFS Master region: unset [10916.440566] cfg80211: (start_freq - end_freq @ bandwidth), (max_antenna_gain, max_eirp), (dfs_cac_time) [10916.440569] cfg80211: (2402000 KHz - 2472000 KHz @ 40000 KHz), (300 mBi, 2000 mBm), (N/A) [10916.440572] cfg80211: (2457000 KHz - 2482000 KHz @ 40000 KHz), (300 mBi, 2000 mBm), (N/A) [10916.440573] cfg80211: (2474000 KHz - 2494000 KHz @ 20000 KHz), (300 mBi, 2000 mBm), (N/A) [10916.440575] cfg80211: (5170000 KHz - 5250000 KHz @ 40000 KHz), (300 mBi, 2000 mBm), (N/A) [10916.440577] cfg80211: (5735000 KHz - 5835000 KHz @ 40000 KHz), (300 mBi, 2000 mBm), (N/A) Thanks for sorting this out and fixing it finally. That is a great help and great support and is mostly appreciated.
Sorry for coming back to this thread again. I experienced these frequent reconnects that I experienced before again. I presumed (based on what you said about the patch) that suspending the laptop and waking it up again would stop these reconnects and it actually did. So it seems that something that causes this (either the firmware or something else) is still "hiding" there, but the suspend > wakeup, maybe because of the re-calibration helped. Just wanted to add this here so you have accurate information.
yeah - so I am not really surprised. What it means is that the firmware can't really "stay calibrated" for too long. This is clearly a firmware bug and won't be fixed. What you can do is to suspend resume, or to crash manually the firmware and it will re-calibrate. To do so: echo 1 > /sys/kernel/debug/iwlwifi/0000\:03\:00.0/iwldvm/debug/fw_restart of course, you'd need to replace the 03\:... with your device's pci number.
Just to follow up on this. I upgraded a few weeks before to Ubuntu 15.10 and it looks like this bug is completely fixed. It has the following kernel: Linux 4.2.0-18-generic #22-Ubuntu SMP Fri Nov 6 18:25:50 UTC 2015 x86_64 GNU/Linux Wifi works stable without any additional tweaks. Happy that this was fixed, thanks.