Bug 72601
Description
Ralf
2014-03-21 16:33:54 UTC
Created attachment 130171 [details]
syslog from one of the bursts of connection drops (this happened while I was not even at the machine)
Created attachment 130181 [details]
Kernel log from the same instance of the issue
Can you change the security parameters of your AP? I'd try to disabled security to see what happens, and then to set it up with WPA2 only (AES). The only thing I am sure about is that this is not related to the Intel NIC. The NetGear AP does not support WPA2, but I could try unencrypted mode of course.
The FritzBox runs with WPA2 only.
> The only thing I am sure about is that this is not related to the Intel NIC.
Well, I had an Atheros chip in the same laptop before, and no such problems with the FritzBox.
But then, with the AP at my other place (which I cannot test currently), the issue is at least much less annoying - I would have to check again to see what's in the logs.
I am experimenting the exact same issues with Intel 7260 and a Netgear CG3100D-RG AP. I have tried: - Different security settings (WEP, WPA-PSK, WPA2-PSK, disabled) - Disabling wifi N both in the AP and in the module (11n_disable=1) - Additional module settings (iwlmvm power_scheme=1 / iwlwifi bt_coex_active=N swcrypto=1) But the problem still happens: a few times a day the connection drops and reconnects immediately, losing lots of packets during some minutes, making the connection unusable. The card is very similar to the original reporter, but a different revision: 03:00.0 Network controller: Intel Corporation Wireless 7260 (rev 6b) Subsystem: Intel Corporation Wireless-N 7260 Flags: bus master, fast devsel, latency 0, IRQ 46 Memory at f0400000 (64-bit, non-prefetchable) [size=8K] Capabilities: [c8] Power Management version 3 Capabilities: [d0] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [40] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [140] Device Serial Number 5c-51-4f-ff-ff-c9-60-49 Capabilities: [14c] Latency Tolerance Reporting Capabilities: [154] Vendor Specific Information: ID=cafe Rev=1 Len=014 <?> Kernel driver in use: iwlwifi Kernel modules: iwlwifi My kernel version is 3.13.1-031301-generic. Is there anything I can do to help debug this problem? Ok, I just realised that for kernel 3.13 there is a newer firmware available (iwlwifi-7260-8.ucode). The connection problems are still there, but now I'm getting "Microcode SW error detected" when they happen: Mar 29 17:25:49 cherokee2 kernel: [ 312.102815] iwlwifi 0000:03:00.0: Microcode SW error detected. Restarting 0x2000000. Mar 29 17:25:49 cherokee2 kernel: [ 312.102819] iwlwifi 0000:03:00.0: CSR values: Mar 29 17:25:49 cherokee2 kernel: [ 312.102821] iwlwifi 0000:03:00.0: (2nd byte of CSR_INT_COALESCING is CSR_INT_PERIODIC_REG) Mar 29 17:25:49 cherokee2 kernel: [ 312.102829] iwlwifi 0000:03:00.0: CSR_HW_IF_CONFIG_REG: 0X00489204 Mar 29 17:25:49 cherokee2 kernel: [ 312.102842] iwlwifi 0000:03:00.0: CSR_INT_COALESCING: 0X8000ff40 Mar 29 17:25:49 cherokee2 kernel: [ 312.102853] iwlwifi 0000:03:00.0: CSR_INT: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102866] iwlwifi 0000:03:00.0: CSR_INT_MASK: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102880] iwlwifi 0000:03:00.0: CSR_FH_INT_STATUS: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102894] iwlwifi 0000:03:00.0: CSR_GPIO_IN: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102908] iwlwifi 0000:03:00.0: CSR_RESET: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102922] iwlwifi 0000:03:00.0: CSR_GP_CNTRL: 0X080403c5 Mar 29 17:25:49 cherokee2 kernel: [ 312.102936] iwlwifi 0000:03:00.0: CSR_HW_REV: 0X00000144 Mar 29 17:25:49 cherokee2 kernel: [ 312.102950] iwlwifi 0000:03:00.0: CSR_EEPROM_REG: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102963] iwlwifi 0000:03:00.0: CSR_EEPROM_GP: 0X80000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102977] iwlwifi 0000:03:00.0: CSR_OTP_GP_REG: 0X803a0000 Mar 29 17:25:49 cherokee2 kernel: [ 312.102991] iwlwifi 0000:03:00.0: CSR_GIO_REG: 0X00080042 Mar 29 17:25:49 cherokee2 kernel: [ 312.103005] iwlwifi 0000:03:00.0: CSR_GP_UCODE_REG: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103019] iwlwifi 0000:03:00.0: CSR_GP_DRIVER_REG: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103033] iwlwifi 0000:03:00.0: CSR_UCODE_DRV_GP1: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103046] iwlwifi 0000:03:00.0: CSR_UCODE_DRV_GP2: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103060] iwlwifi 0000:03:00.0: CSR_LED_REG: 0X00000060 Mar 29 17:25:49 cherokee2 kernel: [ 312.103074] iwlwifi 0000:03:00.0: CSR_DRAM_INT_TBL_REG: 0X88402d4c Mar 29 17:25:49 cherokee2 kernel: [ 312.103088] iwlwifi 0000:03:00.0: CSR_GIO_CHICKEN_BITS: 0X27800200 Mar 29 17:25:49 cherokee2 kernel: [ 312.103102] iwlwifi 0000:03:00.0: CSR_ANA_PLL_CFG: 0Xd55555d5 Mar 29 17:25:49 cherokee2 kernel: [ 312.103116] iwlwifi 0000:03:00.0: CSR_HW_REV_WA_REG: 0X0001001a Mar 29 17:25:49 cherokee2 kernel: [ 312.103130] iwlwifi 0000:03:00.0: CSR_DBG_HPET_MEM_REG: 0Xffff0000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103131] iwlwifi 0000:03:00.0: FH register values: Mar 29 17:25:49 cherokee2 kernel: [ 312.103153] iwlwifi 0000:03:00.0: FH_RSCSR_CHNL0_STTS_WPTR_REG: 0X35feb800 Mar 29 17:25:49 cherokee2 kernel: [ 312.103167] iwlwifi 0000:03:00.0: FH_RSCSR_CHNL0_RBDCB_BASE_REG: 0X03c06570 Mar 29 17:25:49 cherokee2 kernel: [ 312.103181] iwlwifi 0000:03:00.0: FH_RSCSR_CHNL0_WPTR: 0X000000d0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103195] iwlwifi 0000:03:00.0: FH_MEM_RCSR_CHNL0_CONFIG_REG: 0X80801114 Mar 29 17:25:49 cherokee2 kernel: [ 312.103209] iwlwifi 0000:03:00.0: FH_MEM_RSSR_SHARED_CTRL_REG: 0X000000fc Mar 29 17:25:49 cherokee2 kernel: [ 312.103222] iwlwifi 0000:03:00.0: FH_MEM_RSSR_RX_STATUS_REG: 0X07030000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103236] iwlwifi 0000:03:00.0: FH_MEM_RSSR_RX_ENABLE_ERR_IRQ2DRV: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103250] iwlwifi 0000:03:00.0: FH_TSSR_TX_STATUS_REG: 0X07fb0001 Mar 29 17:25:49 cherokee2 kernel: [ 312.103264] iwlwifi 0000:03:00.0: FH_TSSR_TX_ERROR_REG: 0X00000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103280] iwlwifi 0000:03:00.0: FW error in SYNC CMD SCAN_REQUEST_CMD Mar 29 17:25:49 cherokee2 kernel: [ 312.103284] CPU: 0 PID: 312 Comm: kworker/u16:4 Tainted: G W 3.13.1-031301-generic #201401291035 Mar 29 17:25:49 cherokee2 kernel: [ 312.103286] Hardware name: LENOVO 20AN0069US/20AN0069US, BIOS GLET43WW (1.18 ) 12/04/2013 Mar 29 17:25:49 cherokee2 kernel: [ 312.103301] Workqueue: phy7 ieee80211_scan_work [mac80211] Mar 29 17:25:49 cherokee2 kernel: [ 312.103303] ffff88035d107218 ffff880401117b58 ffffffff8173356d 0000000000000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103306] ffff88035d104000 ffff880401117be8 ffffffffa0417150 00000000000000a0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103308] 0000000000000019 0000001300000000 0000000000000019 0000000000000000 Mar 29 17:25:49 cherokee2 kernel: [ 312.103311] Call Trace: Mar 29 17:25:49 cherokee2 kernel: [ 312.103316] [<ffffffff8173356d>] dump_stack+0x46/0x58 Mar 29 17:25:49 cherokee2 kernel: [ 312.103324] [<ffffffffa0417150>] iwl_pcie_send_hcmd_sync+0x580/0x590 [iwlwifi] Mar 29 17:25:49 cherokee2 kernel: [ 312.103328] [<ffffffff810ad7d0>] ? __wake_up_sync+0x20/0x20 Mar 29 17:25:49 cherokee2 kernel: [ 312.103333] [<ffffffffa041858a>] iwl_trans_pcie_send_hcmd+0x2a/0x80 [iwlwifi] Mar 29 17:25:49 cherokee2 kernel: [ 312.103339] [<ffffffffa058fe46>] iwl_mvm_send_cmd_status+0x56/0x170 [iwlmvm] Mar 29 17:25:49 cherokee2 kernel: [ 312.103344] [<ffffffffa059681a>] iwl_mvm_scan_request+0x36a/0x460 [iwlmvm] Mar 29 17:25:49 cherokee2 kernel: [ 312.103347] [<ffffffffa058ae53>] iwl_mvm_mac_hw_scan+0x93/0xa0 [iwlmvm] Mar 29 17:25:49 cherokee2 kernel: [ 312.103355] [<ffffffffa04c2d5d>] __ieee80211_scan_completed+0x1bd/0x330 [mac80211] Mar 29 17:25:49 cherokee2 kernel: [ 312.103358] [<ffffffff8109fe2a>] ? arch_vtime_task_switch+0x8a/0x90 Mar 29 17:25:49 cherokee2 kernel: [ 312.103364] [<ffffffffa04c3ac2>] ieee80211_scan_work+0xe2/0x250 [mac80211] Mar 29 17:25:49 cherokee2 kernel: [ 312.103367] [<ffffffff8108443f>] process_one_work+0x17f/0x4c0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103369] [<ffffffff8108566b>] worker_thread+0x11b/0x3d0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103371] [<ffffffff81085550>] ? manage_workers.isra.21+0x190/0x190 Mar 29 17:25:49 cherokee2 kernel: [ 312.103374] [<ffffffff8108c5c9>] kthread+0xc9/0xe0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103376] [<ffffffff8108c500>] ? flush_kthread_worker+0xb0/0xb0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103379] [<ffffffff817489bc>] ret_from_fork+0x7c/0xb0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103381] [<ffffffff8108c500>] ? flush_kthread_worker+0xb0/0xb0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103384] iwlwifi 0000:03:00.0: Scan failed! status 0x1 ret -5 Mar 29 17:25:49 cherokee2 kernel: [ 312.103400] iwlwifi 0000:03:00.0: Start IWL Error Log Dump: Mar 29 17:25:49 cherokee2 kernel: [ 312.103402] iwlwifi 0000:03:00.0: Status: 0x00000000, count: 6 Mar 29 17:25:49 cherokee2 kernel: [ 312.103404] iwlwifi 0000:03:00.0: 0x000014F4 | ADVANCED_SYSASSERT Mar 29 17:25:49 cherokee2 kernel: [ 312.103406] iwlwifi 0000:03:00.0: 0x000002A0 | uPc Mar 29 17:25:49 cherokee2 kernel: [ 312.103408] iwlwifi 0000:03:00.0: 0x00000000 | branchlink1 Mar 29 17:25:49 cherokee2 kernel: [ 312.103410] iwlwifi 0000:03:00.0: 0x00000BEA | branchlink2 Mar 29 17:25:49 cherokee2 kernel: [ 312.103411] iwlwifi 0000:03:00.0: 0x00015EB4 | interruptlink1 Mar 29 17:25:49 cherokee2 kernel: [ 312.103413] iwlwifi 0000:03:00.0: 0x004ACCD9 | interruptlink2 Mar 29 17:25:49 cherokee2 kernel: [ 312.103415] iwlwifi 0000:03:00.0: 0x00000187 | data1 Mar 29 17:25:49 cherokee2 kernel: [ 312.103416] iwlwifi 0000:03:00.0: 0x00000024 | data2 Mar 29 17:25:49 cherokee2 kernel: [ 312.103418] iwlwifi 0000:03:00.0: 0x000000E0 | data3 Mar 29 17:25:49 cherokee2 kernel: [ 312.103420] iwlwifi 0000:03:00.0: 0x61C0F190 | beacon time Mar 29 17:25:49 cherokee2 kernel: [ 312.103422] iwlwifi 0000:03:00.0: 0x51BD5E89 | tsf low Mar 29 17:25:49 cherokee2 kernel: [ 312.103423] iwlwifi 0000:03:00.0: 0x00000004 | tsf hi Mar 29 17:25:49 cherokee2 kernel: [ 312.103425] iwlwifi 0000:03:00.0: 0x00000000 | time gp1 Mar 29 17:25:49 cherokee2 kernel: [ 312.103427] iwlwifi 0000:03:00.0: 0x0262D2CB | time gp2 Mar 29 17:25:49 cherokee2 kernel: [ 312.103428] iwlwifi 0000:03:00.0: 0x00000000 | time gp3 Mar 29 17:25:49 cherokee2 kernel: [ 312.103430] iwlwifi 0000:03:00.0: 0x00041618 | uCode version Mar 29 17:25:49 cherokee2 kernel: [ 312.103431] iwlwifi 0000:03:00.0: 0x00000144 | hw version Mar 29 17:25:49 cherokee2 kernel: [ 312.103433] iwlwifi 0000:03:00.0: 0x00489204 | board version Mar 29 17:25:49 cherokee2 kernel: [ 312.103435] iwlwifi 0000:03:00.0: 0x09330080 | hcmd Mar 29 17:25:49 cherokee2 kernel: [ 312.103437] iwlwifi 0000:03:00.0: 0x000220C4 | isr0 Mar 29 17:25:49 cherokee2 kernel: [ 312.103439] iwlwifi 0000:03:00.0: 0x00000000 | isr1 Mar 29 17:25:49 cherokee2 kernel: [ 312.103440] iwlwifi 0000:03:00.0: 0x00000002 | isr2 Mar 29 17:25:49 cherokee2 kernel: [ 312.103442] iwlwifi 0000:03:00.0: 0x0041C0C2 | isr3 Mar 29 17:25:49 cherokee2 kernel: [ 312.103444] iwlwifi 0000:03:00.0: 0x00000000 | isr4 Mar 29 17:25:49 cherokee2 kernel: [ 312.103445] iwlwifi 0000:03:00.0: 0x01080112 | isr_pref Mar 29 17:25:49 cherokee2 kernel: [ 312.103447] iwlwifi 0000:03:00.0: 0x00000000 | wait_event Mar 29 17:25:49 cherokee2 kernel: [ 312.103449] iwlwifi 0000:03:00.0: 0x00000080 | l2p_control Mar 29 17:25:49 cherokee2 kernel: [ 312.103451] iwlwifi 0000:03:00.0: 0x00018020 | l2p_duration Mar 29 17:25:49 cherokee2 kernel: [ 312.103452] iwlwifi 0000:03:00.0: 0x0000003F | l2p_mhvalid Mar 29 17:25:49 cherokee2 kernel: [ 312.103454] iwlwifi 0000:03:00.0: 0x00000080 | l2p_addr_match Mar 29 17:25:49 cherokee2 kernel: [ 312.103456] iwlwifi 0000:03:00.0: 0x00000005 | lmpm_pmg_sel Mar 29 17:25:49 cherokee2 kernel: [ 312.103458] iwlwifi 0000:03:00.0: 0x23021719 | timestamp Mar 29 17:25:49 cherokee2 kernel: [ 312.103460] iwlwifi 0000:03:00.0: 0x0000D0E0 | flow_handler Mar 29 17:25:49 cherokee2 kernel: [ 312.103463] ieee80211 phy7: Hardware restart was requested Mar 29 17:25:49 cherokee2 kernel: [ 312.107141] iwlwifi 0000:03:00.0: Failing on timeout while stopping DMA channel 2 [0x07fb0001] Mar 29 17:25:49 cherokee2 kernel: [ 312.107492] iwlwifi 0000:03:00.0: L1 Enabled; Disabling L0S Mar 29 17:25:49 cherokee2 kernel: [ 312.107723] iwlwifi 0000:03:00.0: L1 Enabled; Disabling L0Sl (In reply to Ignacio Huerta from comment #6) > Ok, I just realised that for kernel 3.13 there is a newer firmware available > (iwlwifi-7260-8.ucode). The connection problems are still there, but now I'm > getting "Microcode SW error detected" when they happen: > This is another issue which has been already fixed in a later stable release of 3.13. Please update your kernel. Thanks you very much Emmanuel. I updated my kernel to 3.13.7 and now that error is gone and my Wifi looks much more stable. That should teach me to install the latest stable kernel before bothering you. Thanks a lot! I made another very... curious observation during the last days: The error is very location-dependent. If one of these "events" happens, when I move the laptop by three meters, it immediately stops disconnecting from the wireless. Once I move it back to the table, the disconnects happen again. I already tried removing various electronic devices from around the laptop, to no avail. I am quite at loss, to be honest...
> Ok, I just realised that for kernel 3.13 there is a newer firmware available
> (iwlwifi-7260-8.ucode).
Which is the exact version you are using? "8" here is just the microcode "generation" or how to call it. I am trying version 22.24.8.0 (that's printed by dmesg on module load).
I am currently compiling kernel 3.13.7 to see of this helps. I am not seeing these error messages with the current older kernel (Debian kernel based on 3.13.5) though.
22.24.8.0 is the latest firmware available - please use this one. Unfortunately, that did not help - with kernel 3.13.7 and firmware 22.24.8.0, the problem just happened again. can you sure again your dmesg? did you try to disable security as suggested above? Created attachment 131021 [details]
dmesg of disconnects with latest stable kernel and firmware
New, fresh dmesg is attached.
I did not yet try to disable encryption, as I experimented with some other things during the week. It's the next option I am pursuing though. I just wonder whether my usual "fgrep" for these error messages is still of any use now, as wpa_supplicant won't be in the loop anymore, will it?
from your logs it is actually related to supplicant :) Need to check though - I am not a supplicant expert at all. I'll need to ask a few people here. Can you attach the syslog again? I see that the prints about VHT / HT being not compatible with TKIP have disappeared. BTW - don't bother to disable security for now. Created attachment 131031 [details] syslog from the same disconnects as the last dmesg > Can you attach the syslog again? Okay. > I see that the prints about VHT / HT being not compatible with TKIP have > disappeared. That's probably due to the different base station - the last logs were with the NetGear, these are with the FritzBox. The NetGear can only do WPA1, the FritzBox does WPA2. > BTW - don't bother to disable security for now. Well, I already disabled it (and enabled MAC checking, whatever that helps), so I think I'll just leave it running till the evening. I don't feel good leaving it insecure for too long^^ I just had exactly the same issue (including the wpa_supplicant error) on a connection without security. Will attach dmesg and syslog shortly. Created attachment 131051 [details]
Kernel log from the issue happening without connection security
Created attachment 131061 [details]
syslog from the issue happening without connection security
I now also have logs of the issue happening with Linux 3.14 and "iwlmvm power_scheme=1" - let me know if they are interesting for you. Hi, With kernel 3.13.7 and the "8" firmware the problem just happened again: it's the first time I notice a connection drop since I updated the kernel, but the symptoms seem to be the same. I also have power_scheme=1 for iwlmvm. Please say if I can provide any additional information. (In reply to Ignacio Huerta from comment #20) > Hi, > > With kernel 3.13.7 and the "8" firmware the problem just happened again: > it's the first time I notice a connection drop since I updated the kernel, > but the symptoms seem to be the same. I also have power_scheme=1 for iwlmvm. > Please say if I can provide any additional information. Please attach your dmesg... Created attachment 131201 [details]
dmesg after disconnection problems with kernel 3.13.7 and firmware 8
Here comes the dmesg.
This times it looks like Ralf's issue. Can you please attach your syslog? Thanks. Created attachment 131211 [details]
syslog of disconnection problems with firmware 8 and kernel 3.13.7
You are very welcome. I'm attaching my syslog (only the relevant time frame, since before the disconnections until the problem went away a few minutes later).
Ok - this looks the exact same issue as Ralf. Thanks. Can you run tracing? sudo trace-cmd -e iwlwifi -e mac80211 -e cfg80211 Yes, I'll try to run tracing the next time it arises. Let's hope it doesn't take too long :). OK, it happened again and here's the trace: https://www.dropbox.com/s/t80ec54qfw2273j/trace1.dat Created attachment 131311 [details]
add threshold to beacon loss
Can you please test the patch attached?
This patch is included in 3.14, and I can't be sure it'll apply on 3.13.
If you confirm that this patch helps (or that 3.14 fixes the issue), I'll backport this patch to 3.13.
Thanks!
Oh, I just noticed that Ralf reproduced the issue with 3.14.... Ok... I guess I'll need traces from him too. (In reply to Ignacio Huerta from comment #27) > OK, it happened again and here's the trace: > https://www.dropbox.com/s/t80ec54qfw2273j/trace1.dat you seem to have sw_crypto enabled - can you disable this please? Thanks. There seems to be some problem about whether or not the firmware microcode gets into the iwlwifi module. Running arch linux with kernel 3.13.18 the problems shows in the following: # pacman -Ss linux-firmware core/linux-firmware 20140316.dec41bc-1 [installed] Firmware files for Linux # ls /lib/firmware/iwlwifi* /lib/firmware/iwlwifi-1000-3.ucode /lib/firmware/iwlwifi-5000-2.ucode /lib/firmware/iwlwifi-1000-5.ucode /lib/firmware/iwlwifi-5000-5.ucode /lib/firmware/iwlwifi-100-5.ucode /lib/firmware/iwlwifi-5150-2.ucode /lib/firmware/iwlwifi-105-6.ucode /lib/firmware/iwlwifi-6000-4.ucode /lib/firmware/iwlwifi-135-6.ucode /lib/firmware/iwlwifi-6000g2a-5.ucode /lib/firmware/iwlwifi-2000-6.ucode /lib/firmware/iwlwifi-6000g2a-6.ucode /lib/firmware/iwlwifi-2030-6.ucode /lib/firmware/iwlwifi-6000g2b-5.ucode /lib/firmware/iwlwifi-3160-7.ucode /lib/firmware/iwlwifi-6000g2b-6.ucode /lib/firmware/iwlwifi-3160-8.ucode /lib/firmware/iwlwifi-6050-4.ucode /lib/firmware/iwlwifi-3945-2.ucode /lib/firmware/iwlwifi-6050-5.ucode /lib/firmware/iwlwifi-4965-2.ucode /lib/firmware/iwlwifi-7260-7.ucode /lib/firmware/iwlwifi-5000-1.ucode /lib/firmware/iwlwifi-7260-8.ucode # modinfo iwlwifi | grep firmware firmware: iwlwifi-100-5.ucode firmware: iwlwifi-1000-5.ucode firmware: iwlwifi-135-6.ucode firmware: iwlwifi-105-6.ucode firmware: iwlwifi-2030-6.ucode firmware: iwlwifi-2000-6.ucode firmware: iwlwifi-5150-2.ucode firmware: iwlwifi-5000-5.ucode firmware: iwlwifi-6000g2b-6.ucode firmware: iwlwifi-6000g2a-5.ucode firmware: iwlwifi-6050-5.ucode firmware: iwlwifi-6000-4.ucode firmware: iwlwifi-3160-7.ucode firmware: iwlwifi-7260-7.ucode parm: fw_restart:restart firmware in case of error (default true) (bool) So in /lib/firmware there are several files for which there are two versions, but in the module not always the latest is loaded! 7260 is the older one, 3160 also, 6000g2a-5 also - but some are OK. Could this be underlying this issue? The kernel in my previous comment should be 3.13.8 not .18 - apologies for the typo. Created attachment 131331 [details] dmesg+syslog of the isue happening with linux 3.14 and power_scheme=1 Yes, the issue happened with kernel 3.14, dmesg and syslog are attached. Unfortunately, I am no longer at my parents' place where I experienced the problem daily. I will have to see whether it happens at all here at my own place. If it does, I will provide traces. > So in /lib/firmware there are several files for which there are two versions, > but in the module not always the latest is loaded! > > 7260 is the older one, 3160 also, 6000g2a-5 also - but some are OK. > > Could this be underlying this issue? However, despite what modinfo says, according to dmesg, the new firmware is loaded: iwlwifi 0000:03:00.0: loaded firmware version 22.24.8.0 op_mode iwlmvm @Mike - this is totally unrelated. @Ralf - you have latest firmware. OK I will enter a new bug for the issue I posted in comment #31 It seems that my report was based on confusing output, and that the correct later firmware is in fact loaded. The issue was that Intel chooses to list the oldest usable firmware even though the later firmware is loaded. Apologies for the noise. No - We load the latest available supported FW. 3.13 supports iwlwifi-7260-8.ucode and hence will load iwlwifi-7260-8.ucode. I have connection drops and huge packet loss also with 7260 when just a couple of meters away from the router. Kernel: 3.13.8 with iwlmvm power_scheme=1 Loaded Module: 22.24.8.0 Its probably not relevant but, the laptop is a Vostro 5470. @Emmanuel: I have disabled sw_crypto, upgraded to kernel 3.14.0 and currently monitoring the logs for the error to happen again. If it happens I'll send you the trace. Thanks for your support, I'll keep you updated. i use kernel 3.14 modinfo iwlwifi filename: /lib/modules/3.14.0-2.gfa168d7-desktop/kernel/drivers/net/wireless/iwlwifi/iwlwifi.ko license: GPL author: Copyright(c) 2003- 2014 Intel Corporation <ilw@linux.intel.com> version: in-tree:d description: Intel(R) Wireless WiFi driver for Linux firmware: iwlwifi-100-5.ucode firmware: iwlwifi-1000-5.ucode firmware: iwlwifi-135-6.ucode firmware: iwlwifi-105-6.ucode firmware: iwlwifi-2030-6.ucode firmware: iwlwifi-2000-6.ucode firmware: iwlwifi-5150-2.ucode firmware: iwlwifi-5000-5.ucode firmware: iwlwifi-6000g2b-6.ucode firmware: iwlwifi-6000g2a-5.ucode firmware: iwlwifi-6050-5.ucode firmware: iwlwifi-6000-4.ucode firmware: iwlwifi-3160-7.ucode firmware: iwlwifi-7260-7.ucode so .8 don't seem there will be include in 3.14.1? .8 seem there but it's not used... /lib/firmware/iwlwifi-7260-8.ucode why? Marc: I had an answer about the apparent missing -8 microcode file in the modinfo output from an answer to my arch linux bug report at https://bugs.archlinux.org/task/39722 - see the last comment before that bug was closed. It seems that the later microcode file is loaded but apparently the modinfo only lists the earliest usable one. It would be better if the modinfo output was a bit more logical. Unfortunately (or not ;-), the problem does not seem to happen here at my own place. Maybe it's because the wireless AP is just 2m from my laptop (so I usually don't even use it), but it seems O can only test this when I'm at my parents place. cat /var/log/messages | grep iwlwifi linux-rnkx kernel: [ 9.941312] iwlwifi 0000:03:00.0: loaded firmware version 22.24.8.0 op_mode iwlmvm linux-rnkx kernel: [ 9.954234] iwlwifi 0000:03:00.0: Detected Intel(R) Dual Band Wireless AC 7260, REV=0x144 my router is less then 1m from my laptop. i will add an attachment when i disconnect my cable and try to connect to a 5Ghz network. Created attachment 131491 [details]
not able to connect to 5ghz network with kernel 3.14
@Marc: Please open a new bug for this. What makes you think it that you are experiencing the bug we are discussing here? The "characteristic" error message we talked about does not appear in your log, and the symptom is completely different. This bug is about trouble with a connection that has long been established: Suddenly packet loss goes to 50% or even more (up to completely blocking the connection), and huge amount of "DISCONNECT" messages from wpa_supplicant appear in the syslog. This goes on for 5-10min, then everything is normal again for a few hours. way too many people on the same bug adding unrelated logs. @Marc - please, this bug is not what you are having. Please open a new bug. @Ronnie Andrew - please open a new bug and add your logs there. Worst case, I'll close it as duplicate of this bug. Both - when you create a new bug, please CC ilw@linux.intel.com. I wish I could delete all the unrelated noise on this bad. Back to focus - Ralf and Ignacio have very similar logs. Thanks to Ignacio, I could see that he is suffering from a beacon loss which results in disconnections. This is the only usable data I had until now. So Ignacio, please reproduce the bug on 3.14 and again, add traces. This time, I'd like to have -e iwlwifi_msg along with the rest I already asked for in the previous round. Thanks. Hi Emmanuel, I have managed to reproduce the bug with Kernel 3.14 and latest firmware. I don't have swcrypto enabled anymore, and the only module option I have is "iwlmvm power_scheme=1". This is the trace created with "trace-cmd record -e iwlwifi -e iwlwifi_msg -e mac80211 -e cfg80211": https://www.dropbox.com/s/eop1zph852p19qp/trace_20140408.dat Please say if I can do anything else to help. Regards, Ignacio I haven't forgotten you - but I doubt I'll be able to look at this in the coming 2 weeks sorry. Thank you for your patience. Don't worry Emmanuel, there's no rush. Thanks for the update! So I finally found time for this. I can't see anything bad besides the fact that we are missing beacons for a reason I can't understand. Would you be able to apply https://git.kernel.org/cgit/linux/kernel/git/iwlwifi/iwlwifi-fixes.git/commit/?id=431031851ea72a25abb9ad4df56a0f3b997e3026 and then upgrage your firmware to this one https://git.kernel.org/cgit/linux/kernel/git/egrumbach/linux-firmware.git/plain/iwlwifi-7260-9.ucode?id=7a67dbf9c087ef64b3d3fc9ce448c2efdb2e365f Thanks. Emmanuel, I found this bug while searching for information about why the wireless card on my notebook won't connect to my flat's router. lspci: 02:00.0 Network controller: Intel Corporation Wireless 7260 (rev 73) Kernel: Linux 3.14.3-031403-generic #201405061153 SMP Tue May 6 15:54:50 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux Firmware with md5 (downloaded from the driver's webpage): 1005c3b82879ecfa0d60544a78de7f92 /lib/firmware/iwlwifi-7260-9.ucode I can't even connect to the router anymore, which is a "Fritz!Box 6340 Cable", some generic german brand of router, to which I have no access nor can reconfigure if needed. I tried previous kernels and firmwares without success. Let me know what information is useful and I can gather it. Kind regards, -Ciro Ciro, please open a new bug and add your logs there. Thank you. @Ignacio: can you please do the following: echo "bf_enable_beacon_filter=0" > /sys/kernel/debug/iwlwifi/*/iwlmvm/netdev\:wlan0/bf_params and tell me if it helps? Thanks. Hi Emmanuel, thanks a lot for your answers. I haven't had time until now to look into this. I'm gonna first test "bf_enable_beacon_filter=0" and if that doesn't help I'll test the patch and the upgraded firmware. The issue happens one or two times a day, so I will need a few days to come up with results. Regards, Ignacio Hi, only twice a day? You seemed to see that several times a minute. Ok - in that case, it is probably something else. Anyway I noticed that the debugs line I sent would be overridden after re-association so it is not relevant. but this should work: diff --git a/drivers/net/wireless/iwlwifi/mvm/power.c b/drivers/net/wireless/iwlwifi/mvm/power.c index 4aab126..e67271c 100644 --- a/drivers/net/wireless/iwlwifi/mvm/power.c +++ b/drivers/net/wireless/iwlwifi/mvm/power.c @@ -857,6 +857,7 @@ int iwl_mvm_enable_beacon_filter(struct iwl_mvm *mvm, .bf_enable_beacon_filter = cpu_to_le32(1), }; + return 0; return _iwl_mvm_enable_beacon_filter(mvm, vif, &cmd, flags, false); } I meant twice a day for "bursts of errors". It happens a lot of times during some minutes and then stops. Hours later it starts again for some minutes. Thanks Emmanuel for the patch, I will test it and come back. Created attachment 135921 [details]
dmesg.log
Hi,
I'm seeing the issue (disconnection bursts) with the latest .9 firmware, dmesg.log attached (this is a short occurence from a couple of minutes ago, I've seen more severe ones an hour before)
The kernel I'm running does not have the latest patch your suggested (it does not apply on 3.14, I'll need to fetch a more recent kernel)
Hope it helps,
Florian
can you run tracing? Or re-compile with CFG80211_REG_DEBUG and CFG80211_DEVELOPER_WARNINGS? Hi Emmanuel, I've built a 3.15 kernel : 3.15.0-rc5-g14186fe with your beacon filter patch and CFG80211_REG_DEBUG / CFG80211_DEVELOPER_WARNINGS enabled I'll keep you updated, however, I'm a bit concerned regarding these last two options : they do not seem to have had much effect, see : $ dmesg | grep cfg80211 [ 19.550955] cfg80211: Calling CRDA to update world regulatory domain [ 20.127536] cfg80211: Ignoring regulatory request set by core since the driver uses its own custom regulatory domain [ 392.162117] cfg80211: All devices are disconnected, going to restore regulatory settings [ 392.162120] cfg80211: Restoring regulatory settings [ 392.162123] cfg80211: Kicking the queue [ 392.162127] cfg80211: Calling CRDA to update world regulatory domain I would have expected more logs kernel conf is there : $ zcat /proc/config.gz | sprunge http://sprunge.us/cGDb trace-cmd seem to have more useful data : $ trace-cmd record -e iwlwifi -e iwlwifi_msg -e mac80211 -e cfg80211 [..] $ trace-cmd report | sprunge http://sprunge.us/CQNf Did I miss something for the DEBUG defines ? I'll run the tracing command in the meantime. Ok - on another bug, someone reported that 3.15 is fine regardless of the firmware version running, can someone test this? Thanks. Hi Emmanuel, I did not see any occurence of the issue with 3.15 + beacon filter patch in two days. I'll try without the patch tonight. Hi Florian, good. Thank you. What firmware are you using? Hi Emmanuel, iwlwifi 0000:03:00.0: loaded firmware version 23.214.9.0 op_mode iwlmvm I'm reverting to an unpatched 3.15-rc5 right now, I'll run it over the week end and let you know how it went by Monday :) I am spending this weekend with the "problematic" wireless again. I'm currently running an unpatched 3.15-rc5 and firmware 22.24.8.0. So far, I've had no disconnects for 5h. That's pretty good, but also occasionally happened with the older kernels. I will keep you posted. @Florian, from another bug report it seems that 3.15-rc5 plain seems to solve the issues too... OTOH, I am pretty sure there are several issues that lead to the same effect... Thanks for your help! Ok - I think I have a lead: Can someone try this: diff --git a/drivers/net/wireless/iwlwifi/mvm/mac80211.c b/drivers/net/wireless/iwlwifi/mvm/mac80211.c index cd6ea2e..17c097d 100644 --- a/drivers/net/wireless/iwlwifi/mvm/mac80211.c +++ b/drivers/net/wireless/iwlwifi/mvm/mac80211.c @@ -619,7 +619,7 @@ static int iwl_mvm_mac_add_interface(struct ieee80211_hw *hw, if (ret) goto out_remove_mac; - if (!mvm->bf_allowed_vif && + if (!mvm->bf_allowed_vif && false && vif->type == NL80211_IFTYPE_STATION && !vif->p2p && mvm->fw->ucode_capa.flags & IWL_UCODE_TLV_FLAGS_BF_UPDATED){ mvm->bf_allowed_vif = mvmvif; on 3.13 / 3.14? thanks! building now with that patch on 3.14. I have not seen the issue since my last comment. But it's been a sunny week end so far, so I did not spend that much time with the laptop ;) Created attachment 136601 [details]
disable beacon filtering
This is a fix candidate for 3.13 / 3.14.
Please test - thank you.
Not sure if this is still interesting, but: I have been running 3.15-rc5 the entire weekend and did not get a single disconnect. (In reply to Ralf Jung from comment #70) > Not sure if this is still interesting, but: I have been running 3.15-rc5 the > entire weekend and did not get a single disconnect. :) not very interesting indeed. But thank you! What I'd really like to see is 3.14 / 3.13 with the patch from comment 69. Thanks! Well, I only was using that wireless for this week-end... I can try that patch next time I'm there, which is probably in three weeks. Or whatever patch is "current" then ;-) I've been running the 3.13 / 3.14 patch for a few hours now 21:04:53 ey3ball@omnicron ~ :) $ uname -a Linux omnicron 3.14.2-1-custom #2 SMP PREEMPT Sun May 18 16:49:31 CEST 2014 x86_64 GNU/Linux 20:50:36 ey3ball@omnicron ~ :) $ uptime 21:04:53 up 4:02, 2 users, load average: 0.29, 0.22, 0.16 No issue to report so far Will post an update later Do you need logs of any sort ? @Florian - is this an improvement compared to 3.14 without the patch? If you don't have issues, no need for log, but it can't hurt :) 4 hours without seeing the issue is nice, and feels like an improvement but I haven't moved around much. I'm gonna use the laptop at another location in the house which is usually more problematic, this should be a good test. Depending on "the weather" the issue could happen from multiple times per hour to once or twice every few hours so we're not there yet ;). $ uptime 00:34:56 up 7:32, 2 users, load average: 0.17, 0.16, 0.13 So far so good, it definitely looks like things have improved with that patch. After 8+ hours no sign of this issue. I've ran into a new problem though, Emmanuel you might want to have a look at bug #42978 (it does not look like you're in copy there ATM, although I might be mistaken) Ok - I'll try to push this upstream. regarding the second bugs, this is really weird, I'll try to take a look. I have not seen this bug (#72601) tonight, however #42978 already bit me twice in a few hours, I've posted new informations there. Florian says that this issue is fixed. Closing this bug now. Please re-open if the proposed solution doesn't fix the issue for you. Also - the patch has been sent to GregKH and it will hit 3.13 / 3.14. Hi there, I have bad news, I just arrived from work, booted the laptop (3.14 kernel with your patch), and ... disconnexion bursts. [..] [ 179.109532] wlp3s0: authenticate with d4:ca:6d:25:f3:fc [ 179.115839] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3) [ 179.119493] wlp3s0: authenticated [ 179.120368] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3) [ 179.133101] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=4) [ 179.136458] wlp3s0: associated [ 184.479257] cfg80211: Calling CRDA to update world regulatory domain [ 188.196057] wlp3s0: authenticate with d4:ca:6d:25:f3:fc [ 188.202696] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3) [ 188.205962] wlp3s0: authenticated [ 188.207032] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3) [ 188.211204] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=4) [ 188.215095] wlp3s0: associated [ 190.448612] cfg80211: Calling CRDA to update world regulatory domain [ 194.165158] wlp3s0: authenticate with d4:ca:6d:25:f3:fc [ 194.171331] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3) [ 194.174515] wlp3s0: authenticated [ 194.176913] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3) [ 194.180978] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=4) Interestingly I rebooted once to the same kernel, saw the bursts again, then switched to 3.15 and saw a "mini" burst (sadly small enough that I did not get a chance to save a trace). There is a new message in syslog though : [ 20.491876] psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 3/3 [ 20.711555] input: TPPS/2 IBM TrackPoint as /devices/platform/i8042/serio1/serio2/input/input10 [ 23.348966] wlp3s0: authenticate with d4:ca:6d:25:f3:fc [ 23.357247] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3) [ 23.359901] wlp3s0: authenticated [ 23.360898] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3) [ 23.364986] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=4) [ 23.368852] wlp3s0: associated [ 23.368908] IPv6: ADDRCONF(NETDEV_CHANGE): wlp3s0: link becomes ready [ 23.664309] iwlwifi 0000:03:00.0: No association and the time event is over already... [ 23.664348] wlp3s0: Connection to AP d4:ca:6d:25:f3:fc lost [ 23.691365] cfg80211: Calling CRDA to update world regulatory domain [ 27.408453] wlp3s0: authenticate with d4:ca:6d:25:f3:fc [ 27.414487] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3) [ 27.417045] wlp3s0: authenticated [ 27.418693] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3) [ 27.422829] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=4) [ 27.426200] wlp3s0: associated [ 27.721541] iwlwifi 0000:03:00.0: No association and the time event is over already... [ 27.721578] wlp3s0: Connection to AP d4:ca:6d:25:f3:fc lost [ 27.799266] cfg80211: Calling CRDA to update world regulatory domain [ 31.488935] wlp3s0: authenticate with d4:ca:6d:25:f3:fc [ 31.494093] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3) [ 31.496679] wlp3s0: authenticated [ 31.499832] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3) [ 31.504101] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=4) [ 31.507325] wlp3s0: associated I'll send a 3.14 trace shortly I'll send a trace shorthly Created attachment 136851 [details]
trace with 3.14 + patch during a burst
what is the version of your supplicant? core/wpa_supplicant 2.1-3 [installed] This one has real issues I think. Can you take the latest master branch? git clone git://w1.fi/srv/git/hostap.git Ok, Built it from the following commit : commit 4e0a94b7dc76db58cddbbcfe0be0bfef547f6dd0 Author: Jouni Malinen <jouni@qca.qualcomm.com> Date: Fri May 16 19:24:47 2014 +0300 Rebooting now to a 3.14 kernel with this supplicant I've only seen one disconnection since my last message, no big burst yet. Created attachment 137651 [details]
trace.dat with patch + latest supplicant
Hi Emmanuel,
I saw some big bursts tonight.
I can count at least 3 long series of
[48822.626368] wlp3s0: authenticate with d4:ca:6d:25:f3:fc
[48822.632610] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3)
[48822.635836] wlp3s0: authenticated
[48822.639236] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3)
[48822.652132] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=3)
[48822.655948] wlp3s0: associated
[48824.246838] cfg80211: Calling CRDA to update world regulatory domain
[48827.953921] wlp3s0: authenticate with d4:ca:6d:25:f3:fc
[48827.960065] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3)
[48827.963370] wlp3s0: authenticated
[48827.965032] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3)
[48827.969158] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=3)
[48827.980664] wlp3s0: associated
[48830.691704] cfg80211: Calling CRDA to update world regulatory domain
[48834.393664] wlp3s0: authenticate with d4:ca:6d:25:f3:fc
[48834.400271] wlp3s0: send auth to d4:ca:6d:25:f3:fc (try 1/3)
[48834.403484] wlp3s0: authenticated
[48834.405492] wlp3s0: associate with d4:ca:6d:25:f3:fc (try 1/3)
[48834.409509] wlp3s0: RX AssocResp from d4:ca:6d:25:f3:fc (capab=0x411 status=0 aid=3)
[48834.413842] wlp3s0: associated
[48836.217129] cfg80211: Calling CRDA to update world regulatory domain
Two of them occured while I was AFK. I was eventually able to catch the third live and record a trace (attached)
For reference :
kernel = 3.14.2-1-custom (3.14 + beacon patch)
firmware = 23.214.9.0
wpa_supplicant = wpa_supplicant v2.2-devel (git, 2014-05-20)
Any thoughts ?
Thank you,
Florian
Additionnal information : I was caught in a "bad burst" (not stopping for more than 5 minutes), so I eventually rebooted to 3.15 and saw exactly the same behaviour than described in comment #82 (ie: a mini burst after reboot, then I'm back to a stable connexion). Sadly I failed to get a trace running fast enough under 3.15 again :( Ok - thank you for that. I can see very clearly that this is a FW / radio / environment issue. The FW is reporting that it missed beacons. We give it a bit of time, but then when we missed 10 beacons we report that the connection is lost. I am trying to see what could have solved this in 3.15. Note that there are tons of patches that I tagged for 3.14 and weren't merged even if they hit Linus's tree. Weird... I'll close this as the issue was fixed in 3.14.6. https://git.kernel.org/cgit/linux/kernel/git/stable/linux-stable.git/commit/?id=f47fc3c1b48dd8fc7a0a591551454459eca0ca94 So no further testing necessary on 3.14.6 ? If I understand correctly the fix is only partial and no further action will be taken since 3.15 is out ? (In reply to Florian Vallee from comment #93) > So no further testing necessary on 3.14.6 ? If I understand correctly the > fix is only partial and no further action will be taken since 3.15 is out ? Not entirely true. The original issues were: 1) bad uAPSD behavior with -8.ucode - uAPSD is disabled in 3.14.6 2) bad beacon filtering - beacon filtering is disabled in 3.14.6 so from my POV, 3.14.6 is fine, or at least not worse than 3.15. There is still this other bug: bug 42978, but I think we can close the current one. If not, let me know what I am missing here. > If not, let me know what I am missing here.
Great, thanks for the clarification. I did not notice the uAPSD patch, hence my question.
|