Bug 15374
Summary: | iwlagn Microcode SW error detected | ||
---|---|---|---|
Product: | Drivers | Reporter: | Trenton D. Adams (trenton.d.adams) |
Component: | network-wireless | Assignee: | Reinette Chatre (reinette.chatre) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | claudiomkd, linville, reinette.chatre |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | vanilla 2.6.33-rc6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
Ensure aggregation cleanup succeeds after firmware restart
iwl (wlan0) log before the wifi stops working iwl (wlan0) log after the wifi stops working |
Description
Trenton D. Adams
2010-02-22 20:25:33 UTC
1. the first message happens before I unload the module 2. the second message happens after I unload the module 3. my interface statistics are below. wlan0 Link encap:Ethernet HWaddr 00:21:6a:11:3d:62 inet addr:10.0.1.6 Bcast:10.0.1.255 Mask:255.255.255.0 inet6 addr: fe80::221:6aff:fe11:3d62/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3061987 errors:0 dropped:0 overruns:0 frame:0 TX packets:3578411 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2436016088 (2.2 GiB) TX bytes:530748011 (506.1 MiB) I'll give another set of stats next time it happens. Which version of the firmware are you using? This is usually printed after driver is loaded when interface is brought up. Feb 28 21:35:50 tdanotebook kernel: iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 2.6.33-rc6-ks Feb 28 21:35:50 tdanotebook kernel: iwlagn: Copyright(c) 2003-2009 Intel Corporation Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: setting latency timer to 64 Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: Detected Intel Wireless WiFi Link 5300AGN REV=0x24 Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: Tunable channels: 13 802.11bg, 24 802.11a channels Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: irq 25 for MSI/MSI-X Feb 28 21:35:50 tdanotebook kernel: phy1: Selected rate control algorithm 'iwl-agn-rs' Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: firmware: requesting iwlwifi-5000-2.ucode Feb 28 21:35:50 tdanotebook kernel: iwlagn 0000:04:00.0: loaded firmware version 8.24.2.12 Feb 28 21:35:50 tdanotebook kernel: ADDRCONF(NETDEV_UP): wlan0: link is not ready Created attachment 25346 [details]
Ensure aggregation cleanup succeeds after firmware restart
Tracking down the firmware error that triggers the problem will take some time. While this is done, could you please extract and apply the attached patch series to your kernel. It has been created against 2.6.33, can you test this kernel?
This series ensures that aggregation is cleaned up properly after a firmware restart.
Yes, I can test a new kernel. I have the linus git tree, is it on there somewhere? If so, just let me know what to do. Otherwise, I'll apply the patch. Thanks. (In reply to comment #5) > Yes, I can test a new kernel. I have the linus git tree, is it on there > somewhere? If so, just let me know what to do. Otherwise, I'll apply the > patch. > Please apply these patches on top of vanilla 2.6.33. Thanks Hello, Is this bug fixed in 2.6.33? I have a strange problem as well. I was using a 2.6.31.x without problems. All the problems started with 2.6.32. After some time of using wifi, my card got dead, meaning that it was associated but there was no response whatsoever. Anyway, I had wifi for some hours, but I suspect it depended on the volume of traffic. Now, with 2.6.33, my wifi card (iwl5000) works for a few seconds (when I start downloading something) and then it stops working, just as with 2.6.32. I tried all versions of 2.6.32 and 2.6.33 but I had no luck. Today I decided to try 2.6.31, since I thought it might be something different from the kernel, but I can confirm that it is the kernel. I am now running a 2.6.31.12 and wifi has been working for some hours and I have downloaded/uploaded huge amount of data (in order to test it), but everything works ok. Could this problem be related to this issue? Shall I provide more details or info about what is happening? Thanks in advance, Claudio M. Camacho (In reply to comment #7) > Could this problem be related to this issue? Shall I provide more details or > info about what is happening? Does your logs show the same symptoms? Specifically, are you seeing the same firmware error followed by similar errors? If so, then this could be the same problem, but it is hard to say without logs. (In reply to comment #6) > (In reply to comment #5) > > Yes, I can test a new kernel. I have the linus git tree, is it on there > > somewhere? If so, just let me know what to do. Otherwise, I'll apply the > > patch. > > > > Please apply these patches on top of vanilla 2.6.33. Thanks Okay, done. Now I'm just waiting to see what happens. Created attachment 25411 [details]
iwl (wlan0) log before the wifi stops working
Created attachment 25412 [details]
iwl (wlan0) log after the wifi stops working
Please notice that I don't get any error in dmesg (messages), I just get an additional line saying: iwlagn 0000:04:00.0: iwl_tx_agg_start on ra = 00:1c:f0:f0:16:c4 tid = 0 After this, the wifi stops working. Actually, if I just leave a transfer open, it will download at irregular intervals, meaning that every now and then the wifi card works and some data is downloaded (for a couple of seconds) and then the wifi card stops again. This is very strange.. I actually don't know how to profile this.. (In reply to comment #12) > Please notice that I don't get any error in dmesg (messages), I just get an > additional line saying: > > iwlagn 0000:04:00.0: iwl_tx_agg_start on ra = 00:1c:f0:f0:16:c4 tid = 0 > > > After this, the wifi stops working. Actually, if I just leave a transfer > open, > it will download at irregular intervals, meaning that every now and then the > wifi card works and some data is downloaded (for a couple of seconds) and > then > the wifi card stops again. > > This is very strange.. I actually don't know how to profile this.. The logs you provided have nothing in common with the original report. Please do stop confusing this bug report with this issue. You can submit a new bug report, but before you do so, please first take a look if it is not a duplicate of http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2120 or http://bugzilla.intellinuxwireless.org/show_bug.cgi?id=2129 . It may then be better to add yourself to CC list and add your logs to that bug report since work is in progress on those issues. I see, I know the logs are completely different. I think the correct bug is #2129. Sorry for the misunderstanding. (In reply to comment #9) > (In reply to comment #6) > > > > Please apply these patches on top of vanilla 2.6.33. Thanks > > Okay, done. Now I'm just waiting to see what happens. Trenton, how is the testing going? (In reply to comment #15) > (In reply to comment #9) > > (In reply to comment #6) > > > > > > Please apply these patches on top of vanilla 2.6.33. Thanks > > > > Okay, done. Now I'm just waiting to see what happens. > > Trenton, how is the testing going? Nothing is happening so far. My interface has received almost a GIG now, so I'm not quite up to what I was last time. Did you fix something, or was the patch only for getting information at the time of the problem? # dirty implying your patched version. uname -a Linux tdanotebook 2.6.33-dirty #9 SMP Mon Mar 8 12:26:52 CST 2010 x86_64 Intel(R) Core(TM)2 Duo CPU P8700 @ 2.53GHz GenuineIntel GNU/Linux wlan0 Link encap:Ethernet HWaddr 00:21:6a:11:3d:62 inet addr:10.0.1.5 Bcast:10.0.1.255 Mask:255.255.255.0 inet6 addr: fe80::221:6aff:fe11:3d62/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:2458152 errors:0 dropped:0 overruns:0 frame:0 TX packets:2847427 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:993953444 (947.9 MiB) TX bytes:709225297 (676.3 MiB) oops, using two different emails apparently, removing one (In reply to comment #16) > Nothing is happening so far. My interface has received almost a GIG now, so > I'm not quite up to what I was last time. Did you fix something, or was the > patch only for getting information at the time of the problem? The patch series was a fix. The firmware error may still occur, but there is nothing we can do about this at this time. What the patches do is recover well if ever a firmware error occurs. Can you check your logs if there was perhaps a firmware error as before? You can search for the string "Microcode SW error" Your initial report stated "I can reproduce these problems every time, as long as I leave my machine running; it's just a matter of time." ... so it really looks as though these patches are working for you. Hi Reinette, Nothing in the log yet, since the kernel file datestamp. I was taking a break from working though, for almost a week. So, if it is tied to the amount of data transferred, I haven't reached the limit that I did previously. I'll watch for it in the logs, and see if that comes up. I'll also look into possibly doing a large transfer today. Hello Reinette, Last night I did a large data transfer, and it did happen. My network is still working. Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Microcode SW error detected. Restarting 0x2000000. Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Start IWL Error Log Dump: Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Status: 0x000212E4, count: 5 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Desc Time data1 data2 line Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: NMI_INTERRUPT_WDG (#04) 2976450706 0x00000002 0x07030000 61630 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: blink1 blink2 ilink1 ilink2 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: 0x005AA 0x006E8 0x008B2 0x0E5EA Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Start IWL Event Log Dump: display last 20 entries Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267118:0x00000436:0323 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267143:0x00000000:1350 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267144:0x00000000:1351 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267144:0x00000000:1352 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267145:0x00000002:1353 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267691:0x00000094:0322 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267879:0x0a17001c:0206 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267881:0x00000001:0204 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267885:0x2a100dfa:0227 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557267886:0xc0f00c00:0228 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557269748:0x00000000:0263 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271373:0x0000010f:0106 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271374:0x00000000:0302 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271632:0x00000000:0351 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271707:0x0a17001c:0206 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271709:0x00000001:0204 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271713:0x2a100dfa:0227 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557271714:0xc0f00930:0228 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557471682:0x000000d7:0123 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: EVT_LOGT:3557471690:0x00000000:0125 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Stopping AGG while state not ON or starting Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: queue number out of range: 0, must be 10 to 19 Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Stopping AGG while state not ON or starting Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: queue number out of range: 0, must be 10 to 19 Mar 21 00:07:56 tdanotebook kernel: iwlagn 0000:04:00.0: iwl_tx_agg_start on ra = 00:1f:f3:c3:88:b4 tid = 0 wlan0 Link encap:Ethernet HWaddr 00:21:6a:11:3d:62 inet addr:10.0.1.5 Bcast:10.0.1.255 Mask:255.255.255.0 inet6 addr: fe80::221:6aff:fe11:3d62/64 Scope:Link UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:3533628 errors:0 dropped:0 overruns:0 frame:0 TX packets:4388833 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:1618662921 (1.5 GiB) TX bytes:1130521808 (1.0 GiB) (In reply to comment #20) > Last night I did a large data transfer, and it did happen. My network is > still > working. > > Mar 21 00:00:16 tdanotebook kernel: iwlagn 0000:04:00.0: Microcode SW error > detected. Restarting 0x2000000. This was the goal of the patches. A firmware error occurs, which we cannot do anything about at this time, and the new changes enable the driver to recover without affecting the user. |