Bug 33852 - Regression of AR2413 802.11bg in 2.6.38.4
Summary: Regression of AR2413 802.11bg in 2.6.38.4
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Process Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: process_other
URL:
Keywords:
Depends on:
Blocks: 27352
  Show dependency tree
 
Reported: 2011-04-23 12:12 UTC by Boris Popov
Modified: 2011-06-07 13:36 UTC (History)
9 users (show)

See Also:
Kernel Version: 2.6.38
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
/var/log/messages (5.39 KB, text/plain)
2011-04-23 12:12 UTC, Boris Popov
Details
lspci and iwlist in kernel 2.6.38.4 (2.58 KB, text/plain)
2011-04-27 03:28 UTC, Boris Popov
Details
lspci and iwlist in kernel 2.6.37.6 (21.49 KB, text/plain)
2011-04-27 03:30 UTC, Boris Popov
Details
first bad commit (763 bytes, text/plain)
2011-04-30 07:16 UTC, Boris Popov
Details
log of bisecting (3.63 KB, text/plain)
2011-04-30 07:17 UTC, Boris Popov
Details
first bad commit (new) (726 bytes, text/plain)
2011-05-07 18:14 UTC, Boris Popov
Details
log of bisecting (new) (1.94 KB, text/plain)
2011-05-07 18:15 UTC, Boris Popov
Details

Description Boris Popov 2011-04-23 12:12:24 UTC
Created attachment 55122 [details]
/var/log/messages

AR2413 works best in my notebook with kernel before 2.6.37.6 inc,
but doesn't work with kernel 2.6.38.4.
Comment 1 John W. Linville 2011-04-25 13:42:20 UTC
I don't see anything unusual in your /var/log/messages output.  Can you describe the problem more precisely?  What "doesn't work"?
Comment 2 Boris Popov 2011-04-26 09:47:17 UTC
(In reply to comment #1)
> I don't see anything unusual in your /var/log/messages output.  Can you
> describe the problem more precisely?  What "doesn't work"?

I think problem in Apr 23 15:39:27 laptop kernel: [   15.668598] ADDRCONF(NETDEV_UP): wlan0: link is not ready.

I can't connect to my router.

What info do you need?
Comment 3 Boris Popov 2011-04-26 13:19:25 UTC
iwconfig wlan0 essid router.popov.net key XXXXXXXXXXXXXXXXXXXXXXXXXX

iwconfig wlan0

wlan0     IEEE 802.11bg  ESSID:"router.popov.net"  
          Mode:Managed  Frequency:2.452 GHz  Access Point: Not-Associated   
          Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Encryption key:XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XX
          Power Management:off
Comment 4 Boris Popov 2011-04-26 13:26:45 UTC
iwconfig wlan0 essid router.popov.net key XXXXXXXXXXXXXXXXXXXXXXXXXX

iwconfig wlan0

wlan0     IEEE 802.11bg  ESSID:"router.popov.net"  
          Mode:Managed  Frequency:2.452 GHz  Access Point: Not-Associated   
          Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Encryption key:XXXX-XXXX-XXXX-XXXX-XXXX-XXXX-XX
          Power Management:off
Comment 5 Boris Popov 2011-04-26 13:27:54 UTC
kernel 2.6.37.6:

boris@laptop:~$ /sbin/iwconfig wlan0
wlan0     IEEE 802.11bg  ESSID:"router.popov.net"  
          Mode:Managed  Frequency:2.452 GHz  Access Point: 00:18:E7:F7:B3:9D   
          Bit Rate=54 Mb/s   Tx-Power=20 dBm   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Power Management:off
          Link Quality=68/70  Signal level=-42 dBm  
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:53   Missed beacon:0
Comment 6 John W. Linville 2011-04-26 15:07:10 UTC
Still not much information...perhaps we could see the output of "lspci -n" and "iwlist wlan0 scan"?
Comment 7 Boris Popov 2011-04-27 03:28:46 UTC
Created attachment 55592 [details]
lspci and iwlist in kernel 2.6.38.4
Comment 8 Boris Popov 2011-04-27 03:30:07 UTC
Created attachment 55602 [details]
lspci and iwlist in kernel 2.6.37.6
Comment 9 Boris Popov 2011-04-27 03:40:00 UTC
(In reply to comment #6)
> Still not much information...perhaps we could see the output of "lspci -n"
> and
> "iwlist wlan0 scan"?

Ok John, please see attachment.
Can I help you any more (debuginfo etc...)?
Comment 10 John W. Linville 2011-04-27 19:08:52 UTC
So, it looks like you aren't receiving any scan information (and possibly nothing at all).

Could you do a git bisect between 2.6.37 and 2.6.38.4?
Comment 11 Boris Popov 2011-04-28 12:15:35 UTC
I will try.
Comment 12 Boris Popov 2011-04-30 04:19:12 UTC
> Could you do a git bisect between 2.6.37 and 2.6.38.4?

I am doing bisect.

In 42c025f3de9042d9c9abd9a6f6205d1a0f4bcadf:

iwlist wlan0 scan works well, but
in kern.log:

Apr 30 07:58:24 laptop kernel: [   67.073537] ADDRCONF(NETDEV_UP): wlan0: link is not ready
Apr 30 07:58:25 laptop kernel: [   67.258316] wlan0: authenticate with 00:18:e7:f7:b3:9d (try 1)
Apr 30 07:58:25 laptop kernel: [   67.261474] wlan0: authenticated
Apr 30 07:58:25 laptop kernel: [   67.263641] wlan0: associate with 00:18:e7:f7:b3:9d (try 1)
Apr 30 07:58:25 laptop kernel: [   67.274361] wlan0: RX AssocResp from 00:18:e7:f7:b3:9d (capab=0x431 status=0 aid=1)
Apr 30 07:58:25 laptop kernel: [   67.274369] wlan0: associated
Apr 30 07:58:25 laptop kernel: [   67.275894] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
Apr 30 07:58:25 laptop kernel: [   67.619988] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 07:58:27 laptop kernel: [   69.328244] cfg80211: Calling CRDA to update world regulatory domain
Apr 30 07:58:31 laptop kernel: [   73.637083] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 07:58:35 laptop kernel: [   77.768144] wlan0: no IPv6 routers present
Apr 30 07:59:10 laptop kernel: [  112.276479] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 07:59:10 laptop kernel: [  112.624093] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 07:59:10 laptop kernel: [  112.626185] ADDRCONF(NETDEV_UP): wlan0: link is not ready
Apr 30 07:59:10 laptop kernel: [  113.102671] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 07:59:11 laptop kernel: [  113.608067] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:00:31 laptop kernel: [  193.276325] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:00:31 laptop kernel: [  193.623698] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:00:31 laptop kernel: [  193.625725] ADDRCONF(NETDEV_UP): wlan0: link is not ready
Apr 30 08:00:31 laptop kernel: [  194.102689] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:00:32 laptop kernel: [  194.617677] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:03:50 laptop kernel: [  392.757340] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:03:50 laptop kernel: [  393.104117] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:03:50 laptop kernel: [  393.106053] ADDRCONF(NETDEV_UP): wlan0: link is not ready
Apr 30 08:03:51 laptop kernel: [  393.577207] ath5k phy0: gain calibration timeout (2452MHz)
Apr 30 08:03:51 laptop kernel: [  394.077608] ath5k phy0: gain calibration timeout (2452MHz)
Comment 13 Boris Popov 2011-04-30 07:16:47 UTC
Created attachment 55932 [details]
first bad commit
Comment 14 Boris Popov 2011-04-30 07:17:39 UTC
Created attachment 55942 [details]
log of bisecting
Comment 15 Rafael J. Wysocki 2011-04-30 20:08:25 UTC
First-Bad-Commit : 42c025f3de9042d9c9abd9a6f6205d1a0f4bcadf
Comment 16 Tejun Heo 2011-05-01 13:48:21 UTC
Umm... I'm sorry but that bisection gotta be incorrect.  The only thing the commit does is adding a comment.

 kernel/workqueue.c |    6 +++++-
 1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index 930c239..11869fa 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -768,7 +768,11 @@ static inline void worker_clr_flags(struct worker *worker, unsigned int flags)
 
        worker->flags &= ~flags;
 
-       /* if transitioning out of NOT_RUNNING, increment nr_running */
+       /*
+        * If transitioning out of NOT_RUNNING, increment nr_running.  Note
+        * that the nested NOT_RUNNING is not a noop.  NOT_RUNNING is mask
+        * of multiple flags, not a single flag.
+        */
        if ((flags & WORKER_NOT_RUNNING) && (oflags & WORKER_NOT_RUNNING))
                if (!(worker->flags & WORKER_NOT_RUNNING))
                        atomic_inc(get_gcwq_nr_running(gcwq->cpu));
Comment 17 Florian Mickler 2011-05-01 14:42:37 UTC
Boris, can you concentrate your good/bad decision only on the scanning not working? It's expected that other issues crop up and get fixed while bisecting through a range of commits... if they don't interfere with the scanning, ignore them, if you can't test the wifi scanning (maybe a commit does not compile) just skip that commit. 

[You can always check where you're at with the command 'git bisect visualize', which will you show all commits you have not yet eleminated]
Comment 18 Boris Popov 2011-05-07 18:12:01 UTC
I repeated bisect and have new first bad commit:

573cfde7aaeaadb0fd356ff2a14bdf9238967661 is the first bad commit
commit 573cfde7aaeaadb0fd356ff2a14bdf9238967661
Author: Nick Kossifidis <mickflemm@gmail.com>
Date:   Fri Feb 4 01:41:02 2011 +0200

    ath5k: Fix fast channel switching
    
    Fast channel change fixes:
    
    a) Always set OFDM timings
    b) Don't re-activate PHY
    c) Enable only NF calibration, not AGC
    
    https://bugzilla.kernel.org/show_bug.cgi?id=27382
    
    Signed-off-by: Nick Kossifidis <mickflemm@gmail.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>

:040000 040000 47e60fb64921ea8eb2a91f5baa60b8fcd699a39e c9676b82bdca218cce2988c4952f2fcaac935d36 M      drivers
Comment 19 Boris Popov 2011-05-07 18:14:21 UTC
Created attachment 56922 [details]
first bad commit (new)
Comment 20 Boris Popov 2011-05-07 18:15:10 UTC
Created attachment 56932 [details]
log of bisecting (new)
Comment 21 John W. Linville 2011-05-10 17:13:45 UTC
Did you try reverting that commit?  Does that resolve the issue?
Comment 22 Bob Copeland 2011-05-11 04:14:28 UTC
Perhaps just turning off fast channel switching would help, it's good but nonessential and a few people have reported problems with it.
Comment 23 Boris Popov 2011-05-11 06:29:37 UTC
(In reply to comment #21)
> Did you try reverting that commit?  Does that resolve the issue?

Reverting that commit resolve _only_ iwlist scan.
Access point is "Not-Associated" like comment #3 (https://bugzilla.kernel.org/show_bug.cgi?id=33852#c3).

Can I help you any more?
Comment 24 Nick Kossifidis 2011-05-13 00:19:24 UTC
O.K. let's turn fast channel switching to a module parameter, the question is should we use a blacklist or a whitelist approach (enable by default or disable by default) ? Also we still don't have any reports on failed AR5413 hw, could we at least limit that to AR2413 ?
Comment 25 Nick Kossifidis 2011-05-14 14:25:55 UTC
Anyway I'll send a patch later today that disables fast channel switching by default and adds a module parameter to enable it...
Comment 27 Boris Popov 2011-05-15 07:09:01 UTC
(In reply to comment #26)
> Try this out...
>
> http://www.kernel.org/pub/linux/kernel/people/mickflemm/01-fast-chan-switch-modparm

It works great! Thanks so much!
Comment 28 Rafał Miłecki 2011-05-15 07:52:43 UTC
(In reply to comment #27)
> (In reply to comment #26)
> > Try this out...
> >
> http://www.kernel.org/pub/linux/kernel/people/mickflemm/01-fast-chan-switch-modparm
> 
> It works great! Thanks so much!

Boris: what about your association problem? Does it still occur? Is this also regression between 2.6.37 and 2.6.38?

Could you do bisecting between:
a) GOOD: 2.6.37
b) BAD: One commit before "ath5k: Fix fast channel switching"
and perhaps create new bug report.
Comment 29 Boris Popov 2011-05-15 10:40:00 UTC
(In reply to comment #28)

> Boris: what about your association problem? Does it still occur? Is this also
> regression between 2.6.37 and 2.6.38?

I applied patch to last commit Linus' kernel tree and haven't problem.


> Could you do bisecting between:
> a) GOOD: 2.6.37
> b) BAD: One commit before "ath5k: Fix fast channel switching"
> and perhaps create new bug report.

I saw commit before "fix fast..." b5f737... and it is working well at this moment.
Comment 30 Florian Mickler 2011-05-23 17:23:53 UTC
So do I understand correctly that this issue is now resolved and the 
Patch: http://www.kernel.org/pub/linux/kernel/people/mickflemm/01-fast-chan-switch-modparm 

resolves this issue?
Comment 31 Boris Popov 2011-05-24 03:21:29 UTC
(In reply to comment #30)
> So do I understand correctly that this issue is now resolved and the 
> Patch:
>
> http://www.kernel.org/pub/linux/kernel/people/mickflemm/01-fast-chan-switch-modparm 
> 
> resolves this issue?

Yes of course.
Comment 32 lucio.pinese 2011-06-07 13:36:48 UTC
sorry for the newbie question...on what release of the kernel this patch will be implented? Thanks..i am new to linux and i don't know how to patch manually! 


Thanks a lot!

Note You need to log in before you can comment on or make changes to this bug.