Bug 12414

Summary: iwl4965 cannot use "ap auto" on latest 2.6.28/29?
Product: Drivers Reporter: Rafael J. Wysocki (rjw)
Component: network-wirelessAssignee: John W. Linville (linville)
Status: CLOSED DOCUMENTED    
Severity: normal CC: jeff.chua.linux, reinette.chatre
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28-git Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 12398    
Attachments: revert-remove-ssid-knowledge-from-driver-series.patch
Debug patch

Description Rafael J. Wysocki 2009-01-10 16:32:34 UTC
Subject    : iwl4965 cannot use "ap auto" on latest 2.6.28/29?
Submitter  : "Jeff Chua" <jeff.chua.linux@gmail.com>
Date       : 2009-01-05 4:13
References : http://marc.info/?l=linux-kernel&m=123112882127823&w=4

This entry is being used for tracking a regression from 2.6.28.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 John W. Linville 2009-01-20 07:16:47 UTC
Could you add a line like this to /etc/modprobe.conf?

   options iwlagn disable_hw_scan=1

After that, either reboot or 'modprobe -r iwlagn ; modprobe iwlagn'.  Does that change the situation?
Comment 2 Rafael J. Wysocki 2009-01-20 12:19:09 UTC
On Tuesday 20 January 2009, Jeff Chua wrote:
> On Tue, Jan 20, 2009 at 5:32 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> 
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12414
> > Subject         : iwl4965 cannot use "ap auto" on latest 2.6.28/29?
> > Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> > Date            : 2009-01-05 4:13 (15 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123112882127823&w=4
> 
> Still not fixed. It seems to work if the AP is not hidden, but doesn't
> work if the AP is hidden -- on 2.6.27, it works fine.
Comment 3 Reinette Chatre 2009-01-29 11:50:56 UTC
Could you please try with patch http://marc.info/?l=linux-wireless&m=123320213115108&w=2 ? Even though it is a patch against latest wireless-testing it does apply to 2.6.29-rc3.
Comment 4 Rafael J. Wysocki 2009-02-04 17:37:42 UTC
On Wednesday 04 February 2009, Jeff Chua wrote:
> On Wed, Feb 4, 2009 at 6:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12414
> > Subject         : iwl4965 cannot use "ap auto" on latest 2.6.28/29?
> > Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> > Date            : 2009-01-05 4:13 (31 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123112882127823&w=4
> 
> Latest linux git pull seems to behave slightly better. Restarting the
> interface a few times fixes the problem 70% of the time. The rest of
> the time, it can't associate with the AP.
Comment 5 Rafael J. Wysocki 2009-02-14 12:10:37 UTC
On Saturday 14 February 2009, Jeff Chua wrote:
> On Sat, Feb 14, 2009 at 7:41 PM, Jeff Chua <jeff.chua.linux@gmail.com> wrote:
> > On Wed, Feb 4, 2009 at 6:23 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12414
> >> Subject         : iwl4965 cannot use "ap auto" on latest 2.6.28/29?
> >> Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> >> Date            : 2009-01-05 4:13 (31 days old)
> >> References      : http://marc.info/?l=linux-kernel&m=123112882127823&w=4
> >
> > I applied the patch
> > http://marc.info/?l=linux-wireless&m=123320213115108&w=2 and it seems
> > better now. One instant of not association out of 20 tries, but
> > reloading the modules seems to make it work again. So, let's just
> > close the case?
> 
> I tested on another AP (WAG200G ), and this one, it's failing to
> associate more often. And I noticed that when it does associate, I
> don't get these two lines ...
> 
> iwlagn: TX Power requested while scanning!
> iwlagn: Error sending TX power (-11)
> 
> 
> 
> Here's a good run ...
> 
> iwlagn: Intel(R) Wireless WiFi Link AGN driver for Linux, 1.3.27ks
> iwlagn: Copyright(c) 2003-2008 Intel Corporation
> iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> iwlagn 0000:03:00.0: setting latency timer to 64
> iwlagn: Detected Intel Wireless WiFi Link 4965AGN REV=0x4
> iwlagn: Tunable channels: 11 802.11bg, 13 802.11a channels
> wmaster0 (iwlagn): not using net_device_ops yet
> phy0: Selected rate control algorithm 'iwl-agn-rs'
> wlan0 (iwlagn): not using net_device_ops yet
> iwlagn 0000:03:00.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
> iwlagn 0000:03:00.0: irq 28 for MSI/MSI-X
> iwlagn 0000:03:00.0: firmware: requesting iwlwifi-4965-2.ucode
> iwlagn loaded firmware version 228.57.2.23
> Registered led device: iwl-phy0:radio
> Registered led device: iwl-phy0:assoc
> Registered led device: iwl-phy0:RX
> Registered led device: iwl-phy0:TX
> wlan0: authenticate with AP xxxxxxxxxxxxx
> wlan0: authenticate with AP xxxxxxxxxxxxx
> wlan0: authenticate with AP xxxxxxxxxxxxx
> wlan0: authenticated
> wlan0: associate with AP xxxxxxxxxxxx
> wlan0: RX AssocResp from xxxxxxxxxxxx (capab=0x471 status=0 aid=1)
> wlan0: associated
Comment 6 John W. Linville 2009-02-24 08:03:45 UTC
Any chance we could get the actual bug reporter Cc'ed on this bug?

Honestly, this seems like a rambling description of every glitch or problem the user is experiencing, rather than a focused report on a single problem.

From comment 5, it seems like maybe this one should be closed and a new bug should be opened for the "iwlagn: Error sending TX power (-11)" bits?
Comment 7 Rafael J. Wysocki 2009-02-25 14:39:35 UTC
On Tuesday 24 February 2009, you wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12414
>
> ------- Comment #6 from linville@tuxdriver.com  2009-02-24 08:03 -------
> Any chance we could get the actual bug reporter Cc'ed on this bug?
> 
> Honestly, this seems like a rambling description of every glitch or problem
> the
> user is experiencing, rather than a focused report on a single problem.
> 
> From comment 5, it seems like maybe this one should be closed and a new bug
> should be opened for the "iwlagn: Error sending TX power (-11)" bits?

Jeff, this is a "NEEDINFO" for you, please respond.
Comment 8 Anonymous Emailer 2009-02-25 17:38:42 UTC
Reply-To: jeff.chua.linux@gmail.com

On Thu, Feb 26, 2009 at 6:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> http://bugzilla.kernel.org/show_bug.cgi?id=12414
> Jeff, this is a "NEEDINFO" for you, please respond.

I just tried it and still doesn't work. May be it's this particular
AP, but I tried on another hidden AP and it's not much better.

I'll try to bisect it. I know it works in 2.6.28.

Jeff.
Comment 9 Anonymous Emailer 2009-02-27 15:24:49 UTC
Reply-To: jeff.chua.linux@gmail.com

On Thu, Feb 26, 2009 at 9:38 AM, Jeff Chua <jeff.chua.linux@gmail.com> wrote:
> On Thu, Feb 26, 2009 at 6:39 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>>> http://bugzilla.kernel.org/show_bug.cgi?id=12414
>> Jeff, this is a "NEEDINFO" for you, please respond.
>
> I just tried it and still doesn't work. May be it's this particular
> AP, but I tried on another hidden AP and it's not much better.
>
> I'll try to bisect it. I know it works in 2.6.28.

Here's the commit that causes the Intel 4965 to not associate with a
hidden AP. Reverting this commit make 2.6.28-rc3 works again very
reliably -- getting associated to the AP after every iwlagn module
reload.

But when I reverted this commit in 2.6.29-rc6, my X61 can't associate
with the hidden AP. Some other thing is broken in between 2.6.28 -
2.6.29.

The only way to associate in 2.6.29-rcX is to force it manually with
"iwconfig wlan0 ap <mac addr of hidden ap>".

In 2.6.28, I could just specify the ssid and encrypted key using
iwconfig, and then set "iwconfig wlan0 ap auto channel auto" and
"iwlist scan" will find the hidden ap.

Test on both 2.6.28 and 2.6.29-rcx using same
iwlwifi-4965-ucode-228.57.2.23 and wireless_tools.30.pre7.

What's the best way to get this fixed? I've tried git bisect and
trying to revert the bad commit, but it doesn't compile.

drivers/net/wireless/iwlwifi/iwl-scan.c:746: error: 'struct iwl_priv'
has no member named 'essid_len'
drivers/net/wireless/iwlwifi/iwl-scan.c:750: error: 'struct iwl_priv'
has no member named 'essid_len'
drivers/net/wireless/iwlwifi/iwl-scan.c:751: error: 'struct iwl_priv'
has no member named 'essid_len'


Thanks,
Jeff.



commit a57a59f247b651e8ed6d3eeb7e2f9d83b83134c9
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Tue Oct 28 18:21:05 2008 +0100

    iwlwifi: remove implicit direct scan

    When an undirected scan is requested and iwlwifi is not associated but
    the user has set an SSID (and maybe was associated with that network at
    some point) then iwlwifi will assume the user wanted to scan for this
    SSID which seems wrong. Remove this code.

    Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
Comment 10 Anonymous Emailer 2009-02-27 15:28:36 UTC
Reply-To: jeff.chua.linux@gmail.com

On Sat, Feb 28, 2009 at 7:24 AM, Jeff Chua <jeff.chua.linux@gmail.com> wrote:

>> I just tried it and still doesn't work. May be it's this particular
>> AP, but I tried on another hidden AP and it's not much better.

I just to clarify that this association problem only happens with
"hidden AP". 2.6.28 to 2.6.29-rcX works just fine with non-hidden AP.

Thanks,
Jeff.
Comment 11 Rafael J. Wysocki 2009-03-15 03:39:21 UTC
On Sunday 15 March 2009, Jeff Chua wrote:
> On Sun, Mar 15, 2009 at 3:01 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12414
> > Subject         : iwl4965 cannot use "ap auto" on latest 2.6.28/29?
> > Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> > Date            : 2009-01-05 4:13 (69 days old)
> > References      : http://marc.info/?l=linux-kernel&m=123112882127823&w=4
> 
> Still not working in linux-2.6.29-rc8. Broken after the commit below.
> There were many changes to wireless after this commit, and simply
> reverting this commit will break compiling.
 
commit 41bb73eeac5ff5fb217257ba33b654747b3abf11
Author: Johannes Berg <johannes@sipsolutions.net>
Date:   Wed Oct 29 01:09:37 2008 +0100
 
     mac80211: remove SSID driver code


First-Bad-Commit : 41bb73eeac5ff5fb217257ba33b654747b3abf11
Comment 12 Rafael J. Wysocki 2009-03-15 03:42:05 UTC
On Sunday 15 March 2009, Jeff Chua wrote:
> On Sun, Mar 15, 2009 at 10:58 AM, Jeff Chua <jeff.chua.linux@gmail.com>
> wrote:
> > On Sun, Mar 15, 2009 at 3:01 AM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> >> Bug-Entry       : http://bugzilla.kernel.org/show_bug.cgi?id=12414
> >> Subject         : iwl4965 cannot use "ap auto" on latest 2.6.28/29?
> >> Submitter       : Jeff Chua <jeff.chua.linux@gmail.com>
> >> Date            : 2009-01-05 4:13 (69 days old)
> >> References      : http://marc.info/?l=linux-kernel&m=123112882127823&w=4
> >
> The commit below is causing problem with associating with the hidden AP as
> well.
> 
> 71c11fb57b924c160297ccd9e1761db598d00ac2 is first bad commit
> commit 71c11fb57b924c160297ccd9e1761db598d00ac2
> Author: Johannes Berg <johannes@sipsolutions.net>
> Date:   Tue Oct 28 18:29:48 2008 +0100
> 
>     b43/legacy: remove SSID code
> 
>     The SSID programmed into the device is used by the ucode only
>     to reply to probe requests, a functionality we disable anyway
>     because it doesn't fit with the mac80211/hostapd programming
>     model. Therefore, it isn't useful to program the SSID into
>     device.
> 
>     Signed-off-by: Johannes Berg <johannes@sipsolutions.net>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
Comment 13 John W. Linville 2009-03-16 08:28:17 UTC
Comment 9 identified "iwlwifi: remove implicit direct scan" as the culprit, which seems reasonable.  The other cited commits seem questionable.
Comment 14 John W. Linville 2009-03-17 07:51:58 UTC
Can you try a 2.6.29-rc8 build with this on top?

   git revert a57a59f247b651e8ed6d3eeb7e2f9d83b83134c9

Does that restore operation with hidden SSIDs for you?
Comment 15 John W. Linville 2009-03-17 08:25:06 UTC
Sorry, obviously you have to revert a few more...

git revert 41bb73eeac5ff5fb217257ba33b654747b3abf11
git revert b23f99bcfa12c7b452f7ad201ea5921534d4e9ff
git revert 71c11fb57b924c160297ccd9e1761db598d00ac2
git revert 4607816f608b42a5379aca97ceed08378804c99f
git revert a57a59f247b651e8ed6d3eeb7e2f9d83b83134c9
Comment 16 John W. Linville 2009-03-17 08:25:49 UTC
The first one generates a conflict, just take the hunk.
Comment 17 John W. Linville 2009-03-17 09:04:22 UTC
Created attachment 20567 [details]
revert-remove-ssid-knowledge-from-driver-series.patch

Equivalent patch to prescribed git revert sequence...
Comment 18 Reinette Chatre 2009-03-17 15:29:55 UTC
When in 2.6.29-rc8, can you see the hidden AP in your scan results?

What happens if you specify the ssid in iwconfig:
iwconfig wlanX ap auto channel auto essid <your essid>
Comment 19 Jeff Chua 2009-03-18 09:46:18 UTC
On Wed, Mar 18, 2009 at 6:29 AM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=12414
> ------- Comment #18 from reinette.chatre@intel.com 
Comment 20 John W. Linville 2009-03-18 10:03:54 UTC
FWIW, "my" patch is just removing other pre-existing patches.  We would prefer to keep all those patches and fix this issue.  Please continue to cooperate with Johannes on trying to find a fix against current 2.6.29-rc8 or later kernels.
Comment 21 Reinette Chatre 2009-03-18 10:18:03 UTC
(In reply to comment #20)
> FWIW, "my" patch is just removing other pre-existing patches.  We would
> prefer
> to keep all those patches and fix this issue.  Please continue to cooperate
> with Johannes on trying to find a fix against current 2.6.29-rc8 or later
> kernels.
> 

I am digging into this but it takes a while as I am not that familiar with the MLME. What I have found so far is that when the iwconfig command is issued with the essid then an association is indeed attempted, but because the scan information does not contain essid an association is attempted to a potentially wrong AP. This can be seen if you enable mac80211 debugging CONFIG_MAC80211_VERBOSE_DEBUG then you may see this. After this association fails it does not try again with a different AP that also matches requested connection parameters.

What the removed patch did was ensure that the scan information contains essid and the first association attempt is thus done to correct AP.

In some testing I have found that MLME does indeed at some point include essid of hidden AP in scan results ... I am currently trying to figure out how to get this done at the time of association request.
Comment 22 Jeff Chua 2009-03-18 11:06:45 UTC
On Thu, Mar 19, 2009 at 1:03 AM,  <bugme-Please continue to cooperate
> with Johannes on trying to find a fix against current 2.6.29-rc8 or later
> kernels.

Definitely. I'm just tied up. Just compiled with debug info and will
be testing shortly.


Jeff.
Comment 23 Reinette Chatre 2009-03-18 16:02:08 UTC
Created attachment 20586 [details]
Debug patch

I have learned more about how the hidden AP stuff works and was able to verify connection to hidden AP with 2.6.28-rc8 on 4965. Could you please run a test with the attached patch to obtain more debugging about your case? Please run this together with mac80211 debugging enabled. 

This patch is on top of vanilla 2.6.28-rc8.
Comment 24 Jeff Chua 2009-03-18 19:46:24 UTC
On Thu, Mar 19, 2009 at 7:02 AM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> I have learned more about how the hidden AP stuff works and was > able to
> verify connection to hidden AP with 2.6.28-rc8 on 4965.
> Could you please run a test with the attached patch to obtain
> more debugging about your case? Please run this together with
> mac80211 debugging enabled.

I'm also discovering new things.

I can now get vanilla 2.6.28-rc8 to work (7/10 times) by changing the
sequence of iwconfig.


This loop does not work at all without John's patch , but will work
100% when patched.

        iwconfig wlan0 mode Managed essid xxx key restricted xxx
        for((i = 0; i < 5; i++))
        do
                iwconfig wlan0 ap auto channel auto
                iwconfig wlan0 | grep -q "Access Point: Not-Associated"
                [ $? -ne 0 ] && break
                echo ".\c"
                sleep 1
        done


This loop only works 8 of 10 times with/without the patch.

        iwconfig wlan0 mode Managed essid xxx key restricted xxx
        iwconfig wlan0 ap auto channel auto
        for((i = 0; i < 5; i++))
        do
                iwconfig wlan0 | grep -q "Access Point: Not-Associated"
                [ $? -ne 0 ] && break
                echo ".\c"
                sleep 1
        done


The only difference is having "iwconfig wlan0 ap auto channel auto"
inside the loop.
Comment 25 Jeff Chua 2009-03-18 21:52:27 UTC
On Thu, Mar 19, 2009 at 7:02 AM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> Please run
> this together with mac80211 debugging enabled.
>
> This patch is on top of vanilla 2.6.28-rc8.

Attached. Two runs

mac.fail
mac.pass

Thanks,
Jeff.
Comment 26 Reinette Chatre 2009-03-19 08:51:26 UTC
(In reply to comment #25)
> On Thu, Mar 19, 2009 at 7:02 AM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> > Please run
> > this together with mac80211 debugging enabled.
> >
> > This patch is on top of vanilla 2.6.28-rc8.
> 
> Attached. Two runs
> 
> mac.fail
> mac.pass
> 
> Thanks,
> Jeff.
> 

The attachments did not make it through ... perhaps it needs to be done through the web interface?
Comment 27 Reinette Chatre 2009-03-19 08:54:19 UTC
(In reply to comment #24)
> On Thu, Mar 19, 2009 at 7:02 AM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> > I have learned more about how the hidden AP stuff works and was > able to
> verify connection to hidden AP with 2.6.28-rc8 on 4965.
> > Could you please run a test with the attached patch to obtain
> > more debugging about your case? Please run this together with
> > mac80211 debugging enabled.
> 
> I'm also discovering new things.
> 
> I can now get vanilla 2.6.28-rc8 to work (7/10 times) by changing the
> sequence of iwconfig.
> 
> 
> This loop does not work at all without John's patch , but will work
> 100% when patched.
> 
>         iwconfig wlan0 mode Managed essid xxx key restricted xxx
>         for((i = 0; i < 5; i++))
>         do
>                 iwconfig wlan0 ap auto channel auto
>                 iwconfig wlan0 | grep -q "Access Point: Not-Associated"
>                 [ $? -ne 0 ] && break
>                 echo ".\c"
>                 sleep 1
>         done
> 
> 
> This loop only works 8 of 10 times with/without the patch.
> 
>         iwconfig wlan0 mode Managed essid xxx key restricted xxx
>         iwconfig wlan0 ap auto channel auto
>         for((i = 0; i < 5; i++))
>         do
>                 iwconfig wlan0 | grep -q "Access Point: Not-Associated"
>                 [ $? -ne 0 ] && break
>                 echo ".\c"
>                 sleep 1
>         done
> 
> 
> The only difference is having "iwconfig wlan0 ap auto channel auto"
> inside the loop.
> 

Could you please try this sequence:

iwconfig wlan0 mode Managed key restricted xxx
iwconfig wlan0 ap auto channel auto essid xxx
Comment 28 Jeff Chua 2009-03-19 09:41:15 UTC
On Thu, Mar 19, 2009 at 11:54 PM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> Could you please try this sequence:
> iwconfig wlan0 mode Managed key restricted xxx
> iwconfig wlan0 ap auto channel auto essid xxx

This works on the command line. (Delays between commands), But putting
in a script would not work.

But, interesting discovery is that the 2nd iwconfig for "ap auto
channel auto" is not needed, and will cause it not to associate. So,
just the first iwconfig is enough to make it work.

I'm using this and it's working on 2.6.29-rc8 without patch.

modprobe iwlagn
iwconfig wlan0 essid xxx key restricted xxx
ifconfig wlan0 up


Thanks,
Jeff.
Comment 29 Reinette Chatre 2009-03-19 10:37:45 UTC
(In reply to comment #28)
> On Thu, Mar 19, 2009 at 11:54 PM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> > Could you please try this sequence:
> > iwconfig wlan0 mode Managed key restricted xxx
> > iwconfig wlan0 ap auto channel auto essid xxx
> 
> This works on the command line. (Delays between commands), But putting
> in a script would not work.

It should work. Could you please send debug log of this case? 

> 
> But, interesting discovery is that the 2nd iwconfig for "ap auto
> channel auto" is not needed, and will cause it not to associate. So,
> just the first iwconfig is enough to make it work.
> 
> I'm using this and it's working on 2.6.29-rc8 without patch.
> 
> modprobe iwlagn
> iwconfig wlan0 essid xxx key restricted xxx
> ifconfig wlan0 up
> 
> 

What if you try the following:

modprobe iwlagn
ifconfig wlan0 up
iwconfig wlan0 essid xxx key restricted xxx
Comment 30 Reinette Chatre 2009-03-20 11:51:16 UTC
(In reply to comment #28)
> 
> I'm using this and it's working on 2.6.29-rc8 without patch.
> 
> modprobe iwlagn
> iwconfig wlan0 essid xxx key restricted xxx
> ifconfig wlan0 up
> 

Considering the above comment ... can this bug be closed?
Comment 31 Jeff Chua 2009-03-20 12:26:21 UTC
On Sat, Mar 21, 2009 at 2:51 AM,  <bugme-daemon@bugzilla.kernel.org> wrote:
> Considering the above comment ... can this bug be closed?

Yes please. I've just submitted tcp dumps to Johannes, but for now,
it's all working.

Thanks,
Jeff.