Bug 14206

Summary: ipw220 driver does not associate to AP on resume
Product: Drivers Reporter: Ritesh Raj Sarraf (linux-kernel-bugs)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: CLOSED UNREPRODUCIBLE    
Severity: high CC: linville, yi.zhu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30 Subsystem:
Regression: No Bisected commit-id:
Attachments: WORKAROUND
dmesg
dmesg log

Description Ritesh Raj Sarraf 2009-09-22 11:17:19 UTC
I have an Intel Pro Wireless card on my IBM T43 ThinkPad.

0b:02.0 Network controller: Intel Corporation PRO/Wireless 2200BG [Calexico2] Network Connection (rev 05)
        Subsystem: Intel Corporation IBM ThinkPad R50e
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
        Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=medium >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 64 (750ns min, 6000ns max), Cache Line Size: 32 bytes
        Interrupt: pin A routed to IRQ 21
        Region 0: Memory at b4001000 (32-bit, non-prefetchable) [size=4K]
        Capabilities: [dc] Power Management version 2
                Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
                Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=1 PME-
        Kernel driver in use: ipw2200


When I suspend and resume my laptop, the wifi card doesn't connect to hidden (unencrypted) networks. The iwconfig status remains "unassociated". The ESSID can be set to the relevant AP but still it shows an unassociated.

If I reload the device driver (ipw2200), then it associates to the AP and can then connect to the network.

rrs@champaran:~ $ sudo iwconfig eth1
[sudo] password for rrs:            
eth1      unassociated  ESSID:"netapp"  
          Mode:Managed  Frequency=2.437 GHz  Access Point: 00:1A:A2:82:6B:70   
          Bit Rate:0 kb/s   Tx-Power=20 dBm   Sensitivity=8/0                  
          Retry limit:7   RTS thr:off   Fragment thr:off                       
          Encryption key:off                                                   
          Power Management:off                                                 
          Link Quality:0  Signal level:0  Noise level:0                        
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0             
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0             

It shows that the essid is the correct one, but it lists it as "unassociated".
Comment 1 Ritesh Raj Sarraf 2009-09-22 12:13:51 UTC
Created attachment 23139 [details]
WORKAROUND

Adding this script to pm-utils is a workaround for now, just in case someone else also is facing this annoying problem
Comment 2 Zhu Yi 2009-09-30 01:42:07 UTC
Can you scan after resume, btw? Please make sure the interface is UP.
Comment 3 Ritesh Raj Sarraf 2009-09-30 04:48:19 UTC
Here are the relevant logs:
2009/09/16 17:37:15 :: iwconfig eth1 mode managed
2009/09/16 17:37:15 :: Putting interface up...
2009/09/16 17:37:15 :: ifconfig eth1 up
2009/09/16 17:37:15 :: enctype is None
2009/09/16 17:37:15 :: ['iwconfig', 'eth1', 'essid', 'netapp']
2009/09/16 17:37:15 :: iwconfig eth1 channel 11
2009/09/16 17:37:15 :: iwconfig eth1 ap 00:1A:A2:82:64:C0
2009/09/16 17:37:15 :: Running DHCP
2009/09/16 17:37:15 :: /sbin/dhclient eth1
2009/09/16 17:37:15 :: Internet Systems Consortium DHCP Client V3.1.2p1
2009/09/16 17:37:15 :: Copyright 2004-2009 Internet Systems Consortium.
2009/09/16 17:37:15 :: All rights reserved.
2009/09/16 17:37:15 :: For info, please visit
http://www.isc.org/sw/dhcp/
2009/09/16 17:37:15 ::
2009/09/16 17:37:16 :: Listening on LPF/eth1/00:16:6f:ba:2f:12
2009/09/16 17:37:16 :: Sending on   LPF/eth1/00:16:6f:ba:2f:12
2009/09/16 17:37:16 :: Sending on   Socket/fallback
2009/09/16 17:37:16 :: DHCPDISCOVER on eth1 to 255.255.255.255 port 67
interval 8
2009/09/16 17:37:24 :: DHCPDISCOVER on eth1 to 255.255.255.255 port 67
interval 10
2009/09/16 17:37:34 :: DHCPDISCOVER on eth1 to 255.255.255.255 port 67
interval 21
2009/09/16 17:37:55 :: DHCPDISCOVER on eth1 to 255.255.255.255 port 67
interval 17
2009/09/16 17:38:12 :: DHCPDISCOVER on eth1 to 255.255.255.255 port 67
interval 5
2009/09/16 17:38:17 :: No DHCPOFFERS received.
2009/09/16 17:38:17 :: No working leases in persistent database -
sleeping.
2009/09/16 17:38:17 :: DHCP connection failed
2009/09/16 17:38:17 :: exiting connection thread
2009/09/16 17:38:17 :: Sending connection attempt result dhcp_failed
2009/09/16 17:38:17 :: ifconfig eth0
2009/09/16 17:38:17 :: ifconfig eth1
2009/09/16 17:38:17 :: iwconfig eth1
2009/09/16 17:38:19 :: ifconfig eth0
2009/09/16 17:38:19 :: ifconfig eth1
2009/09/16 17:38:21 :: ifconfig eth0
2009/09/16 17:38:21 :: ifconfig eth1
2009/09/16 17:38:23 :: ifconfig eth0


All I see is that no dhcp ip was offered. At this moment, if I reload the driver everything works.


You can find more information @ Debian BTS #546909
Comment 4 Ritesh Raj Sarraf 2009-09-30 04:50:30 UTC
(In reply to comment #2)
> Can you scan after resume, btw? Please make sure the interface is UP.

And yes, scanning succeeds after resume. And I can also see my hidden essid listed there. The command `iwconfig eth1 essid ESSID` also succeeds. But as I mentioned in the beginning, it shows it as "unassociated".
Comment 5 Zhu Yi 2009-09-30 06:03:31 UTC
It works here on my 31-rc7 kernel. Can you please attach your dmesg?
Comment 6 Ritesh Raj Sarraf 2009-09-30 06:49:13 UTC
Created attachment 23206 [details]
dmesg

dmesg log from today.
Comment 7 Zhu Yi 2009-09-30 07:01:07 UTC
(In reply to comment #6)
> Created an attachment (id=23206) [details]
> dmesg

Can you post the one after resume?
Comment 8 Ritesh Raj Sarraf 2009-09-30 08:08:56 UTC
Oh!! This one was after a resume. The only thing is that my helper script unloads/loads during suspend/resume.

I'll get you a clean one.
Comment 9 Ritesh Raj Sarraf 2009-09-30 09:09:02 UTC
Created attachment 23207 [details]
dmesg log

New one without the driver reload on resume.
Comment 10 Ritesh Raj Sarraf 2009-10-06 05:57:02 UTC
Hi Zhu, Do you need any more information ?
Comment 11 Zhu Yi 2009-10-10 01:03:57 UTC
Sorry for the delay. Can you please load the driver with parameter "debug=255" and attach the dmesg? (make sure CONFIG_IPW2200_DEBUG is enabled).
Comment 12 Ritesh Raj Sarraf 2009-10-14 12:33:06 UTC
Hi Zhu,

I have been able to reproduce the problem (with debug=255). I will send you the log personally.
Comment 13 Zhu Yi 2009-10-15 03:38:42 UTC
From the log:

> ipw2200: U ipw_best_network Network 'rrs (00:16:b6:a5:e7:1c)' excluded
> because of age: 638796ms.

Is it the correct AP? Apparently, the driver refused to associate with it because its last beacon was received too long ago. Can you please try below steps?

1. load libipw.ko with param "debug=3"
2. load ipw2200.ko with param "debug=255"
3. suspend -> resume
4. iwlist eth1 scan (after resume)
5. iwconfig eth1 essid rrs (try to associate manually)
6. attach dmesg
Comment 14 Ritesh Raj Sarraf 2009-10-15 04:38:29 UTC
(In reply to comment #13)
> From the log:
> 
> > ipw2200: U ipw_best_network Network 'rrs (00:16:b6:a5:e7:1c)' excluded
> because of age: 638796ms.
> 
> Is it the correct AP? Apparently, the driver refused to associate with it
> because its last beacon was received too long ago. Can you please try below
> steps?
> 

This is my home network AP which is okay. I don't have problems with this AP at all. 
The problem is with the work network where upon resume, it doesn't connect to the open unencrypted but hidden essid 'netapp'. That essid has many APs available there.

> 1. load libipw.ko with param "debug=3"
> 2. load ipw2200.ko with param "debug=255"
> 3. suspend -> resume
> 4. iwlist eth1 scan (after resume)
> 5. iwconfig eth1 essid rrs (try to associate manually)
> 6. attach dmesg

libipw doesn't have any parameter called debug

rrs@champaran:~ $ sudo modinfo libipw
filename:       /lib/modules/2.6.31-trunk-686/kernel/drivers/net/wireless/ipw2x00/libipw.ko
license:        GPL
author:         Copyright (C) 2004-2005 Intel Corporation <jketreno@linux.intel.com>
description:    802.11 data/management/control stack
version:        git-1.1.13
srcversion:     9E5AE56848208EE8AF19CA7
depends:        lib80211
vermagic:       2.6.31-trunk-686 SMP mod_unload modversions 686
rrs@champaran:~ $ sudo modprobe libipw debug=3
WARNING: All config files need .conf: /etc/modprobe.d/arch, it will be ignored in a future release.
FATAL: Error inserting libipw (/lib/modules/2.6.31-trunk-686/kernel/drivers/net/wireless/ipw2x00/libipw.ko): Unknown symbol in module, or unknown parameter (see dmesg)
Comment 15 Zhu Yi 2009-10-15 05:39:03 UTC
But I saw your driver lock to essid rrs. What is the card associated to before suspend? Is it possible that some script or NM set the fixed essid "rrs"?

Please select CONFIG_LIBIPW_DEBUG and recompile the kernel.
Comment 16 Ritesh Raj Sarraf 2009-10-15 06:05:50 UTC
(In reply to comment #15)
> But I saw your driver lock to essid rrs. What is the card associated to
> before
> suspend? Is it possible that some script or NM set the fixed essid "rrs"?
> 
I should have put the steps to reproduce.

* Connected at home network (rrs)
* Suspend laptop
* Resume laptop at work
* Connect to workplace's open, hidden, unencrypted (netapp) network.

Result: Does not connect to the work network.
Workaround: Reload the driver.

> Please select CONFIG_LIBIPW_DEBUG and recompile the kernel.
Zhu, I remember there were steps for requirements like these where you don't need to compile the entire kernel for just a single module. Can you please direct me any such doc ? That helps a lot.
Comment 17 Zhu Yi 2009-10-15 06:10:56 UTC
(In reply to comment #16)
> * Connected at home network (rrs)
> * Suspend laptop
> * Resume laptop at work
> * Connect to workplace's open, hidden, unencrypted (netapp) network.

OK. I saw the driver kept trying with rrs after resume. How did you connect to netapp AP? via iwconfig or wpa_supplicant?

> > Please select CONFIG_LIBIPW_DEBUG and recompile the kernel.
> Zhu, I remember there were steps for requirements like these where you don't
> need to compile the entire kernel for just a single module. Can you please
> direct me any such doc ? That helps a lot.

You can change the config with make menuconfig and then make. After all 'scripts' files are compiled, ctrl-c the make process. Then you can do "make M=drivers/net/wireless/ipw2x00" to compile the module directly.
Comment 18 Ritesh Raj Sarraf 2009-10-15 06:13:20 UTC
(In reply to comment #15)
> But I saw your driver lock to essid rrs. What is the card associated to
> before
> suspend? Is it possible that some script or NM set the fixed essid "rrs"?

Zhu, I think there might be more going. Could be you are right about the
locking part.
Can you please have a look at this bug:
http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=535507

From the first read, it looks like a cosmetic bug. But now that you have
hinted, it could be the wicd daemon at fault.
Comment 19 Ritesh Raj Sarraf 2009-10-15 06:14:32 UTC
(In reply to comment #17)
> OK. I saw the driver kept trying with rrs after resume. How did you connect
> to
> netapp AP? via iwconfig or wpa_supplicant?
>
For now, it is just a reload of the driver on every resume.
 
> You can change the config with make menuconfig and then make. After all
> 'scripts' files are compiled, ctrl-c the make process. Then you can do "make
> M=drivers/net/wireless/ipw2x00" to compile the module directly.

Sure. Will build and get the information soon.
Comment 20 Ritesh Raj Sarraf 2009-10-26 12:08:55 UTC
Sorry for the delay.

I tried with the steps you mentioned. Looks like libipw does not have a debug option

rrs@champaran:/proc/net $ sudo modinfo /lib/modules/2.6.31-trunk-686/updates/libipw.ko
filename:       /lib/modules/2.6.31-trunk-686/updates/libipw.ko
license:        GPL
author:         Copyright (C) 2004-2005 Intel Corporation <jketreno@linux.intel.com>
description:    802.11 data/management/control stack
version:        git-1.1.13
srcversion:     9E5AE56848208EE8AF19CA7
depends:        lib80211
vermagic:       2.6.31-trunk-686 SMP mod_unload modversions 686


rrs@champaran:/proc/net $ sudo insmod /lib/modules/2.6.31-trunk-686/updates/libipw.ko debug=3
insmod: error inserting '/lib/modules/2.6.31-trunk-686/updates/libipw.ko': -1 Unknown symbol in module



CONFIG_LIBIPW_DEBUG:                                                                                                  │
  │                                                                                                                       │
  │ This option will enable debug tracing output for the                                                                  │
  │ libipw component.                                                                                                     │
  │                                                                                                                       │
  │ This will result in the kernel module being ~70k larger.  You                                                         │
  │ can control which debug output is sent to the kernel log by                                                           │
  │ setting the value in                                                                                                  │
  │                                                                                                                       │
  │ /proc/net/ieee80211/debug_level                                                                                       │
  │                                                                                                                       │
  │ For example:                                                                                                          │
  │                                                                                                                       │
  │ % echo 0x00000FFO > /proc/net/ieee80211/debug_level                                                                   │
  │                                                                                                                       │
  │ For a list of values you can assign to debug_level, you                                                               │
  │ can look at the bit mask values in ieee80211.h                                                                        │
  │                                                                                                                       │
  │ If you are not trying to debug or develop the libipw                                                                  │
  │ component, you most likely want to say N here.                                                                        │


But even with the (hopfully debug enabled) module, I don't see the proc file that have been mentioned in the doc.

rrs@champaran:/proc/net $ ls /proc/net/ieee80211/debug_level
ls: cannot access /proc/net/ieee80211/debug_level: No such file or directory
Comment 21 Zhu Yi 2009-10-27 01:14:44 UTC
(In reply to comment #20)
> Sorry for the delay.
> 
> I tried with the steps you mentioned. Looks like libipw does not have a debug
> option
> 
> rrs@champaran:/proc/net $ sudo modinfo
> /lib/modules/2.6.31-trunk-686/updates/libipw.ko
> filename:       /lib/modules/2.6.31-trunk-686/updates/libipw.ko
> license:        GPL
> author:         Copyright (C) 2004-2005 Intel Corporation
> <jketreno@linux.intel.com>
> description:    802.11 data/management/control stack
> version:        git-1.1.13
> srcversion:     9E5AE56848208EE8AF19CA7
> depends:        lib80211
> vermagic:       2.6.31-trunk-686 SMP mod_unload modversions 686

You need to do "make menuconfig" and select CONFIG_LIBIPW_DEBUG option. Then recompile and reinstall the modules. You'll see a "debug" parameter then.

> But even with the (hopfully debug enabled) module, I don't see the proc file
> that have been mentioned in the doc.
> 
> rrs@champaran:/proc/net $ ls /proc/net/ieee80211/debug_level
> ls: cannot access /proc/net/ieee80211/debug_level: No such file or directory

The doc is out of date. The file is now moved to sysfs.
Comment 22 Ritesh Raj Sarraf 2009-10-27 07:31:18 UTC
Hi Zhu,

I have emailed you the logs. The logs contain the problem being reproduced (with libipw run with debug=3 and ipw2200 run with debug=255). Also, the log contains further reload of the driver and successful connect.
Comment 23 Ritesh Raj Sarraf 2009-10-27 08:21:30 UTC
Zhu, The logs I sent you must be big. The bug is triggered at timestamp of 10:30 and above.
Comment 24 Zhu Yi 2009-10-27 08:54:21 UTC
> I ipw_rx_notification association failed (0x6078): Unknown status value.

Seems like the association failed for unknown reason.

Can you please try if "ifconfig eth1 down; ifconfig eth1 up" after resume can fix the problem? Just to narrow down the issue.
Comment 25 Ritesh Raj Sarraf 2009-10-27 09:20:03 UTC
Sure. I will do that when the bug is next triggered. Hopefully in the next hour.

But, IMO, this has nothing to do with ifconfig. In the Description of this bug report, I have mentioned about the interface being *unassociated* with any carrier.

rrs@champaran:~ $ sudo iwconfig eth1
[sudo] password for rrs:            
eth1      unassociated  ESSID:"netapp"  
          Mode:Managed  Frequency=2.437 GHz  Access Point: 00:1A:A2:82:6B:70   
          Bit Rate:0 kb/s   Tx-Power=20 dBm   Sensitivity=8/0                  
          Retry limit:7   RTS thr:off   Fragment thr:off                       
          Encryption key:off                                                   
          Power Management:off                                                 
          Link Quality:0  Signal level:0  Noise level:0                        
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0             
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0             

It shows that the essid is the correct one, but it lists it as "unassociated".


Here, even manually doing an "iwconfig eth1 essid netapp" succeeds. But it is still reported as "unassociated". And if the same is done after a driver reload, as soon as I associate to an essid, it changes to IEE80211.


I think this is the problem. But I'll report you with the results once it is triggered.
Comment 26 Ritesh Raj Sarraf 2009-10-28 13:06:05 UTC
(In reply to comment #24)
> > I ipw_rx_notification association failed (0x6078): Unknown status value.
> 
> Seems like the association failed for unknown reason.
> 
> Can you please try if "ifconfig eth1 down; ifconfig eth1 up" after resume can
> fix the problem? Just to narrow down the issue.

Zhu, I tried what you said. But that changed nothing for me.

My /etc/network/interfaces (Debian setup) has no settings defined for network interfaces.
I use wicd to manage network interfaces. So, I also stopped and started the wicd daemon but that too did not help.

Anything else you want me to try ?
If you need the logs when the above activity was done, please let me know.
Comment 27 Zhu Yi 2009-10-30 01:52:03 UTC
(In reply to comment #26)
> My /etc/network/interfaces (Debian setup) has no settings defined for network
> interfaces.
> I use wicd to manage network interfaces. So, I also stopped and started the
> wicd daemon but that too did not help.

Did you run `ifconfig eth1 down; ifconfig eth1 up` in command line? I'm not sure wicd will do the same. If it can be confirmed the interface reset doesn't fix the problem for you after resume (but reloading the driver can), it must be a driver bug.
Comment 28 Ritesh Raj Sarraf 2009-10-30 04:53:51 UTC
Yes, I did run the commands:
ifconfig eth1 down
ifconfig eth1 up

But that did not help.
Comment 29 John W. Linville 2010-03-04 19:14:18 UTC
Ritesh, are you still experiencing this problem on 2.6.33?  FWIW, I use a T41 w/ ipw2200 all the time for my "surf the web in front of the TV" box and I have no such problems.
Comment 30 Ritesh Raj Sarraf 2010-03-04 20:03:53 UTC
I don't have access to the laptop anymore. I have a new machine now, a T400 which uses the iwlagn driver, with which I haven't seen the problem.

But on the old laptop with the ipw2200 driver, it pretty much was a persistent problem. And the workaround (of reloading the driver) served me good.
Comment 31 John W. Linville 2010-03-04 22:37:23 UTC
I see...well, if you happen to revive the laptop please do test and reopen this as appropriate...thanks!