Bug 14400

Summary: disable/enable wlan broken with ath5k
Product: Networking Reporter: Daniel Bumke (post)
Component: WirelessAssignee: networking_wireless (networking_wireless)
Status: CLOSED CODE_FIX    
Severity: normal CC: linville, mcgrof, me, nbigaouette, post, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31.3 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13615    
Attachments: Logfile

Description Daniel Bumke 2009-10-13 12:35:25 UTC
This bug was initially reported to http://bugs.archlinux.org/task/16591

The wireless on my Asus eee 901 works. Using FN-F2 to disable the wlan works. However, after using FN-F2 to enable the wlan fails. Wlan works again after reboot.

This problem did not occur with version 2.6.30.6 (downgrading fixes the issue).

Additional info:
kernel26 2.6.31.3-1
acpi-eeepc-generic 0.9.2-1
networkmanager 0.7.1-1

lspci | grep -i wireless
01:00.0 Ethernet controller: Atheros Communications Inc. AR5001 Wireless Network Adapter (rev 01)

/var/log/error.log
<snip>
kernel: ath5k phy0: failed to wakeup the MAC Chip
kernel: ath5k phy0: can't reset hardware (-5)
<snip>

kernel.log attached

Steps to reproduce:
-start eeepc 901
-connect to wireless
-disable wireless
-try to enable wireless (fails)
Comment 1 John W. Linville 2009-10-13 17:24:50 UTC
There aren't any attachments, BTW...could you include those? :-)
Comment 2 Daniel Bumke 2009-10-13 17:36:18 UTC
Created attachment 23392 [details]
Logfile
Comment 3 Bob Copeland 2009-10-14 13:52:53 UTC
Hmm... the kernel log doesn't show the error?

I wonder if this is some issue with the platform rfkill code.
Comment 4 Bob Copeland 2009-10-14 13:54:12 UTC
By the way, after rfkill the network will be down until you manually bring it up again; that is by design (but getting "can't reset hardware (-5)" is not).
Comment 5 Daniel Bumke 2009-10-15 10:19:56 UTC
Ok, I don't seem to get the "can't reset hardware (-5)" any more. Perhaps that was a one-off. However, wireless still does not come back up.

What do you mean by manually bringing it up?

The device does seem to be present but no wireless networks are found.

iwconfig

lo        no wireless extensions.

eth0      no wireless extensions.

wmaster0  no wireless extensions.

wlan0     IEEE 802.11bg  ESSID:""  
          Mode:Managed  Frequency:2.412 GHz  Access Point: Not-Associated   
          Tx-Power=off   
          Retry  long limit:7   RTS thr:off   Fragment thr:off
          Encryption key:off
          Power Management:off
          Link Quality:0  Signal level:0  Noise level:0
          Rx invalid nwid:0  Rx invalid crypt:0  Rx invalid frag:0
          Tx excessive retries:0  Invalid misc:0   Missed beacon:0

Neither networkmanager nor wicd see any networks.

Typical kernal.log output:

Oct 15 12:05:29 lhostname kernel: ath5k 0000:01:00.0: PCI INT A disabled
Oct 15 12:13:15 lhostname kernel: pci 0000:01:00.0: reg 10 64bit mmio: [0x000000-0x00ffff]
Oct 15 12:13:15 lhostname kernel: ath5k 0000:01:00.0: enabling device (0000 -> 0002)
Oct 15 12:13:15 lhostname kernel: ath5k 0000:01:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
Oct 15 12:13:15 lhostname kernel: ath5k 0000:01:00.0: setting latency timer to 64
Oct 15 12:13:15 lhostname kernel: ath5k 0000:01:00.0: registered as 'phy3'
Oct 15 12:13:15 lhostname kernel: rt2860sta: module is from the staging directory, the quality is unknown, you have been warned.
Oct 15 12:13:15 lhostname kernel: ath: EEPROM regdomain: 0x60
Oct 15 12:13:15 lhostname kernel: ath: EEPROM indicates we should expect a direct regpair map
Oct 15 12:13:15 lhostname kernel: ath: Country alpha2 being used: 00
Oct 15 12:13:15 lhostname kernel: ath: Regpair used: 0x60
Oct 15 12:13:15 lhostname kernel: phy3: Selected rate control algorithm 'minstrel'
Oct 15 12:13:15 lhostname kernel: Registered led device: ath5k-phy3::rx
Oct 15 12:13:15 lhostname kernel: Registered led device: ath5k-phy3::tx
Oct 15 12:13:15 lhostname kernel: ath5k phy3: Atheros AR2425 chip found (MAC: 0xe2, PHY: 0x70)
Comment 6 Bob Copeland 2009-10-15 12:50:16 UTC
(In reply to comment #5)
> Ok, I don't seem to get the "can't reset hardware (-5)" any more. Perhaps
> that
> was a one-off. However, wireless still does not come back up.
> 
> What do you mean by manually bringing it up?

Can you try (note first one is i_f_config, second is i_w_config):

$ sudo ifconfig wlan0 up

If that doesn't work, then

$ sudo iwconfig wlan0 txpower 20

> wlan0     IEEE 802.11bg  ESSID:""  
>           Mode:Managed  Frequency:2.412 GHz  Access Point: Not-Associated   
>           Tx-Power=off

Tx power is set to 0, so scanning won't work, but I think ifconfig up
will restore it.
Comment 7 Daniel Bumke 2009-10-20 12:35:38 UTC
Ok, I've had another chance to try this again. Thank you very much for your suggestions, they make the wireless usable at least!

The behaviour is as follows:

Hitting Fn-F2 once sets txpower to off, but the wireless card is still on (returned by iwconfig).

Hitting Fn-F2 again turns the wireless card fully off (no entry in iwconfig).

Hitting Fn-F2 again turns the wireless card back on, with txpower off.

Hitting Fn-F2 again turns the wireless card fully off (no entry in iwconfig).

Hitting Fn-F2 again turns the wireless card back on, with txpower off.

Rinse, repeat....

In the txpower=off state "ifconfig wlan0 up" returns:

SIOCSIFFLAGS: Unknown error 132

"iwconfig wlan0 txpower 20" sets txpower to 20 but the card still does not scan.
At this stage "ifconfig wlan0 up" works and usually brings the card back up. Sometimes it seems to be necessary to repeat one or more of these commands.

So, is this a driver problem? Or is the card functioning as designed but the standby script not using it correctly (i.e. toggling between the three states)?
Comment 8 Nicolas Bigaouette 2009-10-24 17:28:05 UTC
I have similar problem on my 1000 (using Arch too).
Wireless toggle is controled by acpi-eeepc-generic's script and I did not tried any manual steps yet.
What I wanted to say is that I changed the wireless card in my 1000 for an intel 4965.
I think it started with 2.6.31.
Comment 9 Daniel Bumke 2009-10-24 18:09:23 UTC
Could you try removing acpi-eeepc-generic to see what happens?

I have removed this and now enable/disable works flawlessly. I'm beginning to suspect that the kernel update made some changes that are not compatible with acpi-eeepc-generic. If that's the case this could be closed and reported to to acpi-eeepc-generic devs.
Comment 10 Nicolas Bigaouette 2009-10-24 18:43:56 UTC
I am the acpi-eeepc-generic dev ;)
I realized it was broken last week, the .31 kernel probably changed something which made the script's behaviour go wrong.

I'm still trying to found the best combination for activation/deactivation. The main goal is to save power. What is the less intrusive way of achieve this? ifconfig down, rfkill, module unloading, any other?

An issue is open for acpi-eeepc-generic:
http://code.google.com/p/acpi-eeepc-generic/issues/detail?id=42
Comment 11 Bob Copeland 2009-10-24 19:34:09 UTC
My understanding is the best way is to just set the interface down.  If ath5k uses more power in down interface vs rfkilled interface, then that would indicate that we need to turn off the radios in the idle state or similar.  I haven't measured it but would welcome someone carrying out that experiment if they can measure it accurately.
Comment 12 Nicolas Bigaouette 2009-10-24 19:38:08 UTC
Whats the use of the rfkill then? I though it was supposed to force the antenna off.
Comment 13 Bob Copeland 2009-10-24 19:55:23 UTC
Yeah, it does, but the chief purpose is compliance (think airplanes) rather than power saving.  If the interface is down, the driver should be putting the hardware into low-power states already (one hopes) since it isn't being used.
Comment 14 Nicolas Bigaouette 2009-10-25 19:41:08 UTC
You can forget my comments: I have found the problem _I had_.
Basically the bios upgrade (version 1003) made the bios control directly the card. So the Fn+F2 shortcut to disable the wireless is controled by the bios now. That confused the script which then though the card was down and tried to put it back on.

As for the original problem, could something like this happened?
Comment 15 Luis Chamberlain 2009-11-17 15:57:54 UTC
Is this still an issue? Is this a BIOS regression? What's the issue? You said you upgraded your BIOS and now the BIOS controls the FN+f2 stuff, what's the issue on the kernel side?
Comment 16 Luis Chamberlain 2009-11-17 15:58:20 UTC
BTW have you checked out:

http://wireless.kernel.org/en/users/Documentation/rfkill

to help debug rfkill issues?
Comment 17 Daniel Bumke 2009-11-18 10:43:43 UTC
As stated above, this now works fine for me. The issue seems to have been that the update broke the functionality of acpi-eeepc-generic (which I no longer use).