Bug 18172 - ath5k driver reports calibration timeouts and/or failures
Summary: ath5k driver reports calibration timeouts and/or failures
Status: CLOSED INSUFFICIENT_DATA
Alias: None
Product: Networking
Classification: Unclassified
Component: Wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: networking_wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2010-09-09 14:02 UTC by Iordan Iordanov
Modified: 2011-07-29 16:32 UTC (History)
7 users (show)

See Also:
Kernel Version: 2.6.32
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg output of "gain clibration"-error (10.28 KB, text/plain)
2011-01-22 13:00 UTC, Søren Holm
Details

Description Iordan Iordanov 2010-09-09 14:02:34 UTC
I am running Ubuntu 10.04. The stock ath5k driver on my card (dmesg info):

[   27.675749] ath5k 0000:02:03.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[   27.675805] ath5k 0000:02:03.0: registered as 'phy0'
[   28.334721] ath5k phy0: Atheros AR5213A chip found (MAC: 0x59, PHY: 0x43)
[   28.334726] ath5k phy0: RF5112B multiband radio found (0x36)

reports calibration timeouts, resulting in a total wireless hang for up to 1 minute following the report:

...
[66091.283758] ath5k phy0: noise floor calibration timeout (2452MHz)
[66092.010644] ath5k phy0: noise floor calibration timeout (2452MHz)
[66095.455892] ath5k phy0: noise floor calibration timeout (2452MHz)
[79194.212339] ath5k phy0: noise floor calibration timeout (2452MHz)
[80358.549190] ath5k phy0: noise floor calibration timeout (2452MHz)
[80519.111457] ath5k phy0: noise floor calibration timeout (2452MHz)
[80524.295615] ath5k phy0: noise floor calibration timeout (2452MHz)
[80525.023471] ath5k phy0: noise floor calibration timeout (2452MHz)
... etc, etc.

Compiling the latest stable release of the ath5k driver resulted in the *same* behavior, with a different message!

[84792.792435] ath5k 0000:02:03.0: PCI INT A disabled
[84799.111729] ath5k 0000:02:03.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17
[84799.111792] ath5k 0000:02:03.0: registered as 'phy0'
[84799.350105] ath5k phy0: Atheros AR5213A chip found (MAC: 0x59, PHY: 0x43)
[84799.350110] ath5k phy0: RF5112B multiband radio found (0x36)
[84872.868826] ath5k phy0: calibration of channel 9 failed
[84983.204498] ath5k phy0: calibration of channel 9 failed
[84993.244220] ath5k phy0: calibration of channel 9 failed
[85352.833334] ath5k phy0: calibration of channel 9 failed
[85362.832485] ath5k phy0: calibration of channel 9 failed
[85372.843618] ath5k phy0: calibration of channel 9 failed
[85382.850579] ath5k phy0: calibration of channel 9 failed
.... etc, etc.

The start of the second dmesg output shows the loading of the newly built ath5k module. This has been a long-standing issue by the looks of it. Can somebody help us ath5k users out? This is happening when I'm connected to different wireless networks, so I would bet it's not an access-point related issue.

I'm willing to try lots of stuff, so please let me know what information you need.

Thanks!
Iordan
Comment 1 Iordan Iordanov 2010-09-09 14:03:37 UTC
The exact version of the "newest stable" ath5k driver I tried is:
compat-wireless-2.6.36-rc3-1
Comment 2 Frans Schreuder 2010-11-12 12:35:29 UTC
Similar problem here
[   20.450150] ath5k 0000:07:00.0: PCI INT A -> GSI 19 (level, low) -> IRQ 19
[   20.450163] ath5k 0000:07:00.0: setting latency timer to 64
[   20.450274] ath5k 0000:07:00.0: registered as 'phy0'
[   21.006108] ath: EEPROM regdomain: 0x60
[   21.006110] ath: EEPROM indicates we should expect a direct regpair map
[   21.006115] ath: Country alpha2 being used: 00
[   21.006117] ath: Regpair used: 0x60
[   21.031592] Registered led device: ath5k-phy0::rx
[   21.031865] Registered led device: ath5k-phy0::tx
[   21.031871] ath5k phy0: Atheros AR2425 chip found (MAC: 0xe2, PHY: 0x70)
[   22.234823] ath5k phy0: gain calibration timeout (2412MHz)
[   22.281569] ath5k phy0: gain calibration timeout (2412MHz)
(.....)
[  166.689764] ath5k phy0: gain calibration timeout (2457MHz)
[  168.814595] ath5k phy0: gain calibration timeout (2412MHz)

My connection is already broken from startup, mostly when I boot on battery power.
When I boot and it works, my connection is usually stable and will stay on forever. (also suspend / hibernate will sometimes cause this error).

kernel version: 2.6.35-22 (ubuntu 10.10). (but it already happened from version 2.6.32 (31 was still alright).

Is there a way how I can supply more information? I have built debug versions of ath5k.ko and ath.ko, but how do I actually debug them?
Comment 3 theewk 2011-01-12 06:41:07 UTC
I also have this issue, using Ubuntu 10.10 and kernel 2.6.35-24-generic. My laptop is Acer Travelmate 5520G and I have internal mini PCI-E card with Atheros 5001:

05:00.0 Ethernet controller: Atheros Communications Inc. AR5001 Wireless Network Adapter (rev 01)

My card works always when I boot up the machine, signal is a bit poor in this location I mainly use my machine but connection is stable with very rare drop outs. However the problems appear when I do suspend and resume, I cannot reconnect and this card starts to output "Gain calibration error" -messages, phy0 then takes high cpu load. I can usually get the card to work again if I do suspend and resume couple times so it sometimes works. After I get it working again there is no problems until possibly when I do suspend and resume again.
Comment 4 neo.tida 2011-01-15 13:45:23 UTC
I have this problem too. I use AR5414 based card on a routerstation board with Openwrt trunk( it uses compat-wireless-2011-01-05 )
Comment 5 Nick Kossifidis 2011-01-16 00:35:14 UTC
Do you see this on only one or two channels or on all channels (eg when scanning) ? I only see channel 9 and 1 in your reports.

Also please for gain calibration timeout go to this bug...
https://bugzilla.kernel.org/show_bug.cgi?id=16436

This is not about calibration timeouts it's about failed calibrations (there is a difference).
Comment 6 neo.tida 2011-01-16 06:39:06 UTC
Nick, sorry for my mistake. My problem was gain calibration timeout and I went to https://bugzilla.kernel.org/show_bug.cgi?id=16436

Thanks
Comment 7 Søren Holm 2011-01-22 12:58:51 UTC
Hi 

This issue started happening for me after upgrading from 2.6.37 to 2.6.38-rc1.

I have this wireless adapter:

01:00.0 Ethernet controller: Atheros Communications Inc. AR5001 Wireless Network Adapter (rev 01)
        Subsystem: Atheros Communications Inc. AR5006EG 802.11bg NIC (2.4GHz, PCI Express)
        Flags: bus master, fast devsel, latency 0, IRQ 19
        Memory at fedf0000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: <access denied>
        Kernel driver in use: ath5k
        Kernel modules: ath5k
Comment 8 Søren Holm 2011-01-22 13:00:07 UTC
Created attachment 44762 [details]
dmesg output of "gain clibration"-error
Comment 9 Iordan Iordanov 2011-01-23 20:17:56 UTC
Hey Nick, I have calibration failures, but also noise floor calibration timeouts. So isn't this post both about failures and time-outs?

I wanted to tell you that for some reason, with the stock Ubuntu 10.04 kernel, and this driver:

$ modinfo ath5k

filename:       /lib/modules/2.6.32-25-generic/updates/compat-wireless-2.6.35/ath5k.ko
version:        0.6.0 (EXPERIMENTAL)

I haven't seen the problem for a while. I will report if it comes back.

The ubuntu package which supplies this drive is:

linux-backports-modules-compat-wireless-2.6.35-2.6.32-25-generic
Comment 10 John W. Linville 2011-03-29 18:13:11 UTC
I suspect that "calibration timeouts and/or failures" is too generic to be useful.  If you believe you are experiencing such problems, please use git bisect to narrow-down the commit that might be causing the issues you are experiencing.  Alternatively, open a new bug more specific to your situation.
Comment 11 Iordan Iordanov 2011-05-01 21:53:48 UTC
The problem finally showed up again. In case it's relevant, it happened after 13 days uptime, and 22 suspend/resume cycles.

I think that we should first discover how to reliably reproduce this problem before we attempt to git bisect. Otherwise, each git bisect will take few months :). Any suggestions anyone?

Here is the dmesg output with the ugly WARNING: 

[309047.507810] wlan0: deauthenticating from 00:1d:60:9f:21:ae by local choice (reason=3)
[309047.719434] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[309047.721631] wlan0: direct probe responded
[309047.721638] wlan0: authenticate with AP 00:1d:60:9f:21:ae (try 1)
[309047.723374] wlan0: authenticated
[309047.723404] wlan0: associate with AP 00:1d:60:9f:21:ae (try 1)
[309047.725711] wlan0: RX AssocResp from 00:1d:60:9f:21:ae (capab=0x411 status=0 aid=2)
[309047.725718] wlan0: associated
[309047.727181] ADDRCONF(NETDEV_CHANGE): wlan0: link becomes ready
[309057.932033] wlan0: no IPv6 routers present
[312365.723943] wlan0: deauthenticated from 00:1d:60:9f:21:ae (Reason: 7)
[312368.225598] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[312368.227944] wlan0: direct probe responded
[312368.227949] wlan0: authenticate with AP 00:1d:60:9f:21:ae (try 1)
[312368.229641] wlan0: authenticated
[312368.229663] wlan0: associate with AP 00:1d:60:9f:21:ae (try 1)
[312368.231970] wlan0: RX AssocResp from 00:1d:60:9f:21:ae (capab=0x411 status=0 aid=2)
[312368.231975] wlan0: associated
[313559.624056] No probe response from AP 00:1d:60:9f:21:ae after 500ms, disconnecting.
[313562.140764] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[313562.340059] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 2)
[313562.540062] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 3)
[313562.740047] wlan0: direct probe to AP 00:1d:60:9f:21:ae timed out
[313574.552712] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[313574.752051] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 2)
[313574.952053] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 3)
[313575.152050] wlan0: direct probe to AP 00:1d:60:9f:21:ae timed out
[313577.578983] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[313577.581299] wlan0: direct probe responded
[313577.581305] wlan0: authenticate with AP 00:1d:60:9f:21:ae (try 1)
[313577.584703] wlan0: authenticated
[313577.584726] wlan0: associate with AP 00:1d:60:9f:21:ae (try 1)
[313577.586960] wlan0: RX AssocResp from 00:1d:60:9f:21:ae (capab=0x411 status=0 aid=2)
[313577.586965] wlan0: associated
[320828.800855] wlan0: deauthenticated from 00:1d:60:9f:21:ae (Reason: 7)
[320831.320791] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[320831.520061] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 2)
[320831.720053] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 3)
[320831.920052] wlan0: direct probe to AP 00:1d:60:9f:21:ae timed out
[320843.724047] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[320843.726490] wlan0: direct probe responded
[320843.726494] wlan0: authenticate with AP 00:1d:60:9f:21:ae (try 1)
[320843.728405] wlan0: authenticated
[320843.728426] wlan0: associate with AP 00:1d:60:9f:21:ae (try 1)
[320843.730463] wlan0: RX AssocResp from 00:1d:60:9f:21:ae (capab=0x411 status=0 aid=2)
[320843.730467] wlan0: associated
[330021.500056] No probe response from AP 00:1d:60:9f:21:ae after 500ms, disconnecting.
[330038.734436] wlan0: direct probe to AP 00:1d:60:9f:21:ae (try 1)
[330038.737200] wlan0: direct probe responded
[330038.737204] wlan0: authenticate with AP 00:1d:60:9f:21:ae (try 1)
[330038.738890] wlan0: authenticated
[330038.738909] wlan0: associate with AP 00:1d:60:9f:21:ae (try 1)
[330038.741214] wlan0: RX AssocResp from 00:1d:60:9f:21:ae (capab=0x411 status=0 aid=2)
[330038.741217] wlan0: associated
[331156.332075] ------------[ cut here ]------------
[331156.332105] WARNING: at /build/buildd/linux-2.6.32/include/net/mac80211.h:1035 ath5k_txbuf_setup+0x429/0x440 [ath5k]()
[331156.332113] Hardware name: Latitude D410                   
[331156.332118] Modules linked in: nls_iso8859_1 nls_cp437 vfat fat usb_storage aes_i586 aes_generic binfmt_misc ppdev pata_pcmcia fbcon tileblit font bitblit softcursor vga16fb vgastate arc4 joydev snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm_oss pcmcia snd_mixer_oss snd_pcm snd_seq_dummy snd_seq_oss snd_seq_midi snd_rawmidi snd_seq_midi_event snd_seq snd_timer snd_seq_device ath5k dell_wmi mac80211 snd i915 yenta_socket ath drm_kms_helper dell_laptop rsrc_nonstatic intel_agp cfg80211 drm i2c_algo_bit dcdbas soundcore psmouse serio_raw pcmcia_core led_class agpgart snd_page_alloc video output lp parport tg3
[331156.332227] Pid: 845, comm: dbus-daemon Tainted: G        W  2.6.32-30-generic #59-Ubuntu
[331156.332234] Call Trace:
[331156.332249]  [<c014cf72>] warn_slowpath_common+0x72/0xa0
[331156.332266]  [<f83d1fa9>] ? ath5k_txbuf_setup+0x429/0x440 [ath5k]
[331156.332281]  [<f83d1fa9>] ? ath5k_txbuf_setup+0x429/0x440 [ath5k]
[331156.332290]  [<c014cfba>] warn_slowpath_null+0x1a/0x20
[331156.332305]  [<f83d1fa9>] ath5k_txbuf_setup+0x429/0x440 [ath5k]
[331156.332314]  [<c0153e99>] ? irq_exit+0x39/0x70
[331156.332326]  [<c059392c>] ? smp_apic_timer_interrupt+0x5c/0x8b
[331156.332336]  [<c0103df1>] ? apic_timer_interrupt+0x31/0x40
[331156.332345]  [<c01200e0>] ? clear_local_APIC+0x60/0x2b0
[331156.332354]  [<c058f525>] ? _spin_unlock_irqrestore+0x15/0x20
[331156.332370]  [<f83d20b4>] ath5k_tx_queue+0xf4/0x1f0 [ath5k]
[331156.332394]  [<f829b0dd>] ? ccmp_encrypt_skb+0x1bd/0x1e0 [mac80211]
[331156.332410]  [<f83d21c6>] ath5k_tx+0x16/0x20 [ath5k]
[331156.332436]  [<f82ab47e>] __ieee80211_tx+0xce/0x150 [mac80211]
[331156.332463]  [<f82ac6ca>] ieee80211_tx+0x9a/0x1a0 [mac80211]
[331156.332473]  [<c04bcf20>] ? skb_release_data+0x70/0xa0
[331156.332500]  [<f82ac8a4>] ieee80211_xmit+0xd4/0x2a0 [mac80211]
[331156.332509]  [<c01457f0>] ? default_wake_function+0x0/0x20
[331156.332520]  [<c012a768>] ? default_spin_lock_flags+0x8/0x10
[331156.332549]  [<f82ace5c>] ieee80211_subif_start_xmit+0x2dc/0x670 [mac80211]
[331156.332565]  [<c04c6489>] ? dev_queue_xmit_nit+0xc9/0x130
[331156.332573]  [<c04bc64d>] ? skb_queue_head+0x3d/0x50
[331156.332583]  [<c04c68a9>] dev_hard_start_xmit+0x129/0x240
[331156.332594]  [<c04db5b5>] sch_direct_xmit+0x105/0x170
[331156.332604]  [<c0514ef5>] ? arp_create+0x55/0x290
[331156.332614]  [<c04c972f>] dev_queue_xmit+0x37f/0x4b0
[331156.332622]  [<c04d94b0>] ? eth_header+0x0/0xb0
[331156.332631]  [<c0514b3f>] arp_xmit+0x5f/0x70
[331156.332639]  [<c0515185>] arp_send+0x55/0x60
[331156.332648]  [<c0515356>] arp_solicit+0x1c6/0x220
[331156.332659]  [<c04cfa2c>] neigh_timer_handler+0xcc/0x2e0
[331156.332668]  [<c015c21e>] run_timer_softirq+0x13e/0x2c0
[331156.332679]  [<c0177424>] ? tick_dev_program_event+0x74/0xd0
[331156.332688]  [<c04cf960>] ? neigh_timer_handler+0x0/0x2e0
[331156.332697]  [<c0153c18>] __do_softirq+0x98/0x1b0
[331156.332706]  [<c02099f4>] ? rw_verify_area+0x64/0xe0
[331156.332714]  [<c0153d75>] do_softirq+0x45/0x50
[331156.332722]  [<c0153ec5>] irq_exit+0x65/0x70
[331156.332732]  [<c059392c>] smp_apic_timer_interrupt+0x5c/0x8b
[331156.332741]  [<c0103df1>] apic_timer_interrupt+0x31/0x40
[331156.332748] ---[ end trace 04ff3a9f26b70233 ]---
[338637.412135] ath5k phy0: noise floor calibration timeout (2452MHz)
[339317.779640] ath5k phy0: noise floor calibration timeout (2452MHz)
Comment 12 John W. Linville 2011-06-07 17:41:20 UTC
2.6.32 is ~2 years old -- can you recreate this issue on a current kernel (i.e. 2.6.39 or later)?
Comment 13 John W. Linville 2011-07-29 16:32:35 UTC
Closing due to lack of response...

Note You need to log in before you can comment on or make changes to this bug.