Bug 11901

Summary: ath5k triggers WARN_ON in __ieee80211_rx
Product: Networking Reporter: Bob Copeland (me)
Component: WirelessAssignee: networking_wireless (networking_wireless)
Status: CLOSED CODE_FIX    
Severity: high CC: johannes, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.28-rc Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 11808    
Attachments: Fix detection of jumbo frames

Description Bob Copeland 2008-10-30 09:15:09 UTC
Latest working kernel version: 2.6.27
Earliest failing kernel version: 2.6.28-rc

This appears to be a problem with the rate table; mac80211 is complaining that rate is out of bounds.  Indeed it is -1 aka uninitialzed for that combination of band and rate index.  I have the following via printk in ath5k_tasklet_rx:

rxs.rate_idx = -1
sc->curband->band = 0  (2 ghz)
rs.rs_rate = 0      <-- that looks wrong

Reported by kerneloops:

http://www.kerneloops.org/guilty.php?guilty=ath5k_tasklet_rx&version=2.6.28-rc&start=1835008&end=1867775&class=warn


WARNING: at /srv/devel/kernel/net/mac80211/rx.c:2203
ath5k_tasklet_rx+0x318/0x5c0()
Modules linked in:
Pid: 0, comm: swapper Not tainted 2.6.28-rc1-wl #46
Call Trace:
 [<c014744f>] warn_on_slowpath+0x5f/0xa0
 [<c013ef69>] enqueue_task_fair+0xb9/0x100
 [<c013ef69>] enqueue_task_fair+0xb9/0x100
 [<c0137835>] default_spin_lock_flags+0x5/0x10
 [<c0574fbb>] _spin_lock_irqsave+0x2b/0x40
 [<c0151347>] lock_timer_base+0x27/0x60
 [<c01514e0>] __mod_timer+0x90/0xa0
 [<c03e8dea>] rexmit_timer+0x36a/0x3d0
 [<c03aebe8>] ath5k_tasklet_rx+0x318/0x5c0
 [<c014bfb0>] tasklet_action+0x70/0x100
 [<c014ca97>] __do_softirq+0x97/0x160
 [<c014ca00>] __do_softirq+0x0/0x160
 <IRQ>  [<c0170c50>] handle_fasteoi_irq+0x0/0xe0
 [<c014c7bd>] irq_exit+0x5d/0x80
 [<c012297c>] do_IRQ+0xcc/0x100
 [<c0120aeb>] common_interrupt+0x23/0x28
 [<c0334f88>] acpi_idle_enter_bm+0x315/0x39e
 [<c04346c4>] cpuidle_idle_call+0x74/0xc0
 [<c011eded>] cpu_idle+0x6d/0xd0
---[ end trace 8ac847f454371229 ]---
Comment 1 Bob Copeland 2008-10-31 15:01:30 UTC
This triggers since 63266a6535... "ath5k: rates cleanup".  Particularly in  hw_to_driver_rix(), if "something went wrong" it would fall back to the basic rate index of 1, whereas now it will return -1 and trigger the WARN_ON.

Not sure yet whether or not hw rate index of 0 is valid and we just don't know about it, or there's some error condition we aren't catching earlier.

Easiest way to trigger this incidentally is to just run kismet without joining a network.  In recent kernels it happens almost immediately, on older stuff it can take up to 15 minutes.
Comment 2 Bob Copeland 2008-11-01 07:22:52 UTC
Created attachment 18576 [details]
Fix detection of jumbo frames

More digging and I found that these were jumbo frames, and our code for discarding them was horribly broken.  This patch fixes the WARN_ON and replaces it with the intended error message.  However there may be a separate bug leading to the fact that we now get tons of these.
Comment 3 Rafael J. Wysocki 2008-11-02 12:44:07 UTC
Handled-By : Bob Copeland <me@bobcopeland.com>
Patch : http://lkml.org/lkml/2008/11/2/157
Comment 4 Rafael J. Wysocki 2008-11-03 23:08:03 UTC
Ignore-Patch : http://lkml.org/lkml/2008/11/2/157
Patch : http://marc.info/?l=linux-kernel&m=122576849517013&w=4