Bug 207093 - b43legacy disconnects after receiving some Unexpected value for chanstat (0x7C00)
Summary: b43legacy disconnects after receiving some Unexpected value for chanstat (0x7...
Status: RESOLVED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: PPC-32 Linux
: P1 normal
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-03 19:41 UTC by Erhard F.
Modified: 2020-06-23 11:09 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.6.2
Tree: Mainline
Regression: No


Attachments
dmesg (kernel 5.6.2, PowerMac G4 DP) (81.22 KB, text/plain)
2020-04-03 19:41 UTC, Erhard F.
Details
kernel .config (kernel 5.6.2, PowerMac G4 DP) (97.80 KB, text/plain)
2020-04-03 19:45 UTC, Erhard F.
Details
dmesg (kernel 5.7.0, PowerMac G4 DP) (98.75 KB, text/plain)
2020-06-04 20:24 UTC, Erhard F.
Details

Description Erhard F. 2020-04-03 19:41:15 UTC
Created attachment 288191 [details]
dmesg (kernel 5.6.2, PowerMac G4 DP)

This happens after a while and goes on and on. Disconnecting after "", reconnecting, working for a while, disconnecting again, etc.

[...]
[  978.136378] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  995.051033] b43legacy-phy0 debug: Removing Interface type 2
[  995.058474] b43legacy-phy0 debug: Wireless interface stopped
[  995.066593] b43legacy-phy0 debug: DMA-30 0x0260 (RX) max used slots: 1/64
[  995.074175] b43legacy-phy0 debug: DMA-30 0x0200 (RX) max used slots: 1/64
[  995.081800] b43legacy-phy0 debug: DMA-30 0x02A0 (TX) max used slots: 0/128
[  995.103463] b43legacy-phy0 debug: DMA-30 0x0280 (TX) max used slots: 0/128
[  995.120108] b43legacy-phy0 debug: DMA-30 0x0260 (TX) max used slots: 0/128
[  995.136785] b43legacy-phy0 debug: DMA-30 0x0240 (TX) max used slots: 0/128
[  995.150083] b43legacy-phy0 debug: DMA-30 0x0220 (TX) max used slots: 104/128
[  995.263501] b43legacy-phy0 debug: DMA-30 0x0200 (TX) max used slots: 0/128
[  995.276742] b43legacy-phy0 debug: Radio initialized
[  995.282846] b43legacy-phy0 debug: Radio initialized
[  995.700089] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27)
[  995.790576] b43legacy-phy0 debug: Chip initialized
[  995.797353] b43legacy-phy0 debug: 30-bit DMA initialized
[  995.809609] b43legacy-phy0 debug: Wireless interface started
[  995.833784] b43legacy-phy0 debug: Adding Interface type 2
[ 1093.918787] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1094.584074] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1159.037837] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1159.265413] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1311.074322] b43legacy-phy0 debug: Removing Interface type 2
[ 1311.080552] b43legacy-phy0 debug: Wireless interface stopped
[ 1311.096408] b43legacy-phy0 debug: DMA-30 0x0260 (RX) max used slots: 1/64
[ 1311.115973] b43legacy-phy0 debug: DMA-30 0x0200 (RX) max used slots: 1/64
[ 1311.127705] b43legacy-phy0 debug: DMA-30 0x02A0 (TX) max used slots: 0/128
[ 1311.150091] b43legacy-phy0 debug: DMA-30 0x0280 (TX) max used slots: 0/128
[ 1311.173395] b43legacy-phy0 debug: DMA-30 0x0260 (TX) max used slots: 0/128
[ 1311.190018] b43legacy-phy0 debug: DMA-30 0x0240 (TX) max used slots: 0/128
[ 1311.206677] b43legacy-phy0 debug: DMA-30 0x0220 (TX) max used slots: 110/128
[ 1311.323464] b43legacy-phy0 debug: DMA-30 0x0200 (TX) max used slots: 0/128
[ 1311.340012] b43legacy-phy0 debug: Radio initialized
[ 1311.345493] b43legacy-phy0 debug: Radio initialized
[ 1311.780026] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27)
[ 1311.873866] b43legacy-phy0 debug: Chip initialized
[ 1311.881297] b43legacy-phy0 debug: 30-bit DMA initialized
[ 1311.895017] b43legacy-phy0 debug: Wireless interface started
[ 1311.910348] b43legacy-phy0 debug: Adding Interface type 2
[ 1378.691130] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1477.669455] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1478.539075] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1510.483406] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1510.489642] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 1627.074260] b43legacy-phy0 debug: Removing Interface type 2
[...]

Machine is a PowerMac G4 DP 3,6:

 # inxi -b
System:    Kernel: 5.6.2-gentoo-PowerMacG4 ppc bits: 32 Console: tty 1 
           Distro: Gentoo Base System release 2.6 
Machine:   Type: PowerPC Device System: PowerMac3 6 details: PowerMac3 6 rev: 3.3 (pvr 8001 0303) 
           serial: P6N 
CPU:       Dual Core: 7455 altivec supported type: MCP speed: 1417 MHz 
Graphics:  Device-1: AMD RV350 [Radeon 9550/9600/X1050 Series] driver: radeon v: kernel 
           Display: server: X.org 1.20.7 driver: ati,radeon unloaded: fbdev,modesetting tty: 104x53 
           Message: Advanced graphics data unavailable in console for root. 
Network:   Device-1: Broadcom and subsidiaries BCM4306 802.11b/g Wireless LAN driver: b43-pci-bridge 
           Device-2: Apple UniNorth 2 GMAC driver: gem 
           Device-3: gmac driver: gem 
Drives:    Local Storage: total: 689.82 GiB used: 3.95 GiB (0.6%) 
Info:      Processes: 165 Uptime: 38m Memory: 1.96 GiB used: 573.6 MiB (28.6%) Init: systemd 
           Shell: bash inxi: 3.0.36 

 # lspci 
0000:00:0b.0 Host bridge: Apple Inc. UniNorth 2 AGP
0000:00:10.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV350 [Radeon 9550/9600/X1050 Series]
0001:10:0b.0 Host bridge: Apple Inc. UniNorth 2 PCI
0001:10:13.0 USB controller: NEC Corporation OHCI USB Controller (rev 43)
0001:10:13.1 USB controller: NEC Corporation OHCI USB Controller (rev 43)
0001:10:13.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04)
0001:10:15.0 Mass storage controller: Silicon Image, Inc. SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02)
0001:10:16.0 Network controller: Broadcom Inc. and subsidiaries BCM4306 802.11b/g Wireless LAN Controller (rev 02)
0001:10:17.0 Unassigned class [ff00]: Apple Inc. KeyLargo Mac I/O (rev 03)
0001:10:18.0 USB controller: Apple Inc. KeyLargo USB
0001:10:19.0 USB controller: Apple Inc. KeyLargo USB
0001:10:1b.0 USB controller: NEC Corporation OHCI USB Controller (rev 43)
0001:10:1b.1 USB controller: NEC Corporation OHCI USB Controller (rev 43)
0001:10:1b.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04)
0002:20:0b.0 Host bridge: Apple Inc. UniNorth 2 Internal PCI
0002:20:0d.0 Unassigned class [ff00]: Apple Inc. UniNorth 2 ATA/100
0002:20:0e.0 FireWire (IEEE 1394): Apple Inc. UniNorth 2 FireWire (rev 01)
0002:20:0f.0 Ethernet controller: Apple Inc. UniNorth 2 GMAC (Sun GEM)
Comment 1 Erhard F. 2020-04-03 19:45:05 UTC
Created attachment 288193 [details]
kernel .config (kernel 5.6.2, PowerMac G4 DP)
Comment 2 Larry Finger 2020-04-04 20:42:22 UTC
Judging by the bus ID for the BCM4306, I think you are using the one built into the PowerMac G4. Is that correct? That device on my G4 is broken, but I have tested using a PCMCIA BCM4306 card. That works fine for me, other than being really slow.

I have duplicated your value using kernel 5.6.0-rc5, thus we can rule out a hardware problem.

I do not use the G4 for much of anything other than testing to make certain that kernel updates boot as there are some differences between the virtual systems that the developers use and real Apple hardware. As a result, I do not save a lot of old kernels. I am currently testing kernel 4.20, which is the latest one I have saved. Do you know which version you were using before the failures started?
Comment 3 Erhard F. 2020-04-04 21:01:58 UTC
Yes, it's a proprietary Apple one.

# lspci -vv -s 0001:10:16.0
0001:10:16.0 Network controller: Broadcom Inc. and subsidiaries BCM4306 802.11b/g Wireless LAN Controller (rev 02)
	Subsystem: Apple Inc. AirPort Extreme
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 16
	Interrupt: pin A routed to IRQ 57
	Region 0: Memory at 8008a000 (32-bit, non-prefetchable) [size=8K]
	Capabilities: [40] Power Management version 2
		Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=2 PME+
	Kernel driver in use: b43-pci-bridge
	Kernel modules: ssb

Though I bought it later on, so I can't rembember at which kernel these failures started. But of course I could test earlier kernels and try to bisect this issue. Would there be any good starting point?
Comment 4 Larry Finger 2020-04-04 23:15:03 UTC
I'm seeing these bad chanstat problems in kernel 4.20, thus this is not a new problem. On my system, it drops the connection, but recovers without problem.

I have a 3.2 kernel to try. I'll see if it sees the problem. It is possible that the driver has always had this problem, but I do not remember ever seeing it, and it got considerable testing in the early days.
Comment 5 Larry Finger 2020-04-04 23:46:29 UTC
Unfortunately, kernel v3.2 has the same problem. That means no simple bisection. :{

My next step will be to see if it happens with x86. I still have one laptop with a PCMCIA card slot. If so, debugging will be easier than on the G4. I will be in touch.
Comment 6 Larry Finger 2020-04-05 15:49:10 UTC
It happens with x86 kernels as well.

My thinking right now is that it is the firmware screwing up on certain conditions. Thus far, I have not thought of any other cause of the RX header getting destroyed. If it were a buffer overrun, then the faulty chanstat would be random rather than always having the same value.

One thing the code should do under these conditions is drop the packet. If you build your own kernels, that would be accomplished by adding a single statement as follows:

diff --git a/drivers/net/wireless/broadcom/b43legacy/xmit.c b/drivers/net/wireless/broadcom/b43legacy/xmit.c
index e9b23c2e5bd4..efd63f4ce74f 100644
--- a/drivers/net/wireless/broadcom/b43legacy/xmit.c
+++ b/drivers/net/wireless/broadcom/b43legacy/xmit.c
@@ -558,6 +558,7 @@ void b43legacy_rx(struct b43legacy_wldev *dev,
        default:
                b43legacywarn(dev->wl, "Unexpected value for chanstat (0x%X)\n",
                       chanstat);
+               goto drop;
        }
 
        memcpy(IEEE80211_SKB_RXCB(skb), &status, sizeof(status));

I am currently testing that change. From the posted info, it seems that you are running Gentoo 2.6. If you can generate your own kernel, please add that one line to your source. If not, I am running Ubuntu 12.04, and a kernel .deb that I generate should be installable on your system. If I placed that file on the cloud somewhere, would you be able to install it?
Comment 7 Erhard F. 2020-04-05 21:42:10 UTC
Thank you for your efforts!

Applied your patch on top of kernel 5.6.2. Up to now the G4 is running fine and those "Unexpected value for chanstat (0x7C00)" messages disappeared so far.
Comment 8 Larry Finger 2020-04-05 23:24:02 UTC
The total lack of messages means the problem is not happening. My system has been running for about 6 hours and has gotten 1 so far. The main thing is that dropping the packet should eliminate the associated disconnect as we are forcing a retransmit of the faulty data as opposed to letting mac80211 try to process junk. I'm still deciding whether to keep logging the errors, or delete that output statement.

The rough draft of the commit message is as follows:

b43legacy: Fix case where channel status is corrupted
  
In https://bugzilla.kernel.org/show_bug.cgi?id=207093, a defect in
b43legacy is reported. Upon testing, thus problem exists on PPC and
X86 platforms and is present in the oldest kernel tested (3.2).

The problem is a corrupted channel status received from the device.
Both the internal card in a PowerBook G4 and the PCMCIA version
(Broadcom BCM4306 with PCI ID 14e4:4320) have the problem. Only Rev, 2
(revision 4 of the 802.11 core) of the chip has been tested. No other
devices using b43legacy are available for testing.

Various sources of the problem were considered. Buffer overrun and
other sources of corruption within the driver were rejected because
the faulty channel status is always the same, not a random value.
I concluded that the faulty data is coming from the device, probably
due to a firmware bug. As that source is not available, the driver
must take appropriate action to recover.

At present, the driver reports the error, and them continues to process
the bad packet. I believe that to be a mistake, thus this patch causes
the driver to drop the corrupted packet.

Cc: Stable <stable@vger.kernel.org>
Reported-and-tested by: F. Erhard <erhard_f@mailbox.org>
Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
Comment 9 Erhard F. 2020-06-04 20:24:28 UTC
Created attachment 289509 [details]
dmesg (kernel 5.7.0, PowerMac G4 DP)

Though there are by far less reconnects with the patch, the issue is not completely gone yet:

[...]
[  364.183971] b43legacy-phy0 debug: Radio initialized
[  364.608706] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27)
[  364.702532] b43legacy-phy0 debug: Chip initialized
[  364.710644] b43legacy-phy0 debug: 30-bit DMA initialized
[  364.718869] b43legacy-phy0 debug: Wireless interface started
[  364.735695] b43legacy-phy0 debug: Adding Interface type 2
[  395.674810] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  395.683134] b43legacy-phy0 debug: RX: Packet dropped
[  395.879068] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  395.886634] b43legacy-phy0 debug: RX: Packet dropped
[  395.894033] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  395.901580] b43legacy-phy0 debug: RX: Packet dropped
[  494.868092] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  494.875741] b43legacy-phy0 debug: RX: Packet dropped
[  495.472894] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  495.480728] b43legacy-phy0 debug: RX: Packet dropped
[  529.303677] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  529.311845] b43legacy-phy0 debug: RX: Packet dropped
[  658.672137] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[  658.680607] b43legacy-phy0 debug: RX: Packet dropped
[  680.037445] b43legacy-phy0 debug: Removing Interface type 2
[  680.046099] b43legacy-phy0 debug: Wireless interface stopped
[  680.056033] b43legacy-phy0 debug: DMA-30 0x0260 (RX) max used slots: 0/64
[  680.064908] b43legacy-phy0 debug: DMA-30 0x0200 (RX) max used slots: 2/64
[  680.083475] b43legacy-phy0 debug: DMA-30 0x02A0 (TX) max used slots: 0/128
[  680.106619] b43legacy-phy0 debug: DMA-30 0x0280 (TX) max used slots: 0/128
[  680.123288] b43legacy-phy0 debug: DMA-30 0x0260 (TX) max used slots: 0/128
[  680.139947] b43legacy-phy0 debug: DMA-30 0x0240 (TX) max used slots: 0/128
[  680.156640] b43legacy-phy0 debug: DMA-30 0x0220 (TX) max used slots: 110/128
[  680.273346] b43legacy-phy0 debug: DMA-30 0x0200 (TX) max used slots: 0/128
[  680.289941] b43legacy-phy0 debug: Radio initialized
[  680.297559] b43legacy-phy0 debug: Radio initialized
[  680.716623] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27)
[...]
Comment 10 Larry Finger 2020-06-05 00:01:30 UTC
I believe this to be a firmware bug. Unfortunately, we have no source for the firmware, thus there is little chance to fix it. The only "fix" will be unloading/reloading the driver.

Surprisingly, in my environment, I rarely see these events. From that I conclude that my residential environment is relatively quiet in the wifi band, and that reduces the incidence of such events.
Comment 11 Erhard F. 2020-06-05 13:50:01 UTC
Hm, I see...

Still it would be nice if your patch gets upstreamed as (at least on my machine) it reduces the disconnect-reconnect-cycles necessary.
Comment 12 Larry Finger 2020-06-05 16:48:53 UTC
It was merged into mainline some time ago. In the git log is the following:

commit ec4d3e3a054578de34cd0b587ab8a1ac36f629d9
Author: Larry Finger <Larry.Finger@lwfinger.net>
Date:   Tue Apr 7 14:00:43 2020 -0500

    b43legacy: Fix case where channel status is corrupted
    
    This patch fixes commit 75388acd0cd8 ("add mac80211-based driver for
    legacy BCM43xx devices")

--snip--
    
    At present, the driver reports the error, and them continues to process
    the bad packet. This is believed that to be a mistake, and the correct
    action is to drop the corrupted packet.
    
    Fixes: 75388acd0cd8 ("add mac80211-based driver for legacy BCM43xx devices")
    Cc: Stable <stable@vger.kernel.org>
    Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net>
    Reported-and-tested by: F. Erhard <erhard_f@mailbox.org>
    Signed-off-by: Kalle Valo <kvalo@codeaurora.org>
    Link: https://lore.kernel.org/r/20200407190043.1686-1-Larry.Finger@lwfinger.net

    
Note that the patch is annotated with a reference to "stable", which is the correct way to get it pushed to all stable kernels. I have done all that I can. You need to check with your distro as to why this patch is not applied.
Comment 13 Erhard F. 2020-06-08 23:02:52 UTC
Seems it simply has not landed in stable kernels yet. Sorry for being impatient!

I will close here once it gets into stable.
Comment 14 Erhard F. 2020-06-23 11:09:20 UTC
Fix has landed in stable series. Closing.

Note You need to log in before you can comment on or make changes to this bug.