Bug 207093
Summary: | b43legacy disconnects after receiving some Unexpected value for chanstat (0x7C00) | ||
---|---|---|---|
Product: | Drivers | Reporter: | Erhard F. (erhard_f) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | Larry.Finger |
Priority: | P1 | ||
Hardware: | PPC-32 | ||
OS: | Linux | ||
Kernel Version: | 5.6.2 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg (kernel 5.6.2, PowerMac G4 DP)
kernel .config (kernel 5.6.2, PowerMac G4 DP) dmesg (kernel 5.7.0, PowerMac G4 DP) |
Created attachment 288193 [details]
kernel .config (kernel 5.6.2, PowerMac G4 DP)
Judging by the bus ID for the BCM4306, I think you are using the one built into the PowerMac G4. Is that correct? That device on my G4 is broken, but I have tested using a PCMCIA BCM4306 card. That works fine for me, other than being really slow. I have duplicated your value using kernel 5.6.0-rc5, thus we can rule out a hardware problem. I do not use the G4 for much of anything other than testing to make certain that kernel updates boot as there are some differences between the virtual systems that the developers use and real Apple hardware. As a result, I do not save a lot of old kernels. I am currently testing kernel 4.20, which is the latest one I have saved. Do you know which version you were using before the failures started? Yes, it's a proprietary Apple one. # lspci -vv -s 0001:10:16.0 0001:10:16.0 Network controller: Broadcom Inc. and subsidiaries BCM4306 802.11b/g Wireless LAN Controller (rev 02) Subsystem: Apple Inc. AirPort Extreme Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx- Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 16 Interrupt: pin A routed to IRQ 57 Region 0: Memory at 8008a000 (32-bit, non-prefetchable) [size=8K] Capabilities: [40] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=375mA PME(D0+,D1+,D2+,D3hot+,D3cold+) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=2 PME+ Kernel driver in use: b43-pci-bridge Kernel modules: ssb Though I bought it later on, so I can't rembember at which kernel these failures started. But of course I could test earlier kernels and try to bisect this issue. Would there be any good starting point? I'm seeing these bad chanstat problems in kernel 4.20, thus this is not a new problem. On my system, it drops the connection, but recovers without problem. I have a 3.2 kernel to try. I'll see if it sees the problem. It is possible that the driver has always had this problem, but I do not remember ever seeing it, and it got considerable testing in the early days. Unfortunately, kernel v3.2 has the same problem. That means no simple bisection. :{ My next step will be to see if it happens with x86. I still have one laptop with a PCMCIA card slot. If so, debugging will be easier than on the G4. I will be in touch. It happens with x86 kernels as well. My thinking right now is that it is the firmware screwing up on certain conditions. Thus far, I have not thought of any other cause of the RX header getting destroyed. If it were a buffer overrun, then the faulty chanstat would be random rather than always having the same value. One thing the code should do under these conditions is drop the packet. If you build your own kernels, that would be accomplished by adding a single statement as follows: diff --git a/drivers/net/wireless/broadcom/b43legacy/xmit.c b/drivers/net/wireless/broadcom/b43legacy/xmit.c index e9b23c2e5bd4..efd63f4ce74f 100644 --- a/drivers/net/wireless/broadcom/b43legacy/xmit.c +++ b/drivers/net/wireless/broadcom/b43legacy/xmit.c @@ -558,6 +558,7 @@ void b43legacy_rx(struct b43legacy_wldev *dev, default: b43legacywarn(dev->wl, "Unexpected value for chanstat (0x%X)\n", chanstat); + goto drop; } memcpy(IEEE80211_SKB_RXCB(skb), &status, sizeof(status)); I am currently testing that change. From the posted info, it seems that you are running Gentoo 2.6. If you can generate your own kernel, please add that one line to your source. If not, I am running Ubuntu 12.04, and a kernel .deb that I generate should be installable on your system. If I placed that file on the cloud somewhere, would you be able to install it? Thank you for your efforts! Applied your patch on top of kernel 5.6.2. Up to now the G4 is running fine and those "Unexpected value for chanstat (0x7C00)" messages disappeared so far. The total lack of messages means the problem is not happening. My system has been running for about 6 hours and has gotten 1 so far. The main thing is that dropping the packet should eliminate the associated disconnect as we are forcing a retransmit of the faulty data as opposed to letting mac80211 try to process junk. I'm still deciding whether to keep logging the errors, or delete that output statement. The rough draft of the commit message is as follows: b43legacy: Fix case where channel status is corrupted In https://bugzilla.kernel.org/show_bug.cgi?id=207093, a defect in b43legacy is reported. Upon testing, thus problem exists on PPC and X86 platforms and is present in the oldest kernel tested (3.2). The problem is a corrupted channel status received from the device. Both the internal card in a PowerBook G4 and the PCMCIA version (Broadcom BCM4306 with PCI ID 14e4:4320) have the problem. Only Rev, 2 (revision 4 of the 802.11 core) of the chip has been tested. No other devices using b43legacy are available for testing. Various sources of the problem were considered. Buffer overrun and other sources of corruption within the driver were rejected because the faulty channel status is always the same, not a random value. I concluded that the faulty data is coming from the device, probably due to a firmware bug. As that source is not available, the driver must take appropriate action to recover. At present, the driver reports the error, and them continues to process the bad packet. I believe that to be a mistake, thus this patch causes the driver to drop the corrupted packet. Cc: Stable <stable@vger.kernel.org> Reported-and-tested by: F. Erhard <erhard_f@mailbox.org> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Created attachment 289509 [details]
dmesg (kernel 5.7.0, PowerMac G4 DP)
Though there are by far less reconnects with the patch, the issue is not completely gone yet:
[...]
[ 364.183971] b43legacy-phy0 debug: Radio initialized
[ 364.608706] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27)
[ 364.702532] b43legacy-phy0 debug: Chip initialized
[ 364.710644] b43legacy-phy0 debug: 30-bit DMA initialized
[ 364.718869] b43legacy-phy0 debug: Wireless interface started
[ 364.735695] b43legacy-phy0 debug: Adding Interface type 2
[ 395.674810] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 395.683134] b43legacy-phy0 debug: RX: Packet dropped
[ 395.879068] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 395.886634] b43legacy-phy0 debug: RX: Packet dropped
[ 395.894033] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 395.901580] b43legacy-phy0 debug: RX: Packet dropped
[ 494.868092] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 494.875741] b43legacy-phy0 debug: RX: Packet dropped
[ 495.472894] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 495.480728] b43legacy-phy0 debug: RX: Packet dropped
[ 529.303677] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 529.311845] b43legacy-phy0 debug: RX: Packet dropped
[ 658.672137] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00)
[ 658.680607] b43legacy-phy0 debug: RX: Packet dropped
[ 680.037445] b43legacy-phy0 debug: Removing Interface type 2
[ 680.046099] b43legacy-phy0 debug: Wireless interface stopped
[ 680.056033] b43legacy-phy0 debug: DMA-30 0x0260 (RX) max used slots: 0/64
[ 680.064908] b43legacy-phy0 debug: DMA-30 0x0200 (RX) max used slots: 2/64
[ 680.083475] b43legacy-phy0 debug: DMA-30 0x02A0 (TX) max used slots: 0/128
[ 680.106619] b43legacy-phy0 debug: DMA-30 0x0280 (TX) max used slots: 0/128
[ 680.123288] b43legacy-phy0 debug: DMA-30 0x0260 (TX) max used slots: 0/128
[ 680.139947] b43legacy-phy0 debug: DMA-30 0x0240 (TX) max used slots: 0/128
[ 680.156640] b43legacy-phy0 debug: DMA-30 0x0220 (TX) max used slots: 110/128
[ 680.273346] b43legacy-phy0 debug: DMA-30 0x0200 (TX) max used slots: 0/128
[ 680.289941] b43legacy-phy0 debug: Radio initialized
[ 680.297559] b43legacy-phy0 debug: Radio initialized
[ 680.716623] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27)
[...]
I believe this to be a firmware bug. Unfortunately, we have no source for the firmware, thus there is little chance to fix it. The only "fix" will be unloading/reloading the driver. Surprisingly, in my environment, I rarely see these events. From that I conclude that my residential environment is relatively quiet in the wifi band, and that reduces the incidence of such events. Hm, I see... Still it would be nice if your patch gets upstreamed as (at least on my machine) it reduces the disconnect-reconnect-cycles necessary. It was merged into mainline some time ago. In the git log is the following: commit ec4d3e3a054578de34cd0b587ab8a1ac36f629d9 Author: Larry Finger <Larry.Finger@lwfinger.net> Date: Tue Apr 7 14:00:43 2020 -0500 b43legacy: Fix case where channel status is corrupted This patch fixes commit 75388acd0cd8 ("add mac80211-based driver for legacy BCM43xx devices") --snip-- At present, the driver reports the error, and them continues to process the bad packet. This is believed that to be a mistake, and the correct action is to drop the corrupted packet. Fixes: 75388acd0cd8 ("add mac80211-based driver for legacy BCM43xx devices") Cc: Stable <stable@vger.kernel.org> Signed-off-by: Larry Finger <Larry.Finger@lwfinger.net> Reported-and-tested by: F. Erhard <erhard_f@mailbox.org> Signed-off-by: Kalle Valo <kvalo@codeaurora.org> Link: https://lore.kernel.org/r/20200407190043.1686-1-Larry.Finger@lwfinger.net Note that the patch is annotated with a reference to "stable", which is the correct way to get it pushed to all stable kernels. I have done all that I can. You need to check with your distro as to why this patch is not applied. Seems it simply has not landed in stable kernels yet. Sorry for being impatient! I will close here once it gets into stable. Fix has landed in stable series. Closing. |
Created attachment 288191 [details] dmesg (kernel 5.6.2, PowerMac G4 DP) This happens after a while and goes on and on. Disconnecting after "", reconnecting, working for a while, disconnecting again, etc. [...] [ 978.136378] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 995.051033] b43legacy-phy0 debug: Removing Interface type 2 [ 995.058474] b43legacy-phy0 debug: Wireless interface stopped [ 995.066593] b43legacy-phy0 debug: DMA-30 0x0260 (RX) max used slots: 1/64 [ 995.074175] b43legacy-phy0 debug: DMA-30 0x0200 (RX) max used slots: 1/64 [ 995.081800] b43legacy-phy0 debug: DMA-30 0x02A0 (TX) max used slots: 0/128 [ 995.103463] b43legacy-phy0 debug: DMA-30 0x0280 (TX) max used slots: 0/128 [ 995.120108] b43legacy-phy0 debug: DMA-30 0x0260 (TX) max used slots: 0/128 [ 995.136785] b43legacy-phy0 debug: DMA-30 0x0240 (TX) max used slots: 0/128 [ 995.150083] b43legacy-phy0 debug: DMA-30 0x0220 (TX) max used slots: 104/128 [ 995.263501] b43legacy-phy0 debug: DMA-30 0x0200 (TX) max used slots: 0/128 [ 995.276742] b43legacy-phy0 debug: Radio initialized [ 995.282846] b43legacy-phy0 debug: Radio initialized [ 995.700089] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27) [ 995.790576] b43legacy-phy0 debug: Chip initialized [ 995.797353] b43legacy-phy0 debug: 30-bit DMA initialized [ 995.809609] b43legacy-phy0 debug: Wireless interface started [ 995.833784] b43legacy-phy0 debug: Adding Interface type 2 [ 1093.918787] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1094.584074] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1159.037837] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1159.265413] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1311.074322] b43legacy-phy0 debug: Removing Interface type 2 [ 1311.080552] b43legacy-phy0 debug: Wireless interface stopped [ 1311.096408] b43legacy-phy0 debug: DMA-30 0x0260 (RX) max used slots: 1/64 [ 1311.115973] b43legacy-phy0 debug: DMA-30 0x0200 (RX) max used slots: 1/64 [ 1311.127705] b43legacy-phy0 debug: DMA-30 0x02A0 (TX) max used slots: 0/128 [ 1311.150091] b43legacy-phy0 debug: DMA-30 0x0280 (TX) max used slots: 0/128 [ 1311.173395] b43legacy-phy0 debug: DMA-30 0x0260 (TX) max used slots: 0/128 [ 1311.190018] b43legacy-phy0 debug: DMA-30 0x0240 (TX) max used slots: 0/128 [ 1311.206677] b43legacy-phy0 debug: DMA-30 0x0220 (TX) max used slots: 110/128 [ 1311.323464] b43legacy-phy0 debug: DMA-30 0x0200 (TX) max used slots: 0/128 [ 1311.340012] b43legacy-phy0 debug: Radio initialized [ 1311.345493] b43legacy-phy0 debug: Radio initialized [ 1311.780026] b43legacy-phy0: Loading firmware version 0x127, patch level 14 (2005-04-18 02:36:27) [ 1311.873866] b43legacy-phy0 debug: Chip initialized [ 1311.881297] b43legacy-phy0 debug: 30-bit DMA initialized [ 1311.895017] b43legacy-phy0 debug: Wireless interface started [ 1311.910348] b43legacy-phy0 debug: Adding Interface type 2 [ 1378.691130] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1477.669455] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1478.539075] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1510.483406] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1510.489642] b43legacy-phy0 warning: Unexpected value for chanstat (0x7C00) [ 1627.074260] b43legacy-phy0 debug: Removing Interface type 2 [...] Machine is a PowerMac G4 DP 3,6: # inxi -b System: Kernel: 5.6.2-gentoo-PowerMacG4 ppc bits: 32 Console: tty 1 Distro: Gentoo Base System release 2.6 Machine: Type: PowerPC Device System: PowerMac3 6 details: PowerMac3 6 rev: 3.3 (pvr 8001 0303) serial: P6N CPU: Dual Core: 7455 altivec supported type: MCP speed: 1417 MHz Graphics: Device-1: AMD RV350 [Radeon 9550/9600/X1050 Series] driver: radeon v: kernel Display: server: X.org 1.20.7 driver: ati,radeon unloaded: fbdev,modesetting tty: 104x53 Message: Advanced graphics data unavailable in console for root. Network: Device-1: Broadcom and subsidiaries BCM4306 802.11b/g Wireless LAN driver: b43-pci-bridge Device-2: Apple UniNorth 2 GMAC driver: gem Device-3: gmac driver: gem Drives: Local Storage: total: 689.82 GiB used: 3.95 GiB (0.6%) Info: Processes: 165 Uptime: 38m Memory: 1.96 GiB used: 573.6 MiB (28.6%) Init: systemd Shell: bash inxi: 3.0.36 # lspci 0000:00:0b.0 Host bridge: Apple Inc. UniNorth 2 AGP 0000:00:10.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] RV350 [Radeon 9550/9600/X1050 Series] 0001:10:0b.0 Host bridge: Apple Inc. UniNorth 2 PCI 0001:10:13.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) 0001:10:13.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) 0001:10:13.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) 0001:10:15.0 Mass storage controller: Silicon Image, Inc. SiI 3112 [SATALink/SATARaid] Serial ATA Controller (rev 02) 0001:10:16.0 Network controller: Broadcom Inc. and subsidiaries BCM4306 802.11b/g Wireless LAN Controller (rev 02) 0001:10:17.0 Unassigned class [ff00]: Apple Inc. KeyLargo Mac I/O (rev 03) 0001:10:18.0 USB controller: Apple Inc. KeyLargo USB 0001:10:19.0 USB controller: Apple Inc. KeyLargo USB 0001:10:1b.0 USB controller: NEC Corporation OHCI USB Controller (rev 43) 0001:10:1b.1 USB controller: NEC Corporation OHCI USB Controller (rev 43) 0001:10:1b.2 USB controller: NEC Corporation uPD72010x USB 2.0 Controller (rev 04) 0002:20:0b.0 Host bridge: Apple Inc. UniNorth 2 Internal PCI 0002:20:0d.0 Unassigned class [ff00]: Apple Inc. UniNorth 2 ATA/100 0002:20:0e.0 FireWire (IEEE 1394): Apple Inc. UniNorth 2 FireWire (rev 01) 0002:20:0f.0 Ethernet controller: Apple Inc. UniNorth 2 GMAC (Sun GEM)