Bug 198089 - brcmfmac fails on ASUS T100TAM, apparently due to recent CLM download support changes
Summary: brcmfmac fails on ASUS T100TAM, apparently due to recent CLM download support...
Status: RESOLVED CODE_FIX
Alias: None
Product: Networking
Classification: Unclassified
Component: Wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: networking_wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-05 23:38 UTC by Robert R. Howell
Modified: 2018-03-22 15:54 UTC (History)
0 users

See Also:
Kernel Version: 4.15-rc1
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
Patch which restores brcmfmac wireless using 4.15-rc1 (618 bytes, patch)
2017-12-05 23:38 UTC, Robert R. Howell
Details | Diff
Portion of dmesg output showing brcmfmac CLM errors (4.07 KB, text/plain)
2017-12-07 01:55 UTC, Robert R. Howell
Details

Description Robert R. Howell 2017-12-05 23:38:27 UTC
Created attachment 261029 [details]
Patch which restores brcmfmac wireless using 4.15-rc1

Beginning with 4.15-rc1, the brcmfmac driver fails to fully initialize on an ASUS T100TAM and wifi never becomes available.  It appears to be caused by the code added to support CLM download.  As described at <http://www.spinics.net/lists/linux-wireless/msg167637.html> that code SHOULD just bypass the CLM download if the CLM blob file isn't available.  However the actual code in drivers/net/wireless/broadcom/brcm80211/brcmfmac/common.c appears to skip the rest of the initialization if that CLM file isn't found.  (And the CLM file isn't found on the T100TAM.)

I've managed to get wireless working on the T100TAM by applying the attached patch, which DOES just continue initialization rather than skipping it.  I can't say if that's really the correct approach for all situations, but at least for the T100TAM it appears to restore the pre-4.15-rc1 behavior and allow wifi to work correctly.
Comment 1 Robert R. Howell 2017-12-07 01:55:06 UTC
Created attachment 261049 [details]
Portion of dmesg output showing brcmfmac CLM errors
Comment 2 Robert R. Howell 2017-12-07 01:56:35 UTC
I've done some further testing of how the brcmf_c_process_clm_blob function fails, causing the above problem.  I've attached the portion of dmesg containing all the brcmfmac messages when that happens.  It appears that the 3rd line from the end of that dmseg file is the first relevant error message:

brcmfmac: brcmf_c_process_clm_blob: request CLM blob file failed (-11)

Apparently that error code 11 comes from a call to request_firmware within brcmf_c_process_clm_blob.  If request_firmware had returned a simple ENOENT then apparently the code WOULD have just skipped the CLM loading and continued with the rest of the brcmfmac initialization.  However this error code 11 causes the added CLM code to decide there is some more major problem, and (if my patch isn't applied) abort the rest of the brcmfmac initialization.

I don't yet understand the request_firmware mechanism and return codes well enough to understand why I get this error code 11, but it looks like the just added CLM code should not only test for ENOENT but also a broader set of errors, and back out of CLM load more gracefully (rather than just aborting) when those other errors occur as well.
Comment 3 Robert R. Howell 2018-03-01 06:49:33 UTC
The bug which caused this problem has been fixed in 4.15-rc9 (and the released 4.15.0) by a patch posted by Wright Feng at <https://patchwork.kernel.org/patch/10166257>.  That patch is in some ways similar to my kluge version posted above.  

There is one remaining problem in that the CLM blob handling now causes a 60 second delay in WiFi coming up after the system boots, apparently as some part of the CLM blob loading software waits for a timeout.
Comment 4 Robert R. Howell 2018-03-22 15:54:45 UTC
I've discovered that the 60 second delay in bringing up WiFi, mentioned in Comment 3, is triggered by having CONFIG_FW_LOADER_USER_HELPER_FALLBACK set when I built the kernel.  If that config option is NOT set (and apparently it is not set in most distributions) then the 60 second delay does NOT occur.

Note You need to log in before you can comment on or make changes to this bug.