Bug 213937 - ath11k: QCA6390: crash in ath11k_mac_op_unassign_vif_chanctx upon connect to 5 GHz network
Summary: ath11k: QCA6390: crash in ath11k_mac_op_unassign_vif_chanctx upon connect to ...
Status: RESOLVED DUPLICATE of bug 212059
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-08-01 16:43 UTC by Dr. Jan-Philip Gehrcke
Modified: 2021-11-20 17:11 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.13.5
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Dr. Jan-Philip Gehrcke 2021-08-01 16:43:46 UTC
I am following https://wireless.wiki.kernel.org/en/users/drivers/ath11k/bugreport


>     Exact kernel version. Is it a distro kernel or have you compiled it
>     yourself? Any extra patches?

$ uname -a
Linux xps15 5.13.5-200.fc34.x86_64 #1 SMP Sun Jul 25 16:19:01 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux


>    Linux distribution version.

Fedora 34


>    Host device information, for example if it's a laptop make and model,
>    architecture, CPU etc.

Dell Inc. XPS 15 9500/0XWT2F, 


>    BIOS version.

BIOS 1.8.1 05/26/2021


>    How many times have you seen the bug and how many times did you try to
>    reproduce it?

Crash happens upon connect to WiFi. After my latest kernel and BIOS update from today I tried this about 10 times, and saw the same problem 100 % of the times.


>    Output from: uname -a

See above.


>    Output from: lspci -mnn

Slightly modified to just show the wifi hardware -- let me know if more is needed.

$ lspci -mnn | grep -i network
6c:00.0 "Network controller [0280]" "Qualcomm [17cb]" "QCA6390 Wireless Network Adapter [AX500-DBS (2x2)] [1101]" -r01 "Rivet Networks [1a56]" "Device [a501]"


>   Output from: find /lib/firmware/ath11k/ -type f | xargs md5sum

Slightly modified the command to account for xz-compressed binary files:

$ find /lib/firmware/ath11k/QCA6390 -type f -print -exec sh -c 'cat {} | unxz | md5sum' \;
/lib/firmware/ath11k/QCA6390/hw2.0/amss.bin.xz
a101dc90f8e876f39383b60c9da64ec4  -
/lib/firmware/ath11k/QCA6390/hw2.0/board-2.bin.xz
4c0781f659d2b7d6bef10a2e3d457728  -
/lib/firmware/ath11k/QCA6390/hw2.0/m3.bin.xz
d4c912a3501a3694a3f460d13de06d28  -

These are the same as shown by Kalle Valo here: https://bugzilla.kernel.org/show_bug.cgi?id=210923#c9


>    Output from: dmesg | grep ath11k

Slightly modified command to also show the "authenticate with ..." message, 0.2 seconds before the crash:

$ dmesg | grep -e ath11 -e authenticate
[   13.594798] ath11k_pci 0000:6c:00.0: BAR 0: assigned [mem 0xb4200000-0xb42fffff 64bit]
[   13.594813] ath11k_pci 0000:6c:00.0: enabling device (0000 -> 0002)
[   13.595013] ath11k_pci 0000:6c:00.0: qca6390 hw2.0
[   13.936017] ath11k_pci 0000:6c:00.0: chip_id 0x0 chip_family 0xb board_id 0xff soc_id 0xffffffff
[   13.936021] ath11k_pci 0000:6c:00.0: fw_version 0x101c06cc fw_build_timestamp 2020-06-24 19:50 fw_build_id 
[   14.123182] ath11k_pci 0000:6c:00.0 wlp108s0: renamed from wlan0
[   23.764211] wlp108s0: authenticate with 2c:91:ab:75:cf:be
[   23.927028] ath11k_pci 0000:6c:00.0: firmware crashed: MHI_CB_SYS_ERROR
[   24.852899] ath11k_pci 0000:6c:00.0: failed to synchronize setup for vdev 0 start: -110
[   24.852912] ath11k_pci 0000:6c:00.0: failed to start vdev 0 addr 9c:b6:d0:3e:b2:95 on freq 5180: -110
[   24.852919] ath11k_pci 0000:6c:00.0: failed to delay vdev start: -110
[   24.852929] ath11k_pci 0000:6c:00.0: failed to send WMI_PEER_DELETE cmd
[   24.852934] ath11k_pci 0000:6c:00.0: failed to delete peer vdev_id 0 addr 2c:91:ab:75:cf:be ret -108
[   24.852939] ath11k_pci 0000:6c:00.0: Failed to add station: 2c:91:ab:75:cf:be for VDEV: 0
[   24.862787] WARNING: CPU: 14 PID: 1544 at drivers/net/wireless/ath/ath11k/mac.c:5582 ath11k_mac_op_unassign_vif_chanctx+0x17e/0x1d0 [ath11k]
[   24.862905]  ath11k_pci snd_intel_sdw_acpi kvm_intel ath11k snd_hda_codec dell_laptop ee1004 iTCO_wdt ledtrig_audio qmi_helpers intel_pmc_bxt snd_hda_core iTCO_vendor_support mei_hdcp intel_rapl_msr kvm dell_smm_hwmon snd_hwdep snd_seq mac80211 snd_seq_device irqbypass dell_wmi snd_pcm rapl dell_smbios intel_cstate dcdbas cfg80211 intel_uncore pcspkr hci_uart uvcvideo snd_timer videobuf2_vmalloc btqca mhi videobuf2_memops dell_wmi_sysman snd btrtl videobuf2_v4l2 wmi_bmof dell_wmi_descriptor btbcm intel_wmi_thunderbolt i2c_i801 btintel videobuf2_common soundcore libarc4 i2c_smbus bluetooth videodev thunderbolt hid_sensor_als joydev processor_thermal_device hid_sensor_trigger mc mei_me hid_sensor_iio_common processor_thermal_rfim processor_thermal_mbox industrialio_triggered_buffer ucsi_acpi processor_thermal_rapl kfifo_buf typec_ucsi intel_rapl_common idma64 mei industrialio intel_pch_thermal intel_soc_dts_iosf typec int3403_thermal rfkill int340x_thermal_zone int3400_thermal
[   24.863103] RIP: 0010:ath11k_mac_op_unassign_vif_chanctx+0x17e/0x1d0 [ath11k]
[   24.863299]  ? ath11k_mac_op_bss_info_changed+0x3c/0xc70 [ath11k]
[   24.863780]  nl80211_authenticate+0x296/0x2f0 [cfg80211]
[   24.864132] ath11k_pci 0000:6c:00.0: failed to submit WMI_VDEV_STOP cmd
[   24.864138] ath11k_pci 0000:6c:00.0: failed to stop WMI vdev 0: -108
[   24.864142] ath11k_pci 0000:6c:00.0: failed to stop vdev 0: -108
[   28.970516] ath11k_warn: 8 callbacks suppressed
[   28.970525] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   28.970534] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   29.972227] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   29.972238] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   30.973870] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   30.973882] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   31.975468] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   31.975478] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   32.977082] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   32.977093] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   33.978699] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   33.978711] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   34.980350] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   34.980361] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   35.981992] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   35.982003] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   36.983416] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   36.983426] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   37.985057] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   37.985067] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   38.986509] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   38.986521] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   39.988162] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   39.988172] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   40.989782] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   40.989793] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   41.991418] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   41.991429] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   42.993036] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   42.993047] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   43.994543] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   43.994554] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   44.996123] ath11k_pci 0000:6c:00.0: failed to send WMI_START_SCAN_CMDID
[   44.996134] ath11k_pci 0000:6c:00.0: failed to start hw scan: -108
[   45.059192] ath11k_pci 0000:6c:00.0: fail to set monitor filter: -108
[   45.059300] ath11k_pci 0000:6c:00.0: failed to submit WMI_VDEV_DELETE_CMDID
[   45.059307] ath11k_pci 0000:6c:00.0: failed to delete WMI vdev 0: -108
[   45.059314] ath11k_pci 0000:6c:00.0: removing stale peer 2c:91:ab:75:cf:be from vdev_id 0
[   45.059367] ath11k_pci 0000:6c:00.0: failed to clear rx_filter for monitor status ring: (-108)
[   45.068445] ath11k_pci 0000:6c:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[   45.068455] ath11k_pci 0000:6c:00.0: failed to enable PMF QOS: (-108
[  457.152902] ath11k_pci 0000:6c:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[  457.152913] ath11k_pci 0000:6c:00.0: failed to enable PMF QOS: (-108
[  870.045979] ath11k_pci 0000:6c:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[  870.045989] ath11k_pci 0000:6c:00.0: failed to enable PMF QOS: (-108
[ 1282.075619] ath11k_pci 0000:6c:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[ 1282.075628] ath11k_pci 0000:6c:00.0: failed to enable PMF QOS: (-108
[ 1694.049019] ath11k_pci 0000:6c:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[ 1694.049029] ath11k_pci 0000:6c:00.0: failed to enable PMF QOS: (-108
[ 2106.075271] ath11k_pci 0000:6c:00.0: failed to send WMI_PDEV_SET_PARAM cmd
[ 2106.075281] ath11k_pci 0000:6c:00.0: failed to enable PMF QOS: (-108



Ideally, this hardware would at some point soon work out of the box with Fedora 34. I however understand if that is utopia and in that case I would appreciate any help about what I could do to overcome this issue.

For the record, a bit of insight into the somewhat painful journey:

- I had previously tried to report problems via  https://bugzilla.redhat.com/show_bug.cgi?id=1950410 -- but said ticket got too complex.

- I reported a crash in ath11k_mac_op_unassign_vif_chanctx via  https://lore.kernel.org/ath11k/5b80d6b4-2956-0213-eeb8-c2229c7ec67c@googlemail.com/T/ -- but that was kernel 5.12.0 and did happen upon _searching_ networks, not upon connecting.
Comment 1 Dr. Jan-Philip Gehrcke 2021-08-01 17:05:31 UTC
I wanted to see if the difference between 2.4 GHz and 5 GHz matters. To that end I removed the 'saved' 5 GHz Wifi connection via 

    $ nmcli connection delete id <connection uuid>

I then rebooted to confirm again that `dmesg` does not show a crash as long as I don't connect to a wireless network. The last few ath11k-related lines in dmesg output:

[   14.034352] ath11k_pci 0000:6c:00.0: chip_id 0x0 chip_family 0xb board_id 0xff soc_id 0xffffffff
[   14.034355] ath11k_pci 0000:6c:00.0: fw_version 0x101c06cc fw_build_timestamp 2020-06-24 19:50 fw_build_id 
[   14.223440] ath11k_pci 0000:6c:00.0 wlp108s0: renamed from wlan0


I then browsed the wireless networks and connected to a 2.4 GHz network. No crash. Relevant lines from `dmesg`:

[  122.869683] wlp108s0: authenticate with 2c:91:ab:75:cf:bd
[  123.027099] wlp108s0: send auth to 2c:91:ab:75:cf:bd (try 1/3)
[  123.032424] wlp108s0: authenticated
[  123.034883] wlp108s0: associate with 2c:91:ab:75:cf:bd (try 1/3)
[  123.039912] wlp108s0: RX AssocResp from 2c:91:ab:75:cf:bd (capab=0x1431 status=0 aid=131)
[  123.061465] wlp108s0: associated
[  123.100517] wlp108s0: Limiting TX power to 20 (20 - 0) dBm as advertised by 2c:91:ab:75:cf:bd
[  125.360926] IPv6: ADDRCONF(NETDEV_CHANGE): wlp108s0: link becomes ready


I then disconnected wired network and at least for basic communication the 2.4 GHz connectivity works:

    $ ping 1.1.1.1
    PING 1.1.1.1 (1.1.1.1) 56(84) bytes of data.
    64 bytes from 1.1.1.1: icmp_seq=1 ttl=57 time=12.7 ms
    ...


I then searched specifically about 5 Ghz issues and found https://bugzilla.kernel.org/show_bug.cgi?id=212059. Which might be the actual underlying issue here. I am updating the title of this issue here.
Comment 2 Kalle Valo 2021-11-20 17:11:26 UTC

*** This bug has been marked as a duplicate of bug 212059 ***

Note You need to log in before you can comment on or make changes to this bug.