Bug 218276 - ath11k: QCNFA765: Bug with non-standard router setting. Crashes, terrible latensy and speed.
Summary: ath11k: QCNFA765: Bug with non-standard router setting. Crashes, terrible la...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: AMD Linux
: P3 high
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-12-16 11:46 UTC by Evgenii Ilchenko
Modified: 2023-12-21 03:55 UTC (History)
2 users (show)

See Also:
Kernel Version: 6.5.13 6.7.0rc5
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Dmesg output (131.13 KB, text/plain)
2023-12-16 11:46 UTC, Evgenii Ilchenko
Details

Description Evgenii Ilchenko 2023-12-16 11:46:12 UTC
Created attachment 305615 [details]
Dmesg output

Hardware:
Lenovo Thinkpad P14s (21K5001JUS)
AMD Ryzen 7450u with Qualcomm QCNFA765 Wireless Network Adapter
Router: Huawei HG8245X6-10

Software:
Debian Testing (trixie).
Testing with 6.1.0, 6.5.0, 6.5.13 kernels.


The problem is reproduced in the following environment:

802.11ax is turned off on the router.
In this case, a lot of messages like this are printed to the logs:

ath11k_pci 0000:02:00.0: Received with invalid mcs in VHT mode 11
ath11k_pci 0000:02:00.0: Received with invalid mcs in VHT mode 10

and:

[   19.498035] ------------[ cut here ]------------
[   19.498039] Rate marked as a VHT rate but data is invalid: MCS: 10, NSS: 0
[   19.498138] WARNING: CPU: 12 PID: 3107 at net/mac80211/rx.c:5337 ieee80211_rx_list+0x2b3/0xda0 [mac80211]
.........
[   19.498631] RIP: 0010:ieee80211_rx_list+0x2b3/0xda0 [mac80211]
[   19.498684] Code: 00 00 80 3d 96 a7 07 00 00 0f 85 2d ff ff ff 0f b6 53 4a 40 0f b6 f7 48 c7 c7 e0 a4 e2 c1 c6 05 7a a7 07 00 01 e8 dd 5d b6 e3 <0f> 0b e9 0b ff ff ff 40 80 ff 0b 0f 86 26 03 00 00 80 3d 5c a7 07
.......
[   19.498724] Call Trace:
[   19.498731]  <IRQ>
[   19.498735]  ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
[   19.498785]  ? __warn+0x81/0x130
[   19.498799]  ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
[   19.498852]  ? report_bug+0x171/0x1a0
[   19.498861]  ? prb_read_valid+0x1b/0x30
[   19.498871]  ? srso_alias_return_thunk+0x5/0x7f
[   19.498882]  ? handle_bug+0x3c/0x80
[   19.498891]  ? exc_invalid_op+0x17/0x70
[   19.498897]  ? asm_exc_invalid_op+0x1a/0x20
[   19.498910]  ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
[   19.498941]  ? srso_alias_return_thunk+0x5/0x7f
[   19.498944]  ? _dev_warn+0x79/0xa0
[   19.498952]  ? srso_alias_return_thunk+0x5/0x7f
[   19.498956]  ? ath11k_peer_find_by_id+0x100/0x1c0 [ath11k]
[   19.498978]  ieee80211_rx_napi+0x53/0xe0 [mac80211]
[   19.498999]  ath11k_dp_rx_process_received_packets+0x23e/0x660 [ath11k]
[   19.499013]  ath11k_dp_process_rx+0x2cf/0x3c0 [ath11k]
[   19.499026]  ath11k_dp_service_srng+0x2e0/0x320 [ath11k]
[   19.499037]  ath11k_pcic_ext_grp_napi_poll+0x25/0x80 [ath11k]
[   19.499047]  __napi_poll+0x28/0x1b0
[   19.499055]  net_rx_action+0x2a4/0x380
[   19.499058]  ? srso_alias_return_thunk+0x5/0x7f
[   19.499060]  ? __napi_schedule+0xb0/0xc0
[   19.499065]  __do_softirq+0xc7/0x2ae
[   19.499070]  ? handle_edge_irq+0x8b/0x230
[   19.499076]  __irq_exit_rcu+0x96/0xb0
[   19.499083]  common_interrupt+0x86/0xa0
[   19.499086]  </IRQ>
[   19.499087]  <TASK>
[   19.499089]  asm_common_interrupt+0x26/0x40
.........
[   19.499179] ---[ end trace 0000000000000000 ]---
full dmesg are attached.

Under these conditions, there is a high proportion of packet loss and terrible network speed.
--- 8.8.8.8 ping statistics ---
1897 packets transmitted, 1868 received, 1.52873% packet loss, time 1899829ms
rtt min/avg/max/mdev = 8.361/19.235/182.594/10.237 ms

Workaround:
When you enable 802.11ax in the router settings, everything becomes fine.

From my side, looks like router is sending incompatible in 802.11ac mode MCS setting and this cause the problem.
But a lot of devices (include thinkpad t14 g2 with AX201 intel wi-fi) work well with this router and this setting.
Comment 1 Bagas Sanjaya 2023-12-17 05:39:29 UTC
(In reply to Evgenii Ilchenko from comment #0)
> Created attachment 305615 [details]
> Dmesg output
> 
> Hardware:
> Lenovo Thinkpad P14s (21K5001JUS)
> AMD Ryzen 7450u with Qualcomm QCNFA765 Wireless Network Adapter
> Router: Huawei HG8245X6-10
> 
> Software:
> Debian Testing (trixie).
> Testing with 6.1.0, 6.5.0, 6.5.13 kernels.
> 

Can you check current mainline (v6.7-rc5)?
Comment 2 Evgenii Ilchenko 2023-12-17 22:43:54 UTC
(In reply to Bagas Sanjaya from comment #1)
> Can you check current mainline (v6.7-rc5)?
Of course.
At first glance it seemed to be better, but the problem is still reproducible.
1800 packets transmitted, 1715 received, 4.72222% packet loss, time 1803370ms

Dmesg:
https://drive.proton.me/urls/ANXKYVSSE0#1UAg2yv5RbvD
Ping with timestamps:
https://drive.proton.me/urls/0X1YVJ0QEG#HWiaF4ZtM2YZ

There appears to be a correlation between log messages (ath11k_pci ... Received with invalid mcs) and packet loss.
Comment 3 Bagas Sanjaya 2023-12-21 03:55:08 UTC
(In reply to Evgenii Ilchenko from comment #0)
> Created attachment 305615 [details]
> Dmesg output
> 
> Hardware:
> Lenovo Thinkpad P14s (21K5001JUS)
> AMD Ryzen 7450u with Qualcomm QCNFA765 Wireless Network Adapter
> Router: Huawei HG8245X6-10
> 
> Software:
> Debian Testing (trixie).
> Testing with 6.1.0, 6.5.0, 6.5.13 kernels.
> 
> 
> The problem is reproduced in the following environment:
> 
> 802.11ax is turned off on the router.
> In this case, a lot of messages like this are printed to the logs:
> 
> ath11k_pci 0000:02:00.0: Received with invalid mcs in VHT mode 11
> ath11k_pci 0000:02:00.0: Received with invalid mcs in VHT mode 10
> 
> and:
> 
> [   19.498035] ------------[ cut here ]------------
> [   19.498039] Rate marked as a VHT rate but data is invalid: MCS: 10, NSS: 0
> [   19.498138] WARNING: CPU: 12 PID: 3107 at net/mac80211/rx.c:5337
> ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> .........
> [   19.498631] RIP: 0010:ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [   19.498684] Code: 00 00 80 3d 96 a7 07 00 00 0f 85 2d ff ff ff 0f b6 53
> 4a 40 0f b6 f7 48 c7 c7 e0 a4 e2 c1 c6 05 7a a7 07 00 01 e8 dd 5d b6 e3 <0f>
> 0b e9 0b ff ff ff 40 80 ff 0b 0f 86 26 03 00 00 80 3d 5c a7 07
> .......
> [   19.498724] Call Trace:
> [   19.498731]  <IRQ>
> [   19.498735]  ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [   19.498785]  ? __warn+0x81/0x130
> [   19.498799]  ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [   19.498852]  ? report_bug+0x171/0x1a0
> [   19.498861]  ? prb_read_valid+0x1b/0x30
> [   19.498871]  ? srso_alias_return_thunk+0x5/0x7f
> [   19.498882]  ? handle_bug+0x3c/0x80
> [   19.498891]  ? exc_invalid_op+0x17/0x70
> [   19.498897]  ? asm_exc_invalid_op+0x1a/0x20
> [   19.498910]  ? ieee80211_rx_list+0x2b3/0xda0 [mac80211]
> [   19.498941]  ? srso_alias_return_thunk+0x5/0x7f
> [   19.498944]  ? _dev_warn+0x79/0xa0
> [   19.498952]  ? srso_alias_return_thunk+0x5/0x7f
> [   19.498956]  ? ath11k_peer_find_by_id+0x100/0x1c0 [ath11k]
> [   19.498978]  ieee80211_rx_napi+0x53/0xe0 [mac80211]
> [   19.498999]  ath11k_dp_rx_process_received_packets+0x23e/0x660 [ath11k]
> [   19.499013]  ath11k_dp_process_rx+0x2cf/0x3c0 [ath11k]
> [   19.499026]  ath11k_dp_service_srng+0x2e0/0x320 [ath11k]
> [   19.499037]  ath11k_pcic_ext_grp_napi_poll+0x25/0x80 [ath11k]
> [   19.499047]  __napi_poll+0x28/0x1b0
> [   19.499055]  net_rx_action+0x2a4/0x380
> [   19.499058]  ? srso_alias_return_thunk+0x5/0x7f
> [   19.499060]  ? __napi_schedule+0xb0/0xc0
> [   19.499065]  __do_softirq+0xc7/0x2ae
> [   19.499070]  ? handle_edge_irq+0x8b/0x230
> [   19.499076]  __irq_exit_rcu+0x96/0xb0
> [   19.499083]  common_interrupt+0x86/0xa0
> [   19.499086]  </IRQ>
> [   19.499087]  <TASK>
> [   19.499089]  asm_common_interrupt+0x26/0x40
> .........
> [   19.499179] ---[ end trace 0000000000000000 ]---
> full dmesg are attached.
> 
> Under these conditions, there is a high proportion of packet loss and
> terrible network speed.
> --- 8.8.8.8 ping statistics ---
> 1897 packets transmitted, 1868 received, 1.52873% packet loss, time 1899829ms
> rtt min/avg/max/mdev = 8.361/19.235/182.594/10.237 ms
> 
> Workaround:
> When you enable 802.11ax in the router settings, everything becomes fine.
> 
> From my side, looks like router is sending incompatible in 802.11ac mode MCS
> setting and this cause the problem.
> But a lot of devices (include thinkpad t14 g2 with AX201 intel wi-fi) work
> well with this router and this setting.

Forwarded to LKML [1].

[1]: https://lore.kernel.org/lkml/ZYO12aX3RpWzWuDs@archie.me/

Note You need to log in before you can comment on or make changes to this bug.