Bug 217310

Summary: mt7921e swiotlb buffer is full
Product: Drivers Reporter: Mike Lothian (mike)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: NEW ---    
Severity: normal CC: petr.tesarik.ext
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
dmesg swiotlb=noforce
dmesg latest

Description Mike Lothian 2023-04-08 12:03:01 UTC
Created attachment 304097 [details]
dmesg

Since the start of the 6.3 kernel cycle I've been having issues with my wifi when it's been going at high speeds, which usually requires me having to disable and re enable wifi to get it going again

When this happens I see this in the logs:

mt7921e 0000:05:00.0: swiotlb buffer is full (sz: 4096 bytes), total 32768 (slots), used 32160 (slots)

I'm running the latest firmware from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git

And I don't see this on 6.2

I've attached my dmesg and the device is listed as:

05:00.0 Network controller [0280]: MEDIATEK Corp. MT7921 802.11ax PCI Express Wireless Network Adapter [14c3:7961]
Comment 1 Mike Lothian 2023-04-08 15:39:33 UTC
I increased the number of slots but still had the same issue and running with swiotlb=noforce gave the following errors:

[    4.534440] ------------[ cut here ]------------
[    4.534502] mt7921e 0000:05:00.0: DMA addr 0x00000001010c9000+4096 overflow (mask ffffffff, bus limit 0).
[    4.534581] WARNING: CPU: 1 PID: 1 at kernel/dma/direct.h:105 dma_map_page_attrs+0x1d3/0x1f0
[    4.534651] Modules linked in:
[    4.534680] CPU: 1 PID: 1 Comm: swapper/0 Not tainted 6.3.0-rc5-tip+ #3574
[    4.536367] Hardware name: ASUSTeK COMPUTER INC. ROG Strix G513QY_G513QY/G513QY, BIOS G513QY.320 09/07/2022
[    4.536922] RIP: 0010:dma_map_page_attrs+0x1d3/0x1f0
[    4.537446] Code: 49 8b 8e 40 02 00 00 48 c7 c7 02 ff ed 82 48 89 c6 49 89 d8 4c 8b 09 48 89 e1 41 ff b6 50 02 00 00 e8 31 70 f7 ff 48 83 c4 08 <0f> 0b e9 5d ff ff ff 0f 0b 49 c7 c4 ff ff ff ff e9 4f ff ff 
ff 0f
[    4.538526] RSP: 0018:ffff8881009078a8 EFLAGS: 00010282
[    4.539062] RAX: 000000000000005d RBX: 0000000000001000 RCX: 0000000000000203
[    4.539599] RDX: ffff8881009077a8 RSI: 0000000000000002 RDI: 00000000ffffffff
[    4.540130] RBP: 0000000000000002 R08: 000000000001ffff R09: ffff888fddd00000
[    4.540662] R10: 000000000005fffd R11: 0000000000000004 R12: ffffffffffffffff
[    4.541184] R13: 00000001010c9000 R14: ffff88810102e0d0 R15: 0000000000000020
[    4.541701] FS:  0000000000000000(0000) GS:ffff888fde440000(0000) knlGS:0000000000000000
[    4.542223] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    4.542740] CR2: 0000000000000000 CR3: 00000000b340b000 CR4: 0000000000350ee0
[    4.543260] Call Trace:
[    4.543766]  <TASK>
[    4.544258]  ? __page_pool_alloc_pages_slow+0xfa/0x240
[    4.544753]  ? page_pool_alloc_frag+0x13a/0x1f0
[    4.545240]  ? mt76_dma_rx_fill+0x156/0x440
[    4.545723]  ? mt76_dma_init+0x115/0x130
[    4.546201]  ? mt7921_dma_init+0x2d0/0x2d0
[    4.546666]  ? mt7921_irq_tasklet+0x1b0/0x1b0
[    4.547129]  ? mt7921_dma_init+0x276/0x2d0
[    4.547581]  ? mt7921_pci_probe+0x323/0x370
[    4.548024]  ? pci_device_probe+0xba/0x150
[    4.548463]  ? really_probe+0x13a/0x310
[    4.548898]  ? __driver_probe_device+0x91/0xd0
[    4.549329]  ? driver_probe_device+0x19/0x160
[    4.549754]  ? __driver_attach+0xdf/0x1a0
[    4.550177]  ? driver_attach+0x20/0x20
[    4.550599]  ? bus_for_each_dev+0xfd/0x130
[    4.551024]  ? bus_add_driver+0x160/0x240
[    4.551450]  ? driver_register+0x62/0x100
[    4.551871]  ? __initstub__kmod_r8169__599_5369_rtl8169_pci_driver_init6+0x20/0x20
[    4.552304]  ? do_one_initcall+0x103/0x280
[    4.552730]  ? do_initcall_level+0x77/0xe0
[    4.553143]  ? do_initcalls+0x54/0x90
[    4.553553]  ? kernel_init_freeable+0xc2/0x110
[    4.553962]  ? rest_init+0xc0/0xc0
[    4.554370]  ? kernel_init+0x15/0x140
[    4.554773]  ? ret_from_fork+0x1f/0x30
[    4.555177]  </TASK>
[    4.555561] ---[ end trace 0000000000000000 ]---
[    4.556291] mt7921e 0000:05:00.0: Failed to get patch semaphore
Comment 2 Mike Lothian 2023-04-08 15:40:33 UTC
Created attachment 304098 [details]
dmesg swiotlb=noforce
Comment 3 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-04-09 07:59:01 UTC
Which 6.3 version do you exactly use? There were a few swiotlb fixes that were merged shortly before -rc5. And one fix just merged yesterday:

https://git.kernel.org/torvalds/c/bbb73a103fbbed6f63cb738d3783261c4241b4b2

I wonder if those might help your case as well, but I'm not a developer of the affected area, hence this comment might be misleading. But might be worth giving  latest mainline or -rc6 (out later today) a quick try.
Comment 4 Mike Lothian 2023-04-09 15:58:40 UTC
Created attachment 304102 [details]
dmesg latest

Latest git tip
Comment 5 Mike Lothian 2023-04-09 15:59:21 UTC
I managed to work around it by setting swiotlb=131072
Comment 6 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-04-12 10:56:23 UTC
Sorry took me so long, but I finally forwarded the issue to the developers:

https://lore.kernel.org/regressions/c63dab2b-906d-5383-39f9-b02e7d7d2659@leemhuis.info/T/#u
Comment 7 Petr Tesarik 2023-04-13 10:55:26 UTC
FWIW this commit changed the behaviour of page-sized (or bigger) bounce buffers:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/dma/swiotlb.c?id=39e7d2ab6ea9fd6b389091ec223d566934fe7be5

Before this commit, page-sized buffers were not always page-aligned (contrary to comments in code) as long as it is properly aligned for the device.

OTOH it cannot be the root cause here, because this is an x86 system, i.e. page size is 4K. Device alignment can be either 2K or smaller (and slots are never skipped), or 4K or greater (and hence page-aligned), so except the above-mentioned and already fixed regression, nothing really changes for this system.