Bug 204053

Summary: RTW88 driver crashes system when memory is low
Product: Drivers Reporter: jian-hong
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: stf_xl
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 5.2-rc7 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg for this bug

Description jian-hong 2019-07-03 07:09:38 UTC
Created attachment 283523 [details]
dmesg for this bug

We have an ASUS X512DK equipped with AMD Ryzen 5 3500U and RTL8822BE WiFi chip. I tested kernel 5.2-rc7 on it. The new Realtek module: rtw_pci is loaded for WiFi.

02:00.0 Network controller [0280]: Realtek Semiconductor Co., Ltd. RTL8822BE 802.11a/b/g/n/ac WiFi adapter [10ec:b822]
	Subsystem: AzureWave RTL8822BE 802.11a/b/g/n/ac WiFi adapter [1a3b:2950]
	Flags: bus master, fast devsel, latency 0, IRQ 55
	I/O ports at e000 [size=256]
	Memory at fe900000 (64-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable- Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [148] Device Serial Number 00-e0-4c-ff-fe-b8-22-01
	Capabilities: [158] Latency Tolerance Reporting
	Capabilities: [160] L1 PM Substates
	Kernel driver in use: rtw_pci
	Kernel modules: rtwpci

I browse a lot of videos on Youtube at the same time with Chromium to get the memory pressure. System looks like stuck when the free memory is under 60MB.  If I play more and more Youtube videos as more memory pressure, system has chance to be hung up. And, it is caused by rtw88:

[ 2356.580239] rx routine starvation
[ 2356.580292] WARNING: CPU: 7 PID: 9871 at drivers/net/wireless/realtek/rtw88/pci.c:822 rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
[ 2356.580294] Modules linked in: efi_pstore rfcomm bnep rtwpci rtw88 btusb btrtl btbcm btintel mac80211 bluetooth cfg80211 amdgpu gpu_sched ttm efivarfs
[ 2356.580309] CPU: 7 PID: 9871 Comm: chromium-browse Tainted: G        W         5.2.0-rc7 #3
[ 2356.580310] Hardware name: ASUSTeK COMPUTER INC. VivoBook_ASUSLaptop X512DK_X512DK/X512DK, BIOS X512DK.200 01/04/2019
[ 2356.580313] RIP: 0010:rtw_pci_rx_isr.constprop.25+0x35a/0x370 [rtwpci]
[ 2356.580316] Code: 89 f7 eb a2 44 8b 6c 24 3c 45 03 ac 24 34 02 00 00 41 29 c5 44 89 6c 24 04 e9 31 fd ff ff 48 c7 c7 db f1 4b c0 e8 ed 23 9a d2 <0f> 0b e9 b3 fe ff ff e8 9a 21 9a d2 66 2e 0f 1f 84 00 00 00 00 00
[ 2356.580317] RSP: 0000:ffffbccfc1ae4e10 EFLAGS: 00010082
[ 2356.580319] RAX: 0000000000000000 RBX: ffff9ef12f5a15e0 RCX: 0000000000000006
[ 2356.580320] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff9ef134bd63d0
[ 2356.580321] RBP: 000000000000012e R08: 000000000000064e R09: 0000000000000000
[ 2356.580322] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9ef12f5a3290
[ 2356.580322] R13: ffff9ef12f5a3c00 R14: ffff9eef058daa00 R15: ffff9eef3881ab00
[ 2356.580326] FS:  00007ffac2655440(0000) GS:ffff9ef134bc0000(0000) knlGS:0000000000000000
[ 2356.580327] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2356.580328] CR2: 00005592c83cfed0 CR3: 0000000085b18000 CR4: 00000000003406e0
[ 2356.580329] Call Trace:
[ 2356.580334]  <IRQ>
[ 2356.580340]  rtw_pci_interrupt_handler+0x102/0x190 [rtwpci]
[ 2356.580350]  __handle_irq_event_percpu+0x35/0x160
[ 2356.580353]  handle_irq_event_percpu+0x2b/0x70
[ 2356.580355]  handle_irq_event+0x22/0x3f
[ 2356.580357]  handle_fasteoi_irq+0x81/0x120
[ 2356.580361]  handle_irq+0x11/0x20
[ 2356.580367]  do_IRQ+0x3c/0xc0
[ 2356.580370]  common_interrupt+0xf/0xf
[ 2356.580371]  </IRQ>
[ 2356.580376] RIP: 0010:filemap_map_pages+0xaa/0x340
[ 2356.580378] Code: 00 00 49 81 ff 06 04 00 00 0f 84 e2 00 00 00 49 81 ff 02 04 00 00 0f 84 75 01 00 00 41 f6 c7 01 0f 85 cb 00 00 00 49 8b 57 08 <48> 8d 42 ff 83 e2 01 49 0f 44 c7 48 8b 00 a8 01 0f 85 b1 00 00 00
[ 2356.580378] RSP: 0000:ffffbccfc9abfdd0 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffdb
[ 2356.580379] RAX: 000000000000003f RBX: 0000000000006abf RCX: ffff9ef0ee29e6c8
[ 2356.580380] RDX: fffffcc9c646e1c8 RSI: 0000000000000000 RDI: ffff9ef0ee29e6c8
[ 2356.580381] RBP: ffff9ef0ee0116e8 R08: fffffcc9c1f6bec0 R09: 0000000000006abf
[ 2356.580382] R10: 0000000000000000 R11: 0000000000000000 R12: ffffbccfc9abfe58
[ 2356.580382] R13: 0000000000006ab0 R14: ffff9eee9f925800 R15: fffffcc9c1f6bec0
[ 2356.580388]  __handle_mm_fault+0x935/0xae0
[ 2356.580393]  __do_page_fault+0x242/0x4b0
[ 2356.580395]  ? page_fault+0x8/0x30
[ 2356.580396]  page_fault+0x1e/0x30
[ 2356.580398] RIP: 0033:0x5592c83cfed0
[ 2356.580405] Code: Bad RIP value.
[ 2356.580405] RSP: 002b:00007ffe2eea3cb8 EFLAGS: 00010202
[ 2356.580406] RAX: 00000b1a089c8908 RBX: 00000b1a08e17a10 RCX: 00000b1a089c8900
[ 2356.580407] RDX: 00000b1a089c8800 RSI: 0000384e468cbab8 RDI: 00000b1a08e17a10
[ 2356.580408] RBP: 00007ffe2eea3d40 R08: 00005592c7b2c600 R09: 00005592c7ca56d0
[ 2356.580409] R10: 0000000000000800 R11: 000000009c8edc09 R12: 000000000000092c
[ 2356.580410] R13: 0000384e468cbab8 R14: 00005592c83cfed0 R15: 00000b1a08590380
[ 2356.580411] ---[ end trace 7a976d7d82e50407 ]---
[ 2356.581343] rtw_pci 0000:02:00.0: pci bus timeout, check dma status
[ 2356.581353] skbuff: skb_over_panic: text:00000000091b6e66 len:415 put:415 head:00000000d2880c6f data:000000007a02b1ea tail:0x1df end:0xc0 dev:<NULL>
[ 2356.581372] ------------[ cut here ]------------
[ 2356.581373] kernel BUG at net/core/skbuff.c:105!
[ 2356.581380] invalid opcode: 0000 [#1] SMP NOPTI
[ 2356.581383] CPU: 7 PID: 9871 Comm: chromium-browse Tainted: G        W         5.2.0-rc7 #3
[ 2356.581384] Hardware name: ASUSTeK COMPUTER INC. VivoBook_ASUSLaptop X512DK_X512DK/X512DK, BIOS X512DK.200 01/04/2019
[ 2356.581390] RIP: 0010:skb_panic+0x43/0x45
[ 2356.581393] Code: 4f 70 50 8b 87 bc 00 00 00 50 8b 87 b8 00 00 00 50 ff b7 c8 00 00 00 4c 8b 8f c0 00 00 00 48 c7 c7 d8 73 10 94 e8 60 7f 80 ff <0f> 0b 48 8b 14 24 48 c7 c1 a0 dc f2 93 e8 ab ff ff ff 48 c7 c6 e0
[ 2356.581393] RSP: 0000:ffffbccfc1ae4de0 EFLAGS: 00010046
[ 2356.581395] RAX: 0000000000000088 RBX: ffff9ef12f5a15e0 RCX: 0000000000000006
[ 2356.581396] RDX: 0000000000000000 RSI: 0000000000000092 RDI: ffff9ef134bd63d0
[ 2356.581396] RBP: 000000000000012e R08: 000000000000067f R09: 0000000000000000
[ 2356.581397] R10: 0000000000000000 R11: 0000000000000001 R12: ffff9ef12f5a3290
[ 2356.581398] R13: ffff9ef12f5a3c00 R14: ffff9eef058daa00 R15: ffff9eef3881b400
[ 2356.581399] FS:  00007ffac2655440(0000) GS:ffff9ef134bc0000(0000) knlGS:0000000000000000
[ 2356.581399] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2356.581400] CR2: 00005592c9803870 CR3: 0000000085b18000 CR4: 00000000003406e0
[ 2356.581401] Call Trace:
[ 2356.581403]  <IRQ>
[ 2356.581406]  skb_put.cold.91+0x10/0x10
[ 2356.581410]  rtw_pci_rx_isr.constprop.25+0x2b8/0x370 [rtwpci]
[ 2356.581415]  rtw_pci_interrupt_handler+0x102/0x190 [rtwpci]
[ 2356.581420]  __handle_irq_event_percpu+0x35/0x160
[ 2356.581422]  handle_irq_event_percpu+0x2b/0x70
[ 2356.581424]  handle_irq_event+0x22/0x3f
[ 2356.581426]  handle_fasteoi_irq+0x81/0x120
[ 2356.581430]  handle_irq+0x11/0x20
[ 2356.581435]  do_IRQ+0x3c/0xc0
[ 2356.581437]  common_interrupt+0xf/0xf
[ 2356.581438]  </IRQ>

This bug will be easier to reproduce, if the PSI is disabled in config.
Comment 1 Stanislaw Gruszka 2019-07-08 07:22:55 UTC
Please report this on linux-wireless@vger.kernel.org mailing list and 
cc: Yan-Hsuan Chuang <yhchuang@realtek.com> .
Comment 2 jian-hong 2019-07-08 07:34:55 UTC
I have sent the patch to upstream, including Yan-Hsuan Chuang
https://patchwork.kernel.org/patch/11034617/