Bug 215496
Summary: | dmesg flood: realtek wi-fi: rtw_8822ce firmware failed to leave lps state | ||
---|---|---|---|
Product: | Drivers | Reporter: | Roman Evstifeev (someuniquename) |
Component: | network-wireless | Assignee: | drivers_network-wireless (drivers_network-wireless) |
Status: | NEW --- | ||
Severity: | normal | CC: | fw, ketchupfelipe, pbrobinson, pkshih, tom, vedran |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.16.0 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Roman Evstifeev
2022-01-15 08:58:50 UTC
lspci: 04:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8822CE 802.11ac PCIe Wireless Network Adapter Subsystem: Lenovo Device c123 Flags: bus master, fast devsel, latency 0, IRQ 79, IOMMU group 12 I/O ports at 2000 [size=256] Memory at c3600000 (64-bit, non-prefetchable) [size=64K] Capabilities: [40] Power Management version 3 Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [70] Express Endpoint, MSI 00 Capabilities: [100] Advanced Error Reporting Capabilities: [148] Device Serial Number 00-e0-4c-ff-fe-c8-22-01 Capabilities: [158] Latency Tolerance Reporting Capabilities: [160] L1 PM Substates Kernel driver in use: rtw_8822ce Kernel modules: rtw88_8822ce Originally reported here: https://github.com/lwfinger/rtw88/issues/61 CC'ing Ping-Ke Shih Please take a look. Other Realtek employees are not present in this bugzilla :( Please help to collect log by setting module parameter rtw_debug_mask=0x10 of rtw_core.ko. Then, stay each state you mentioned at least 1 minute. If possible, please mark the timeline in log. In state (2), "Message flood stopped." means no weird messages, even no "failed to send h2c command", right? I would also like to know if it still works when you see these messages. I think the messages are only appeared if your computer stays idle, so please try to make it busy, like flooding ping (ping -i 0.01). Then, check if the messages again. > setting module parameter rtw_debug_mask=0x10 of rtw_core.ko.
How can i set this parameter? I have to compile driver myself?
> In state (2), "Message flood stopped." means no weird messages, even no
> "failed to send h2c command", right?
Yes, this is correct.
(In reply to Roman Evstifeev from comment #5) > > setting module parameter rtw_debug_mask=0x10 of rtw_core.ko. > > How can i set this parameter? I have to compile driver myself? Similar to add options rtw88_pci disable_aspm=1 to my /etc/modprobe.d/local.conf. Add another option to rtw88_core. Or, you can set and check the value via /sys/modules/rtw88_core/parameter/debug_mask. If CONFIG_RTW88_DEBUG is off in your existing kernel, you have to compile driver. > I would also like to know if it still works when you see these messages. Yes, it works, but at least web browsing seems to be slower. > I think the messages are only appeared if your computer stays idle, so please try to make it busy, like flooding ping (ping -i 0.01). Then, check if the messages again. With ping -i 0.01 running, the "firmware failed to leave lps state" does not appear, but "failed to send h2c command" message still does. I have enabled debug_mask in my /etc/modprobe.d/local.conf , and checked it with: # cat /sys/module/rtw88_core/parameters/debug_mask 16 here is the full logs, as i stayed in each mode for ~1 minute: https://gist.github.com/Fak3/172ac902dbab898d3cb580ffcd441ffe The dmesg flood starts when battery charge is below 83%. With charge above this level, I have not encountered this issue yet. (In reply to Roman Evstifeev from comment #8) > > # cat /sys/module/rtw88_core/parameters/debug_mask > 16 > > here is the full logs, as i stayed in each mode for ~1 minute: > https://gist.github.com/Fak3/172ac902dbab898d3cb580ffcd441ffe I think your kernel turns on dynamic_debug, so please do below commands: sudo bash -c 'echo -n "module rtw88 +p" > /sys/kernel/debug/dynamic_debug/control' sudo bash -c 'echo -n "module rtwpci +p" > /sys/kernel/debug/dynamic_debug/control' Then, we can see H2C content, like: rtw_8822ce 0000:03:00.0: send H2C content 29000042 00000002 (In reply to Roman Evstifeev from comment #9) > The dmesg flood starts when battery charge is below 83%. With charge above > this level, I have not encountered this issue yet. I wonder if the platform turns off power_save when battery is over 83%. Please check power_save status in both situations battery is low and fully charged. > sudo iw wlan0 get power_save > sudo bash -c 'echo -n "module rtw88 +p" >
> /sys/kernel/debug/dynamic_debug/control'
> sudo bash -c 'echo -n "module rtwpci +p" >
> /sys/kernel/debug/dynamic_debug/control'
Just did that, but the error from rtw_8822ce looks the same as before, and does not have any additional info.
> I wonder if the platform turns off power_save when battery is over 83%.
> Please check power_save status in both situations battery is low and fully
> charged.
Checked that, and powersave is always on, regardless of the charge level:
# iw wlp4s0 get power_save
Power save: on
May be relevant? Both of us, who encountered the issue, use bluetooth mouse. (In reply to Roman Evstifeev from comment #12) > > Just did that, but the error from rtw_8822ce looks the same as before, and > does not have any additional info. Please check if CONFIG_RTW88_DEBUG turn off by your kernel build. Without additional info, it's hard to me to dig it further. (In reply to Roman Evstifeev from comment #14) > May be relevant? Both of us, who encountered the issue, use bluetooth mouse. Yes, it is possible. If you disconnect bluetooth mouse. Does it work well without messages? Another try is to connect wifi to a 5G AP, because bluetooth is working on 2G that needs additional mechanism to be coexistent with 2G AP. > Please check if CONFIG_RTW88_DEBUG turn off by your kernel build
localhost:/home/u1 # cat /proc/config.gz | gunzip | grep CONFIG_DYNAMIC_DEBUG
CONFIG_DYNAMIC_DEBUG=y
CONFIG_DYNAMIC_DEBUG_CORE=y
localhost:/home/u1 # cat /proc/config.gz | gunzip | grep RTW
CONFIG_RTW88=m
CONFIG_RTW88_CORE=m
CONFIG_RTW88_PCI=m
CONFIG_RTW88_8822B=m
CONFIG_RTW88_8822C=m
CONFIG_RTW88_8723D=m
CONFIG_RTW88_8821C=m
CONFIG_RTW88_8822BE=m
CONFIG_RTW88_8822CE=m
CONFIG_RTW88_8723DE=m
CONFIG_RTW88_8821CE=m
# CONFIG_RTW88_DEBUG is not set
# CONFIG_RTW88_DEBUGFS is not set
I switched to kernel-debug, and now I see bunch more debug output like
> rtw_8822ce 0000:04:00.0: send H2C content 32000042 00000002
Will try to reproduce the bug, and report.
Here you go: https://gist.github.com/Fak3/81424c5786bacf6f3cdc331557c5a09d > If you disconnect bluetooth mouse. Does it work well without messages? No, disconnecting mouse and "systemctl stop bluetooth" does not fix the issue. But If I put laptop to sleep and then wake up, the messages stop to come in. It looks like firmware command queue get stuck after "firmware failed to leave lps state", so driver can't send out more commands, then kernel log shows "failed to send h2c command" periodically. But, firmware can still send something to driver, like "recv C2H, id=0x0c, seq=0x3a, len=8", so firmware doesn't hang. Since next week is the Lunar New Year, many people are offline. I will track this issue weeks later. Another person reproduced the bug: https://github.com/lwfinger/rtw88/issues/61#issuecomment-1023504711 And he does not use bluetooth, so it is irrelevant. The people tried: [connection] wifi.powersave = 2 Please try if this also works to you. And, please also try below two ko options, and do cold reboot. rtw88_core disable_lps_deep=y rtw88_pci disable_aspm=y > [connection]
> wifi.powersave = 2
>
> And, please also try below two ko options, and do cold reboot.
>
> rtw88_core disable_lps_deep=y
> rtw88_pci disable_aspm=y
Tried both methods separately, they both worked, no dmesg flood detected.
The method 2 can save more power than method 1, so I suggest you use method 2. While debugging this problem, I started a ping to a reliable host and surprisingly, this kept the wifi link up for six hours without data loss (streaming audio did not fail) nor ICMP loss. I ultimately added the NetworkManager 'wifi.powersave = 2' fix and problem seems solved. This message also appears on resuming from suspend with rtw_8822bu, but it seems that using disable_lps_deep=y works. |