Bug 215496 - dmesg flood: realtek wi-fi: rtw_8822ce firmware failed to leave lps state
Summary: dmesg flood: realtek wi-fi: rtw_8822ce firmware failed to leave lps state
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-01-15 08:58 UTC by Roman Evstifeev
Modified: 2024-09-07 07:29 UTC (History)
6 users (show)

See Also:
Kernel Version: 5.16.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Roman Evstifeev 2022-01-15 08:58:50 UTC
When laptop charger is not, connected, dmesg is flooded with errors from realtek driver

Dmesg log init:
[    3.516129] rtw_8822ce 0000:04:00.0: enabling device (0000 -> 0003)
[    3.523097] rtw_8822ce 0000:04:00.0: Firmware version 9.9.4, H2C version 15
[    3.530951] rtw_8822ce 0000:04:00.0: Firmware version 9.9.10, H2C version 15
[    3.999011] rtw_8822ce 0000:04:00.0 wlp4s0: renamed from wlan0
[    5.324088] rtw_8822ce 0000:04:00.0: start vif dc:e9:94:7e:5f:85 on port 0
[    5.325431] rtw_8822ce 0000:04:00.0: stop vif dc:e9:94:7e:5f:85 on port 0
[    5.561473] rtw_8822ce 0000:04:00.0: start vif ee:c7:7c:0e:6b:3c on port 0
[    9.847260] rtw_8822ce 0000:04:00.0: stop vif ee:c7:7c:0e:6b:3c on port 0
[   10.085443] rtw_8822ce 0000:04:00.0: start vif dc:e9:94:7e:5f:85 on port 0
[   14.738024] rtw_8822ce 0000:04:00.0: sta 10:fe:ed:51:2a:94 joined with macid 0

Repeated messages are:
[   49.030634] rtw_8822ce 0000:04:00.0: firmware failed to leave lps state
[   49.033828] rtw_8822ce 0000:04:00.0: failed to send h2c command
[   49.036994] rtw_8822ce 0000:04:00.0: failed to send h2c command

1) I connected the charger. The "firmware failed to leave lps state" message stopped to appear, but "failed to send h2c command" message still continuously flooded my dmesg
2) I put laptop to sleep, then wake up. Message flood stopped.
3) I disconnected the charger, and after few minutes message started to come in again.
Comment 1 Roman Evstifeev 2022-01-15 09:00:55 UTC
lspci:
04:00.0 Network controller: Realtek Semiconductor Co., Ltd. RTL8822CE 802.11ac PCIe Wireless Network Adapter
        Subsystem: Lenovo Device c123
        Flags: bus master, fast devsel, latency 0, IRQ 79, IOMMU group 12
        I/O ports at 2000 [size=256]
        Memory at c3600000 (64-bit, non-prefetchable) [size=64K]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Device Serial Number 00-e0-4c-ff-fe-c8-22-01
        Capabilities: [158] Latency Tolerance Reporting
        Capabilities: [160] L1 PM Substates
        Kernel driver in use: rtw_8822ce
        Kernel modules: rtw88_8822ce
Comment 2 Roman Evstifeev 2022-01-15 09:02:41 UTC
Originally reported here: https://github.com/lwfinger/rtw88/issues/61
Comment 3 Artem S. Tashkinov 2022-01-15 11:24:20 UTC
CC'ing Ping-Ke Shih

Please take a look.

Other Realtek employees are not present in this bugzilla :(
Comment 4 Ping-Ke Shih 2022-01-17 03:18:25 UTC
Please help to collect log by setting module parameter rtw_debug_mask=0x10 of rtw_core.ko. Then, stay each state you mentioned at least 1 minute.
If possible, please mark the timeline in log.

In state (2), "Message flood stopped." means no weird messages, even no "failed to send h2c command", right?

I would also like to know if it still works when you see these messages.
I think the messages are only appeared if your computer stays idle, so
please try to make it busy, like flooding ping (ping -i 0.01). Then,
check if the messages again.
Comment 5 Roman Evstifeev 2022-01-17 07:00:05 UTC
> setting module parameter rtw_debug_mask=0x10 of rtw_core.ko.

How can i set this parameter? I have to compile driver myself?
Comment 6 Roman Evstifeev 2022-01-17 07:00:40 UTC
> In state (2), "Message flood stopped." means no weird messages, even no
> "failed to send h2c command", right?

Yes, this is correct.
Comment 7 Ping-Ke Shih 2022-01-17 07:09:53 UTC
(In reply to Roman Evstifeev from comment #5)
> > setting module parameter rtw_debug_mask=0x10 of rtw_core.ko.
> 
> How can i set this parameter? I have to compile driver myself?

Similar to add options rtw88_pci disable_aspm=1 to my /etc/modprobe.d/local.conf.
Add another option to rtw88_core.
Or, you can set and check the value via /sys/modules/rtw88_core/parameter/debug_mask.


If CONFIG_RTW88_DEBUG is off in your existing kernel, you have to compile driver.
Comment 8 Roman Evstifeev 2022-01-17 08:30:32 UTC
> I would also like to know if it still works when you see these messages.

Yes, it works, but at least web browsing seems to be slower.


> I think the messages are only appeared if your computer stays idle, so
please try to make it busy, like flooding ping (ping -i 0.01). Then,
check if the messages again.

With ping -i 0.01 running, the "firmware failed to leave lps state" does not appear, but "failed to send h2c command" message still does.

I have enabled debug_mask in my /etc/modprobe.d/local.conf , and checked it with:

# cat /sys/module/rtw88_core/parameters/debug_mask 
16

here is the full logs, as i stayed in each mode for ~1 minute: https://gist.github.com/Fak3/172ac902dbab898d3cb580ffcd441ffe
Comment 9 Roman Evstifeev 2022-01-17 08:36:54 UTC
The dmesg flood starts when battery charge is below 83%. With charge above this level, I have not encountered this issue yet.
Comment 10 Ping-Ke Shih 2022-01-18 01:16:09 UTC
(In reply to Roman Evstifeev from comment #8)
> 
> # cat /sys/module/rtw88_core/parameters/debug_mask 
> 16
> 
> here is the full logs, as i stayed in each mode for ~1 minute:
> https://gist.github.com/Fak3/172ac902dbab898d3cb580ffcd441ffe

I think your kernel turns on dynamic_debug, so please do below commands:

sudo bash -c 'echo -n "module rtw88 +p" > /sys/kernel/debug/dynamic_debug/control'
sudo bash -c 'echo -n "module rtwpci +p" > /sys/kernel/debug/dynamic_debug/control'

Then, we can see H2C content, like:

rtw_8822ce 0000:03:00.0: send H2C content 29000042 00000002
Comment 11 Ping-Ke Shih 2022-01-18 01:19:32 UTC
(In reply to Roman Evstifeev from comment #9)
> The dmesg flood starts when battery charge is below 83%. With charge above
> this level, I have not encountered this issue yet.

I wonder if the platform turns off power_save when battery is over 83%.
Please check power_save status in both situations battery is low and fully charged.

> sudo iw wlan0 get power_save
Comment 12 Roman Evstifeev 2022-01-18 09:25:00 UTC
> sudo bash -c 'echo -n "module rtw88 +p" >
> /sys/kernel/debug/dynamic_debug/control'
> sudo bash -c 'echo -n "module rtwpci +p" >
> /sys/kernel/debug/dynamic_debug/control'

Just did that, but the error from rtw_8822ce looks the same as before, and does not have any additional info.
Comment 13 Roman Evstifeev 2022-01-18 10:42:51 UTC
> I wonder if the platform turns off power_save when battery is over 83%.
> Please check power_save status in both situations battery is low and fully
> charged.

Checked that, and powersave is always on, regardless of the charge level:

# iw wlp4s0 get power_save
Power save: on
Comment 14 Roman Evstifeev 2022-01-20 12:00:56 UTC
May be relevant? Both of us, who encountered the issue, use bluetooth mouse.
Comment 15 Ping-Ke Shih 2022-01-21 00:36:36 UTC
(In reply to Roman Evstifeev from comment #12)
> 
> Just did that, but the error from rtw_8822ce looks the same as before, and
> does not have any additional info.

Please check if CONFIG_RTW88_DEBUG turn off by your kernel build.

Without additional info, it's hard to me to dig it further.
Comment 16 Ping-Ke Shih 2022-01-21 00:40:57 UTC
(In reply to Roman Evstifeev from comment #14)
> May be relevant? Both of us, who encountered the issue, use bluetooth mouse.

Yes, it is possible.
If you disconnect bluetooth mouse. Does it work well without messages?

Another try is to connect wifi to a 5G AP, because bluetooth is working on 2G that needs additional mechanism to be coexistent with 2G AP.
Comment 17 Roman Evstifeev 2022-01-21 05:16:53 UTC
> Please check if CONFIG_RTW88_DEBUG turn off by your kernel build

localhost:/home/u1 # cat /proc/config.gz | gunzip | grep CONFIG_DYNAMIC_DEBUG
CONFIG_DYNAMIC_DEBUG=y
CONFIG_DYNAMIC_DEBUG_CORE=y
localhost:/home/u1 # cat /proc/config.gz | gunzip | grep RTW
CONFIG_RTW88=m
CONFIG_RTW88_CORE=m
CONFIG_RTW88_PCI=m
CONFIG_RTW88_8822B=m
CONFIG_RTW88_8822C=m
CONFIG_RTW88_8723D=m
CONFIG_RTW88_8821C=m
CONFIG_RTW88_8822BE=m
CONFIG_RTW88_8822CE=m
CONFIG_RTW88_8723DE=m
CONFIG_RTW88_8821CE=m
# CONFIG_RTW88_DEBUG is not set
# CONFIG_RTW88_DEBUGFS is not set
Comment 18 Roman Evstifeev 2022-01-21 05:59:46 UTC
I switched to kernel-debug, and now I see bunch more debug output like

> rtw_8822ce 0000:04:00.0: send H2C content 32000042 00000002

Will try to reproduce the bug, and report.
Comment 19 Roman Evstifeev 2022-01-21 06:18:47 UTC
Here you go: https://gist.github.com/Fak3/81424c5786bacf6f3cdc331557c5a09d

> If you disconnect bluetooth mouse. Does it work well without messages?

No, disconnecting mouse and "systemctl stop bluetooth" does not fix the issue. 

But If I put laptop to sleep and then wake up, the messages stop to come in.
Comment 20 Ping-Ke Shih 2022-01-26 06:45:36 UTC
It looks like firmware command queue get stuck after "firmware failed to leave lps state", so driver can't send out more commands, then kernel log shows "failed to send h2c command" periodically.

But, firmware can still send something to driver, like "recv C2H, id=0x0c, seq=0x3a, len=8", so firmware doesn't hang.

Since next week is the Lunar New Year, many people are offline. 
I will track this issue weeks later.
Comment 21 Roman Evstifeev 2022-02-09 13:38:43 UTC
Another person reproduced the bug: https://github.com/lwfinger/rtw88/issues/61#issuecomment-1023504711

And he does not use bluetooth, so it is irrelevant.
Comment 22 Ping-Ke Shih 2022-02-10 02:03:49 UTC
The people tried:
  [connection]
  wifi.powersave = 2

Please try if this also works to you.


And, please also try below two ko options, and do cold reboot.

rtw88_core
  disable_lps_deep=y
rtw88_pci
  disable_aspm=y
Comment 23 Roman Evstifeev 2022-02-11 07:02:11 UTC
>   [connection]
>   wifi.powersave = 2
> 
> And, please also try below two ko options, and do cold reboot.
> 
> rtw88_core disable_lps_deep=y
> rtw88_pci disable_aspm=y

Tried both methods separately, they both worked, no dmesg flood detected.
Comment 24 Ping-Ke Shih 2022-02-11 08:45:10 UTC
The method 2 can save more power than method 1, so I suggest you use method 2.
Comment 25 tom jennings 2024-06-02 00:23:18 UTC
While debugging this problem, I started a ping to a reliable host and surprisingly, this kept the wifi link up for six hours without data loss (streaming audio did not fail) nor ICMP loss.


I ultimately added the NetworkManager 'wifi.powersave = 2' fix and problem seems solved.
Comment 26 Vedran Miletić 2024-09-07 07:29:08 UTC
This message also appears on resuming from suspend with rtw_8822bu, but it seems that using disable_lps_deep=y works.

Note You need to log in before you can comment on or make changes to this bug.