Bug 215881

Summary: ath11k: QCA6390: hardware rfkill enablement can break suspend (S3)
Product: Drivers Reporter: Tyler S (stachecki.tyler)
Component: network-wirelessAssignee: Kalle Valo (kvalo)
Status: RESOLVED CODE_FIX    
Severity: high CC: kvalo
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.15.35 + patches from from kvalo/ath.git Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg while running kvalo-ath/master and entering/exiting S3

Description Tyler S 2022-04-24 16:12:33 UTC
Sorry, I've known about this breakage for awhile, but only got the chance to bisect it this weekend.

In summary, the below patch results in at least one platform (an L390 Yoga Thinkpad) *immediately* powering back on after it enters ACPI S3:

    ath11k: add support for hardware rfkill for QCA6390
    
    When hardware rfkill is enabled in the firmware it will report the
    capability via using WMI_SYS_CAP_INFO_RFKILL bit in the WMI_SERVICE_READY
    event to the host. ath11k will check the capability, and if it is enabled then
    ath11k will set the GPIO information to firmware using WMI_PDEV_SET_PARAM. When
    the firmware detects hardware rfkill is enabled by the user, it will report it
    via WMI_RFKILL_STATE_CHANGE_EVENTID. Once ath11k receives the event it will
    send wmi command WMI_PDEV_SET_PARAM to the firmware and also notifies cfg80211.
    
    This only enable rfkill feature for QCA6390, rfkill_pin is all initialized to 0
    for other chips in ath11k_hw_params.
    
    Tested-on: QCA6390 hw2.0 PCI WLAN.HST.1.0.1-01740-QCAHSTSWPLZ_V2_TO_X86-1
    
    Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
    Signed-off-by: Kalle Valo <quic_kvalo@quicinc.com>
    Link: https://lore.kernel.org/r/20211217102334.14907-1-quic_wgong@quicinc.com

Looking at dmesg with my untrained eye, there's not a whole lot to go off of... I'll try re-verifying this with Kalle's tree instead of my backported tree just to be sure, as well as rachet up the ath11k debug information for additional clues/output shortly.

Verified with firmware: WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1
Comment 1 Tyler S 2022-04-24 17:42:13 UTC
Created attachment 300799 [details]
dmesg while running kvalo-ath/master and entering/exiting S3

$ sudo rfkill
ID TYPE DEVICE      SOFT      HARD
 0 wlan phy0   unblocked unblocked

$ grep RFKILL /boot/config...
CONFIG_RFKILL=y
CONFIG_RFKILL_LEDS=y
CONFIG_RFKILL_INPUT=y
CONFIG_RFKILL_GPIO=y

dmesg attached around time of suspend/resume with ath11k.debug_mask=0xffffffff:
$ date; echo '0xffffffff' > /sys/module/ath11k/parameters/debug_mask ; echo mem > /sys/power/state; sleep 5; echo 0 > /sys/module/ath11k/parameters/debug_mask;
Comment 2 Tyler S 2022-04-24 17:47:29 UTC
The rfkill by itself actually works, but even `rfkill --block wlan` and then an attempt to sleep via S3 still results in the platform resuming immediately without user intervention.
Comment 3 Kalle Valo 2022-04-25 06:05:03 UTC
(In reply to Tyler S from comment #0)
> Sorry, I've known about this breakage for awhile, but only got the chance to
> bisect it this weekend.
> 
> In summary, the below patch results in at least one platform (an L390 Yoga
> Thinkpad) *immediately* powering back on after it enters ACPI S3:
> 
>     ath11k: add support for hardware rfkill for QCA6390

This is commit id:

ec038c6127fa ath11k: add support for hardware rfkill for QCA6390

Introduced in v5.17-rc1.
Comment 4 Kalle Valo 2022-04-29 17:10:02 UTC
(In reply to Tyler S from comment #0)
> Sorry, I've known about this breakage for awhile, but only got the chance to
> bisect it this weekend.
> 
> In summary, the below patch results in at least one platform (an L390 Yoga
> Thinkpad) *immediately* powering back on after it enters ACPI S3:
> 
>     ath11k: add support for hardware rfkill for QCA6390

So just to confirm, if you revert that commit the problem goes away?
Comment 5 Tyler S 2022-05-01 02:07:24 UTC
Correct, that single commit makes or breaks functional suspend.
Comment 6 Tyler S 2022-05-01 02:14:45 UTC
As an added data point, I just swapped out the QCN6390 radio for a WCN6855 in the same exact laptop and kernel tonight. soft rfkill with WCN6855 works fine and there's no issue with suspend, either.
Comment 7 Kalle Valo 2022-05-01 08:44:52 UTC
bugzilla-daemon@kernel.org writes:

> As an added data point, I just swapped out the QCN6390 radio for a WCN6855 in
> the same exact laptop and kernel tonight. soft rfkill with WCN6855 works fine
> and there's no issue with suspend, either.

Thanks, that's good to know. We will investigate more.
Comment 8 Kalle Valo 2022-05-06 14:37:00 UTC
Tyler, what version of QCA6390 firmware are you using? The best is if you can provide md5sum:

md5sum /lib/firmware/ath11k/QCA6390/hw2.0/*
Comment 9 Tyler S 2022-05-06 14:51:03 UTC
The Jenkins pipeline producing the builds is using these firmwares:

$ md5sum ../ath11k-firmware/QCA6390/hw2.0/1.0.1/WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1/*
682f7ca2e0b3ea16644c3585772a2cba  ../ath11k-firmware/QCA6390/hw2.0/1.0.1/WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1/amss.bin
fd4aa4a58f33854a751ec7d14d77ce91  ../ath11k-firmware/QCA6390/hw2.0/1.0.1/WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1/m3.bin
152ae5cac6bd0017877c2b778d962409  ../ath11k-firmware/QCA6390/hw2.0/1.0.1/WLAN.HST.1.0.1-05266-QCAHSTSWPLZ_V2_TO_X86-1/Notice.txt

The firmwares are compiled into the vmlinuz (via CONFIG_EXTRA_FIRMWARE).
Comment 10 Kalle Valo 2022-06-21 08:32:45 UTC
My plan is to revert commit ec038c6127fa in v5.19.
Comment 11 Tyler S 2022-06-26 21:03:52 UTC
Sorry for the late reply, and thanks!

I have had this patch reverted and have continued to rebase your tree on top of 5.15 for awhile... no issues seen and S3 continues to work.
Comment 12 Kalle Valo 2022-07-08 16:58:46 UTC
Patch submitted, most likely will be in v5.20:

https://patchwork.kernel.org/project/linux-wireless/patch/20220708164656.29549-1-kvalo@kernel.org/
Comment 13 Kalle Valo 2022-08-04 06:45:08 UTC
Patch applied now:

https://git.kernel.org/linus/169ede1f5948