Bug 217894 - iwlwifi: AX210 Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)
Summary: iwlwifi: AX210 Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless-intel (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: Default virtual assignee for network-wireless-intel
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-09-10 04:23 UTC by Ronan Pigott
Modified: 2024-03-07 12:25 UTC (History)
5 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments
journal from linux 6.5.2 where firmware 83 fails to load (148.40 KB, text/plain)
2023-09-10 04:23 UTC, Ronan Pigott
Details
journal from linux 6.4.12 where firmware 78 loads successfully (95.74 KB, text/plain)
2023-09-10 04:24 UTC, Ronan Pigott
Details
journal from linux 6.6-rc1, firmware 83 loads successfully (101.71 KB, text/plain)
2023-09-11 22:50 UTC, Ronan Pigott
Details
journal from linux 6.5.2, firmware 83 loads successfully this time (96.25 KB, text/plain)
2023-09-11 23:02 UTC, Ronan Pigott
Details
crash logs 1 (16.49 KB, text/plain)
2023-09-30 05:59 UTC, Pawel
Details

Description Ronan Pigott 2023-09-10 04:23:29 UTC
Created attachment 305076 [details]
journal from linux 6.5.2 where firmware 83 fails to load

In linux 6.5.2, loading firmware version 83 on AX210 device appears to fail (repeatedly) with a timeout. Reverting to linux 6.4.12 successfully loads firmware 78.

My nic:
$ lspci -kd::280
08:00.0 Network controller: Intel Corporation Wi-Fi 6 AX210/AX211/AX411 160MHz (rev 1a)
	Subsystem: Rivet Networks Wi-Fi 6 AX210/AX211/AX411 160MHz
	Kernel driver in use: iwlwifi
	Kernel modules: iwlwifi

The failing kernel:
$ pacman -Qp /var/cache/pacman/pkg/linux-6.5.2.arch1-1-x86_64.pkg.tar.zst 
linux 6.5.2.arch1-1

The error:
$ journalctl -b -1 _KERNEL_DEVICE=+pci:${$(lspci -Dd::280)[(w)1]} + _TRANSPORT=kernel > dmesg65.log

[..attached..]

The interesting bit:
Sep 09 20:30:28 kernel: iwlwifi 0000:08:00.0: WRT: Invalid buffer destination
Sep 09 20:30:29 kernel: ------------[ cut here ]------------
Sep 09 20:30:29 kernel: Timeout waiting for hardware access (CSR_GP_CNTRL 0xffffffff)
Sep 09 20:30:29 kernel: WARNING: CPU: 13 PID: 679 at drivers/net/wireless/intel/iwlwifi/pcie/trans.c:2190 __iwl_trans_pcie_gr>
Sep 09 20:30:29 kernel: Modules linked in: iwlmvm(+) snd_sof_pci_intel_tgl snd_sof_intel_hda_common soundwire_intel snd_sof_i>
Sep 09 20:30:29 kernel:  snd_hwdep intel_rapl_msr dell_smm_hwmon processor_thermal_rfim i2c_i801 realtek btmtk alienware_wmi >
Sep 09 20:30:29 kernel: CPU: 13 PID: 679 Comm: modprobe Not tainted 6.5.2-arch1-1 #1 d2912f929551bc8e9b95af790b8285a77c25fa29
Sep 09 20:30:29 kernel: Hardware name: Dell Inc. XPS 8950/0R6PCT, BIOS 1.2.1 03/25/2022
[...]
Sep 09 20:30:29 kernel: Call Trace:
Sep 09 20:30:29 kernel:  <TASK>
Sep 09 20:30:29 kernel:  ? __iwl_trans_pcie_grab_nic_access+0x14a/0x150 [iwlwifi 25a8da985d322177fdc2dbc451d4271c449a7a6f]
Sep 09 20:30:29 kernel:  ? __warn+0x81/0x130
Sep 09 20:30:29 kernel:  ? __iwl_trans_pcie_grab_nic_access+0x14a/0x150 [iwlwifi 25a8da985d322177fdc2dbc451d4271c449a7a6f]
Sep 09 20:30:29 kernel:  ? report_bug+0x171/0x1a0
Sep 09 20:30:29 kernel:  ? prb_read_valid+0x1b/0x30
Sep 09 20:30:29 kernel:  ? handle_bug+0x3c/0x80
Sep 09 20:30:29 kernel:  ? exc_invalid_op+0x17/0x70
Sep 09 20:30:29 kernel:  ? asm_exc_invalid_op+0x1a/0x20
Sep 09 20:30:29 kernel:  ? __iwl_trans_pcie_grab_nic_access+0x14a/0x150 [iwlwifi 25a8da985d322177fdc2dbc451d4271c449a7a6f]
Sep 09 20:30:29 kernel:  iwl_trans_pcie_grab_nic_access+0x1a/0x40 [iwlwifi 25a8da985d322177fdc2dbc451d4271c449a7a6f]
Sep 09 20:30:29 kernel:  iwl_read_prph+0x1d/0x60 [iwlwifi 25a8da985d322177fdc2dbc451d4271c449a7a6f]
Sep 09 20:30:29 kernel:  iwl_mvm_load_ucode_wait_alive+0x2d9/0x620 [iwlmvm 7d9113127caff2df016f1a19aad637aa20200412]
[...]
Sep 09 20:30:29 kernel: ---[ end trace 0000000000000000 ]---
Sep 09 20:30:29 kernel: iwlwifi 0000:08:00.0: iwlwifi transaction failed, dumping registers
[...]

See attachment for full log.

Boot after revert to 6.4.12 with working firmware:
$ journalctl --no-hostname -b _KERNEL_DEVICE=+pci:${$(lspci -Dd::280)[(w)1]} + _TRANSPORT=kernel > dmesg64.log

[..attached..]

The interesting bit:
Sep 09 20:48:23 kernel: iwlwifi 0000:08:00.0: loaded firmware version 78.3bfdc55f.0 ty-a0-gf-a0-78.ucode op_mode iwlmvm


$ pacman -Ql linux-firmware | grep ty.a0.gf.a0                            
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-59.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-66.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-72.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-73.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-74.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-77.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-78.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-79.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-81.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0-83.ucode.zst
linux-firmware /usr/lib/firmware/iwlwifi-ty-a0-gf-a0.pnvm.zst
Comment 1 Ronan Pigott 2023-09-10 04:24:10 UTC
Created attachment 305077 [details]
journal from linux 6.4.12 where firmware 78 loads successfully
Comment 2 Ronan Pigott 2023-09-10 04:25:43 UTC
Most likely since:

commit 399762de769c4ec7d82220feb83de9bca30e5ef0
Author: Gregory Greenman <gregory.greenman@intel.com>
Date:   Wed Jun 21 13:12:19 2023 +0300

    wifi: iwlwifi: bump FW API to 83 for AX/BZ/SC devices
    
    Start supporting API version 83 for new devices.
    
    Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
    Link: https://lore.kernel.org/r/20230621130444.267a136ea57f.Iaef9f04b9655c5c1b8bdee3b89cc3361ab621bcf@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>
Comment 3 Bagas Sanjaya 2023-09-11 00:39:31 UTC
(In reply to Ronan Pigott from comment #0)
> Created attachment 305076 [details]
> journal from linux 6.5.2 where firmware 83 fails to load
> 
> In linux 6.5.2, loading firmware version 83 on AX210 device appears to fail
> (repeatedly) with a timeout. Reverting to linux 6.4.12 successfully loads
> firmware 78.
> 
> My nic:
> $ lspci -kd::280
> 08:00.0 Network controller: Intel Corporation Wi-Fi 6 AX210/AX211/AX411
> 160MHz (rev 1a)
>       Subsystem: Rivet Networks Wi-Fi 6 AX210/AX211/AX411 160MHz
>       Kernel driver in use: iwlwifi
>       Kernel modules: iwlwifi
> 
> The failing kernel:
> $ pacman -Qp /var/cache/pacman/pkg/linux-6.5.2.arch1-1-x86_64.pkg.tar.zst 
> linux 6.5.2.arch1-1
> 

Please test the current mainline (v6.6-rc1).
Comment 4 Ronan Pigott 2023-09-11 22:50:39 UTC
Created attachment 305091 [details]
journal from linux 6.6-rc1, firmware 83 loads successfully

Hi,

It didn't reproduce on 6.6-rc1 (0bb80ecc33a8) actually.

$ journalctl -b --no-hostname _KERNEL_DEVICE=+pci:${$(lspci -Dd::280)[(w)1]} + _TRANSPORT=kernel > dmesg66.log

[..attached..]
Comment 5 Ronan Pigott 2023-09-11 23:02:33 UTC
Created attachment 305092 [details]
journal from linux 6.5.2, firmware 83 loads successfully this time

In fact, it seems to be working on 6.5.2 atm as well... Maybe it was some kind of spurious error? It reproduced a couple times before, but now it doesn't seem to. Attached another log from 6.5.2.

I will report if it happens again but maybe it was just a false alarm?
Comment 6 Elena 2023-09-19 10:15:05 UTC
I have a similar problem, not sure if I should log a different bug as it's for a different card AX201. I suspect it's to do with the Wi-Fi 7 support being added from 6.5? 
Same thing, it's fine in 6.4.12, not fine since 6.5. We've tried 3 releases so far, but may have to try the latest one, which is not available in local repos and will be harder to push to users.  

[   10.025581] iwlwifi 0000:00:14.3: enabling device (0000 -> 0002)
[   10.026887] iwlwifi 0000:00:14.3: Detected crf-id 0x3617, cnv-id 0x20000302 wfpm id 0x80000000
[   10.026907] iwlwifi 0000:00:14.3: PCI dev a0f0/4070, rev=0x351, rfid=0x10a100
[   10.027017] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-78.ucode failed with error -2
[   10.027033] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-77.ucode failed with error -2
[   10.027042] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-76.ucode failed with error -2
[   10.027048] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-75.ucode failed with error -2
[   10.027056] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-74.ucode failed with error -2
[   10.027062] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-73.ucode failed with error -2
[   10.027068] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-72.ucode failed with error -2
[   10.027074] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-71.ucode failed with error -2
[   10.027081] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-70.ucode failed with error -2
[   10.027087] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-69.ucode failed with error -2
[   10.027092] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-68.ucode failed with error -2
[   10.027098] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-67.ucode failed with error -2
[   10.027103] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-66.ucode failed with error -2
[   10.027110] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-65.ucode failed with error -2
[   10.027117] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-64.ucode failed with error -2
[   10.027135] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-63.ucode failed with error -2
[   10.027143] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-62.ucode failed with error -2
[   10.027152] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-61.ucode failed with error -2
[   10.027161] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-60.ucode failed with error -2
[   10.027171] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-59.ucode failed with error -2
[   10.027181] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-58.ucode failed with error -2
[   10.027190] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-57.ucode failed with error -2
[   10.027200] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-56.ucode failed with error -2
[   10.027210] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-55.ucode failed with error -2
[   10.027219] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-54.ucode failed with error -2
[   10.027228] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-53.ucode failed with error -2
[   10.027237] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-52.ucode failed with error -2
[   10.027246] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-51.ucode failed with error -2
[   10.027255] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-50.ucode failed with error -2
[   10.027265] iwlwifi 0000:00:14.3: Direct firmware load for iwlwifi-QuZ-a0-hr-b0-49.ucode failed with error -2
[   10.033774] iwlwifi 0000:00:14.3: TLV_FW_FSEQ_VERSION: FSEQ Version: 43.2.23.17
[   10.033779] iwlwifi 0000:00:14.3: Found debug destination: EXTERNAL_DRAM
[   10.033780] iwlwifi 0000:00:14.3: Found debug configuration: 0
[   10.033979] iwlwifi 0000:00:14.3: loaded firmware version 48.4fa0041f.0 QuZ-a0-hr-b0-48.ucode op_mode iwlmvm
[   10.155575] iwlwifi 0000:00:14.3: Detected Intel(R) Wi-Fi 6 AX201 160MHz, REV=0x351
[   10.168122] iwlwifi 0000:00:14.3: Applying debug destination EXTERNAL_DRAM
[   10.168847] iwlwifi 0000:00:14.3: Allocated 0x00400000 bytes for firmware monitor.
[   10.258715] iwlwifi 0000:00:14.3: Detected RF HR B3, rfid=0x10a100
[   10.319155] iwlwifi 0000:00:14.3: base HW address: cc:15:31:32:46:fa
[   10.787739] iwlwifi 0000:00:14.3: Applying debug destination EXTERNAL_DRAM
[   10.934814] iwlwifi 0000:00:14.3: FW already configured (0) - re-configuring
[   10.939279] iwlwifi 0000:00:14.3: Registered PHC clock: iwlwifi-PTP, with index: 1
[   10.988060] iwlwifi 0000:00:14.3: Applying debug destination EXTERNAL_DRAM
[   11.134901] iwlwifi 0000:00:14.3: FW already configured (0) - re-configuring
Comment 7 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-09-29 11:51:58 UTC
(In reply to Ronan Pigott from comment #5)
> I will report if it happens again but maybe it was just a false alarm?

Maybe that or maybe your linux-fimware package was updated in between.
Comment 8 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-09-29 11:53:47 UTC
(In reply to Elena from comment #6)
> I have a similar problem, not sure if I should log a different bug

Different bug, as explained here:
https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/#you-reported-your-issue-in-a-reply-to-an-earlier-report
Afterwards mentioned it here. And please let us know if your linux-firmware package is update (and if updating it helps)
Comment 9 Pawel 2023-09-30 05:59:43 UTC
Created attachment 305164 [details]
crash logs 1
Comment 10 Pawel 2023-09-30 06:09:35 UTC
Whatever the problem is, it seems to manifest in different ways.
Things work quite well on 6.4.14 (loads firmware 77).
The don't on 6.5.5 (loads firmware 81).
I'm on F37, linux-firmware is linux-firmware-20230919-1.fc37.noarch, no updates available ATM.

I've attached the crash for the first boot attempt.
For the second, the kernel just GPFed.
I couldn't preserve the trace, but I swear whatever output I saw on the console indicated it came from the iwlwifi driver.

Sep 29 22:18:56 hornet kernel: BUG: kernel NULL pointer dereference, address: 0000000000000048
Sep 29 22:18:56 hornet kernel: #PF: supervisor write access in kernel mode
Sep 29 22:18:56 hornet kernel: #PF: error_code(0x0002) - not-present page

I'm gonna boot 6.4x for now, but let me know if I can help debug this in any way.
Comment 11 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-09-30 06:38:54 UTC
(In reply to Pawel from comment #10)
> Whatever the problem is,

Could you please create a separate report for this? That would help a lot, as this might be a different issue. Best to keep them separated, otherwise things can get messy quickly
Comment 12 Pawel 2023-10-02 06:38:28 UTC
(In reply to The Linux kernel's regression tracker (Thorsten Leemhuis) from comment #11)
> (In reply to Pawel from comment #10)
> > Whatever the problem is,
> 
> Could you please create a separate report for this? That would help a lot,
> as this might be a different issue. Best to keep them separated, otherwise
> things can get messy quickly

sure, #217963
Comment 13 The Linux kernel's regression tracker (Thorsten Leemhuis) 2023-10-04 09:17:38 UTC
FWIW, this patch is supposed to fix a bug that might cause the problems some of you are seeing that use a older linux-fimware package:

https://lore.kernel.org/linux-wireless/20230926165546.086e635fbbe6.Ia660f35ca0b1079f2c2ea92fd8d14d8101a89d03@changeid/

Not yet in mainline (and thus not yet en route to 6.5.y), but hopefully soon.
Comment 14 Lev Lybin 2024-03-06 13:46:10 UTC
Hello, the same problem.
Kernel: 6.7.8
dmesg: https://0x0.st/H7xy.txt
Comment 15 The Linux kernel's regression tracker (Thorsten Leemhuis) 2024-03-07 12:25:10 UTC
(In reply to Lev Lybin from comment #14)
> Hello, the same problem.

Or maybe a totally different issue. That's why you likely want to report it separately (CCme is it is a regression), as there is otherwise a big risk that it will be ignored or not even be seen by the right people; see "You reported the problem in a reply to an earlier report" on https://linux-regtracking.leemhuis.info/post/frequent-reasons-why-linux-kernel-bug-reports-are-ignored/

Note You need to log in before you can comment on or make changes to this bug.