Bug 217570 - ath11k: QCN9074: doesn't work in a VM
Summary: ath11k: QCN9074: doesn't work in a VM
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: All Linux
: P5 normal
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on: 216055
Blocks:
  Show dependency tree
 
Reported: 2023-06-17 20:23 UTC by Nazar Mokrynskyi
Modified: 2023-11-14 02:52 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Nazar Mokrynskyi 2023-06-17 20:23:14 UTC
I'm trying to get Wallys DR9074-6E card working under KVM VM and unfortunately it doesn't work even with 6.3 kernel.

I have provided extra details in this thread: https://forum.openwrt.org/t/qcn9074-doesnt-initialize-on-x86-64/163288

But basically if vfio-pci is bound to the device from boot in VM it fails like this:
[    6.168990] ath11k_pci 0000:01:00.0: BAR 0: assigned [mem 0xfc600000-0xfc7fffff 64bit]
[    6.182193] ath11k_pci 0000:01:00.0: MSI vectors: 1
[    6.186952] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
[    6.187843] ath11k_pci 0000:01:00.0: FW memory mode: 2
[    6.344359] ath11k_pci 0000:01:00.0: failed to set pcie link register 0x01e0e0a8: 0xffffffff != 0x00000010
[    6.345731] ath11k_pci 0000:01:00.0: failed to set sysclk: -110
[    6.386927] ath11k_pci 0000:01:00.0: link down error during global reset
[    6.399508] mhi mhi0: BHI offset: 0xffffffff is out of range: 0x200000
[    6.401949] ath11k_pci 0000:01:00.0: failed to prepare mhi: -22
[    6.403109] ath11k_pci 0000:01:00.0: failed to start mhi: -22
[    6.404164] ath11k_pci 0000:01:00.0: failed to power up :-22
[    6.435324] ath11k_pci 0000:01:00.0: failed to create soc core: -22
[    6.436631] ath11k_pci 0000:01:00.0: failed to init core: -22
[    6.868277] ath11k_pci: probe of 0000:01:00.0 failed with error -22

If ath11k_pci was loaded on the host, things change a bit, but still result in a failure (this is with kernel 6.3 on both host and guest machines):
[    6.117987] ath11k_pci 0000:01:00.0: BAR 0: assigned [mem 0xfc600000-0xfc7fffff 64bit]
[    6.136995] ath11k_pci 0000:01:00.0: MSI vectors: 1
[    6.138325] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
[    6.139120] ath11k_pci 0000:01:00.0: FW memory mode: 2
[    6.300925] mhi mhi0: Requested to power ON
[    6.301945] mhi mhi0: Power on setup success
[    6.474052] mhi mhi0: Wait for device to enter SBL or Mission mode
[    6.818322] kmodloader: done loading kernel modules from /etc/modules.d/*
[    6.842030] ath11k_pci 0000:01:00.0: chip_id 0x0 chip_family 0x0 board_id 0xa2 soc_id 0xffffffff
[    6.843913] ath11k_pci 0000:01:00.0: fw_version 0x290c8569 fw_build_timestamp 2023-03-25 06:50 fw_build_id
[    7.426195] ath: EEPROM regdomain: 0x8324
[    7.427010] ath: EEPROM indicates we should expect a country code
[    7.429907] ath: doing EEPROM country->regdmn map search
[    7.430921] ath: country maps to regdmn code: 0x3e
[    7.431771] ath: Country alpha2 being used: UA
[    7.432783] ath: Regpair used: 0x3e
[    7.433527] ath: regdomain 0x8324 dynamically updated by user
[    8.298726] ath11k_pci 0000:01:00.0: leaving PCI ASPM disabled to avoid MHI M2 problems
[    9.343586] ath11k_pci 0000:01:00.0: failed to receive control response completion, polling..
[    9.819256] ath10k_pci 0000:04:00.0: pdev param 0 not supported by firmware
[    9.843293] device phy0-ap0 entered promiscuous mode
[   10.383573] ath11k_pci 0000:01:00.0: Service connect timeout
[   10.384445] ath11k_pci 0000:01:00.0: failed to connect to HTT: -110
[   10.385551] ath11k_pci 0000:01:00.0: failed to start core: -110
Comment 1 Nazar Mokrynskyi 2023-06-20 04:59:03 UTC
I think https://bugzilla.kernel.org/show_bug.cgi?id=216055 has a similar issue
Comment 2 Cristian C 2023-06-20 13:07:44 UTC
Yes I agree this bug is generic for multiple platforms which use the ath11k driver. 
I can confirm that I experience the same problem for a WCN6855 card reported on this bug https://bugzilla.kernel.org/show_bug.cgi?id=216055 as the one reported here for QCN9074.
Comment 3 Kalle Valo 2023-07-11 09:30:04 UTC
Unfortunately I'm not really able to look at this bug in detail but patches very welcome.
Comment 4 Nazar Mokrynskyi 2023-07-29 17:57:13 UTC
Is there anything we (users) can run/test to collect useful information that would help to resolve this?
Comment 5 Nazar Mokrynskyi 2023-07-29 21:49:06 UTC
I tried to apply https://patchwork.kernel.org/project/linux-wireless/patch/20230601033840.2997-1-quic_bqiang@quicinc.com/ to latest OpenWRT snapshot and got this crash (unfortunately):

[    6.457703] ath11k_pci 0000:01:00.0: BAR 0: assigned [mem 0xfc600000-0xfc7fffff 64bit]
[    6.881308] ath11k_pci 0000:01:00.0: MSI vectors: 1
[    6.943131] ath11k_pci 0000:01:00.0: qcn9074 hw1.0
[    6.944100] ath11k_pci 0000:01:00.0: FW memory mode: 2
[    7.527115] mhi mhi0: Requested to power ON
[    7.541628] mhi mhi0: Power on setup success
[    7.710257] mhi mhi0: Image transfer failed
[    7.759374] mhi mhi0: Reg: ERROR_CODE value: 0xffffffff
[    7.765251] mhi mhi0: Reg: ERROR_DBG1 value: 0x5500000
[    7.767446] mhi mhi0: Reg: ERROR_DBG2 value: 0x0
[    7.769928] mhi mhi0: Reg: ERROR_DBG3 value: 0x80000
[    7.770867] mhi mhi0: MHI did not load image over BHI, ret: -5
[    7.772267] ------------[ cut here ]------------
[    7.773197] WARNING: CPU: 1 PID: 960 at free_irq+0x343/0x390
[    7.774187] Modules linked in: ath11k_pci(+) ath11k ath10k_pci ath10k_core ath pppox ppp_generic nft_reject_ipv6 nft_reject_ipv4 nft_reject_inet nft_reject_bridge nft_reject nft_redir nft_quota nft_objref nft_numgen nft_nat nft_meta_bridge nft_masq nft_log nft_limit nft_hash nft_flow_offload nft_fib_ipv6 nft_fib_ipv4 nft_fib nft_ct nft_counter nft_chain_nat nf_tables nf_nat nf_flow_table nf_conntrack_bridge nf_conntrack mac80211 lzo cfg80211 slhc r8169 qrtr_mhi qrtr qmi_helpers nfnetlink nf_reject_ipv6 nf_reject_ipv4 nf_log_syslog nf_defrag_ipv6 nf_defrag_ipv4 mhi lzo_rle lzo_decompress lzo_compress libcrc32c igc forcedeth e1000e crc_ccitt compat bnx2 i2c_dev ixgbe e1000 amd_xgbe mdio nls_utf8 nls_iso8859_1 nls_cp437 ena sha512_ssse3 sha512_generic seqiv jitterentropy_rng drbg michael_mic hmac cmac crypto_acompress igb vfat fat button_hotplug tg3 ptp realtek pps_core mii
[    7.785413] CPU: 1 PID: 960 Comm: kworker/u5:0 Not tainted 5.15.120 #0
[    7.786337] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.2-0-gea1b7a073390-prebuilt.qemu.org 04/01/2014
[    7.788169] Workqueue: mhi_hiprio_wq mhi_pm_st_worker [mhi]
[    7.789032] RIP: 0010:free_irq+0x343/0x390
[    7.789789] Code: 00 e9 cb fd ff ff be 03 00 00 00 48 89 d7 e8 a4 01 2d 00 e9 7b fe ff ff 49 8d bc 24 98 00 00 00 ff d0 0f 1f 00 e9 e1 fe ff ff <0f> 0b 49 c7 84 24 28 01 00 00 00 00 00 00 e9 a3 fd ff ff 4c 89 ef
[    7.792703] RSP: 0018:ffffc90000463d90 EFLAGS: 00010082
[    7.793644] RAX: 0000000000000880 RBX: ffff8880038d4980 RCX: ffff888004222298
[    7.794879] RDX: ffff888003fd3600 RSI: ffff8880038d4980 RDI: ffff888004a13c00
[    7.795945] RBP: ffffc90000463dc8 R08: ffffffff824ae4e0 R09: 0000000000000000
[    7.797165] R10: 0000000000000000 R11: ffffffff824ae4e8 R12: ffff888004a13c00
[    7.798295] R13: ffff888003fd3600 R14: ffff888004a13e08 R15: ffff888004a13d14
[    7.799500] FS:  0000000000000000(0000) GS:ffff88801f500000(0000) knlGS:0000000000000000
[    7.803546] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    7.804593] CR2: 00007faaeab71b7c CR3: 0000000004170004 CR4: 0000000000370ee0
[    7.805882] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    7.807200] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    7.808521] Call Trace:
[    7.809012]  <TASK>
[    7.809655]  ? show_regs.part.0+0x1e/0x24
[    7.810393]  ? show_regs.cold+0x8/0xd
[    7.811311]  ? __warn+0x74/0xf0
[    7.812042]  ? free_irq+0x343/0x390
[    7.812774]  ? report_bug+0x86/0xa0
[    7.813462]  ? handle_bug+0x38/0x90
[    7.814293]  ? exc_invalid_op+0x18/0x70
[    7.815119]  ? asm_exc_invalid_op+0x1b/0x20
[    7.815816]  ? free_irq+0x343/0x390
[    7.816486]  mhi_pm_st_worker+0x1af/0xab0 [mhi]
[    7.817357]  ? __switch_to+0x129/0x370
[    7.818240]  process_one_work+0x1f8/0x360
[    7.819013]  worker_thread+0x20f/0x410
[    7.819764]  ? process_one_work+0x360/0x360
[    7.820697]  kthread+0x128/0x150
[    7.821318]  ? set_kthread_struct+0x50/0x50
[    7.822037]  ret_from_fork+0x1f/0x30
[    7.822713]  </TASK>
[    7.823205] ---[ end trace ec40cee90c7b3c1a ]---
[    7.824220] ath11k_pci 0000:01:00.0: failed to power up mhi: -110
[    7.825730] ath11k_pci 0000:01:00.0: failed to start mhi: -110
[    7.827063] ath11k_pci 0000:01:00.0: failed to power up :-110
[    7.863504] ath11k_pci 0000:01:00.0: failed to create soc core: -110
[    7.864555] ath11k_pci 0000:01:00.0: failed to init core: -110
[    8.511716] ath11k_pci: probe of 0000:01:00.0 failed with error -110

At least it is different now.
Comment 6 Nazar Mokrynskyi 2023-11-14 02:52:57 UTC
We have found a way to get driver initialize in a VM with intremap and caching_mode activated in QEMU: https://forum.openwrt.org/t/qcn9074-doesnt-initialize-on-x86-64/163288

One user confirmed it as working on Debian, for me with Ubuntu 23.10.1 and latest OpenWRT snapshot it appears to initialize, but I AP created is not visible anywhere.

Here are kernel logs from boot:
```
[    6.423278] ath11k_pci 0000:04:00.0: BAR 0: assigned [mem 0xc9400000-0xc95fffff 64bit]
[    6.439883] ath11k_pci 0000:04:00.0: MSI vectors: 16
[    6.476537] ath11k_pci 0000:04:00.0: qcn9074 hw1.0
[    6.481163] ath11k_pci 0000:04:00.0: FW memory mode: 2
[    6.642931] mhi mhi0: Requested to power ON
[    6.652253] mhi mhi0: Power on setup success
[    6.763362] mhi mhi0: Wait for device to enter SBL or Mission mode
[    7.057705] ath11k_pci 0000:04:00.0: chip_id 0x0 chip_family 0x0 board_id 0xa2 soc_id 0xffffffff
[    7.063431] ath11k_pci 0000:04:00.0: fw_version 0x29010762 fw_build_timestamp 2023-08-02 19:50 fw_build_id 
[    7.133541] kmodloader: 1 module could not be probed
[    7.137747] kmodloader: - leds-mlxcpld - 0
[    8.751932] ath11k_pci 0000:04:00.0: htt event 48 not handled

```

Still looking for way to work around this fully, but maybe above information is helpful for maintainers to make it boot with default VM config.

Note You need to log in before you can comment on or make changes to this bug.