Bug 217506

Summary: mt7921e 0000:02:00.0: hardware init failed
Product: Drivers Reporter: Kai-Heng Feng (kai.heng.feng)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED CODE_FIX    
Severity: high CC: aros, deren.g, max.lee, stanislaw.barzowski
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
ftrace
lspci

Description Kai-Heng Feng 2023-05-29 23:35:11 UTC
This is not a regression, tried on all kernel versions.

MCU init doesn't work:
[59010.889773] mt7921e 0000:02:00.0: Message 00000010 (seq 1) timeout
[59010.889786] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59014.217839] mt7921e 0000:02:00.0: Message 00000010 (seq 2) timeout
[59014.217852] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59017.545880] mt7921e 0000:02:00.0: Message 00000010 (seq 3) timeout
[59017.545893] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59020.874086] mt7921e 0000:02:00.0: Message 00000010 (seq 4) timeout
[59020.874099] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59024.202019] mt7921e 0000:02:00.0: Message 00000010 (seq 5) timeout
[59024.202033] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59027.530082] mt7921e 0000:02:00.0: Message 00000010 (seq 6) timeout
[59027.530096] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59030.857888] mt7921e 0000:02:00.0: Message 00000010 (seq 7) timeout
[59030.857904] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59034.185946] mt7921e 0000:02:00.0: Message 00000010 (seq 8) timeout
[59034.185961] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59037.514249] mt7921e 0000:02:00.0: Message 00000010 (seq 9) timeout
[59037.514262] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59040.842362] mt7921e 0000:02:00.0: Message 00000010 (seq 10) timeout
[59040.842375] mt7921e 0000:02:00.0: Failed to get patch semaphore
[59040.923845] mt7921e 0000:02:00.0: hardware init failed
Comment 1 Kai-Heng Feng 2023-05-29 23:35:44 UTC
Created attachment 304352 [details]
dmesg
Comment 2 Kai-Heng Feng 2023-05-29 23:36:10 UTC
Created attachment 304353 [details]
ftrace
Comment 3 Kai-Heng Feng 2023-05-29 23:37:36 UTC
Created attachment 304354 [details]
lspci
Comment 4 Deren Wu 2023-05-30 03:19:30 UTC
(In reply to Kai-Heng Feng from comment #0)
> This is not a regression, tried on all kernel versions.
> 
> MCU init doesn't work:
> [59010.889773] mt7921e 0000:02:00.0: Message 00000010 (seq 1) timeout
> [59010.889786] mt7921e 0000:02:00.0: Failed to get patch semaphore
> [59014.217839] mt7921e 0000:02:00.0: Message 00000010 (seq 2) timeout
> [59014.217852] mt7921e 0000:02:00.0: Failed to get patch semaphore
.....
> [59040.923845] mt7921e 0000:02:00.0: hardware init failed

Do the issue show in all HW platform? or just some specific HW configuration?


Could you please try to force disable ASPM to check if the problem is still there or not?
example : insmod mt7921-common.ko #disable_clc=1
Comment 5 Kai-Heng Feng 2023-05-30 03:55:41 UTC
> insmod mt7921-common.ko disable_clc=1
Doesn't help. The same issue remains.
Comment 6 Deren Wu 2023-05-30 04:31:50 UTC
(In reply to Kai-Heng Feng from comment #5)
> > insmod mt7921-common.ko disable_clc=1
> Doesn't help. The same issue remains.

Sorry, I provided a wrong command.
Should be : insmod mt7921e.ko disable_aspm=1

Please help to try again.

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/net/wireless/mediatek/mt76/mt7921/pci.c#n28
Comment 7 Kai-Heng Feng 2023-05-30 04:52:15 UTC
That doesn't help, either. Disabling the ASPM entirely (i.e. performance for pcie_aspm policy) doesn't help either.
Comment 8 Artem S. Tashkinov 2023-07-14 07:59:29 UTC
Count me in:

This is what I'm getting on boot with 6.5-rc1:

mt7921e 0000:01:00.0: ASIC revision: 79220010
mt7921e 0000:01:00.0: Message 00000010 (seq 1) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 2) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 3) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 4) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 5) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 6) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 7) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 8) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 9) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: Message 00000010 (seq 10) timeout
mt7921e 0000:01:00.0: Failed to get patch semaphore
mt7921e 0000:01:00.0: hardware init failed

Older released kernels show the same errors. Windows 10 works just fine.

01:00.0 Network controller: MEDIATEK Corp. MT7922 802.11ax PCI Express Wireless Network Adapter
        Subsystem: Foxconn International, Inc. Device e0db
        Flags: bus master, fast devsel, latency 0, IRQ 121, IOMMU group 12
        Memory at 2810900000 (64-bit, prefetchable) [size=1M]
        Memory at 94c00000 (64-bit, non-prefetchable) [size=32K]
        Capabilities: [80] Express Endpoint, MSI 00
        Capabilities: [e0] MSI: Enable+ Count=1/32 Maskable+ 64bit+
        Capabilities: [f8] Power Management version 3
        Capabilities: [100] Vendor Specific Information: ID=1556 Rev=1 Len=008 <?>
        Capabilities: [108] Latency Tolerance Reporting
        Capabilities: [110] L1 PM Substates
        Capabilities: [200] Advanced Error Reporting
        Kernel driver in use: mt7921e
        Kernel modules: mt7921e

Trying to rmmod the driver and modprobing it results in a system crash (can't see anything on the screen, the system freezes right away - probably a kernel panic).

https://unix.stackexchange.com/questions/723828/mt7921e-00000900-0-hardware-init-failed
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2006797

I cannot unplug my battery unfortunately.
Comment 9 Artem S. Tashkinov 2023-07-14 08:06:33 UTC
There's a patch which mentions this issue specifically but I'm not sure if Fedora's 6.5-rc1 kernel includes it or if it requires newer firmware:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/drivers/net/wireless/mediatek/mt76/mt7921?id=525c469e5de9bf7e53574396196e80fc716ac9eb

commit	525c469e5de9bf7e53574396196e80fc716ac9eb
mt7921e: fix init command fail with enabled device
Comment 10 Artem S. Tashkinov 2023-07-14 08:17:42 UTC
Fedora's 6.5-rc1 kernel doesn't yet include this patch. Darn, I absolutely need to use the laptop and I don't know what to do. There's no LAN either.

Running Fedora's rawhide kernel is not an option either. It's full of debugging, it's slow and rc1 doesn't sound even remotely stable.
Comment 11 Artem S. Tashkinov 2023-07-14 09:40:22 UTC
Considering Bluetooth works atrociously as it often disconnects, maximum speed is 700kbps, there's up to 4000ms delays for pings - that's not how Bluetooth 5.3 should work, I'll probably replace my Laptop WiFi M.2 module with Intel's AX210. Mediatek seemingly doesn't care much about Linux.
Comment 12 Artem S. Tashkinov 2023-07-20 08:44:10 UTC
Must be fixed in 6.4.4.