Bug 214557

Summary: mt7921e: probe of 0000:05:00.0 failed with error -110
Product: Drivers Reporter: Mike Lothian (mike)
Component: network-wirelessAssignee: drivers_network-wireless (drivers_network-wireless)
Status: RESOLVED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: bruno.n.pagani, jifengshenmo, kernel, mapengyu, mike, realskorpion, sean.mcauliffe, sephiroth_pk, thanhdatwarriorok, urugang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.15-rc2 Subsystem:
Regression: No Bisected commit-id:
Attachments: Full dmesg
Bug

Description Mike Lothian 2021-09-28 10:56:03 UTC
Created attachment 298995 [details]
Full dmesg

Sometimes when I reboot my laptop the driver doesn't initialise properly

mt7921e 0000:05:00.0: enabling device (0000 -> 0002)
mt7921e 0000:05:00.0: disabling ASPM  L1
mt7921e 0000:05:00.0: ASIC revision: ff0c0000
mt7921e: probe of 0000:05:00.0 failed with error -110

lspci -nn

05:00.0 Network controller [0280]: MEDIATEK Corp. Device [14c3:7961]

Rebooting tends not to fix things, but switching off then switching off works

This is on a new laptop and I'm not sure if it's a regression

The module and firmware is built into the kernel and it's a small EFI stub setup

Here is the successful messages after a cold start:

mt7921e 0000:05:00.0: enabling device (0000 -> 0002)
mt7921e 0000:05:00.0: disabling ASPM  L1
mt7921e 0000:05:00.0: ASIC revision: 79610010
mt7921e 0000:05:00.0: HW/SW Version: 0x8a108a10, Build Time: 20210612122717a
mt7921e 0000:05:00.0: WM Firmware Version: ____010000, Build Time: 20210612122753
mt7921e 0000:05:00.0: Firmware init done
<info>  [1632825882.0267] rfkill0: found Wi-Fi radio killswitch (at /sys/devices/pci0000:00/0000:00:02.2/0000:05:00.0/ieee80211/phy0/rfkill0) (driver mt7921e)
Comment 1 Pengyu Ma 2021-10-21 15:26:17 UTC
error in log:
mt7921e 0000:05:00.0: ASIC revision: ff0c0000

It could be a fw error.
Please try latest fw.
Comment 2 Mike Lothian 2021-10-21 15:28:21 UTC
I've already tested when I noticed it had been updated, no improvement unfortunately
Comment 3 Mike Lothian 2021-10-25 03:00:38 UTC
I've also tested the updates for the next kernel here:

https://github.com/nbd168/wireless

But the issue still happens:

Oct 25 03:56:21 axion.fireburn.co.uk kernel: mt7921e 0000:05:00.0: enabling device (0000 -> 0002)
Oct 25 03:56:21 axion.fireburn.co.uk kernel: mt7921e 0000:05:00.0: ASIC revision: ff000000
Oct 25 03:56:21 axion.fireburn.co.uk kernel: mt7921e: probe of 0000:05:00.0 failed with error -110
Comment 4 Mike Lothian 2021-12-03 09:56:52 UTC
Re-tested with all the patches at https://github.com/nbd168/wireless which I guess is for 5.17 - same issue
Comment 5 Mike Simos 2021-12-06 20:43:35 UTC
I'm running into the same problem. On reboot with my Asus g533qs I get:

Dec  6 08:40:44 localhost kernel: [   12.230358] mt7921e 0000:03:00.0: enabling device (0000 -> 0002)
Dec  6 08:40:44 localhost kernel: [   12.230692] mt7921e 0000:03:00.0: disabling ASPM  L1
Dec  6 08:40:44 localhost kernel: [   12.230751] mt7921e 0000:03:00.0: ASIC revision: 79610010
Dec  6 08:40:44 localhost kernel: [   13.304760] mt7921e: probe of 0000:03:00.0 failed with error -110

However when I power off and power on, I get:

Dec  6 08:42:37 localhost kernel: [   11.394899] mt7921e 0000:03:00.0: enabling device (0000 -> 0002)
Dec  6 08:42:37 localhost kernel: [   11.395114] mt7921e 0000:03:00.0: disabling ASPM  L1
Dec  6 08:42:37 localhost kernel: [   11.395143] mt7921e 0000:03:00.0: ASIC revision: 79610010
Dec  6 08:42:37 localhost kernel: [   11.474166] mt7921e 0000:03:00.0: HW/SW Version: 0x8a108a10, Build Time: 20211014150838a
Dec  6 08:42:37 localhost kernel: [   11.731801] mt7921e 0000:03:00.0: WM Firmware Version: ____010000, Build Time: 20211014150922
Dec  6 08:42:37 localhost kernel: [   11.757965] mt7921e 0000:03:00.0: Firmware init done
Dec  6 08:42:37 localhost kernel: [   12.574961] mt7921e 0000:03:00.0 wlp3s0: renamed from wlan0


This problem doesn't happen if I swap out the wifi card and use an Intel AX200NGW.So issue seems to be isolated to the Mediatek 7921.
Comment 6 urugang 2021-12-13 22:30:02 UTC
for 5.15.7,only when i need shutdown machine, and unplug power cable,then power on, driver can init wlan0 successdully.
Comment 7 tuwuna 2022-01-02 06:24:51 UTC
Same issue here. I have to do a cold boot without charger plugged in to restore wifi

Discussion on github: https://github.com/openwrt/mt76/issues/548
Comment 8 tuwuna 2022-01-02 06:31:52 UTC
The problem reappears everytime I reboot with charger plugged in.
Comment 10 Mike Lothian 2022-01-20 09:25:06 UTC
When booting with this patch applied to Linus's tree, the boot stops with a BUG
Comment 11 Mike Lothian 2022-01-20 09:25:34 UTC
Created attachment 300291 [details]
Bug
Comment 12 Mike Lothian 2022-01-20 09:48:30 UTC
I don't see this issue if I compile mt76 as a module
Comment 13 Mike Lothian 2022-01-25 22:57:14 UTC
That patch does fix the issue, the bug I was seeing was from an unrelated issue

The fix for that is https://patchwork.kernel.org/project/linux-usb/patch/20220124090228.41396-2-heikki.krogerus@linux.intel.com/
Comment 14 Bruno Pagani 2022-02-12 21:51:45 UTC
Would it be possible to backport it to current kernel lines? Because AFAICS, it is merged for 5.17 that will not be out before one month likely. Also, not all distros upgrade kernel to newer lines, so…
Comment 15 Mike Lothian 2022-02-12 23:08:15 UTC
This patch should work against 5.17-rc

https://raw.githubusercontent.com/FireBurn/KernelStuff/master/04-mt76.patch

git am 04-mt76.patch

And should get warm reboots working 

Not sure what would be needed to get this working on 5.16

The patches I think are due for 5.18 😭
Comment 16 Bruno Pagani 2022-02-13 07:50:45 UTC
No, you can see here https://github.com/torvalds/linux/commit/7817adb03cfb52ebb5bdb25fd9fc8f683a1a09d9 that it has been published as part of 5.17-rc2 (and thus if you use -rc2 or -rc3 instead of -rc, you don’t need to manually add the patch). So it should be in 5.17 release.
Comment 17 shenmo 2022-02-15 00:44:17 UTC
Thanks a lot for the solution.
I'll wait for the 5.17 release
Comment 18 Riccardo Robecchi 2022-03-23 09:49:02 UTC
As far as my ability to read code goes, I would say that the commit linked by Bruno does not have anything to do with this issue, as it looks to be about USB Type-C and not about MediaTek network devices. My apologies if I am mistaken.

The issue appears to be there still with Linux 5.17.0. dmesg says:
mt7921e 0000:04:00.0: enabling device (0000 -> 0002)
mt7921e 0000:04:00.0: ASIC revision: 79610010
mt7921e: probe of 0000:04:00.0 failed with error -110

This is on my desktop system which includes an ASUS ROG Strix X570-E motherboard.
Comment 19 shenmo 2022-03-23 10:04:34 UTC
(In reply to Riccardo Robecchi from comment #18)
> As far as my ability to read code goes, I would say that the commit linked
> by Bruno does not have anything to do with this issue, as it looks to be
> about USB Type-C and not about MediaTek network devices. My apologies if I
> am mistaken.
> 
> The issue appears to be there still with Linux 5.17.0. dmesg says:
> mt7921e 0000:04:00.0: enabling device (0000 -> 0002)
> mt7921e 0000:04:00.0: ASIC revision: 79610010
> mt7921e: probe of 0000:04:00.0 failed with error -110
> 
> This is on my desktop system which includes an ASUS ROG Strix X570-E
> motherboard.

It's so sad...

So the bug is still unsolved? 
Plus, I noticed that in some situation, for example, when I reboot my laptop from Linux to start Windows, the device goes down too. It seems that it is a bug from hardware.
Comment 20 Mike Lothian 2022-03-23 10:10:35 UTC
The patch that fixes this will be included in 5.18-rc1, I've asked about it being backported but I've not heard back

Alternativly you can apply https://raw.githubusercontent.com/FireBurn/KernelStuff/master/04-mt76.patch onto 5.17
Comment 21 Bruno Pagani 2022-03-23 11:11:49 UTC
(In reply to Riccardo Robecchi from comment #18)
> As far as my ability to read code goes, I would say that the commit linked
> by Bruno does not have anything to do with this issue, as it looks to be
> about USB Type-C and not about MediaTek network devices. My apologies if I
> am mistaken.

Yes sorry, I mixed things up with the other patch linked by Mike about his unrelated issue…