Bug 215787 - mt7921: panic on unbind - drivers/net/wireless/mediatek/mt76/mt7921/pci.c:mt7921_pci_remove
Summary: mt7921: panic on unbind - drivers/net/wireless/mediatek/mt76/mt7921/pci.c:mt7...
Status: RESOLVED PATCH_ALREADY_AVAILABLE
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: AMD Linux
: P1 high
Assignee: drivers_network-wireless@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2022-04-01 03:49 UTC by Thiner Logoer
Modified: 2023-03-08 19:50 UTC (History)
2 users (show)

See Also:
Kernel Version: latest as in 2022-04-01 and 5.17 5.16 5.15 ...
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Thiner Logoer 2022-04-01 03:49:22 UTC
The issue is reported at: https://github.com/QubesOS/qubes-issues/issues/7294

`mt7921_pci_remove` seems to always crash whenever it is called.

This can be reproduced by echoing the pci address of wifi device (for example `0000:00:07.0`) to `/sys/bus/pci/drivers/mt7921e/unbind`.

I have read the kernel source code and have a guess.

```
static void mt7921_pci_remove(struct pci_dev *pdev)
{
	struct mt76_dev *mdev = pci_get_drvdata(pdev);
	struct mt7921_dev *dev = container_of(mdev, struct mt7921_dev, mt76);

	mt7921e_unregister_device(dev);
	devm_free_irq(&pdev->dev, pdev->irq, dev);
	pci_free_irq_vectors(pdev);
}
```

From my newbie kernel knowledge I suspect that `mt7921_pci_remove` should first call `devm_free_irq` and then `mt7921e_unregister_device`, due to the reason that `devm_free_irq` calls `free_irq` that "does not return until any executing interrupts for this IRQ have completed" according to the comment there, and that when IRQ for mt7921 is being handled, it 100% uses some fields in `dev`, so before that `dev` cannot be unregistered.

My original email is in https://lore.kernel.org/linux-wireless/153f1a0c.36a0.17fba8be75c.Coremail.logoerthiner1@163.com/T/ however it seems that maillist is not the correct place.
Comment 1 Iakunin Andrei 2022-05-13 15:35:05 UTC
I have the same issue with my ThinkPad E14 Gen 3 with AMD Ryzen 3 5300U.
Laptop did not wake up after suspend.

Device-2: MEDIATEK MT7921 802.11ax PCI Express Wireless Network Adapter driver: mt7921e
 

Proposed patch good for 5.18+ kernel branch, but in can be easy changed to use with kernels 5.17 and before. 
I patched my 5.15.25  kernel with it and it fix the problem.
Comment 2 Mario Limonciello (AMD) 2023-03-08 19:49:52 UTC
I came across this issue and wanted to share it was fixed in the mainline kernel 5.19 and later last year with this commit:

ad483ed9dd51 ("mt76: mt7921: fix kernel crash at mt7921_pci_remove")

Note You need to log in before you can comment on or make changes to this bug.