Bug 202399
Summary: | iwlwifi: 8260: wireless module found on cold boot but not on reboot | ||
---|---|---|---|
Product: | Drivers | Reporter: | Sylvain Leroux (sylvain) |
Component: | network-wireless-intel | Assignee: | DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi) |
Status: | ASSIGNED --- | ||
Severity: | normal | CC: | bjorn, hkallweit1, luca, sylvain |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.9 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg after a cold-boot
dmesg after reboot hwinfo after a cold-boot hwinfo after a reboot lspci -xxxvvv after coldboot lspci -xxxvvv after reboot acpidump (after coldboot FWIW) |
Description
Sylvain Leroux
2019-01-24 14:38:41 UTC
Created attachment 280721 [details]
dmesg after reboot
Created attachment 280723 [details]
hwinfo after a cold-boot
Created attachment 280725 [details]
hwinfo after a reboot
Attached above are the dmesg/hwinfo gathered after a cold- and warm-boot.
The Intel 8260 is on /devices/pci0000:00/0000:00:13.1.
I am running the 4.9 kernel on a Debian Stretch distribution:
> $uname -a
> Linux nosferapti 4.9.0-8-amd64 #1 SMP Debian 4.9.130-2 (2018-10-27) x86_64
> GNU/Linux
Let me know if you need more info.
Regards,
- Sylvain
Can you please add the output of sudo lspci -xxxvvv after a cold boot and after reboot? Thanks. Created attachment 280727 [details]
lspci -xxxvvv after coldboot
Created attachment 280729 [details]
lspci -xxxvvv after reboot
Thanks for your interest in this issue Emmanuel. I attached the `lspci -xxxvvv` output above. FWIW, as a workaround, adding the `reboot=pci` boot kernel option ensures the 8260 to be consistently detected after a reboot. Adding the PCI maintainer. There have been fixes in PCI lately. Those fixes aren't included in mainline yet. Could you please test mainline + patches from: https://bugzilla.kernel.org/show_bug.cgi?id=201469#c48 Thanks. I doubt the ASPM fixes from bug 201469 are related, but this does seem to have some PCI wrinkle to it. This system has: 00:13.1 bridge to [bus 02] 02:00.0 iwlwifi NIC After the reboot, the 00:13.1 bridge has its secondary bus number programmed (02), but we didn't enumerate 02:00.0 and the bridge memory window is left disabled: - Memory behind bridge: a1300000-a13fffff + Memory behind bridge: fff00000-000fffff After the reboot, BIOS would enumerate PCI devices and configure things, then Linux would enumerate everything again. My guess is that the NIC isn't responding to config reads after the reboot. That would mean the BIOS wouldn't find the NIC, so it would leave the bridge window disabled, and Linux also wouldn't find the NIC. If the NIC doesn't respond to config accesses, Linux doesn't know it even exists, so the possibilities for a workaround are somewhat limited. You might be able to fiddle with this theory by attempting another reset of the NIC by asserting the bridge's Secondary Bus Reset bit, e.g., # setpci -s00:13.1 BRIDGE_CONTROL.w=0x40 # setpci -s00:13.1 BRIDGE_CONTROL.w=0x00 # echo 1 > /sys/bus/pci/rescan If that's the case, then it may be an "integration problem". I don't know much about all this, but I heard some stuff about the fact that the device needs the platform / BIOS to write to a specific place on the device upon certain flows. I don't have more details about this, but when we have such bugs on self made systems, it may be related. The platform/BIOS has no way to touch the NIC itself if it's not responding to config reads. There could certainly be something else in the chipset, e.g., in the ICH, that is relevant. We do have several DMI quirks that use set_pci_reboot() to essentially do "reboot=pci" automatically. We could add a similar quirk for this system. I don't think that's *ideal* because presumably Windows reboots cleanly without such a quirk, and there may be many systems with this issue and we may be adding such quirks frequently. But maybe that's the only option, since we don't know any other way to fix this. ACPI provides information about how to reboot, so it'd be nice to have an acpidump attached here just in case we can figure out a more generic fix in the future. Sylvain do you want to provide the required information here? I am not sure we'll be able to do much without this. If not, I'll close. I think the dmesg log has enough information to write the quirk. The acpidump is a "nice to have" for possible future improvements. If we do go the quirk route, who wants to write it? It's not hard and I think there are existing ones we can copy, but it would take me a couple days before I have a chance. Emmanuel, I do not have access to the device for now. Probably next week. Sorry for the delay. (In reply to Bjorn Helgaas from comment #16) > I think the dmesg log has enough information to write the quirk. The > acpidump is a "nice to have" for possible future improvements. If we do go > the quirk route, who wants to write it? It's not hard and I think there are > existing ones we can copy, but it would take me a couple days before I have > a chance. I don't really know how to do that. I can learn, sure, but it'll have to wait since I am busy as well. Another thing is that I am not even sure we need to go there. After all, this system is a self made system and we probably don't want to add a quirk for every system a user may build? From Emmanuel Grumbach in comment 18: > Another thing is that I am not even sure we need to go there. After all, this > system is a self made system [...] Well, it all depends on your definition of a self-made system. It is not some exotic system made of brandless parts. The motherboard is an ASRock with embedded Intel CPU. The only real change compared to the stock motherboard was the addition of a couple of RAM and WiFi modules. From Bjorn Helgaas in comment 14: > I don't think that's *ideal* because presumably Windows reboots cleanly > without such a quirk, and there may be many systems with this issue and we > may be adding such quirks frequently. But maybe that's the only option, > since we don't know any other way to fix this. > > ACPI provides information about how to reboot, so it'd be nice to have an > acpidump attached here just in case we can figure out a more generic fix in > the future. Surely, a generic fix should be better. I will make the necessary to send you the `acpidump` you've requested ASAP. That being said, if the only solution is to add a system-specific quirk, well, at least that would make the system work "out of the box". Created attachment 281353 [details]
acpidump (after coldboot FWIW)
We need to double-check whether there's anything we can do about this. I have the same issue with an AX210 card on a Zotac ZBOX CI327 nano (N3450 CPU, linux-next kernel from 11/27/2020). Card isn't listed by lspci after a reboot, reboot=pci helps. So far I go with the following private change, however it may be unfair to blame the system if the root cause should be the cards behavior. diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index db115943e..9991c5920 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -477,6 +477,15 @@ static const struct dmi_system_id reboot_dmi_table[] __initconst = { }, }, + { /* PCIe Wifi card isn't detected after reboot otherwise */ + .callback = set_pci_reboot, + .ident = "Zotac ZBOX CI327 nano", + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "NA"), + DMI_MATCH(DMI_PRODUCT_NAME, "ZBOX-CI327NANO-GS-01"), + }, + }, + /* Sony */ { /* Handle problems with rebooting on Sony VGN-Z540N */ .callback = set_bios_reboot, -- 2.29.2 |