Bug 201319 - iwlwifi: 9260: can't get MAC READY after INIT DONE
Summary: iwlwifi: 9260: can't get MAC READY after INIT DONE
Status: CLOSED DOCUMENTED
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: Intel Linux
: P1 normal
Assignee: DO NOT USE - assign "network-wireless-intel" component instead
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-10-03 10:05 UTC by Andreas Backx
Modified: 2020-08-16 23:57 UTC (History)
12 users (show)

See Also:
Kernel Version: 4.18.10-arch1-1-ARCH
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Output of proposed dump_stack() call (4.86 KB, text/plain)
2019-01-18 19:40 UTC, Taneli Huuskonen
Details
Debug outputs, with and without bug (18.08 KB, application/octet-stream)
2019-03-03 09:32 UTC, Taneli Huuskonen
Details
Corrected debug output (1.46 KB, text/plain)
2019-03-03 10:29 UTC, Taneli Huuskonen
Details
Debug output again (1.47 KB, text/plain)
2019-03-03 11:00 UTC, Taneli Huuskonen
Details
debug-output_2019-03-03-16-00-CET.txt (1.47 KB, text/plain)
2019-03-03 15:07 UTC, David Niehues
Details
print fseq version (777 bytes, patch)
2019-05-17 12:50 UTC, Emmanuel Grumbach
Details | Diff

Description Andreas Backx 2018-10-03 10:05:05 UTC
My Intel 9260 card seems to be timing out on boot. The first time I connected the card it was recognised and showed up as wspXXsX but I didn't configure it immediately and it was gone after a reboot. (I don't remember whether I did anything that might've caused this.) This is the exact product I got: https://www.gigabyte.com/Motherboard/GC-WB1733D-I-rev-10#ov

Bluetooth works perfectly fine, but Wi-Fi does not seem to work.


Some information:

$ lspci -k
23:00.0 Network controller: Intel Corporation Wireless-AC 9260 (rev 29)
Subsystem: Intel Corporation Wireless-AC 9260
Kernel modules: iwlwifi

$ dmesg | grep wifi
[    4.395631] iwlwifi 0000:23:00.0: enabling device (0000 -> 0002)
[    4.450137] iwlwifi: probe of 0000:23:00.0 failed with error -110
[   18.548039] Modules linked in: hid_logitech_hidpp cmac bnep hid_logitech_dj btusb nls_iso8859_1 nls_cp437 btrtl btbcm vfat btintel fat bluetooth hid_sony ff_memless edac_mce_amd joydev kvm_amd input_leds mousedev led_class iwlwifi ecdh_generic snd_hda_codec_realtek kvm snd_hda_codec_generic snd_hda_codec_hdmi snd_hda_intel cfg80211 snd_usb_audio snd_hda_codec irqbypass crct10dif_pclmul crc32_pclmul snd_usbmidi_lib snd_hda_core ghash_clmulni_intel snd_rawmidi pcbc snd_hwdep snd_seq_device snd_pcm aesni_intel snd_timer ccp aes_x86_64 crypto_simd snd cryptd sp5100_tco ppdev wmi_bmof r8169 glue_helper pcspkr i2c_piix4 k10temp soundcore rng_core mii rfkill parport_pc parport wmi gpio_amdpt evdev pinctrl_amd mac_hid vboxnetflt(OE) vboxnetadp(OE) vboxpci(OE) vboxdrv(OE) i2c_dev crypto_user ip_tables x_tables

$ dmesg | grep firmware
[    1.251085] [drm] Found UVD firmware Version: 1.87 Family ID: 17
[    1.251088] [drm] PSP loading UVD firmware
[    1.251552] [drm] Found VCE firmware Version: 55.3 Binary ID: 4
[    1.251555] [drm] PSP loading VCE firmware

I've setup crda and my regulatory domain which fixed an error in these logs ^. It however didn't change the error I get from iwlwifi.

If there's any more information I can provide, feel free to ask.
Comment 1 Andreas Backx 2018-10-08 14:49:37 UTC
I've updated to kernel 4.18.12-arch1-1-ARCH and am no longer encountering the issue. Wi-Fi is working fine now and so is Bluetooth.
Comment 2 Luca Coelho 2018-10-08 17:48:36 UTC
Thanks for reporting back!
Comment 3 Andreas Backx 2018-10-09 12:58:27 UTC
Unfortunately. Today I'm getting the same issue again. I've got a dual-boot with Windows. Maybe it's because I booted into Windows and back into Arch which caused something to change?
Comment 4 Andreas Backx 2018-10-12 15:22:25 UTC
And now it works again. I guess it's a lottery with me given the lowest odds.
Comment 5 Luca Coelho 2018-10-13 07:08:32 UTC
We'll assign this to someone to see if we can figure anything out.  Please keep us posted in case you find anything out.
Comment 6 hgminh95 2018-10-14 12:17:51 UTC
I also got the same issue. From what I see, if I restart in Windows then I can use wifi afterward. But if I shutdown in Windows, then I got the timeout error in iwlwifi.
Comment 7 Andreas Backx 2018-10-16 13:51:46 UTC
I have tested what hgmin95 says and I am experiencing the same. Restarting in Windows means that I can use it in Linux and shutdown in Windows means I cannot. Very odd.
Comment 8 Heikki Parviainen 2018-11-13 04:16:12 UTC
I have been experiencing the same thing with dual boot from windows. After reboot from windows to Ubuntu 18.04.1 no wifi, Reboot from ubuntu to ubuntu and wifi starts working.

My wifi hw is:
~$ sudo lspci -knn | grep Wire -A3
Network controller [0280]: Intel Corporation Wireless 8265 / 8275 [8086:24fd] (rev 78)
Subsystem: Intel Corporation Wireless 8265 / 8275 [8086:8110]
Kernel driver in use: iwlwifi
Kernel modules: iwlwifi
Comment 9 Colin Tucker 2018-11-13 21:57:19 UTC
I also dual boot with Windows and KDE neon 18.04, and I too have issues with iwlwifi.  However, I have noticed it appears to be from cold boot only.  The first time I power on and boot into KDE neon, my wifi almost certainly will not work.  After rebooting, it works fine. The mobo is a Gigabyte X470 Aorus Gaming 7 Wifi, with BIOS F4g (BIOS F4 and F5 are newer, but they break virtualization and suspend within Linux due to an AGESA bug).

Here are my details:

$ lspci -k

07:00.0 Network controller: Intel Corporation Device 2526 (rev 29)
        Subsystem: Intel Corporation Device 0014
        Kernel driver in use: iwlwifi
        Kernel modules: iwlwifi

$ dmesg | grep wifi

[    6.668309] iwlwifi 0000:07:00.0: enabling device (0000 -> 0002)
[    6.792193] iwlwifi: probe of 0000:07:00.0 failed with error -110

$ dmesg | grep firmware

[    6.599143] iwlwifi 0000:07:00.0: loaded firmware version 38.c0e03d94.0 op_mode iwlmvm
Comment 10 Emmanuel Grumbach 2018-12-04 17:34:31 UTC
Can you please try to blacklist btusb and do a cold reboot?
This will of course disable BT, but will let us know a bit more about this issue.

Also, if someone can try this:


diff --git a/drivers/net/wireless/intel/iwlwifi/iwl-io.c b/drivers/net/wireless/intel/iwlwifi/iwl-io.c
index fccb63a32..370f602b5 100644
--- a/drivers/net/wireless/intel/iwlwifi/iwl-io.c
+++ b/drivers/net/wireless/intel/iwlwifi/iwl-io.c
@@ -113,6 +113,7 @@ int iwl_poll_bit(struct iwl_trans *trans, u32 addr,
                t += IWL_POLL_INTERVAL;
        } while (t < timeout);

+       dump_stack();
        return -ETIMEDOUT;
 }
 IWL_EXPORT_SYMBOL(iwl_poll_bit);
@@ -164,6 +165,7 @@ int iwl_poll_direct_bit(struct iwl_trans *trans, u32 addr, u32 mask,
                t += IWL_POLL_INTERVAL;
        } while (t < timeout);

+       dump_stack();
        return -ETIMEDOUT;
 }
 IWL_EXPORT_SYMBOL(iwl_poll_direct_bit);
@@ -227,6 +229,7 @@ int iwl_poll_prph_bit(struct iwl_trans *trans, u32 addr,
                t += IWL_POLL_INTERVAL;
        } while (t < timeout);

+       dump_stack();
        return -ETIMEDOUT;
 }






Thanks.
Comment 11 Robert Van Voorhees 2018-12-12 20:38:50 UTC
I get a slightly different response from `dmesg`:



[    4.630318] iwlwifi 0000:3b:00.0: enabling device (0000 -> 0002)
[    4.652342] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-38.ucode failed with error -2
[    4.652351] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-37.ucode failed with error -2
[    4.652363] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-36.ucode failed with error -2
[    4.652369] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-35.ucode failed with error -2
[    4.652374] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-34.ucode failed with error -2
[    4.652380] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-33.ucode failed with error -2
[    4.652385] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-32.ucode failed with error -2
[    4.652391] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-31.ucode failed with error -2
[    4.652397] iwlwifi 0000:3b:00.0: Direct firmware load for iwlwifi-9260-th-b0-jf-b0-30.ucode failed with error -2
[    4.652397] iwlwifi 0000:3b:00.0: no suitable firmware found!
[    4.652398] iwlwifi 0000:3b:00.0: minimum version required: iwlwifi-9260-th-b0-jf-b0-30
[    4.652399] iwlwifi 0000:3b:00.0: maximum version supported: iwlwifi-9260-th-b0-jf-b0-38
[    4.652400] iwlwifi 0000:3b:00.0: check git://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git

lspci -k
...
3b:00.0 Network controller: Intel Corporation Wireless-AC 9260 (rev 29)
	Subsystem: Intel Corporation Device 4010
	Kernel modules: iwlwifi
...
Comment 12 Emmanuel Grumbach 2018-12-12 20:41:09 UTC
@Robert, this absolutely not related to this bug.
Please install the firmware. But, please do not reply to not hijack this bug.

Thank you.
Comment 13 Robert Van Voorhees 2018-12-12 20:59:56 UTC
@Emmanuel, apologies, brief moment of insanity,

dnf groupinstall hardware-support

resolved that issue unrelated to this bug report.
Comment 14 hgminh95 2018-12-13 12:16:14 UTC
@Emmanuel, blacklist btusb makes the wifi work

$ tail /etc/modprobe.d/blacklist.conf -n1
blacklist btusb

$ dmesg | grep iwlwifi
[    2.908494] iwlwifi 0000:03:00.0: enabling device (0000 -> 0002)
[    2.913394] iwlwifi 0000:03:00.0: loaded firmware version 38.c0e03d94.0 op_mode iwlmvm
[    2.933919] iwlwifi 0000:03:00.0: Detected Intel(R) Dual Band Wireless AC 9260, REV=0x324

I will try to figure out how to add these dump_stack().
Comment 15 Andreas Backx 2018-12-13 21:32:52 UTC
Unfortunately I do not have the time the coming 2 months to provide feedback because of exams. I've currently resorted to installing a wired socket by my desk so I can use that. Sorry for not being able to help debug the problem.
Comment 16 Emmanuel Grumbach 2018-12-18 06:08:10 UTC
Ok, I'll close this for now. If someone can help, he can reopen the bug.
Comment 17 Taneli Huuskonen 2019-01-18 19:40:35 UTC
Created attachment 280589 [details]
Output of proposed dump_stack() call
Comment 18 Taneli Huuskonen 2019-01-18 19:56:51 UTC
For me, the bug occurs whenever I boot into Windows and then back into Linux. The bug goes away when I boot again into Linux, no matter whether I shut down and turn the machine back on, or just reboot.

Here are the outputs of lspci -vv , merged into a unified diff:

--- ok-lspci-vv.txt	2019-01-18 18:05:14.682005050 +0200
+++ bug-lspci-vv.txt	2019-01-18 18:07:56.435000630 +0200
@@ -1,34 +1,32 @@
 00:14.3 Network controller: Intel Corporation Device a370 (rev 10)
 	Subsystem: Intel Corporation Device 02a4
-	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
+	Control: I/O- Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
-	Latency: 0, Cache Line Size: 64 bytes
 	Interrupt: pin A routed to IRQ 16
 	Region 0: Memory at a4314000 (64-bit, non-prefetchable) [size=16K]
 	Capabilities: [c8] Power Management version 3
 		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
 	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
 		Address: 0000000000000000  Data: 0000
 	Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
 		DevCap:	MaxPayload 128 bytes, PhantFunc 0
 			ExtTag- RBE-
 		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
 			RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+
 			MaxPayload 128 bytes, MaxReadReq 128 bytes
 		DevSta:	CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr+ TransPend-
 		DevCap2: Completion Timeout: Range B, TimeoutDis+, LTR+, OBFF Via WAKE#
 			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
 		DevCtl2: Completion Timeout: 16ms to 55ms, TimeoutDis-, LTR+, OBFF Disabled
 			 AtomicOpsCtl: ReqEn-
-	Capabilities: [80] MSI-X: Enable+ Count=16 Masked-
+	Capabilities: [80] MSI-X: Enable- Count=16 Masked-
 		Vector table: BAR=0 offset=00002000
 		PBA: BAR=0 offset=00003000
 	Capabilities: [100 v0] #00
 	Capabilities: [14c v1] Latency Tolerance Reporting
 		Max snoop latency: 0ns
 		Max no snoop latency: 0ns
 	Capabilities: [164 v1] Vendor Specific Information: ID=0010 Rev=0 Len=014 <?>
-	Kernel driver in use: iwlwifi
 	Kernel modules: iwlwifi
Comment 19 David Niehues 2019-02-04 20:32:33 UTC
I have the same or a very similar issue. Since I booted Windows again I can not access any wifi anymore. No matter how often I boot intow Linux again.
dmesg | grep iwlwifi shows the following output.

[    3.445668] iwlwifi: probe of 0000:17:00.0 failed with error -110

dmesg | grep firmware shows the following output (appears unrelated to me, but I I give it for completeness).

[    3.532814] [drm] Found UVD firmware Version: 1.130 Family ID: 16
[    3.534169] [drm] Found VCE firmware Version: 53.26 Binary ID: 3

Here is the output of lspci -k regarding wifi.

17:00.0 Network controller: Intel Corporation Wireless-AC 9260 (rev 29)
        DeviceName: Broadcom 5762
        Subsystem: Intel Corporation Wireless-AC 9260
        Kernel modules: iwlwifi

I run kernel version 4.20.6. I am not that familiar with kernel debugging, but if can help to fix this issue in any way I will be happy to help.
Comment 21 Emmanuel Grumbach 2019-02-21 19:02:23 UTC
Ping?
Comment 22 David Niehues 2019-02-24 09:44:07 UTC
(In reply to Emmanuel Grumbach from comment #20)
> Please install the latest firmware from BT:
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/
> plain/intel/ibt-18-16-1.sfi
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/
> plain/intel/ibt-18-2.sfi
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/
> plain/intel/ibt-17-16-1.sfi
> https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/
> plain/intel/ibt-17-2.sfi
> 
> 
> Then shutdown the system (not reboot!) and then test again.
> 
> Thanks.

Sorry it took me so long to answer you. Here is a short update.

In the meantime, the WiFi started to work normally again, even though I did not change anything. So, it seems to me, that the behavior of my pc is pretty much the same as the behavior Andreas, Heikki, Colin and Taneli observed. Just that for me, it takes more boot-reboot cycles to work again.

However, booting into Windows and then into Linux again reproduced the problem.
As a (maybe unrelated) side note: When I booted into Windows, the WiFi did not connect automatically even though I configured it earlier.

After I reproduced the problem, I downloaded the latest version of the files you linked and placed them in /lib/firmware/intel/, shutdown the pc and then restarted it. However, nothing changed. The WiFi still does not show up and all the outputs I listed above are still the same.
Comment 23 Emmanuel Grumbach 2019-02-25 20:59:50 UTC
> 
> In the meantime, the WiFi started to work normally again, even though I did
> not change anything. So, it seems to me, that the behavior of my pc is
> pretty much the same as the behavior Andreas, Heikki, Colin and Taneli
> observed. Just that for me, it takes more boot-reboot cycles to work again.
> 
> However, booting into Windows and then into Linux again reproduced the
> problem.
> As a (maybe unrelated) side note: When I booted into Windows, the WiFi did
> not connect automatically even though I configured it earlier.

This is clearly not related. This is an internal Windows problem (if we consider this as a problem).

> 
> After I reproduced the problem, I downloaded the latest version of the files
> you linked and placed them in /lib/firmware/intel/, shutdown the pc and then
> restarted it. However, nothing changed. The WiFi still does not show up and
> all the outputs I listed above are still the same.

Ok, at this point, I guess that what would help would be to try to take our backport driver and put lots of prints in the probe function to see where we get stuck.
Would it be something doable on your side? (I'd provide the code to compile of course).
Comment 24 Taneli Huuskonen 2019-02-26 09:02:38 UTC
(In reply to Emmanuel Grumbach from comment #23)

> Ok, at this point, I guess that what would help would be to try to take our
> backport driver and put lots of prints in the probe function to see where we
> get stuck.
> Would it be something doable on your side? (I'd provide the code to compile
> of course).

I'm somewhat busy these days, but I know how to do that. I'll do it later this week or maybe the next, if nobody else does it quicker.

I also installed the latest firmware without any difference, as far as I could tell. For me, the bug occurs exactly when I've shut down the computer in Windows and next boot into Linux. If I use Windows and reboot instead of shutting down, it works. Booting into Linux and either shutting down or rebooting fixes it. Even rebooting straight from rEFInd, without booting any system, fixes it.
Comment 25 David Niehues 2019-02-28 21:59:58 UTC
(In reply to Emmanuel Grumbach from comment #23)
> Ok, at this point, I guess that what would help would be to try to take our
> backport driver and put lots of prints in the probe function to see where we
> get stuck.
> Would it be something doable on your side? (I'd provide the code to compile
> of course).

That should be doable for me. Also, it's always great to learn something new.
I should have enough time at my hands next week, if that fits your schedule.
Comment 26 Emmanuel Grumbach 2019-03-03 08:00:22 UTC
Please take the debug_201319 branch of:

git://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi.git

I added debug prints.

Please compile with IWLWIFI_DEBUG enabled and load iwlwifi with:
debug=0xffffffff

Thanks.
Comment 27 Taneli Huuskonen 2019-03-03 09:32:15 UTC
Created attachment 281471 [details]
Debug outputs, with and without bug

I rmmod'ed iwlmvm and iwlwifi, which had been autoloaded at boot, and modprobe'd them back with debug enabled. First I did that with the wireless card working, then with the bug. After that, I grepped "iwlwifi" from my debug log. As you can see, the bug made the module fail almost immediately, with only four output lines.

Please let me know if you wish me to do any further testing.
Comment 28 Emmanuel Grumbach 2019-03-03 09:38:37 UTC
Problem is that the prints I added didn't include iwlwifi :)
Can you please send the logs w/o grepping iwlwifi?
Comment 29 Taneli Huuskonen 2019-03-03 10:29:42 UTC
Created attachment 281473 [details]
Corrected debug output

Sorry, I'd accidentally checked out the main branch, so the extra debug outputs weren't there in the first place. Here's the relevant snippet from dmesg after installing the debug_201319 version.
Comment 30 Emmanuel Grumbach 2019-03-03 10:37:07 UTC
I pushed another patch in the same branch.

can you produce the same logs with the new content there?
Thanks.
Comment 31 Taneli Huuskonen 2019-03-03 11:00:24 UTC
Created attachment 281475 [details]
Debug output again

Snippet of dmesg output, with latest version
Comment 32 Emmanuel Grumbach 2019-03-03 11:12:13 UTC
Pushed again, but I don't think it'll help.
Still worth trying.

You have updated the BT firmware as mentioned in comment #20, right?
Comment 33 Emmanuel Grumbach 2019-03-03 11:13:46 UTC
I just noticed that the BT patches were updated:

https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=55fa1fbae78512c305153aeb66ef5cd948ea3709
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=30b931413997178cab3163c1cf15f54c59c486c9

Please take those.
Note that you need a cold boot for the BT to take the new FW.
Comment 34 Taneli Huuskonen 2019-03-03 14:44:09 UTC
Yes, I had updated the firmware. Updating it again and compiling the latest version of the driver ("read after write") didn't make any visible difference.
Comment 35 David Niehues 2019-03-03 15:07:22 UTC
Created attachment 281477 [details]
debug-output_2019-03-03-16-00-CET.txt

I compiled your branch and installed the latest patches for BT.

Here is the output of dmesg | grep iwl it yielded.

If you need any other output, let me know.
Comment 36 Emmanuel Grumbach 2019-03-03 19:00:47 UTC
Thanks.
Your two logs are exactly identical.

I need to get more info internally.
Comment 37 David Niehues 2019-03-03 19:18:39 UTC
As a side note: I just found out why for all others rebooting into Linux helps but not for me. It seems that I need to completely unplug the computer. When I then reboot into Linux the problem is fixed.

This might not be crucial information, but I wanted to share it nevertheless if it maybe helps in fixing the bug.
Comment 38 David Niehues 2019-04-26 15:55:25 UTC
Ping? Is there any progress on this issue?
Comment 39 Emmanuel Grumbach 2019-04-29 13:13:36 UTC
Unfortunately no.

This is constantly bypassed by higher priority tasks.
Comment 40 David Niehues 2019-04-30 06:58:25 UTC
(In reply to Emmanuel Grumbach from comment #39)
> Unfortunately no.
> 
> This is constantly bypassed by higher priority tasks.

That's understandable, thanks for the update.
Comment 41 David Goodlad 2019-05-16 10:35:30 UTC
For the record, I am experiencing the same issue.

I have a dual-boot machine, Debian Stretch and Windows 10.

I installed a new Intel 9260 card, and booted into Debian first running the kernel (4.19.0-0.bpo.4-amd64) & firmware (20190114-1~bpo9+2) from stretch-backports. The card worked fine, through multiple reboots.

I then booted into Windows, updated Intel's drivers, and it worked there too. I powered down the machine for the night.

Booting into Linux after that resulted in the -110 timeout behavior, which wouldn't resolve across either hard or soft reboots, nor with manually removing/re-adding the kernel module a few times.

Booting into Windows, then soft-rebooting into Linux led to a working card.

Happy to help with testing in the future whenever someone has time to work on this - no rush, just happy to see that it's had some attention so far.
Comment 42 Emmanuel Grumbach 2019-05-16 20:40:46 UTC
After you boot into Linux with a working device, can you shut down the machine. Power it up again in Linux and get a working device again?

Only booting to windows makes the problem appear?

Fwiw, this is the exact opposite of what I'd expect. There is some state that stays alive across reboots. So I'd expect that once you boot from fresh (as opposed to reboot) things would fall in place. But what you described is the exact opposite.
Comment 43 David Goodlad 2019-05-16 21:00:49 UTC
> After you boot into Linux with a working device, can you shut down the
> machine. Power it up again in Linux and get a working device again?

Yes, I just powered it off now and back on again, the device works without issue:

$ dmesg | grep iwlwifi

[    4.789382] iwlwifi 0000:27:00.0: enabling device (0000 -> 0002)
[    4.797452] iwlwifi 0000:27:00.0: firmware: direct-loading firmware iwlwifi-9260-th-b0-jf-b0-38.ucode
[    4.798053] iwlwifi 0000:27:00.0: loaded firmware version 38.755cfdd8.0 op_mode iwlmvm
[    4.846147] iwlwifi 0000:27:00.0: Detected Intel(R) Dual Band Wireless AC 9260, REV=0x324
[    4.896223] iwlwifi 0000:27:00.0: base HW address: a4:c3:f0:88:bb:e8
[    4.971072] iwlwifi 0000:27:00.0 wlp39s0: renamed from wlan0

I'll run a few more tests tonight and write them up for you.
Comment 44 Taneli Huuskonen 2019-05-17 11:10:01 UTC
For me, the bug shows up exactly when the last thing I did before starting up Linux was to shut down Windows. A warm reboot from Windows, Linux, or even straight from the bootloader clears the bug, and so does shutting down from Linux and cold booting. As long as I don't touch Windows (which is most of the time), the bug never comes back.
Comment 45 Emmanuel Grumbach 2019-05-17 12:50:14 UTC
Created attachment 282805 [details]
print fseq version

Can you please all use the patch attached and tell me what the print there say?

I need the TLV_FW_FSEQ_VERSION: output.

Thank you.
Comment 46 Taneli Huuskonen 2019-05-17 13:16:56 UTC
The patched version didn't compile. IWL_UCODE_TLV_FW_FSEQ_VERSION was undefined.
Comment 47 Emmanuel Grumbach 2019-05-17 14:07:12 UTC
Hmmm... it define in my code. Apparently, we have different version of the code.

Please add:

       IWL_UCODE_TLV_FW_FSEQ_VERSION   = 60

in  enum iwl_ucode_tlv_type {
which is in drivers/net/wireless/intel/iwlwifi/fw/file.h

Thanks.
Comment 48 David Goodlad 2019-05-18 02:54:04 UTC
I compiled the latest rev of the backport-iwlwifi driver (ac5faf65ba515d3f7ac4ca29de59dce5baf3f85f) with your patch to print the fseq version. I also installed the latest firmware, iwlwifi-9260-th-b0-jf-b0-46.ucode.

Unfortunately, loading the module doesn't seem to print the line we're looking for:

[ 1958.208284] iwlwifi 0000:27:00.0: U _iwl_disable_interrupts Disabled interrupts
[ 1958.208288] iwlwifi 0000:27:00.0: U iwl_pcie_prepare_card_hw iwl_trans_prepare_card_hw enter
[ 1958.208296] iwlwifi 0000:27:00.0: U iwl_pcie_set_hw_ready hardware ready
[ 1958.208321] iwlwifi 0000:27:00.0: U iwl_trans_pcie_alloc HW REV: 0x324
[ 1958.208569] iwlwifi 0000:27:00.0: U iwl_pcie_set_interrupt_capa MSI-X enabled. 16 interrupt vectors were allocated
[ 1958.209368] iwlwifi 0000:27:00.0: firmware: failed to load iwl-dbg-cfg.ini (-2)
[ 1958.209371] iwlwifi 0000:27:00.0: Direct firmware load for iwl-dbg-cfg.ini failed with error -2
[ 1958.209376] iwlwifi 0000:27:00.0: U iwl_request_firmware attempting to load firmware 'iwlwifi-9260-th-b0-jf-b0-46.ucode'
[ 1958.209702] iwlwifi 0000:27:00.0: firmware: direct-loading firmware iwlwifi-9260-th-b0-jf-b0-46.ucode
[ 1958.209705] iwlwifi 0000:27:00.0: U iwl_req_fw_callback Loaded firmware file 'iwlwifi-9260-th-b0-jf-b0-46.ucode' (1456088 bytes).
[ 1958.209708] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Found debug memory segment: 0
[ 1958.209710] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Found debug memory segment: 1
[ 1958.209711] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Found debug memory segment: 2
[ 1958.209713] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Found debug memory segment: 0
[ 1958.209714] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Found debug memory segment: 1
[ 1958.209714] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Found debug memory segment: 2
[ 1958.209716] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware unknown TLV: 58
[ 1958.209718] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware Paging: paging enabled (size = 241664 bytes)
[ 1958.209719] iwlwifi 0000:27:00.0: U iwl_parse_tlv_firmware unknown TLV: 58

Maybe I've missed something here?
Comment 49 glebtv 2019-09-03 08:46:40 UTC
I have exactly the same problem (error -110) with an Intel(R) Wi-Fi 6 AX200 160MHz, REV=0x340 both on 5.2.11 and 5.3.0-rc6

The problem for me only happens when powering off windows, then power on and boot into linux. A reboot from windows clears the problem, and after it's working it doesn't happen again unless powering off windows. Rebooting linux, powering off linux etc does not help if the problem has occured, and does not cause it.

I think windoows driver does something when powering off the machine which linux driver can't undo.

Here is the log with TLV_FW_FSEQ_VERSION when it is working:

[    2.587563] iwlwifi 0000:03:00.0: enabling device (0000 -> 0002)
[    2.592047] iwlwifi 0000:03:00.0: TLV_FW_FSEQ_VERSION: FSEQ Version: 43.2.23.17
[    2.592049] iwlwifi 0000:03:00.0: Found debug destination: EXTERNAL_DRAM
[    2.592050] iwlwifi 0000:03:00.0: Found debug configuration: 0
[    2.592177] iwlwifi 0000:03:00.0: loaded firmware version 48.954cff6d.0 op_mode iwlmvm
[    2.992221] iwlmvm: unknown parameter 'power_save' ignored
[    2.992300] iwlwifi 0000:03:00.0: Detected Intel(R) Wi-Fi 6 AX200 160MHz, REV=0x340
[    3.002880] iwlwifi 0000:03:00.0: Applying debug destination EXTERNAL_DRAM
[    3.003027] iwlwifi 0000:03:00.0: Allocated 0x00400000 bytes for firmware monitor.
[    3.159180] iwlwifi 0000:03:00.0: base HW address: 3c:f0:11:d9:cc:13
[    3.173805] iwlwifi 0000:03:00.0 wlp3s0: renamed from wlan0
[   22.181142] iwlwifi 0000:03:00.0: Applying debug destination EXTERNAL_DRAM
[   22.331987] iwlwifi 0000:03:00.0: FW already configured (0) - re-configuring
[   22.341352] iwlwifi 0000:03:00.0: BIOS contains WGDS but no WRDS
[   25.756449] wlp3s0: authenticate with 7c:8b:ca:59:05:8c
[   25.758586] wlp3s0: send auth to 7c:8b:ca:59:05:8c (try 1/3)
[   25.782952] wlp3s0: authenticated
[   25.784190] wlp3s0: associate with 7c:8b:ca:59:05:8c (try 1/3)
[   25.797300] wlp3s0: RX AssocResp from 7c:8b:ca:59:05:8c (capab=0x431 status=0 aid=9)
[   25.800772] wlp3s0: associated
Comment 50 Mantas 2019-09-29 21:00:33 UTC
Booting into windows and turning wifi off solved the issue for me
Comment 51 Luca Coelho 2019-10-11 06:27:39 UTC
This is probably due to "fast boot" on Windows.  Can you try to follow these instructions from our wiki and see if it solves the problem?

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi#about_dual-boot_with_windows_and_fast-boot_enabled

Additionally, you should make sure that the Bluetooth firmwares are also up-to-date.
Comment 52 hgminh95 2019-11-02 04:38:39 UTC
Confirm that disabling "fast boot" on Windows solve the problem for me.

Thanks.
Comment 53 dglt 2019-11-19 19:04:12 UTC
with the more recent iwlwifi drivers i noticed after a while of troubleshooting that the LAR (Location Aware Regulatory) "feature" of the iwlwifi drivers were incorrectly detecting wireless regulatory domain info or rather that it detects the correct one and it also disables most of the channels with another. 

disabling lar fixes the dropped or no-connection issue

iwlwifi.lar_disable=1
Comment 54 Luca Coelho 2019-12-16 19:44:02 UTC
(In reply to dglt from comment #53)
> with the more recent iwlwifi drivers i noticed after a while of
> troubleshooting that the LAR (Location Aware Regulatory) "feature" of the
> iwlwifi drivers were incorrectly detecting wireless regulatory domain info
> or rather that it detects the correct one and it also disables most of the
> channels with another. 
> 
> disabling lar fixes the dropped or no-connection issue
> 
> iwlwifi.lar_disable=1

LAR is used to detect the current location and handle the regulatory accordingly.  But it's not related to this bug and, in any case, you're not supposed to disable lar, this option was there just for initial testing and will be removed.

I'm going to close this bug now, since people have been reporting that disabling fastboot on Windows has solved the problem for them.  And this bug has so many comments from different things that it is hard to track.

Please open a new bug if you still have problems.  Especially the original author of this bug entry.
Comment 55 Kilian Cavalotti 2020-08-16 23:57:06 UTC
In case someone's still interested in this, I found out that initiating a PCI reset of the device allowed to properly initialize it (and load the iwlwifi module) when coming from a Windows hibernation or Fast Startup state (-110 error).

This seems to be working fine, with no visible side-effect, either on the Linux or Windows side (tested on kernel 5.8.1):

$ cat /etc/modprobe.d/iwlwifi.conf
# PCIe reset device before loading the module.
#   Windows Hibernation/FastBoot may leave it in an unusable state
#   Replace 0000:00:14.3 below by the actual PCI address of the device
install iwlwifi echo 1 > /sys/bus/pci/devices/0000\:00\:14.3/reset ; \
        /sbin/modprobe --ignore-install iwlwifi

Hope this helps.

Note You need to log in before you can comment on or make changes to this bug.