Bug 205055 - iwlwifi: 9260: crash in loop on 5Ghz wifi with a 80Mhz width
Summary: iwlwifi: 9260: crash in loop on 5Ghz wifi with a 80Mhz width
Status: NEEDINFO
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless-intel (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: DO NOT USE - assign "network-wireless-intel" component instead
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-09-30 22:14 UTC by Arnaud Astruc
Modified: 2021-11-26 11:06 UTC (History)
11 users (show)

See Also:
Kernel Version: 5.3.1-arch1-1-ARCH
Subsystem:
Regression: No
Bisected commit-id:


Attachments
iwlwifi firmware version (71 bytes, text/plain)
2019-09-30 22:14 UTC, Arnaud Astruc
Details
dmesg (12.67 KB, text/plain)
2019-09-30 22:16 UTC, Arnaud Astruc
Details
iw wlp58s0 info (332 bytes, text/plain)
2019-09-30 22:18 UTC, Arnaud Astruc
Details
iwconfig wlp58s0 (527 bytes, text/plain)
2019-09-30 22:19 UTC, Arnaud Astruc
Details
dmesg without call trace (9.18 KB, text/plain)
2019-09-30 22:21 UTC, Arnaud Astruc
Details
ping 8.8.8.8 (891 bytes, text/plain)
2019-09-30 22:22 UTC, Arnaud Astruc
Details
lspci -vvv (3.34 KB, text/plain)
2019-09-30 22:25 UTC, Arnaud Astruc
Details
journalctl -f output during crash (13.85 KB, text/plain)
2019-10-20 20:03 UTC, Alex Kampas
Details
journalctl -f output during crash (other version) (5.94 KB, text/plain)
2019-10-20 20:07 UTC, Alex Kampas
Details
journalctl -f output during crash (2.4GHz with different output) (16.33 KB, text/plain)
2019-10-20 20:56 UTC, Alex Kampas
Details
journal with segfault (6.39 KB, text/plain)
2019-11-25 10:50 UTC, dgfguida
Details
Journal, different error (6.15 KB, text/plain)
2019-11-25 10:51 UTC, dgfguida
Details
dmesg and other logs with a 5.4.7 kernel (16.18 KB, text/plain)
2020-01-20 17:06 UTC, Arnaud Astruc
Details

Description Arnaud Astruc 2019-09-30 22:14:16 UTC
Created attachment 285259 [details]
iwlwifi firmware version

When I try to connect a 5Ghz wifi with a 80Mhz my Intel 9260 crash in loop.
Comment 1 Arnaud Astruc 2019-09-30 22:16:54 UTC
Created attachment 285261 [details]
dmesg
Comment 2 Arnaud Astruc 2019-09-30 22:18:57 UTC
Created attachment 285263 [details]
iw wlp58s0 info
Comment 3 Arnaud Astruc 2019-09-30 22:19:26 UTC
Created attachment 285265 [details]
iwconfig wlp58s0
Comment 4 Arnaud Astruc 2019-09-30 22:21:15 UTC
Created attachment 285267 [details]
dmesg without call trace
Comment 5 Arnaud Astruc 2019-09-30 22:22:06 UTC
Created attachment 285269 [details]
ping 8.8.8.8
Comment 6 Arnaud Astruc 2019-09-30 22:25:23 UTC
Created attachment 285271 [details]
lspci -vvv
Comment 7 Alex Kampas 2019-10-20 18:28:12 UTC
This problem affects me too.
9260 in T460p (same in T480s) Thinkpads.

Identical dmesg output. 
I have tried Ubuntu 19.10 and Fedora 31 beta. It seems to be linked to 5.3
Everything works fine on earlier kernels (tested on Ubuntu 19.04 and Fedora 30).

Crashes occur as soon as traffic starts with 5GHz. 80 or 40Mhz, no difference.
Comment 8 Luca Coelho 2019-10-20 19:43:46 UTC
We have a suspect of what this may be.  We will look into.  But to be sure, can you confirm whether that's the first SYSASSERT (or any iwlwifi error) that you see? The dmesg is truncated, so I can be sure there was nothing before that could be relevant.
Comment 9 Alex Kampas 2019-10-20 19:54:12 UTC
This is dmesg | grep iwlw (Ubuntu 19.10 kernel 5.3.0-18-generic)

[    3.233876] iwlwifi 0000:03:00.0: enabling device (0000 -> 0002)
[    3.279689] iwlwifi 0000:03:00.0: Found debug destination: EXTERNAL_DRAM
[    3.279692] iwlwifi 0000:03:00.0: Found debug configuration: 0
[    3.279973] iwlwifi 0000:03:00.0: loaded firmware version 46.6bf1df06.0 op_mode iwlmvm
[    3.307919] iwlwifi 0000:03:00.0: Detected Intel(R) Wireless-AC 9260 160MHz, REV=0x324
[    3.315655] iwlwifi 0000:03:00.0: Applying debug destination EXTERNAL_DRAM
[    3.316007] iwlwifi 0000:03:00.0: Allocated 0x00400000 bytes for firmware monitor.
[    3.357200] iwlwifi 0000:03:00.0: base HW address: 64:5d:86:92:e2:d5
[    3.428766] iwlwifi 0000:03:00.0 wlp3s0: renamed from wlan0
[    4.718761] iwlwifi 0000:03:00.0: Applying debug destination EXTERNAL_DRAM
[    4.833787] iwlwifi 0000:03:00.0: Applying debug destination EXTERNAL_DRAM
[    4.900970] iwlwifi 0000:03:00.0: FW already configured (0) - re-configuring
Comment 10 Alex Kampas 2019-10-20 20:03:31 UTC
Created attachment 285587 [details]
journalctl -f output during crash

This is the complete journalctl -f output when the problem manifests.
This is the worst version of it that leads to complete system freeze with very little control as the touchpad is also lost (see end of file for psmouse.1)

The message starts with the mention of "HW error, resetting before reading"

but the wifi works fine in older kernels (5.0.32, Ubuntu 19.04) and Windows 10.

There is a different version of the crash with different output and not as severe lock up (I can barely type sudo reboot compared to ending up hard-resetting when the above lock up takes place. I will follow it.
Comment 11 Alex Kampas 2019-10-20 20:07:32 UTC
Created attachment 285589 [details]
journalctl -f output during crash (other version)

This crash is "softer" than the other, I still have control of the touchpad/trackpoint, choppy, but enough to get to a terminal and type sudo reboot.

There is no mention of HW error, but rather a SW Error at the beginning.

Needless to say bluetooth connectivity is gone in both cases. 

These logs are from Ubuntu 19.10, but the EXACT logs were produced with Fedora 31 beta (also kernel 5.3). 

I was not able to reproduce this with older kernels (Fedora 30 kernel 5.1 and Ubuntu 19.04 kernel 5.0.32).

I hope this helps.
Comment 12 Alex Kampas 2019-10-20 20:56:18 UTC
Created attachment 285591 [details]
journalctl -f output during crash (2.4GHz with different output)

This is fresh. Not as catastrophic. After it completed the system works fine. It even re-established wifi connection and bluetooth works. 
Softer version of crash but with some different output.

(Ubuntu 19.10, kernel 5.3.0-18)
Comment 13 Alex Kampas 2019-10-21 08:52:53 UTC
I can confirm that using this dkms backport here: https://gitlab.com/vicamo/backport-iwlwifi-dkms/tree/ubuntu/eoan

for Ubuntu 18.10 has worked. Wifi works flawlessly with this. 

Obviously not a general solution (and I have no idea what this does), but a way out for Ubuntu 18.10. 

May this info helps.

Thanks.
Comment 14 Alex Kampas 2019-10-27 22:17:23 UTC
(In reply to Alex Kampas from comment #13)
> I can confirm that using this dkms backport here:
> https://gitlab.com/vicamo/backport-iwlwifi-dkms/tree/ubuntu/eoan
> 
> for Ubuntu 18.10 has worked. Wifi works flawlessly with this. 
> 
> Obviously not a general solution (and I have no idea what this does), but a
> way out for Ubuntu 18.10. 
> 
> May this info helps.
> 
> Thanks.

Sadly this did not work. Some bootups will be fine for as long as I work (few hours). Then I reboot and the problem manifests again.

As such this backport is not a solution. 

Any updates on this issue?
Thanks
Comment 15 sunsi.lucas 2019-10-28 10:52:02 UTC
I have also been suffering from this bug (although I'm not exactly sure it's just this bug). Past month the wifi driver has been acting enough, both crashing and dropping connection on high trafic 5GHz (downloading a torrent for example), and with reaaaaally low speeds (<1mbps) on 2.4GHz.
Comment 16 dgfguida 2019-11-25 10:50:43 UTC
Created attachment 286041 [details]
journal with segfault

I am also experiencing this same bug. I attach two different iwlwifi segfaults, they both happened after ~30m of intense wifi usage (Using steam remote play on via wifi).

If you want me to test anything it's quite easy for me to reproduce the bug.
Comment 17 dgfguida 2019-11-25 10:51:37 UTC
Created attachment 286043 [details]
Journal, different error
Comment 18 labaunti3 2019-11-27 01:14:50 UTC
I am also experiencing this bug on kernel 5.4, with an AC 9260 con a Dell XPS 15 9570. This doesn't happen at all on kernel 5.2 or lower. But happens on kernels 5.3 or higher.
Comment 19 dgfguida 2020-01-02 12:05:24 UTC
It seems to have been fixed on the kernel 5.4.6. Yesterday I was testing it for a long time and it didn't segfault even once where usually it would have at least 5-7 times.

Lenovo T490 with AC 9260. Using iwd and systemd-network.
Comment 20 Luca Coelho 2020-01-02 13:09:46 UTC
Great, thanks for reporting!

I hope the other users who had this issue can also confirm so we can close this bug.
Comment 21 Alex Kampas 2020-01-02 13:45:34 UTC
I will try it and report back. 

If you dual boot Windows 10, can you please test booting to Windows, then back to Linux and see if it still works.

My experience was that I had much more trouble after I had rebooted from Windows compared to long stints of using linux.

Thanks.
Comment 22 dgfguida 2020-01-02 17:13:50 UTC
(In reply to Alex Kampas from comment #21)
> I will try it and report back. 
> 
> If you dual boot Windows 10, can you please test booting to Windows, then
> back to Linux and see if it still works.
> 
> My experience was that I had much more trouble after I had rebooted from
> Windows compared to long stints of using linux.
> 
> Thanks.

I am not dual-booting windows, can't test it sorry :/

I usually saw the error after 20-30min of streaming a game from my PC to my laptop via Parsec/Steam remote play on the 5Ghz wifi. Yesterday I played for a few hours and didn't see the error even once, so it seems fixed on my setup.
Comment 23 Onur Aslan 2020-01-14 21:52:21 UTC
I can confirm this issue has been fixed in 5.4.8. Today I switched to this kernel and after using 5ghz network for more than 8 hours, not a single panic or slowing down occurred and wifi work like a charm now.
Comment 24 Luca Coelho 2020-01-14 22:07:48 UTC
Great, thanks for reporting!

We have two users confirming this is fixed now.  Closing the bug.
Comment 25 Arnaud Astruc 2020-01-20 17:05:35 UTC
(In reply to Luca Coelho from comment #24)
> Great, thanks for reporting!
> 
> We have two users confirming this is fixed now.  Closing the bug.

Problem still here in kernel 5.4.7 (openSUSE Tumbleweed) see attachment bellow.

PS: Something strange I tried to make an hotspot with my Xiaomi phone (5Ghz band) and in this case no dmesg output et WiFi work great if I connect to it, but with all other hotspot I tried I had no luck.
Comment 26 Arnaud Astruc 2020-01-20 17:06:45 UTC
Created attachment 286913 [details]
dmesg and other logs with a 5.4.7 kernel
Comment 27 Luca Coelho 2020-01-20 17:52:04 UTC
Onur said in comment #23 that it doesn't happen for him anymore on kernel 5.4.8.  Can you try that?
Comment 28 fcayre 2020-11-30 12:43:38 UTC
I'm also affected (same firmware version, same traceback) with kernel Ubuntu 5.4.0-52.57~18.04.1-generic 5.4.65 
Has anyone at Intel really worked on this more-than-one-year-old bug?
Comment 29 comio 2021-11-26 11:06:39 UTC
Same error on my AC 9560 but I face the error only when the channel 36 is used with 80MHz bandwidth.

Ciao

luigi

Some info:

Linux abc 5.13.0-21-generic #21-Ubuntu SMP Tue Oct 19 08:59:28 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Firmware version: 46.6b541b68.0 9000-pu-b0-jf-b0-46.ucode op_mode iwlmvm

00:14.3 Network controller: Intel Corporation Cannon Lake PCH CNVi WiFi (rev 10)
	DeviceName: Onboard - Ethernet
	Subsystem: Intel Corporation Wireless-AC 9560 [Jefferson Peak]
	Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 16
	IOMMU group: 8
	Region 0: Memory at ed31c000 (64-bit, non-prefetchable) [size=16K]
	Capabilities: [c8] Power Management version 3
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold+)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [d0] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [40] Express (v2) Root Complex Integrated Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0
			ExtTag- RBE- FLReset+
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr+ NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq- AuxPwr+ TransPend-
		DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp- 10BitTagReq- OBFF Via WAKE#, ExtFmt- EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 16ms to 55ms, TimeoutDis- LTR+ OBFF Disabled,
			 AtomicOpsCtl: ReqEn-
	Capabilities: [80] MSI-X: Enable+ Count=16 Masked-
		Vector table: BAR=0 offset=00002000
		PBA: BAR=0 offset=00003000
	Capabilities: [100 v0] Null
	Capabilities: [14c v1] Latency Tolerance Reporting
		Max snoop latency: 0ns
		Max no snoop latency: 0ns
	Capabilities: [164 v1] Vendor Specific Information: ID=0010 Rev=0 Len=014 <?>
	Kernel driver in use: iwlwifi

Note You need to log in before you can comment on or make changes to this bug.