Bug 197591 - iwlwifi: 8265: ucode 34 crashes with kernel 4.14 rc6
Summary: iwlwifi: 8265: ucode 34 crashes with kernel 4.14 rc6
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: DO NOT USE - assign "network-wireless-intel" component instead
URL:
Keywords:
: 197983 197995 (view as bug list)
Depends on:
Blocks:
 
Reported: 2017-10-30 21:35 UTC by GoodMirek
Modified: 2017-11-27 13:37 UTC (History)
9 users (show)

See Also:
Kernel Version: 4.14 RC6
Tree: Mainline
Regression: Yes


Attachments
dmesg with kernel 4.14 rc6 (crashing) (136.68 KB, text/plain)
2017-10-30 21:35 UTC, GoodMirek
Details
dmesg with kernel 4.13.10 (75.20 KB, text/plain)
2017-10-30 21:36 UTC, GoodMirek
Details
dmesg with kernel 4.13.10, coexistence switched on (74.76 KB, text/plain)
2017-10-30 21:38 UTC, GoodMirek
Details
dmesg with kernel 4.14 rc7 (crashing) (84.23 KB, text/plain)
2017-10-31 08:40 UTC, GoodMirek
Details
Patch with the potential fix. (11.66 KB, patch)
2017-11-10 19:55 UTC, Luca Coelho
Details | Diff

Description GoodMirek 2017-10-30 21:35:03 UTC
Created attachment 260439 [details]
dmesg with kernel 4.14 rc6 (crashing)

dmesg attached, as requsted by emmanuel.grumbach@intel.com

inxi -Fxzc0
System:    Host: laptop Kernel: 4.14.0-0.rc6.git3.1.fc28.x86_64 x86_64 bits: 64 gcc: 7.2.1 Desktop: Gnome 3.24.3
           Distro: Fedora release 26 (Twenty Six)
Machine:   Device: laptop System: HP product: HP EliteBook 850 G4 serial: <filter>
           Mobo: HP model: 828C v: KBC Version 45.3C serial: <filter> UEFI: HP v: P78 Ver. 01.08 date: 10/17/2017
Battery    BAT0: charge: 48.0 Wh 100.0% condition: 48.0/48.0 Wh (100%)
           model: Hewlett-Packard Primary status: Full
CPU:       Dual core Intel Core i5-7200U (-HT-MCP-) arch: Kaby Lake rev.9 cache: 3072 KB
           flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 10848
           clock speeds: max: 3100 MHz 1: 2700 MHz 2: 2700 MHz 3: 2700 MHz 4: 2700 MHz
Graphics:  Card: Intel HD Graphics 620 bus-ID: 00:02.0
           Display Server: x11 (X.org 1.19.3 ) driver: i915 Resolution: 1920x1080@60.05hz
           OpenGL: renderer: Mesa DRI Intel HD Graphics 620 (Kaby Lake GT2)
           version: 4.5 Mesa 17.2.2 Direct Render: Yes
Audio:     Card Intel Sunrise Point-LP HD Audio driver: snd_hda_intel bus-ID: 00:1f.3
           Sound: Advanced Linux Sound Architecture v: k4.14.0-0.rc6.git3.1.fc28.x86_64
Network:   Card-1: Intel Ethernet Connection (4) I219-V driver: e1000e v: 3.2.6-k bus-ID: 00:1f.6
           IF: enp0s31f6 state: down mac: <filter>
           Card-2: Intel Wireless 8265 / 8275 driver: iwlwifi bus-ID: 02:00.0
           IF: wlp2s0 state: up mac: <filter>
Drives:    HDD Total Size: 256.1GB (39.1% used)
           ID-1: /dev/nvme0n1 model: SAMSUNG_MZVLW256HEHP size: 256.1GB temp: 32C
Partition: ID-1: / size: 120G used: 93G (78%) fs: xfs dev: /dev/dm-2
           ID-2: /boot size: 1018M used: 426M (42%) fs: xfs dev: /dev/nvme0n1p6
RAID:      No RAID devices: /proc/mdstat, md_mod kernel module present
Sensors:   System Temperatures: cpu: 32.5C mobo: 0.0C
           Fan Speeds (in rpm): cpu: N/A
Info:      Processes: 317 Uptime: 11 min Memory: 3927.2/15806.8MB Init: systemd runlevel: 5 Gcc sys: 7.2.1
           Client: Shell (bash 4.4.121) inxi: 2.3.40
Comment 1 GoodMirek 2017-10-30 21:36:26 UTC
Created attachment 260441 [details]
dmesg with kernel 4.13.10

coexistence switched off
Comment 2 GoodMirek 2017-10-30 21:38:09 UTC
Created attachment 260443 [details]
dmesg with kernel 4.13.10, coexistence switched on

Connected bluetooth mouse stops working as soon as WiFi network connects (already subject of bug https://bugzilla.kernel.org/show_bug.cgi?id=197061)
Comment 3 GoodMirek 2017-10-31 08:39:29 UTC
Tested with 4.14 rc7, crashing too. dmesg attached.
Comment 4 GoodMirek 2017-10-31 08:40:12 UTC
Created attachment 260447 [details]
dmesg with kernel 4.14 rc7 (crashing)
Comment 5 Luca Coelho 2017-11-03 10:24:16 UTC
I'll take a look.  There were other similar reports too.
Comment 6 Luca Coelho 2017-11-10 19:55:08 UTC
Created attachment 260597 [details]
Patch with the potential fix.

Okay, I finally had the time to look into it and found what the problem was.  The FW scan command API has changed, but I mistakenly held back the commits that adapted to it in our internal tree, because they depended on other parts that are not yet upstream. :(

Anyway, I have created a patch and touch-tested it and it seems to solve the problem.

Could you please try this patch and see if it works for you?
Comment 7 mart.b 2017-11-10 21:40:00 UTC
@Luca  

As i wrote in my E-Mail your patches work fine with the latest rc8 39dae59 and my 8260.  
Thanks for taking the time to fix this.  

Kind regards
Martin
Comment 8 Luca Coelho 2017-11-10 21:46:34 UTC
Great, thanks for testing! I'm marking this as resolved.  We can reopen it if needed.
Comment 9 GoodMirek 2017-11-13 12:23:29 UTC
Is there any chance to download the bin firmware file with this patch?
Comment 10 Luca Coelho 2017-11-13 12:50:10 UTC
This is not a FW patch (and you can't build the firmware anyway -- it's proprietary).

This is a kernel patch and you need to apply it to your kernel sources and rebuild the iwlwifi/iwlmvm modules.
Comment 11 GoodMirek 2017-11-13 18:13:50 UTC
As described in bug 197061, I was able to build iwlwifi with the provided patch, firmware 34 loads, Wifi connects, but BT does not work properly.
Kernel used for testing was:
uname -r
4.14.0-0.rc8.git3.1.fc28.x86_64
Comment 12 Emmanuel Grumbach 2017-11-13 18:43:01 UTC
This bug is about a bug in the scan API implementation, let's not mix it with BT issues.
Comment 13 mart.b 2017-11-13 19:00:28 UTC
Can not confirm any Bluetooth problems on my 8260 using the patches luca made. Streaming audio to my AV-Receiver using A2DP works as flawlessly as transfering files, pairing or anything else i throw at it.

But i have to agree with Emmanuel this should get its own Bugreport and if it already has a report just stick to that one.
Comment 14 Luca Coelho 2017-11-13 19:08:49 UTC
Yes, I agree.  This is already fixed and verified by a couple of people.  Let's only comment here if the *same* problem pops up again, which it should not. :)
Comment 15 GoodMirek 2017-11-13 20:01:41 UTC
I have successfully applied the patch on kernel 4.14.0 (4.14.0-1.fc28.x86_64).
WiFi works with the patch and firmware 34.ucode.
As of the previous comments, it seems the change of BT behavior is not in scope of this ticket, thanks for the explanation.
Comment 16 Luca Coelho 2017-11-25 10:56:30 UTC
*** Bug 197983 has been marked as a duplicate of this bug. ***
Comment 17 sandy.8925 2017-11-26 16:09:43 UTC
Hello, will these changes be pushed out in a point release for 4.14? WiFi is completely broken and not working for my Thinkpad X1 Carbon (4th gen) running Arch Linux.
Comment 18 Luca Coelho 2017-11-26 16:31:39 UTC
Yes, I'm hoping it will make to 4.14.2.  It's already in Linus' tree and, from there, it should be picked up for 4.14 stable.

Meanwhile, you can either cherry-pick this commit from Linus' tree:

dac4df1c5f2c ("iwlwifi: mvm: support version 7 of the SCAN_REQ_UMAC FW command")

...or you can just remove iwlwifi-8265-34.ucode from your /lib/firmware directory, so the driver will load the older version 31 instead.
Comment 19 sandy.8925 2017-11-26 16:53:05 UTC
Uh - 4.14.2 has already been released. Based on the changelog (and my experience trying it out) the fix was not included - https://cdn.kernel.org/pub/linux/kernel/v4.x/ChangeLog-4.14.2

Since Arch Linux's linux-firmware package was updated to include version 34 of the firmware, WiFI on my laptop stopped working (and there was no fallback).

IMO linux-firmware git repo should follow the kernel's versioning scheme to ensure the correct firmware is used with the appropriate kernel versions. Otherwise, drivers will start failing in a seemingly random fashion due to differences in packaging between different distros.
Comment 20 Luca Coelho 2017-11-26 18:12:50 UTC
Yeah, sorry, typo about the stable version.  It will probably be in the next one to be released next week (or the next, as I can't really guarantee when Greg will pick it up).

About linux-firmware.git, we already have a schegme that solves the problem you are mentioning.  We have the FW versions.  We bump up the version when the API actually changes, so old drivers can't work with it anymore.  But newer drivers can also handle a bunch of older versions too, and that's why I recommended that you take iwlwifi-8264-31.ucode instead.  This older version *is* in the linux-firmware.git tree.  So, if ou remove the newer version (that was broken without my driver patch), the driver will load the older version and work happily.

It was an actual *bug* that caused this problem.  There is nothing wrong with our process of getting the correct firmware to work with the driver you chose.
Comment 21 sandy.8925 2017-11-26 19:03:26 UTC
Ah got it. I installed a custom linux-firmware package with the newer firmware files removed. Once the relevant change is available in a stable release, I'll try it out. Thank you!
Comment 22 Emmanuel Grumbach 2017-11-27 13:37:05 UTC
*** Bug 197995 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.