Bug 109531

Summary: iwlwifi: 7265: can't load D version of FW - MWG100252580
Product: Drivers Reporter: gordan
Component: network-wirelessAssignee: DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: gordan, linuxwifi, linville
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 3.18.26 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: lspci.txt
iwlwifi dmesg output
fix rebased to 3.18

Description gordan 2015-12-17 19:18:43 UTC
3.18.25 update appears to break the iwlwifi driver.

3.18.24 works:
Intel(R) Wireless WiFi driver for Linux, in-tree:
Copyright(c) 2003- 2014 Intel Corporation
iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
iwlwifi 0000:04:00.0: irq 34 for MSI/MSI-X
iwlwifi 0000:04:00.0: loaded firmware version 23.15.10.0 op_mode iwlmvm
iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
iwlwifi 0000:04:00.0 wlp4s0: renamed from wlan0

3.18.25 doesn't:
Intel(R) Wireless WiFi driver for Linux, in-tree:
Copyright(c) 2003- 2014 Intel Corporation
iwlwifi 0000:04:00.0: irq 34 for MSI/MSI-X
iwlwifi 0000:04:00.0: loaded firmware version 23.15.10.0 op_mode iwlmvm
iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: Failed to start INIT ucode: -110
iwlwifi 0000:04:00.0: Failed to run INIT ucode: -110

It looks like the problem is with firmware handling. The firmware appears to be the latest one listed for for 3.17+ kernels.
Comment 1 Johannes Berg 2015-12-18 13:30:54 UTC
Likely due to my commit 8babe12bfba735a55fd42e3ba3d0178eb3383c37
Author: Johannes Berg <johannes.berg@intel.com>
Date:   Tue Nov 18 15:39:51 2014 +0100

    iwlwifi: pcie: support 7265-D devices
    
    [ Upstream commit 3fd0d3c170ad6ba8b64e16938f699d0b43cc782e ]

does the bug go away if you install the 7265D firmware image(s)?
Comment 2 gordan 2015-12-18 13:58:06 UTC
7265D firmware blobs are already there:

# ls -la /lib/firmware/*7265*
-rw-r--r--. 1 root root  736844 Mar  5  2015 /lib/firmware/iwlwifi-7265-10.ucode
-rw-r--r--. 1 root root  880604 May  3  2015 /lib/firmware/iwlwifi-7265-12.ucode
-rw-r--r--. 1 root root  740436 Mar  5  2015 /lib/firmware/iwlwifi-7265D-10.ucode
-rw-r--r--. 1 root root 1002800 May  3  2015 /lib/firmware/iwlwifi-7265D-12.ucode
Comment 3 Johannes Berg 2015-12-18 14:05:16 UTC
Oh, right, it's actually loading it, sorry.

What if you rename the 7265 files to 7256D? That would load the original one - just want to be sure that it's really due to this change, this obviously isn't really a fix.
Comment 4 gordan 2015-12-18 14:15:38 UTC
Yes, renaming iwlwifi-7265-10.ucode to iwlwifi-7265D-10.ucode does appear to make it work. So it looks like it is picking the wrong firmware file.
Comment 5 Emmanuel Grumbach 2015-12-19 20:02:17 UTC
Thank you for your report.

This is extremely strange. It'd mean that you have a device that identifies itself as 7265D (through the register) but is not a real 7265D device. I'll try to get information internally. Note that the 7265D is "newer" than 7265 and it has more room for firmware, hence the different firmware versions.

Are you sure your device is not an engineering sample?
Did it come from a known OEM or you bought it on internet?

can you share the output of:

sudo lspci -vvvv -xxxx

thank you.
Comment 6 gordan 2015-12-20 15:16:12 UTC
Created attachment 197881 [details]
lspci.txt
Comment 7 gordan 2015-12-20 15:18:58 UTC
Output of lspci is attached.

The WiFi  module came with a new Clevo laptop a couple of months ago, so I don't think it is an engineering sample.
Comment 8 Emmanuel Grumbach 2015-12-20 15:22:23 UTC
Thanks for the data.
Checking internally.
Comment 9 Emmanuel Grumbach 2015-12-20 19:56:14 UTC
Can you please do:
echo 1 >  /sys/kernel/debug/iwlwifi/0000\:04\:00.0/trans/csr

and attach the dmesg output?

Thanks.
Comment 10 gordan 2015-12-20 19:59:43 UTC
There is no /sys/kernel/debug/iwlwifi directory.
Comment 11 Emmanuel Grumbach 2015-12-20 20:10:52 UTC
Hmm... probably you don't have CONFIG_IWLWIFI_DEBUGFS set?
Comment 12 Emmanuel Grumbach 2015-12-20 20:11:44 UTC
If you want, I can send you a link to a backport based tree with our driver. This will save you the compilation of the kernel to change the Kconfig option.
Comment 13 gordan 2015-12-20 20:15:17 UTC
Indeed:
# CONFIG_IWLWIFI_DEBUGFS is not set

I'll recompile my kernel and report back.
Comment 14 gordan 2015-12-20 20:51:35 UTC
Created attachment 197901 [details]
iwlwifi dmesg output

dmesg CSR output attached.
Comment 15 Emmanuel Grumbach 2015-12-20 20:59:52 UTC
To be able to rule out a bad firmware, can you please install our backport based driver to be able to use the latest firmware?

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/core_release

I'd like to make sure that the bug isn't in the firmware itself.
Thanks!
Comment 16 Emmanuel Grumbach 2015-12-20 21:01:24 UTC
CSR_HW_REV: 0X00000210

This register tells us that you have a 7265D device which should be able to use 7265D.ucode
Comment 17 gordan 2015-12-20 21:08:33 UTC
Which version of the driver should I build? 14? 15?

iwlwifi-7265-10.ucode works with the driver in 3.18.25 (when I copy the file to iwlwifi-7265D-10.ucode)
iwlwifi-7265D-10.ucode results in the error above. The firmwares are the latest from:

https://wireless.wiki.kernel.org/_media/en/users/drivers/iwlwifi-7265-ucode-23.15.10.0.tgz
Comment 18 Emmanuel Grumbach 2015-12-20 21:11:58 UTC
(In reply to gordan from comment #17)
> Which version of the driver should I build? 14? 15?
> 

Let's go for 15 which means -18.ucode

> iwlwifi-7265-10.ucode works with the driver in 3.18.25 (when I copy the file
> to iwlwifi-7265D-10.ucode)
> iwlwifi-7265D-10.ucode results in the error above. The firmwares are the
> latest from:
> 
> https://wireless.wiki.kernel.org/_media/en/users/drivers/iwlwifi-7265-ucode-
> 23.15.10.0.tgz

Right - you are doing the right thing :) You have the latest firmware your kernel supports :)
Comment 19 gordan 2015-12-20 21:22:55 UTC
With the v15 driver it seems to work (downloaded the firmware blobs listed on the same page).

$ dmesg | grep iwl
Loading modules backported from iwlwifi
iwlwifi-stack-public:release/LinuxCore15:4768:96149b8f
iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
iwlwifi 0000:04:00.0: irq 34 for MSI/MSI-X
iwlwifi 0000:04:00.0: Direct firmware load for iwl-dbg-cfg.ini failed with iwlwifi 0000:04:00.0: loaded firmware version 18.261294.0 op_mode iwlmvm
iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
iwlwifi 0000:04:00.0 wlp4s0: renamed from wlan0
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled

It loaded firmware v18, so it seems to be the correct driver (the one I just built).

I'm not sure what this proves, though (new driver with new firmware works, but it isn't obvious from dmesg output whether the new driver tried to load the 7265 blob or the 7265D blob).
Comment 20 Emmanuel Grumbach 2015-12-20 21:29:36 UTC
Great.

The easiest is just to remove 7265-18.ucode and leave only 7265D-18.ucode.
This will prove that 7265D.ucode was loaded.
Comment 21 gordan 2015-12-20 21:39:41 UTC
OK, it still works with the iwlwifi-7265-18.ucode file moved out of the way.

I'm not sure, however, how that indicates whether the 3.18.25 driver or v10 firmware are at fault. All we still know is that the 3.18.25 driver works with the v10 7265 firmware but not the v10 7265D firmware.
Comment 22 Emmanuel Grumbach 2015-12-20 21:48:32 UTC
v10 is clearly at fault.
The driver doesn't do anything at that stage, just waits for the firmware to finish the INIT phase... and it doesn't.
We clearly need to revert the patch Johannes mentioned above.
I just need to think about how to do it...
We also need to delete 7265D-10.ucode from linux-firmware.git.

Oh well.. I'll need to check with Greg / firmware folks how to handle that.

Thanks for your cooperation!
Comment 23 Emmanuel Grumbach 2015-12-21 07:34:05 UTC
I replaced 7265D-10.ucode with 7265-10.ucode in http://git.kernel.org/cgit/linux/kernel/git/iwlwifi/linux-firmware.git/

Pull request has been sent upstream and communicated to our OSV collaboration team to accelerate the process of introducing this change in distributions.

Thanks for reporting this bug.
Comment 24 gordan 2016-01-26 11:33:22 UTC
This appears to still be broken in 3.18.26.

The API version has been bumped up to 12 (from 10) (which means that it also now needs to be pointed out on the firmware download page that kernels 3.18.26+ need iwlwifi-7265-ucode-25.17.12.0.tgz firmware blobs).

However, the only way to make my WiFi card work with 3.18.26 kernel is to copy
iwlwifi-7265-12.ucode
to
iwlwifi-7265D-12.ucode

# dmesg | grep iwl
[    3.900196] iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
[    3.900327] iwlwifi 0000:04:00.0: irq 33 for MSI/MSI-X
[    3.904409] iwlwifi 0000:04:00.0: loaded firmware version 25.17.12.0 op_mode iwlmvm
[    3.910980] iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
[    3.911049] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
[    3.911226] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
[    3.998163] ieee80211 phy0: Selected rate control algorithm 'iwl-mvm-rs'
[    4.000879] iwlwifi 0000:04:00.0 wlp4s0: renamed from wlan0
[    6.496631] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
[    6.496814] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled

This reports 7265 but tries to load the 7265D firmware blob which fails.
Comment 25 Emmanuel Grumbach 2016-01-26 12:51:04 UTC
Hi again,

We never print:
Detected Intel(R) Dual Band Wireless AC 7265D, REV=0x210
Only 7265 even for 'D' devices.

According to the REV= value (0x210), our device is a 'D' model.

Can I please ask you to try to install our Core release bundle?
I would like to know what happens on latest firmware and for that you'd need to update your driver:

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/core_release#core_release

Please try to install the Core16 bundle (driver + firmware, no need to update thee supplicant).

If that does not work, then we will try to see how come you can't load the 'D' firmware on your device.

Thank you.
Comment 26 gordan 2016-01-26 13:57:22 UTC
This is a bit Deja Vu-ish. Can we not debug the problem with 3.18.26 using the driver in 3.18.26, with the proscribed firmware for it (.10 up to 3.18.25 and .12 from 3.18.26)?

It seems that last time troubleshooting with the latest driver and firmware resulted in backport of a fix that doesn't seem to have fixed the problem.
Comment 27 Emmanuel Grumbach 2016-01-26 14:07:10 UTC
oh right. Sorry.

Will refresh my memory with the bug and get back to you.
Comment 28 Emmanuel Grumbach 2016-02-07 15:21:59 UTC
OK... I reproduced the bug. I was convinced it was a firmware issue, but ...it is a driver bug. In fact there is a dependency that is not satisfied in 3.18.

I hope I'll have some time tonight to debug this a bit further.
Comment 29 Emmanuel Grumbach 2016-02-07 20:53:07 UTC
Created attachment 203101 [details]
fix rebased to 3.18

I just sent this patch to stable for 3.18.

This should fix your issue. Can you test?

thanks
Comment 30 Emmanuel Grumbach 2016-02-08 07:29:35 UTC
Will close when I'll get feedback. Moving to resolved for now.
Comment 31 Emmanuel Grumbach 2016-02-14 08:27:57 UTC
Assuming all is good now.
Comment 32 gordan 2016-03-07 16:27:36 UTC
Still broken in the same way on 3.18.28 released on Friday:

# dmesg | grep -i iwl
[   38.705879] iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
[   38.706041] iwlwifi 0000:04:00.0: irq 34 for MSI/MSI-X
[   38.710581] iwlwifi 0000:04:00.0: loaded firmware version 25.17.12.0 op_mode iwlmvm
[   38.721267] iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 7265, REV=0x210
[   38.721336] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
[   38.721528] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
[   39.727255] iwlwifi 0000:04:00.0: Failed to start INIT ucode: -110
[   39.729699] iwlwifi 0000:04:00.0: Failed to run INIT ucode: -110

Same workaround as before still works:
[root@localhost /lib/firmware]# cp iwlwifi-7265-12.ucode iwlwifi-7265D-12.ucode
Comment 33 Emmanuel Grumbach 2016-03-07 16:59:18 UTC
I sent the fix to stable:

http://www.spinics.net/lists/stable/msg118348.html
But it was not applied to 3.18 yet.
This patch is already upstream.

re-closing the bug.