Bug 202163

Summary: iwlwifi: 9260: Failed to init the card
Product: Drivers Reporter: bernard.gautier4
Component: network-wireless-intelAssignee: DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi)
Status: RESOLVED UNREPRODUCIBLE    
Severity: blocking CC: bernard.gautier4, cordleztoaster, dyegomb, golan.ben.ami, pbrobinson
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 4.18.0-13-generic Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg log
dmesg with backport driver (release/core40)
Failed init with dump of stack.
Dump of CSR registers when wifi won't work
Dump of CSR register when wifi will work
Dump of CSR registers when wifi will work with bt blacklisted
dmesg with bt not probed
ignore this

Description bernard.gautier4 2019-01-06 11:39:02 UTC
Created attachment 280283 [details]
dmesg log

Intel AC 9260 card wifi part is not working on my PC.
log indicates:

[   11.417253] iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
[   11.452900] iwlwifi 0000:04:00.0: loaded firmware version 38.c0e03d94.0 op_mode iwlmvm
[   11.482132] iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 9260, REV=0x324
[   11.516997] iwlwifi 0000:04:00.0: Failed to init the card
[   11.546638] iwlwifi 0000:04:00.0: Failed to init the card

It worked once but does not work anymore.
Bluetooth is working.

I don't expect this to be a hardware problem as this same PC is booting Windows also and with Windows wifi is working properly.

By the way, AC 9260 was not working at all with Ubuntu 18.04 LTS and the kernel 4.15. After reading some issue with this kernel, I upgraded my Ubuntu to 18.10 (with default kernel 4.18). It worked once but the days after I was not able to have the card working again.
Comment 1 Emmanuel Grumbach 2019-01-06 11:42:42 UTC
Please install our latest backport driver from:

https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/core_release

And install the Core40 firmware.
Comment 2 bernard.gautier4 2019-01-06 12:50:29 UTC
I did it and installed also corresponding firmware (43), but the problem is still here.

[   11.392840] Loading modules backported from iwlwifi
[   11.392841] iwlwifi-stack-public:release/core40:7323:c29b654e
[   11.422441] iwlwifi 0000:04:00.0: enabling device (0000 -> 0002)
[   11.444252] iwlwifi 0000:04:00.0: Direct firmware load for iwl-dbg-cfg.ini failed with error -2
[   11.458686] iwlwifi 0000:04:00.0: loaded firmware version 43.95eb4e97.0 op_mode iwlmvm
[   11.477578] iwlwifi 0000:04:00.0: Detected Intel(R) Wireless-AC 9260 160MHz, REV=0x324
[   11.512580] iwlwifi 0000:04:00.0: Failed to init the card
[   11.541330] iwlwifi 0000:04:00.0: Failed to init the card
Comment 3 bernard.gautier4 2019-01-06 12:52:09 UTC
Created attachment 280285 [details]
dmesg with backport driver (release/core40)
Comment 4 Emmanuel Grumbach 2019-01-09 04:19:06 UTC
Oh well.

So even that didn't help.
I'll need get back to you but I am really really busy with other stuff.
Comment 5 Emmanuel Grumbach 2019-01-09 04:23:15 UTC
Sorry, ignore that comment.
Comment 6 Emmanuel Grumbach 2019-01-09 07:04:49 UTC
Please try this:


diff --git a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
index 57ae208fe1d5..9ca59bdb50ce 100644
--- a/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
+++ b/drivers/net/wireless/intel/iwlwifi/pcie/trans.c
@@ -386,7 +386,7 @@ static int iwl_pcie_apm_init(struct iwl_trans *trans)
        ret = iwl_poll_bit(trans, CSR_GP_CNTRL,
                           BIT(trans->cfg->csr->flag_mac_clock_ready),
                           BIT(trans->cfg->csr->flag_mac_clock_ready),
-                          25000);
+                          50000);
        if (ret < 0) {
                IWL_ERR(trans, "Failed to init the card\n");
                return ret;
Comment 7 bernard.gautier4 2019-01-09 07:20:32 UTC
Yesterday evening I tried to investigate that problem and I already tried your proposed change (but I put 100000 instead of 50000). It did not change anything unfortunately: no init of the 9260.
Then I realized that my bug may be a duplicate of https://bugzilla.kernel.org/show_bug.cgi?id=201319.
So I launched windows and reboot with Ubuntu. Since then, the wifi card is working fine. Even after a switch off of the computer (from Ubuntu) and switch on after 1 mn with Ubuntu, my wifi card is still working.
I switch off the computer all night, but this morning wifi is still working.

I have added also the dump_stack() lines in the driver code, but up to now, wifi is working so I have no dumps.

I will do some more tests to try to have the init of the card fail again.
Comment 8 Emmanuel Grumbach 2019-01-09 07:36:41 UTC
Oh great.

All this means I need more sleep :)

Is your platform a vPRO platform?
Do you have AMT provisioned?
Comment 9 bernard.gautier4 2019-01-09 07:39:56 UTC
I did not understand first fully the comments on bug 201319. But I got it now.
So:
1- if I switch off computer from Windows and switch on Ubuntu => wifi card do not work.
2- if I reboot computer from Windows to Ubuntu => wifi card will work with Ubuntu
3- If I switch off/on or reboot from Ubuntu to Ubuntu and card is working => card will continue to work
4- If I switch off/on or reboot from Ubuntu to Ubuntu and card is not working => card will never work.

I will do more tests to confirm it.

I will reply to your question just after.
Comment 10 bernard.gautier4 2019-01-09 07:41:26 UTC
Created attachment 280355 [details]
Failed init with dump of stack.
Comment 11 Emmanuel Grumbach 2019-01-09 08:20:47 UTC
This is helpful - thanks.

Problem, I don't have the .ko of iwlwifi and I can't really do much with the stack strace. I believe your kernel doesn't have frame pointers so that even if you send me the .ko I won't be able to know where in iwl_trans_pcie_alloc we failed.

Can you please add prints through iwl_trans_pcie_alloc so that we know where we fail here?

Thank you.

What you describe about the reboots with Windows and all seems to point to a known issue of BT / WiFi race in INIT flow. After a cold boot BT runs INIT, but not upon reboot. And it seems that we have a problem with BT in INIT in Linux.
Comment 12 bernard.gautier4 2019-01-09 08:33:13 UTC
I am not familiar with vPRO or AMT but:
My CPU is a i7-5960X so from ark.intel.com it does not seem to be a vPRO platform.
For AMT I did not see anything in my bios (my mother board is a Gigabyte X99 SLI).
Comment 13 bernard.gautier4 2019-01-09 08:36:52 UTC
I will add some prints in iwl_trans_pcie_alloc and will let you know.

Regarding your last point, even a power off -> on from Ubuntu to Ubuntu (so I will assume a cold boot) does not make a fail in the init. Wifi is still working.
Is a power off not a cold reboot in Ubuntu as it is in Windows ?
Comment 14 Emmanuel Grumbach 2019-01-09 10:31:31 UTC
depends on BT....


What is the md5sum output of /lib/firmware/intel/ibt-18-16-1.ddc ?

I want to check that you have:
https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git/commit/?id=d8e8163cf7c082bda83887adf246e866919682f9
Comment 15 Emmanuel Grumbach 2019-01-09 20:34:35 UTC
Hm... I just checked on my Ubuntu 18.04 system, and I don't have the latest BT FW installed.

Don't know what there is in 18.10.
Can you check?
Comment 16 bernard.gautier4 2019-01-09 20:59:53 UTC
md5sum /lib/firmware/intel/ibt-18-16-1.ddc 
2eeea5dfc754edcc1f701b73f7d89139  ibt-18-16-1.ddc

xxd /lib/firmware/intel/ibt-18-16-1.ddc
00000000: 0328 0118 0442 0145 8004 2901 0300 0327  .(...B.E..)....'
00000010: 0109 0a26 01ff 0000 0000 0000 00         ...&.........

xxd ibt-18-16-1.ddc from commit
00000000: 0328 0118 0442 0145 8004 2901 0300       .(...B.E..)...

These files are short.

I checked also both .sfi files but md5sum differs between the commit and what I have on my PC.
Comment 17 Emmanuel Grumbach 2019-01-09 21:05:14 UTC
yeah sorry, /lib/firmware/intel/ibt-18-16-1.ddc  is not very useful.

so this is good news, you don't have the latest version.
Please install the latest version and do a cold reboot.

Then, please check if that helped with the flows you identified earlier.

thanks.
Comment 18 bernard.gautier4 2019-01-09 21:36:44 UTC
So I copy the 2 .ddc files and the 2 .sfi files from the commit to my /lib/firmware/intel/ and did some tests, but no change: wifi is still not working after a power off from Windows.

I checked iwl_trans_pcie_alloc()  and the problem is always when calling the function iwl_poll_bit().
So I dumped some registers before this call (dump by calling iwl_pcie_dump_csr())

When wifi will work: See attachement reg_ok.txt
When wifi won't work: See attachement reg_bad.txt

By the way, I fear that there is also problem during the stop/suspend of this driver. I saw some error with DMA and some more...
Comment 19 bernard.gautier4 2019-01-09 21:37:39 UTC
Created attachment 280367 [details]
Dump of CSR registers when wifi won't work
Comment 20 bernard.gautier4 2019-01-09 21:37:58 UTC
Created attachment 280369 [details]
Dump of CSR register when wifi will work
Comment 21 Emmanuel Grumbach 2019-01-10 04:00:34 UTC
Thanks for all this!

Did you do a cold reboot?
Shutdown / start?
Comment 22 bernard.gautier4 2019-01-10 08:36:11 UTC
Don't really know if a cold reboot is done or not but I switched off from Windows and then powered on and boot on Ubuntu => Wifi still not working.

But I could not figure out which firmware is used by BT. Could it be that the ones I copied are not the ones used by BT as there are several other firmwares available in /lib/firmware/intel ?

I blacklisted Bluetooth and now wifi is working (as expected).
Find attached the list of CSR registers.
Comment 23 bernard.gautier4 2019-01-10 08:37:22 UTC
Created attachment 280387 [details]
Dump of CSR registers when wifi will work with bt blacklisted
Comment 24 bernard.gautier4 2019-01-15 18:27:23 UTC
Created attachment 280515 [details]
dmesg with bt not probed

Finally, even with btusb blacklisted, wifi does not always start.
Comment 25 dyegomb 2019-07-19 18:59:42 UTC
Created attachment 283845 [details]
ignore this

The same with me:

I've tried the latest version from https://git.kernel.org/pub/scm/linux/kernel/git/firmware/linux-firmware.git

version: 5.2.1-arch1-1-ARCH
firmware-version: 46.a41adfe7.0


working well with:
firmware-version: 46.3cfab8da.0
Comment 26 Golan Ben Ami 2021-12-23 07:57:14 UTC
Sorry for this bug starvation, i'll try to catch up.
This is the latest FW:
https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/linux-firmware.git/commit/?id=564d97abfa7d0071c47be16e7b691fdc7c6cf22b

i'd like to know if this issue still exists. if so, let's see how we can progress.
Comment 27 Golan Ben Ami 2022-01-04 18:26:30 UTC
Still here, Bernard? :)
Comment 28 bernard.gautier4 2022-01-05 07:49:25 UTC
Yes :)

It is a long time ago.
As far as I remember, the issue was mostly related to dual boot (Win10 and Ubuntu).
What I have found is that when Win10 is powering down, it is not exactly powering down by default and it is going in a kind of state which enables fast startup. But this state seems to set wifi in a state which prevent it to working when booting to Ubuntu instead of rebooting to Win10.
So there is an option in Win10 to do a real powerdown. Once done, wifi was working fine with Ubuntu.

So currently my issue is solved with that setup.
Comment 29 Golan Ben Ami 2022-01-05 09:01:48 UTC
Ok, thanks for the update.
With your permission, i'm moving the status to resolved.