Hello! So, after boot to kernel 4.12.0, I found that my Intel 8260 WiFi cannot get IP address. I've restarted dhcpcd manually, and it worked, but after suspend-resume cycle, problem reappears - no address. I've rebooted - same problem. After reboot to 4.11.6 - no such problem, dhcpcd can get IP address. My hardware is Thinkpad T460s
Here is a dmesg: [ 1.572417] iwlwifi 0000:04:00.0: enabling device (0000 -> 0002) [ 1.579121] iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-30.ucode failed with error -2 [ 1.579142] iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-29.ucode failed with error -2 [ 1.579157] iwlwifi 0000:04:00.0: Direct firmware load for iwlwifi-8000C-28.ucode failed with error -2 [ 1.591271] iwlwifi 0000:04:00.0: capa flags index 3 larger than supported by driver [ 1.591746] iwlwifi 0000:04:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm [ 1.611860] iwlwifi 0000:04:00.0: Detected Intel(R) Dual Band Wireless AC 8260, REV=0x208 [ 1.613747] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled [ 1.614840] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled [ 1.760741] iwlwifi 0000:04:00.0 wlp4s0: renamed from wlan0 [ 4.004068] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled [ 4.005272] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled [ 4.149677] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled [ 4.151074] iwlwifi 0000:04:00.0: L1 Enabled - LTR Enabled
Hi, can we have the syslog? I can't see why dhcpd works only after being restarted...
Created attachment 257477 [details] syslog before and after suspend
Attached log file. I've used static IP address now, and the bug is worse - traffic is not working at all. I do not understand how dhcpcd is fixing it, but starting it afterwards fixed traffic again.
I can't see anything related to the dhcp ok your syslog. Weird. I'll ask tracing with -e iwlwifi -e iwlwifi_data. I have to inform you that this will show ask the data in plaintext.
Yeah, in this log I'm using static IP, and the traffic is not working. Trying now udhcpc client
Can you try also with power_scheme=1 passed as a module parameter to iwlmvm? Please add this as an option in iwlwifi.conf.
I found the cause - it's preassoc_mac_addr=2 in wpa_supplicant.conf. When it's disabled - no problem
I'm not sure where this iwlwifi.conf should be placed - I'm on Gentoo, and there is no such flag. Can I just pass iwlmvm.power_scheme=1 on kernel cmdline?
Tried to reboot with iwlmvm.power_scheme=1 and preassoc_mac_addr=2 - same result, no address. So it's definitely a cause. But it worked in 4.11...
Please check the firmware version in both kernel. We print the firmware version when iwlwifi loads.
They're the same: yar@thinkpad ~ $ sudo journalctl -b | egrep "(Linux version|iwlwifi.*firmware version)" июл 13 18:36:15 thinkpad.lan kernel: Linux version 4.12.0-gentoo (root@thinkpad.lan) (gcc version 5.4.0 (Gentoo 5.4.0-r3 p1.3, pie-0.6.5) ) #6 SMP PREEMPT Thu Jul 13 02:48:16 MSK 2017 июл 13 18:36:16 thinkpad.lan kernel: iwlwifi 0000:04:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm yar@thinkpad ~ $ sudo journalctl -b -1 | egrep "(Linux version|iwlwifi.*firmware version)" июл 13 18:35:24 thinkpad.lan kernel: Linux version 4.11.6-gentoo (root@thinkpad.lan) (gcc version 5.4.0 (Gentoo 5.4.0 p1.0, pie-0.6.5) ) #1 SMP PREEMPT Fri Jun 23 18:40:50 MSK 2017 июл 13 18:35:25 thinkpad.lan kernel: iwlwifi 0000:04:00.0: loaded firmware version 27.455470.0 op_mode iwlmvm
All right, I'll take a look at the diff in the driver a bit later.
I can't have access to my source code right now... It'll have to wait until Sunday... What you can do in the meantime is to try our backport tree on 4.11 (it won't work on 4.12) and check what happens. If you can repro the bug there, then it'll super easy to bisect. The tree is here: https://git.kernel.org/pub/scm/linux/kernel/git/iwlwifi/backport-iwlwifi.git/
Created attachment 257479 [details] New error I've tried to use 4.11.6 with iwlwifi-backport (master branch), but found another error
Ok - that's embarrassing :) This seems to be happening after another firmware crash, no? iwlwifi 0000:04:00.0: Firmware error during reconfiguration - reprobe! This happens when we have a firmware crash while recovering from another crash. I guess I can fix this crash, but I'd prefer to see the first problem first. Does this happen all the time?
FWIW - I was able to use -27.ucode on the master branch. But I had to load iwlwifi with led_mode=3 Can you try this and bisect? thanks!
Emmanuel, you were right about error - and iwlwifi.led_mode=3 fixed this problem, but I was not able to reproduce original problem - with preassoc_mac_addr=2 everything works properly. I'm not sure how to bisect this... At least in 4.11 it works with and without backport (and in 4.12 it doesn't work, checked this again)
The preassoc_mac_addr thing makes no sense to me. It is supposed to be impacting scanning only and in your case, I can see that you succeed to connect, to the preassoc_mac_addr parameter shouldn't be impacting anything. I'll take it with my wpa_s colleagues. So, I understand that: 4.11 / preassoc_mac_addr=0: works 4.11 / preassoc_mac_addr=2: works 4.12 / preassoc_mac_addr=0: does NOT work 4.12 / preassoc_mac_addr=2: works Am I correct?
for 4.11 - right. For 4.12 - preassoc_mac_addr=2 - problem, preassoc_mac_addr commented (I think that by default it's 0) - works. My guess is that it happens like that: 1) wpa_supplicant is programming random mac, which is used for scanning 2) on connect, wpa_supplicant is trying to set MAC, but firmware and/or driver are still keeping old random MAC 3) But dhcpcd and wlan interface are using real mac - so packets are just not returning. Or maybe firmware is setting some kind of filter... I'll try to do bisect on vanilla sources - 14k commits, but it's probably 14 compiles and reboots...
Ok - this starts to be understandable :) The fact that it works with 4.11 and backport means that our latest driver works.... or that the bug is elsewhere... The best way to debug this is just to leave tracing open during suspend resume. We'd need the data as well so: sudo trace-cmd record -e iwlwifi -e iwlwifi_data -e mac80211 -e cfg80211 *beware* this will include tons of private data. Our privacy notice is here: https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging#privacy_aspects
It's not quite reproducible on suspend-resume, but on boot I can reproduce it 100%. So I'm not quite sure that there is a way to do tracing. I'll try to do this bisecting.
So, I was wrong about iwlwifi - sorry! Here is a commit which broke my wifi - https://github.com/torvalds/linux/commit/cd8966e75ed3c6b41a37047a904617bc44fa481f rtnetlink: Do not generate notifications for CHANGEADDR event
Thanks! I am moving the bug to the right component. We'll still be CC'ed.
Which code base processes the rtnetlink notifications? I'd like to understand why the redundant notification is needed.
I don;t quite understand the question. I'm using wpa_supplicant-2.6 with kernel 4.12
Can you enable debugging for wpa_supplicant and collect the logs for 4.11.6 and 4.12? The revert is easy to do, but I want to understand why wpa_supplicant is not responding to the first rtnetlink message. Hoping that the debug logs will shed some light. Thanks.
So, I was probably wrong about wpa_supplicant - it looks like on 4.11 dhcpcd is getting two events on mac change, and on 4.12 - only one. But with wpa_supplicant option preassoc_mac_addr=2, MAC is changing two times - first to random (for scan) and back to real one. I'm attaching dhcpcd logs
Created attachment 257605 [details] dhcpcd log from 4.11 two mac changes - correct
Created attachment 257607 [details] dhcpcd log from 4.12 only one mac change - wrong
I see why it breaks. thanks for the logs. patch sent
Can you attach it here for educational purposes? :)
The patch is just a revert of the change. The explanation of why: "The duplicate CHANGEADDR event message is sent regardless of link status whereas the setlink changes only generate a notification when the link is up. Not sending a notification when the link is down breaks dhcpcd which only processes hwaddr changes when the link is down."
Thanks. Want to close the bug?
Waiting for the patch to be picked up.