Bug 208599

Summary: iwlwifi: warning nl80211_get_reg_do
Product: Drivers Reporter: Serg Podtynnyi (serg)
Component: network-wireless-intelAssignee: Default virtual assignee for network-wireless-intel (drivers_network-wireless-intel)
Status: RESOLVED CODE_FIX    
Severity: normal CC: amonakov, bkohler, chris2553, cruzki123, golan.ben.ami, linux, m1027, martin.stolpe, mike+lists, serg, t.clastres
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.7.9 Subsystem:
Regression: No Bisected commit-id:

Description Serg Podtynnyi 2020-07-17 11:48:56 UTC
```
[   15.981636] ------------[ cut here ]------------
[   15.981643] WARNING: CPU: 7 PID: 492 at net/wireless/nl80211.c:7260 nl80211_get_reg_do+0x1cd/0x1f0
[   15.981644] Modules linked in: algif_aead des_generic libdes md4 mei_hdcp i2c_designware_platform(+) i2c_designware_core rtsx_pci_sdmmc dell_wmi wmi_bmof intel_wmi_thunderbolt dell_laptop(+) snd_hda_codec_hdmi dell_smbios dell_wmi_descriptor dcdbas dell_smm_hwmon snd_sof_pci snd_sof_intel_byt snd_sof_intel_ipc snd_sof_intel_hda_common snd_sof_xtensa_dsp snd_sof snd_sof_nocodec snd_hda_codec_realtek snd_soc_acpi_intel_match snd_soc_acpi snd_hda_codec_generic snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine ledtrig_audio psmouse snd_hda_intel snd_intel_dspcfg serio_raw snd_hda_codec snd_hda_core iwlmvm snd_hwdep snd_pcm rtsx_pci snd_timer snd i2c_i801(+) mei_me soundcore intel_lpss_pci mei intel_lpss iwlwifi idma64 virt_dma hid_sensor_magn_3d hid_sensor_rotation hid_sensor_incl_3d hid_sensor_gyro_3d hid_sensor_als hid_sensor_custom cros_ec_ishtp processor_thermal_device cros_ec intel_soc_dts_iosf ucsi_acpi typec_ucsi typec wmi i2c_hid soc_button_array int3403_thermal int340x_thermal_zone
[   15.981672]  intel_pmc_core intel_hid sparse_keymap int3400_thermal acpi_thermal_rel sdhci_pltfm pkcs8_key_parser atkbd libps2 i8042
[   15.981677] CPU: 7 PID: 492 Comm: iwd Tainted: G     U            5.7.9-13372.native #1
[   15.981678] Hardware name: Dell Inc. XPS 13 7390 2-in-1/06CDVY, BIOS 1.5.0 06/05/2020
[   15.981680] RIP: 0010:nl80211_get_reg_do+0x1cd/0x1f0
[   15.981682] Code: 00 00 00 4c 89 e7 e8 e2 29 7d ff 85 c0 0f 84 10 ff ff ff eb a8 4c 89 e7 48 89 45 c8 e8 dc 83 e1 ff 48 8b 45 c8 e9 53 ff ff ff <0f> 0b 4c 89 e7 e8 c9 83 e1 ff b8 ea ff ff ff e9 3f ff ff ff e9 7a
[   15.981683] RSP: 0018:ffffa99240d47b98 EFLAGS: 00010202
[   15.981684] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[   15.981685] RDX: ffff99efc0990008 RSI: 0000000000000000 RDI: ffff99efc0990300
[   15.981686] RBP: ffffa99240d47bd0 R08: ffff99efc0990300 R09: ffff99f1baf5e014
[   15.981686] R10: 0000000000000000 R11: 000000000000001c R12: ffff99f00f553f00
[   15.981687] R13: ffffa99240d47bf0 R14: ffff99f1baf5e014 R15: 0000000000000000
[   15.981688] FS:  00007f6c42760740(0000) GS:ffff99f1bf7c0000(0000) knlGS:0000000000000000
[   15.981689] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   15.981690] CR2: 00007f6c424905e0 CR3: 00000001bdfc0003 CR4: 0000000000760ee0
[   15.981691] PKRU: 55555554
[   15.981691] Call Trace:
[   15.981697]  genl_family_rcv_msg+0x168/0x250
[   15.981699]  genl_rcv_msg+0x47/0x90
[   15.981701]  ? genl_family_rcv_msg+0x250/0x250
[   15.981702]  netlink_rcv_skb+0x49/0x110
[   15.981703]  genl_rcv+0x24/0x40
[   15.981705]  netlink_unicast+0x1e1/0x2f0
[   15.981706]  netlink_sendmsg+0x21f/0x420
[   15.981709]  sock_sendmsg+0x60/0x70
[   15.981710]  __sys_sendto+0x10e/0x190
[   15.981714]  ? __secure_computing+0x35/0xb0
[   15.981717]  ? syscall_trace_enter+0xaf/0x270
[   15.981719]  __x64_sys_sendto+0x24/0x30
[   15.981721]  do_syscall_64+0x55/0xf0
[   15.981723]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   15.981725] RIP: 0033:0x7f6c4288f0c3
[   15.981726] Code: 1f 84 00 00 00 00 00 f3 0f 1e fa 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 1d 45 31 c9 45 31 c0 b8 2c 00 00 00 c5 fc 77 0f 05 <48> 3d 00 f0 ff ff 77 65 c3 0f 1f 40 00 55 48 83 ec 20 48 89 54 24
[   15.981727] RSP: 002b:00007ffd3373a388 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   15.981728] RAX: ffffffffffffffda RBX: 0000556a2373d870 RCX: 00007f6c4288f0c3
[   15.981729] RDX: 000000000000001c RSI: 0000556a23754610 RDI: 0000000000000004
[   15.981729] RBP: 00007ffd3373a3a0 R08: 0000000000000000 R09: 0000000000000000
[   15.981730] R10: 0000000000000000 R11: 0000000000000246 R12: 0000556a23747450
[   15.981731] R13: 00007ffd3373a3fc R14: 00007ffd3373a3f8 R15: 0000556a23747500
[   15.981733] ---[ end trace fc11e98d7b0c62fa ]---
```

```
00:14.3 Network controller: Intel Corporation Killer Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW) (rev 30)
```

```
[   15.719195] iwlwifi 0000:00:14.3: api flags index 2 larger than supported by driver
[   15.719226] iwlwifi 0000:00:14.3: TLV_FW_FSEQ_VERSION: FSEQ Version: 65.3.35.22
[   15.719230] iwlwifi 0000:00:14.3: Found debug destination: EXTERNAL_DRAM
[   15.719231] iwlwifi 0000:00:14.3: Found debug configuration: 0
[   15.719487] iwlwifi 0000:00:14.3: loaded firmware version 53.c31ac674.0 Qu-c0-hr-b0-53.ucode op_mode iwlmvm
[   15.719510] iwlwifi 0000:00:14.3: Direct firmware load for iwl-debug-yoyo.bin failed with error -2
[   15.720486] intel-lpss 0000:00:15.0: enabling device (0000 -> 0002)
[   15.720830] idma64 idma64.0: Found Intel integrated DMA 64-bit
[   15.725999] iwlwifi 0000:00:14.3: Detected Killer(R) Wi-Fi 6 AX1650i 160MHz Wireless Network Adapter (201NGW), REV=0x338
```

Dell Inc. XPS 13 7390 2-in-1 notebook
Comment 1 m1027 2020-10-21 10:26:14 UTC
Hope it's okay to add another warning here from dmesg. It appears to me to be
either the same or very close.

We've been discussing and bisecting it at Gentoo's here [1] as well as on
iwd's mailing list [2].

[1] https://bugs.gentoo.org/746539

[2] https://lists.01.org/hyperkitty/list/iwd@lists.01.org/thread/PSQEBUVXJLMR7TB2DDVY2R6JNXYIQLSD/

The summary:

iwd-1.7 was not hit, but 1.8 and 1.9 have been hit by the issue.

The kernel has been bisected down to this commit:
https://git.kernel.org/pub/scm/network/wireless/iwd.git/commit/?id=b43e915b989dcbb0fa763fb7f256e30fe7426f14

However, it may not be the real cause, maybe only revealing the
issue. Still happening with linux 5.9.1.

In the testbed iwd gets started via systemd (at least here). Note:
Starting the service *after* the boot process manually does *not*
crash. So, currently everything works, as the service gets restarted
after the initial crash.

Last but not least, the output of dmesg:

[   16.086329] ------------[ cut here ]------------
[   16.086333] WARNING: CPU: 0 PID: 370 at net/wireless/nl80211.c:7284 nl80211_get_reg_do+0x1fc/0x230
[   16.086334] CPU: 0 PID: 370 Comm: iwd Not tainted 5.8.13 #15
[   16.086335] Hardware name: LENOVO 20KFCTO1WW/20KFCTO1WW, BIOS N20ET55W (1.40 ) 06/01/2020
[   16.086336] RIP: 0010:nl80211_get_reg_do+0x1fc/0x230
[   16.086338] Code: 00 00 00 48 89 ef e8 13 ff 85 ff 85 c0 0f 84 01 ff ff ff eb a6 48 89 ef 48 89 04 24 e8 4d ff e0 ff 48 8b 04 24 e9 43 ff ff ff <0f> 0b 48 89 ef e8 3a ff e0 ff b8 ea ff ff ff e9 2f ff ff ff e9 78
[   16.086338] RSP: 0018:ffff9ab9c041bb98 EFLAGS: 00010202
[   16.086339] RAX: 0000000000000000 RBX: 0000000000000001 RCX: 0000000000000000
[   16.086340] RDX: ffff95472b7b0008 RSI: 0000000000000000 RDI: ffff95472b7b02e0
[   16.086340] RBP: ffff9547264c1d00 R08: 0000000000000004 R09: ffff954728585014
[   16.086340] R10: ffff954728581000 R11: 0000000000000001 R12: ffff9ab9c041bbf0
[   16.086341] R13: 0000000000000000 R14: ffff954728585014 R15: ffff95472b7b02e0
[   16.086341] FS:  00007f346c24e740(0000) GS:ffff95472e400000(0000) knlGS:0000000000000000
[   16.086342] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   16.086342] CR2: 000055f2e8fc3010 CR3: 00000004222c2002 CR4: 00000000003606f0
[   16.086343] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   16.086343] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   16.086344] Call Trace:
[   16.086346]  genl_rcv_msg+0x1ae/0x2f0
[   16.086348]  ? genl_family_rcv_msg_attrs_parse.isra.0+0xd0/0xd0
[   16.086349]  netlink_rcv_skb+0x46/0x110
[   16.086350]  genl_rcv+0x1f/0x30
[   16.086351]  netlink_unicast+0x197/0x230
[   16.086352]  netlink_sendmsg+0x1ed/0x400
[   16.086353]  __sys_sendto+0x1d3/0x1f0
[   16.086355]  __x64_sys_sendto+0x21/0x30
[   16.086356]  do_syscall_64+0x4d/0x1d0
[   16.086358]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   16.086360] RIP: 0033:0x7f346c3d862c
[   16.086361] Code: 89 02 48 c7 c0 ff ff ff ff eb b8 0f 1f 00 41 89 ca 64 8b 04 25 18 00 00 00 85 c0 75 21 45 31 c9 45 31 c0 b8 2c 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 6c e9 91 0c 04 00 0f 1f 80 00 00 00 00 55 48
[   16.086361] RSP: 002b:00007fff38eb6978 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
[   16.086362] RAX: ffffffffffffffda RBX: 000055f2e8faab00 RCX: 00007f346c3d862c
[   16.086362] RDX: 000000000000001c RSI: 000055f2e8fc2ac0 RDI: 0000000000000004
[   16.086363] RBP: 000055f2e8fc1910 R08: 0000000000000000 R09: 0000000000000000
[   16.086363] R10: 0000000000000000 R11: 0000000000000246 R12: 00007fff38eb6a08
[   16.086364] R13: 000055f2e8fb4910 R14: 000055f2e8fb4790 R15: 0000000000000000
[   16.086364] ---[ end trace a7bf7b4f3a41fe0a ]---

Thanks!
Comment 2 Chris Clayton 2021-01-20 16:10:27 UTC
I've been trying to track down the cause of this warning for a few days now. I think it's a combination of changes to both the kernel and iwd. The warning seems to arise because when iwd is launched, it tries to get regulatory domain information from the kernel. If the adapter (phy0) has not been set up at this point, the warning is produced. In my case, at this point in the boot wlan0 exists, but the associated phy0 does not. For now, i "fix" that situation by inserting "ip link set wlan0 up" into the init script at the point immediately before iwd is launched. Bringing up the WLAN appears to set up the the missing phy0 and the warning is no longer produced. This explains why the warning is seen during boot, but does not appear if the network is manually restarted after login.

You may have guessed from my use of the term init script, that I don't use systemd on my system (which is based on Linux From Scratch). My init system is sysvinit-2.98.
Comment 3 Alexander Monakov 2021-04-27 21:37:00 UTC
The following (currently unreviewed?) patch explains that warning is over-zealous and removes it: https://lore.kernel.org/linux-wireless/iwlwifi.20210409123755.ba2ea961f4ae.I8fde32d3196e860efa3b4ec464c42194195b42ec@changeid/
Comment 4 Chris Clayton 2021-04-30 08:16:58 UTC
I've applied the patch at comment 3 to linux-5.12. Obviously, the warning splat has gone but more importantly, iwd still connects my laptop to mum wireless network and data transfer over the network works fine. Thanks, Alexander.
Comment 5 Chris Clayton 2021-04-30 08:18:35 UTC
I've applied the patch at comment 3 to linux-5.12. Obviously, the warning splat has gone but more importantly, iwd still connects my laptop to my wireless network and data transfer over the network works fine. Thanks, Alexander.
Comment 6 Chris Clayton 2021-06-10 06:30:32 UTC
Is there any progress on getting this fix upstream, please? I can't find the change in any tree on kernel.org.
Comment 7 Chris Clayton 2021-07-03 08:05:59 UTC
Ping.
Comment 8 Chris Clayton 2021-08-24 19:38:37 UTC
Any progress on moving this upstream, please?
Comment 9 Chris Clayton 2021-09-07 09:21:55 UTC
<frustrated_rant>Well, I give up! I provided an analysis of what I think the  problem is 20 months ago and nothing has happened to materially change things! Nobody at Intel seems to care that a driver they maintain is out of step with a daemon which, it seems, they substantially maintain. Utter crap!!!!!</frustrated_rant>

I have a better workaround than the one I described in comment 2 above. I have created the file /etc/modprobe.d/iwlwifi.conf with the following contents:

softdep iwlwifi pre: cfg80211
options cfg80211 ieee80211_regdom=GB

Obviously, you change iwlwifi on the first line  to the name of the wireless driver on your system and change GB on the second line to the identifier for whatever country you live in.

You can call the .conf file whatever you want except that it must have the .conf suffix.
Comment 10 Golan Ben Ami 2021-11-18 10:16:07 UTC
Sorry for not replying here, well, ever.
Eventually, we have uploaded a different version of the fix for the issue, and reverted the patch mentioned above.
This is the fix merged:
eb09ae9 iwlwifi: mvm: load regdomain at INIT stage
Comment 11 Chris Clayton 2021-11-18 19:40:27 UTC
Thanks Golan.


On 18/11/2021 10:16, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=208599
> 
> Golan Ben Ami (golan.ben.ami@intel.com) changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>              Status|NEW                         |RESOLVED
>                  CC|                            |golan.ben.ami@intel.com
>          Resolution|---                         |CODE_FIX
> 
> --- Comment #10 from Golan Ben Ami (golan.ben.ami@intel.com) ---
> Sorry for not replying here, well, ever.
> Eventually, we have uploaded a different version of the fix for the issue,
> and
> reverted the patch mentioned above.
> This is the fix merged:
> eb09ae9 iwlwifi: mvm: load regdomain at INIT stage
> 

Are there any plans to backport the patch to stable. It looks to me as if 5.4 and 5.10 both need the fix.

Chris
Comment 12 Chris Clayton 2021-11-21 19:43:27 UTC
Sorry, my mistake - the fix has been backported to 5.10-stable.