Bug 218095 - iwlmvm: Linux 6.7 pre-rc1 prints stack trace of iwl_op_mode_mvm_start; caused by commit b6e3d1ba4fcf
Summary: iwlmvm: Linux 6.7 pre-rc1 prints stack trace of iwl_op_mode_mvm_start; caused...
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: network-wireless-intel (show other bugs)
Hardware: All Linux
: P3 normal
Assignee: Default virtual assignee for network-wireless-intel
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2023-11-02 21:30 UTC by Niklāvs Koļesņikovs
Modified: 2023-11-18 19:23 UTC (History)
2 users (show)

See Also:
Kernel Version:
Subsystem:
Regression: Yes
Bisected commit-id: b6e3d1ba4fcf02176846d03a930203d8133c0aaf


Attachments
The full kernel stack trace from dmesg (4.97 KB, text/plain)
2023-11-02 21:30 UTC, Niklāvs Koļesņikovs
Details

Description Niklāvs Koļesņikovs 2023-11-02 21:30:22 UTC
Created attachment 305350 [details]
The full kernel stack trace from dmesg

When testing the branch that will become Linux 6.7-rc1, ever since commit b6e3d1ba4fcf02176846d03a930203d8133c0aaf I have observed the following abbreviated kernel stack trace (please see attachments for the full trace):

```
[    6.733281]  ? __warn+0xc3/0x1d0
[    6.733284]  ? iwl_op_mode_mvm_start+0x9c4/0x9d0 [iwlmvm]
[    6.733299]  ? report_bug+0x167/0x220
[    6.733301]  ? handle_bug+0x3d/0x90
[    6.733304]  ? exc_invalid_op+0x1a/0x60
[    6.733306]  ? asm_exc_invalid_op+0x1a/0x20
[    6.733309]  ? iwl_op_mode_mvm_start+0x9c4/0x9d0 [iwlmvm]
[    6.733325]  _iwl_op_mode_start+0x66/0xe0 [iwlwifi]
[    6.733338]  iwl_opmode_register+0x87/0xd0 [iwlwifi]
[    6.733350]  ? __cfi_init_module+0x10/0x10 [iwlmvm]
[    6.733364]  init_module+0x22/0xff0 [iwlmvm]
[    6.733378]  ? __cfi_init_module+0x10/0x10 [iwlmvm]
[    6.733392]  do_one_initcall+0x129/0x380
[    6.733395]  ? idr_alloc_cyclic+0x148/0x1e0
[    6.733397]  ? security_kernfs_init_security+0x41/0x80
[    6.733399]  ? __kernfs_new_node+0x1be/0x250
[    6.733401]  ? preempt_count_add+0x55/0xb0
[    6.733404]  ? up_write+0x4a/0xe0
[    6.733406]  ? preempt_count_add+0x55/0xb0
[    6.733408]  ? sysvec_call_function+0xa4/0xb0
[    6.733410]  ? asm_sysvec_call_function+0x1a/0x20
[    6.733411]  ? free_unref_page_prepare+0xf3/0x410
[    6.733413]  ? preempt_count_add+0x62/0xb0
[    6.733415]  ? _raw_spin_trylock+0x19/0x60
[    6.733417]  ? _raw_spin_unlock+0x11/0x30
[    6.733419]  ? __kmem_cache_free+0x29a/0x3c0
[    6.733421]  ? vfree+0xd2/0x150
[    6.733422]  ? slab_post_alloc_hook+0x76/0x3d0
[    6.733425]  ? do_init_module+0x3f/0x230
[    6.733427]  ? __kmem_cache_alloc_node+0x1fe/0x2f0
[    6.733430]  do_init_module+0x7a/0x230
[    6.733432]  __se_sys_init_module+0x1a0/0x220
[    6.733435]  do_syscall_64+0x7a/0x100
[    6.733438]  ? syscall_exit_to_user_mode+0x2d/0x1d0
[    6.733440]  ? do_syscall_64+0x89/0x100
[    6.733442]  ? do_user_addr_fault+0x4e7/0x680
[    6.733444]  ? exc_page_fault+0x61/0x150
[    6.733446]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
```

I did try reverting the offending commit but the result did not compile and I did not want to dig deeper to identify any subsequent commits what would need reverting to fix compilation.

If it's relevant, I'm building the kernel with Clang 17.0.4 toolchain by setting the LLVM=1 variable. CFI is enabled but I'm not sure if it's actually functional.
Comment 1 Niklāvs Koļesņikovs 2023-11-02 21:34:00 UTC
Here's the result from git bisect that I forgot to include in the initial report:

```
b6e3d1ba4fcf02176846d03a930203d8133c0aaf is the first bad commit
commit b6e3d1ba4fcf02176846d03a930203d8133c0aaf
Author: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
Date:   Sun Oct 22 17:55:47 2023 +0300

    wifi: iwlwifi: mvm: implement new firmware API for statistics
    
    The new firmware API uses a new command and notification,
    the command configures in which statistics types driver is
    interested and the notification is sent periodically.
    An additional change in the API is that most of the statistics
    data is accumulated and reported by the firmware per MLO link.
    Implement new command and notification handlers and adjust to
    per-link statistics.
    
    Signed-off-by: Anjaneyulu <pagadala.yesu.anjaneyulu@intel.com>
    Signed-off-by: Gregory Greenman <gregory.greenman@intel.com>
    Link: https://lore.kernel.org/r/20231022173519.8cc7df0ebff2.If1dcb57145841c5b3c68ed112bbfcd0201f7acc3@changeid
    Signed-off-by: Johannes Berg <johannes.berg@intel.com>

 .../net/wireless/intel/iwlwifi/fw/api/commands.h   |  30 ++++
 drivers/net/wireless/intel/iwlwifi/fw/api/stats.h  | 153 ++++++++++++++++++--
 drivers/net/wireless/intel/iwlwifi/mvm/mac80211.c  |  46 ++++--
 .../net/wireless/intel/iwlwifi/mvm/mld-mac80211.c  |   6 +-
 drivers/net/wireless/intel/iwlwifi/mvm/mvm.h       |  12 ++
 drivers/net/wireless/intel/iwlwifi/mvm/ops.c       |  24 ++++
 drivers/net/wireless/intel/iwlwifi/mvm/rx.c        | 157 ++++++++++++++++++++-
 drivers/net/wireless/intel/iwlwifi/mvm/utils.c     |  61 ++++++++
 8 files changed, 451 insertions(+), 38 deletions(-)
```
Comment 2 Fabio Comolli 2023-11-03 11:07:58 UTC
Similar trace on my laptop:

Nov  1 18:58:31 fedora kernel: intel_rapl_msr: PL4 support detected.
Nov  1 18:58:31 fedora kernel: intel_rapl_common: Found RAPL domain package
Nov  1 18:58:31 fedora kernel: intel_rapl_common: Found RAPL domain core
Nov  1 18:58:31 fedora kernel: intel_rapl_common: Found RAPL domain uncore
Nov  1 18:58:31 fedora kernel: intel_rapl_common: Found RAPL domain psys
Nov  1 18:58:31 fedora kernel: ------------[ cut here ]------------
Nov  1 18:58:31 fedora kernel: WARNING: CPU: 0 PID: 772 at drivers/net/wireless/intel/iwlwifi/mvm/../iwl-trans.h:1158 iwl_op_mode_mvm_start+0x9f2/0xa10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: Modules linked in: snd_sof_pci_intel_tgl iwlmvm(+) snd_sof_intel_hda_common soundwire_intel snd_sof_intel_hda_mlink soundwire_cadence snd_sof_intel_hda snd_sof_pci snd_sof_xtensa_dsp snd_sof intel_rapl_msr intel_rapl_common mac80211 snd_sof_utils snd_soc_hdac_hda snd_hda_ext_core snd_soc_acpi_intel_match snd_soc_acpi soundwire_generic_allocation soundwire_bus x86_pkg_temp_thermal intel_powerclamp snd_hda_codec_hdmi coretemp snd_soc_core kvm_intel snd_hda_codec_realtek snd_compress snd_hda_codec_generic ac97_bus ledtrig_audio libarc4 snd_pcm_dmaengine snd_hda_intel kvm snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec iwlwifi snd_hda_core btusb btrtl btintel snd_hwdep btbcm iTCO_wdt intel_pmc_bxt btmtk irqbypass mei_hdcp mei_pxp uvcvideo ee1004 cfg80211 iTCO_vendor_support snd_seq bluetooth snd_seq_device uvc videobuf2_vmalloc videobuf2_memops rapl snd_pcm videobuf2_v4l2 videobuf2_common intel_cstate videodev snd_timer intel_uncore pcspkr wmi_bmof snd i2c_i801 mei_me soundcore mc i2c_smbus idma64 mei
Nov  1 18:58:31 fedora kernel: igen6_edac rfkill acpi_pad acpi_tad loop zram i915 i2c_algo_bit drm_buddy ttm crct10dif_pclmul crc32_pclmul crc32c_intel drm_display_helper polyval_clmulni nvme polyval_generic spi_pxa2xx_platform nvme_core ghash_clmulni_intel sha512_ssse3 dw_dmac nvme_common cec i2c_hid_acpi i2c_hid video wmi pinctrl_tigerlake serio_raw ip6_tables ip_tables fuse
Nov  1 18:58:31 fedora kernel: CPU: 0 PID: 772 Comm: modprobe Not tainted 6.7.0-0.rc0.20231101gt8bc9e651.302.vanilla.fc39.x86_64 #1
Nov  1 18:58:31 fedora kernel: Hardware name: Intel TU45B/TU45B, BIOS 100E 07/19/2022
Nov  1 18:58:31 fedora kernel: RIP: 0010:iwl_op_mode_mvm_start+0x9f2/0xa10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: Code: 49 c7 c0 48 2a 52 c1 31 d2 be 40 00 00 00 48 c7 c1 b0 8b 51 c1 48 8b 78 40 e8 3a aa 86 ff e9 0d fd ff ff 0f 0b e9 39 fe ff ff <0f> 0b e9 ff fb ff ff 0f 0b e9 2b fe ff ff e8 6b 0e b4 e3 66 66 2e
Nov  1 18:58:31 fedora kernel: RSP: 0018:ffffc900007cfca0 EFLAGS: 00010286
Nov  1 18:58:31 fedora kernel: RAX: 00000000ffffffff RBX: ffff8881211d1fc0 RCX: ffffffffc1519210
Nov  1 18:58:31 fedora kernel: RDX: 0000000000000070 RSI: 00000000000000ff RDI: ffffffffc15191a0
Nov  1 18:58:31 fedora kernel: RBP: ffff88811ee28028 R08: 0000000000000008 R09: ffffffffc1518d20
Nov  1 18:58:31 fedora kernel: R10: ffffffffc1518e10 R11: 0000000000000100 R12: ffff8881211d1fc8
Nov  1 18:58:31 fedora kernel: R13: 0000000fffffffe0 R14: ffff8881043cb810 R15: ffffc900007cfca8
Nov  1 18:58:31 fedora kernel: FS:  00007f27b54e7740(0000) GS:ffff88849fa00000(0000) knlGS:0000000000000000
Nov  1 18:58:31 fedora kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Nov  1 18:58:31 fedora kernel: CR2: 000055d0f1edb000 CR3: 000000010401a003 CR4: 0000000000f70ef0
Nov  1 18:58:31 fedora kernel: PKRU: 55555554
Nov  1 18:58:31 fedora kernel: Call Trace:
Nov  1 18:58:31 fedora kernel: <TASK>
Nov  1 18:58:31 fedora kernel: ? iwl_op_mode_mvm_start+0x9f2/0xa10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: ? __warn+0x81/0x130
Nov  1 18:58:31 fedora kernel: ? iwl_op_mode_mvm_start+0x9f2/0xa10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: ? report_bug+0x171/0x1a0
Nov  1 18:58:31 fedora kernel: ? handle_bug+0x3c/0x80
Nov  1 18:58:31 fedora kernel: ? exc_invalid_op+0x17/0x70
Nov  1 18:58:31 fedora kernel: ? asm_exc_invalid_op+0x1a/0x20
Nov  1 18:58:31 fedora kernel: ? iwl_op_mode_mvm_start+0x9f2/0xa10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: _iwl_op_mode_start+0xa8/0xd0 [iwlwifi]
Nov  1 18:58:31 fedora kernel: iwl_opmode_register+0x6c/0xe0 [iwlwifi]
Nov  1 18:58:31 fedora kernel: ? __pfx_iwl_mvm_init+0x10/0x10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: iwl_mvm_init+0x26/0xff0 [iwlmvm]
Nov  1 18:58:31 fedora kernel: ? __pfx_iwl_mvm_init+0x10/0x10 [iwlmvm]
Nov  1 18:58:31 fedora kernel: do_one_initcall+0x58/0x320
Nov  1 18:58:31 fedora kernel: do_init_module+0x60/0x240
Nov  1 18:58:31 fedora kernel: __do_sys_init_module+0x17f/0x1b0
Nov  1 18:58:31 fedora kernel: do_syscall_64+0x61/0xe0
Nov  1 18:58:31 fedora kernel: ? exc_page_fault+0x7f/0x180
Nov  1 18:58:31 fedora kernel: entry_SYSCALL_64_after_hwframe+0x6e/0x76
Nov  1 18:58:31 fedora kernel: RIP: 0033:0x7f27b4f347fe
Nov  1 18:58:31 fedora kernel: Code: 48 8b 0d 35 16 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 af 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 02 16 0c 00 f7 d8 64 89 01 48
Nov  1 18:58:31 fedora kernel: RSP: 002b:00007ffce70c3dc8 EFLAGS: 00000246 ORIG_RAX: 00000000000000af
Nov  1 18:58:31 fedora kernel: RAX: ffffffffffffffda RBX: 000055d0f11fcc40 RCX: 00007f27b4f347fe
Nov  1 18:58:31 fedora kernel: RDX: 000055d0f0d24fc9 RSI: 00000000001a2766 RDI: 000055d0f1d3a7c0
Nov  1 18:58:31 fedora kernel: RBP: 00007ffce70c3e80 R08: 000055d0f11fc010 R09: 0000000000000007
Nov  1 18:58:31 fedora kernel: R10: 0000000000000001 R11: 0000000000000246 R12: 000055d0f0d24fc9
Nov  1 18:58:31 fedora kernel: R13: 0000000000040000 R14: 000055d0f11fcd50 R15: 0000000000000000
Nov  1 18:58:31 fedora kernel: </TASK>
Nov  1 18:58:31 fedora kernel: ---[ end trace 0000000000000000 ]---
Nov  1 18:58:31 fedora kernel: iwlwifi 0000:01:00.0: Detected Intel(R) Dual Band Wireless AC 3165, REV=0x210


Kernel version:

Nov  1 18:58:27 fedora kernel: Linux version 6.7.0-0.rc0.20231101gt8bc9e651.302.vanilla.fc39.x86_64 (mockbuild@a15d6eccee0a4a14a16384dfba114599) (gcc (GCC) 13.2.1 20230918 (Red Hat 13.2.1-3)
, GNU ld version 2.40-13.fc39) #1 SMP PREEMPT_DYNAMIC Wed Nov  1 09:52:19 UTC 2023
Comment 4 Niklāvs Koļesņikovs 2023-11-12 20:58:37 UTC
This should have been fixed by commit e257da57153 . Therefore I'm closing it now, since this bug probably never went beyond the pre-rc stage.

Note You need to log in before you can comment on or make changes to this bug.