just found in the dmesg log: [73727.062496] ================================================================== [73727.062531] BUG: KASAN: global-out-of-bounds in _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm] [73727.062539] Read of size 2 at addr ffffffffc0f5a95e by task irq/43-iwlwifi/415 [73727.062551] CPU: 0 PID: 415 Comm: irq/43-iwlwifi Not tainted 4.18.0-rc6-00002-g2b32ed5865f5 #4 [73727.062556] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.15 03/26/2018 [73727.062560] Call Trace: [73727.062571] dump_stack+0x5b/0x90 [73727.062582] print_address_description+0x60/0x229 [73727.062597] ? _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm] [73727.062626] kasan_report.cold.5+0x241/0x2ff [73727.062660] _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm] [73727.062676] iwl_mvm_rs_tx_status+0x1534/0x4e30 [iwlmvm] [73727.062690] ? iwl_pcie_rx_handle+0x66a/0x1fd0 [iwlwifi] [73727.062700] ? iwl_pcie_irq_handler+0x2df/0x1160 [iwlwifi] [73727.062736] ? ieee80211_report_used_skb+0x10d/0x1150 [mac80211] [73727.062744] ? _raw_spin_lock_irqsave+0x1f/0x40 [73727.062759] ? iwl_mvm_rs_rate_init+0x2f80/0x2f80 [iwlmvm] [73727.062775] ? iwl_mvm_check_ratid_empty+0x26d/0x3c0 [iwlmvm] [73727.062789] iwl_mvm_tx_reclaim+0x822/0xac0 [iwlmvm] [73727.062809] ? ieee80211_tx_status+0x1dd/0x390 [mac80211] [73727.062829] ? __ieee80211_tx_status+0x2360/0x2360 [mac80211] [73727.062843] ? iwl_mvm_hwrate_to_tx_rate+0x560/0x560 [iwlmvm] [73727.062857] iwl_mvm_rx_ba_notif+0xb14/0xd90 [iwlmvm] [73727.062866] ? iommu_unmap_page+0xfd/0x1f0 [73727.062873] ? bsearch+0x52/0x80 [73727.062887] ? iwl_mvm_rx_tx_cmd+0x1b90/0x1b90 [iwlmvm] [73727.062893] ? queue_iova+0x2d9/0x490 [73727.062904] iwl_pcie_rx_handle+0x66a/0x1fd0 [iwlwifi] [73727.062917] ? iwl_pcie_rxq_alloc_rbs+0x830/0x830 [iwlwifi] [73727.062927] iwl_pcie_irq_handler+0x2df/0x1160 [iwlwifi] [73727.062936] ? iwl_pcie_handle_rfkill_irq+0x370/0x370 [iwlwifi] [73727.062943] ? irq_forced_thread_fn+0x140/0x140 [73727.062947] irq_thread_fn+0x7d/0x120 [73727.062952] irq_thread+0x280/0x340 [73727.062957] ? irq_thread_dtor+0x1c0/0x1c0 [73727.062964] ? ___preempt_schedule+0x16/0x18 [73727.062969] ? preempt_schedule_common+0x1a/0xc0 [73727.062974] ? ___preempt_schedule+0x16/0x18 [73727.062979] ? wake_threads_waitq+0x40/0x40 [73727.062984] ? _raw_spin_unlock_irqrestore+0x50/0x70 [73727.062989] ? irq_thread_dtor+0x1c0/0x1c0 [73727.062995] kthread+0x2cf/0x380 [73727.063001] ? kthread_create_worker+0xd0/0xd0 [73727.063006] ret_from_fork+0x22/0x40 [73727.063015] The buggy address belongs to the variable: [73727.063026] iwl_mvm_exit+0xc0bd/0x75f [iwlmvm] [73727.063031] Memory state around the buggy address: [73727.063038] ffffffffc0f5a800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa [73727.063042] ffffffffc0f5a880: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 [73727.063046] >ffffffffc0f5a900: 00 00 00 fa fa fa fa fa 00 00 00 06 fa fa fa fa [73727.063050] ^ [73727.063055] ffffffffc0f5a980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [73727.063059] ffffffffc0f5aa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [73727.063062] ================================================================== [73727.063065] Disabling lock debugging due to kernel taint
some more debug info: (gdb) list *(_rs_collect_tx_data+0x2a4) 0xffffffffc0f270f4 is in _rs_collect_tx_data (drivers/net/wireless/intel/iwlwifi/mvm/rs.c:748). 743 (window->success_counter >= IWL_MVM_RS_RATE_MIN_SUCCESS_TH)) 744 window->average_tpt = (window->success_ratio * tpt + 64) / 128; 745 else 746 window->average_tpt = IWL_INVALID_VALUE; 747 748 return 0; 749 } 750 751 static int rs_collect_tpc_data(struct iwl_mvm *mvm, 752 struct iwl_lq_sta *lq_sta,
How easy is it to reproduce this? Do you still have the binary of this warning? If yes, can you send it to us or check the variable that KASAN is mentioning (iwl_mvm_exit+0xc0bd/0x75f [iwlmvm])? It seems that we are somehow trying to access an invalid index in our rate tables. The index comes from the FW/HW in the TX response and we translate it to an internal index. But I found one problem, when converting, we may return -EINVAL and the index will be undefined, but we don't check the return value at the callsites. If this is easy to reproduce, any chance you could provide trace-cmd logs as explained in our wiki[1]? [1] https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging#tracing
(In reply to Luca Coelho from comment #2) > How easy is it to reproduce this? I've seen this only one time. > Do you still have the binary of this warning? Unfortunately no. I'm trying to catch another error with a current kernel. But as I don't know how to reproduce it, I can't say how long this may take.
(In reply to Luca Coelho from comment #2) > How easy is it to reproduce this? > > Do you still have the binary of this warning? If yes, can you send it to us > or check the variable that KASAN is mentioning (iwl_mvm_exit+0xc0bd/0x75f > [iwlmvm])? Ok, now I've seen it again and I have the binary. Do you need only the module or the vmlinux too?
Cool! Please send us all the binaries that may be involved: vmlinux, cfg80211.ko, mac80211.ko, iwlwifi.ko and iwlmvm.ko.
(In reply to Luca Coelho from comment #5) > Cool! > > Please send us all the binaries that may be involved: vmlinux, cfg80211.ko, > mac80211.ko, iwlwifi.ko and iwlmvm.ko. I've send it to linuxwifi@intel.com
Hm, the mails didn't pass the mail system because of size. Is there any possibility for uploading the files or do I have to provide some kind of download for you?
Can you just attach them here? You can leave vmlinux out for now, we will most likely not need it anyway and, if we do, I'll ask again. ;)
Created attachment 278757 [details] iwlmvm.ko
Created attachment 278759 [details] iwlwifi.ko
Created attachment 278761 [details] mac80211.ko.xz
Created attachment 278763 [details] cfg80211.ko.xz
corresponding kasan output for provided modules: [59114.833692] ================================================================== [59114.833753] BUG: KASAN: global-out-of-bounds in _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm] [59114.833762] Read of size 2 at addr ffffffffc098a918 by task irq/43-iwlwifi/402 [59114.833776] CPU: 0 PID: 402 Comm: irq/43-iwlwifi Not tainted 4.18.8-00002-g1b41fee3fbd4 #23 [59114.833782] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.15 03/26/2018 [59114.833787] Call Trace: [59114.833803] dump_stack+0x5b/0x90 [59114.833817] print_address_description+0x60/0x229 [59114.833837] ? _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm] [59114.833843] kasan_report.cold.6+0x241/0x2ff [59114.833859] _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm] [59114.833877] iwl_mvm_rs_tx_status+0x1155/0x4e30 [iwlmvm] [59114.833897] ? iwl_mvm_rs_rate_init+0x2f80/0x2f80 [iwlmvm] [59114.833904] ? save_stack+0x8c/0xb0 [59114.833910] ? __kasan_slab_free+0x125/0x170 [59114.833916] ? kmem_cache_free+0x73/0x200 [59114.833924] ? irq_thread_fn+0x7d/0x120 [59114.833929] ? irq_thread+0x280/0x340 [59114.833936] ? kthread+0x2cf/0x380 [59114.833943] ? ret_from_fork+0x22/0x40 [59114.833951] ? lock_timer_base+0xbc/0x150 [59114.833966] ? iwl_mvm_rs_tx_status+0x4e30/0x4e30 [iwlmvm] [59114.834038] rate_control_tx_status+0x1ff/0x2b0 [mac80211] [59114.834049] ? del_timer+0xa4/0xe0 [59114.834072] __ieee80211_tx_status+0xa43/0x2360 [mac80211] [59114.834099] ieee80211_tx_status+0x1d8/0x390 [mac80211] [59114.834122] ? __ieee80211_tx_status+0x2360/0x2360 [mac80211] [59114.834129] ? __kasan_slab_free+0x13a/0x170 [59114.834146] iwl_mvm_rx_tx_cmd+0xc6f/0x1b90 [iwlmvm] [59114.834157] ? skb_partial_csum_set+0x201/0x2d0 [59114.834175] ? iwl_mvm_tx_reclaim+0xac0/0xac0 [iwlmvm] [59114.834182] ? memcpy+0x34/0x50 [59114.834198] iwl_pcie_rx_handle+0x66a/0x1fd0 [iwlwifi] [59114.834213] ? iwl_pcie_rxq_alloc_rbs+0x830/0x830 [iwlwifi] [59114.834224] iwl_pcie_irq_handler+0x2df/0x1160 [iwlwifi] [59114.834235] ? iwl_pcie_handle_rfkill_irq+0x370/0x370 [iwlwifi] [59114.834241] ? irq_forced_thread_fn+0x140/0x140 [59114.834246] irq_thread_fn+0x7d/0x120 [59114.834252] irq_thread+0x280/0x340 [59114.834258] ? irq_thread_dtor+0x1c0/0x1c0 [59114.834263] ? __switch_to_asm+0x34/0x70 [59114.834268] ? __switch_to_asm+0x40/0x70 [59114.834273] ? __switch_to_asm+0x34/0x70 [59114.834281] ? __sched_text_start+0x8/0x8 [59114.834309] ? __wake_up_common+0x108/0x4f0 [59114.834334] ? wake_threads_waitq+0x40/0x40 [59114.834340] ? _raw_spin_unlock_irqrestore+0x3a/0x70 [59114.834346] ? irq_thread_dtor+0x1c0/0x1c0 [59114.834351] kthread+0x2cf/0x380 [59114.834356] ? kthread_create_worker_on_cpu+0xc0/0xc0 [59114.834361] ret_from_fork+0x22/0x40 [59114.834371] The buggy address belongs to the variable: [59114.834383] iwl_mvm_exit+0xc077/0x75f [iwlmvm] [59114.834390] Memory state around the buggy address: [59114.834399] ffffffffc098a800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa [59114.834404] ffffffffc098a880: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00 [59114.834409] >ffffffffc098a900: 00 00 00 fa fa fa fa fa 00 00 00 06 fa fa fa fa [59114.834414] ^ [59114.834419] ffffffffc098a980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [59114.834425] ffffffffc098aa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [59114.834428] ==================================================================
Is there any more information I can provide for debugging this?
I have fixed this issue and sent it upstream. It should go into 4.20 and is CCed to stable, so it should trickle down to all the affected stable releases. https://patchwork.kernel.org/patch/10639957/ Thanks for reporting and helping with this!