Bug 200659

Summary: BUG: KASAN: global-out-of-bounds in _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
Product: Drivers Reporter: Johannes Hirte (johannes.hirte)
Component: network-wirelessAssignee: DO NOT USE - assign "network-wireless-intel" component instead (linuxwifi)
Status: CLOSED CODE_FIX    
Severity: normal CC: luca
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.18-rc6 Subsystem:
Regression: No Bisected commit-id:
Attachments: iwlmvm.ko
iwlwifi.ko
mac80211.ko.xz
cfg80211.ko.xz

Description Johannes Hirte 2018-07-26 13:43:30 UTC
just found in the dmesg log:

[73727.062496] ==================================================================
[73727.062531] BUG: KASAN: global-out-of-bounds in _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
[73727.062539] Read of size 2 at addr ffffffffc0f5a95e by task irq/43-iwlwifi/415

[73727.062551] CPU: 0 PID: 415 Comm: irq/43-iwlwifi Not tainted 4.18.0-rc6-00002-g2b32ed5865f5 #4
[73727.062556] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.15 03/26/2018
[73727.062560] Call Trace:
[73727.062571]  dump_stack+0x5b/0x90
[73727.062582]  print_address_description+0x60/0x229
[73727.062597]  ? _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
[73727.062626]  kasan_report.cold.5+0x241/0x2ff
[73727.062660]  _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
[73727.062676]  iwl_mvm_rs_tx_status+0x1534/0x4e30 [iwlmvm]
[73727.062690]  ? iwl_pcie_rx_handle+0x66a/0x1fd0 [iwlwifi]
[73727.062700]  ? iwl_pcie_irq_handler+0x2df/0x1160 [iwlwifi]
[73727.062736]  ? ieee80211_report_used_skb+0x10d/0x1150 [mac80211]
[73727.062744]  ? _raw_spin_lock_irqsave+0x1f/0x40
[73727.062759]  ? iwl_mvm_rs_rate_init+0x2f80/0x2f80 [iwlmvm]
[73727.062775]  ? iwl_mvm_check_ratid_empty+0x26d/0x3c0 [iwlmvm]
[73727.062789]  iwl_mvm_tx_reclaim+0x822/0xac0 [iwlmvm]
[73727.062809]  ? ieee80211_tx_status+0x1dd/0x390 [mac80211]
[73727.062829]  ? __ieee80211_tx_status+0x2360/0x2360 [mac80211]
[73727.062843]  ? iwl_mvm_hwrate_to_tx_rate+0x560/0x560 [iwlmvm]
[73727.062857]  iwl_mvm_rx_ba_notif+0xb14/0xd90 [iwlmvm]
[73727.062866]  ? iommu_unmap_page+0xfd/0x1f0
[73727.062873]  ? bsearch+0x52/0x80
[73727.062887]  ? iwl_mvm_rx_tx_cmd+0x1b90/0x1b90 [iwlmvm]
[73727.062893]  ? queue_iova+0x2d9/0x490
[73727.062904]  iwl_pcie_rx_handle+0x66a/0x1fd0 [iwlwifi]
[73727.062917]  ? iwl_pcie_rxq_alloc_rbs+0x830/0x830 [iwlwifi]
[73727.062927]  iwl_pcie_irq_handler+0x2df/0x1160 [iwlwifi]
[73727.062936]  ? iwl_pcie_handle_rfkill_irq+0x370/0x370 [iwlwifi]
[73727.062943]  ? irq_forced_thread_fn+0x140/0x140
[73727.062947]  irq_thread_fn+0x7d/0x120
[73727.062952]  irq_thread+0x280/0x340
[73727.062957]  ? irq_thread_dtor+0x1c0/0x1c0
[73727.062964]  ? ___preempt_schedule+0x16/0x18
[73727.062969]  ? preempt_schedule_common+0x1a/0xc0
[73727.062974]  ? ___preempt_schedule+0x16/0x18
[73727.062979]  ? wake_threads_waitq+0x40/0x40
[73727.062984]  ? _raw_spin_unlock_irqrestore+0x50/0x70
[73727.062989]  ? irq_thread_dtor+0x1c0/0x1c0
[73727.062995]  kthread+0x2cf/0x380
[73727.063001]  ? kthread_create_worker+0xd0/0xd0
[73727.063006]  ret_from_fork+0x22/0x40

[73727.063015] The buggy address belongs to the variable:
[73727.063026]  iwl_mvm_exit+0xc0bd/0x75f [iwlmvm]

[73727.063031] Memory state around the buggy address:
[73727.063038]  ffffffffc0f5a800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
[73727.063042]  ffffffffc0f5a880: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
[73727.063046] >ffffffffc0f5a900: 00 00 00 fa fa fa fa fa 00 00 00 06 fa fa fa fa
[73727.063050]                                                     ^
[73727.063055]  ffffffffc0f5a980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[73727.063059]  ffffffffc0f5aa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[73727.063062] ==================================================================
[73727.063065] Disabling lock debugging due to kernel taint
Comment 1 Johannes Hirte 2018-07-26 14:40:00 UTC
some more debug info:

(gdb) list *(_rs_collect_tx_data+0x2a4)
0xffffffffc0f270f4 is in _rs_collect_tx_data (drivers/net/wireless/intel/iwlwifi/mvm/rs.c:748).
743                 (window->success_counter >= IWL_MVM_RS_RATE_MIN_SUCCESS_TH))
744                     window->average_tpt = (window->success_ratio * tpt + 64) / 128;
745             else
746                     window->average_tpt = IWL_INVALID_VALUE;
747
748             return 0;
749     }
750
751     static int rs_collect_tpc_data(struct iwl_mvm *mvm,
752                                    struct iwl_lq_sta *lq_sta,
Comment 2 Luca Coelho 2018-08-10 06:22:45 UTC
How easy is it to reproduce this?

Do you still have the binary of this warning? If yes, can you send it to us or check the variable that KASAN is mentioning (iwl_mvm_exit+0xc0bd/0x75f [iwlmvm])?

It seems that we are somehow trying to access an invalid index in our rate tables.  The index comes from the FW/HW in the TX response and we translate it to an internal index.  But I found one problem, when converting, we may return -EINVAL and the index will be undefined, but we don't check the return value at the callsites.

If this is easy to reproduce, any chance you could provide trace-cmd logs as explained in our wiki[1]?

[1] https://wireless.wiki.kernel.org/en/users/drivers/iwlwifi/debugging#tracing
Comment 3 Johannes Hirte 2018-08-10 13:32:13 UTC
(In reply to Luca Coelho from comment #2)
> How easy is it to reproduce this?

I've seen this only one time. 

> Do you still have the binary of this warning? 

Unfortunately no. I'm trying to catch another error with a current kernel. But as I don't know how to reproduce it, I can't say how long this may take.
Comment 4 Johannes Hirte 2018-09-23 08:41:40 UTC
(In reply to Luca Coelho from comment #2)
> How easy is it to reproduce this?
> 
> Do you still have the binary of this warning? If yes, can you send it to us
> or check the variable that KASAN is mentioning (iwl_mvm_exit+0xc0bd/0x75f
> [iwlmvm])?

Ok, now I've seen it again and I have the binary. Do you need only the module or the vmlinux too?
Comment 5 Luca Coelho 2018-09-23 09:23:25 UTC
Cool!

Please send us all the binaries that may be involved: vmlinux, cfg80211.ko, mac80211.ko, iwlwifi.ko and iwlmvm.ko.
Comment 6 Johannes Hirte 2018-09-25 12:36:42 UTC
(In reply to Luca Coelho from comment #5)
> Cool!
> 
> Please send us all the binaries that may be involved: vmlinux, cfg80211.ko,
> mac80211.ko, iwlwifi.ko and iwlmvm.ko.

I've send it to linuxwifi@intel.com
Comment 7 Johannes Hirte 2018-09-25 13:18:06 UTC
Hm, the mails didn't pass the mail system because of size. Is there any possibility for uploading the files or do I have to provide some kind of download for you?
Comment 8 Luca Coelho 2018-09-25 13:38:43 UTC
Can you just attach them here? You can leave vmlinux out for now, we will most likely not need it anyway and, if we do, I'll ask again. ;)
Comment 9 Johannes Hirte 2018-09-25 18:49:29 UTC
Created attachment 278757 [details]
iwlmvm.ko
Comment 10 Johannes Hirte 2018-09-25 18:50:53 UTC
Created attachment 278759 [details]
iwlwifi.ko
Comment 11 Johannes Hirte 2018-09-25 18:51:44 UTC
Created attachment 278761 [details]
mac80211.ko.xz
Comment 12 Johannes Hirte 2018-09-25 18:52:07 UTC
Created attachment 278763 [details]
cfg80211.ko.xz
Comment 13 Johannes Hirte 2018-09-25 18:53:30 UTC
corresponding kasan output for provided modules:

[59114.833692] ==================================================================
[59114.833753] BUG: KASAN: global-out-of-bounds in _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
[59114.833762] Read of size 2 at addr ffffffffc098a918 by task irq/43-iwlwifi/402

[59114.833776] CPU: 0 PID: 402 Comm: irq/43-iwlwifi Not tainted 4.18.8-00002-g1b41fee3fbd4 #23
[59114.833782] Hardware name: HP HP ProBook 645 G2/80FE, BIOS N77 Ver. 01.15 03/26/2018
[59114.833787] Call Trace:
[59114.833803]  dump_stack+0x5b/0x90
[59114.833817]  print_address_description+0x60/0x229
[59114.833837]  ? _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
[59114.833843]  kasan_report.cold.6+0x241/0x2ff
[59114.833859]  _rs_collect_tx_data.isra.5+0x2a4/0x2c0 [iwlmvm]
[59114.833877]  iwl_mvm_rs_tx_status+0x1155/0x4e30 [iwlmvm]
[59114.833897]  ? iwl_mvm_rs_rate_init+0x2f80/0x2f80 [iwlmvm]
[59114.833904]  ? save_stack+0x8c/0xb0
[59114.833910]  ? __kasan_slab_free+0x125/0x170
[59114.833916]  ? kmem_cache_free+0x73/0x200
[59114.833924]  ? irq_thread_fn+0x7d/0x120
[59114.833929]  ? irq_thread+0x280/0x340                                                                                                                                                                                                                           
[59114.833936]  ? kthread+0x2cf/0x380                                                                                                                                                                                                                              
[59114.833943]  ? ret_from_fork+0x22/0x40                                                                                                                                                                                                                          
[59114.833951]  ? lock_timer_base+0xbc/0x150                                                                                                                                                                                                                       
[59114.833966]  ? iwl_mvm_rs_tx_status+0x4e30/0x4e30 [iwlmvm]                                                                                                                                                                                                      
[59114.834038]  rate_control_tx_status+0x1ff/0x2b0 [mac80211]                                                                                                                                                                                                      
[59114.834049]  ? del_timer+0xa4/0xe0                                                                                                                                                                                                                              
[59114.834072]  __ieee80211_tx_status+0xa43/0x2360 [mac80211]                                                                                                                                                                                                      
[59114.834099]  ieee80211_tx_status+0x1d8/0x390 [mac80211]                                                                                                                                                                                                         
[59114.834122]  ? __ieee80211_tx_status+0x2360/0x2360 [mac80211]                                                                                                                                                                                                   
[59114.834129]  ? __kasan_slab_free+0x13a/0x170                                                                                                                                                                                                                    
[59114.834146]  iwl_mvm_rx_tx_cmd+0xc6f/0x1b90 [iwlmvm]                                                                                                                                                                                                            
[59114.834157]  ? skb_partial_csum_set+0x201/0x2d0                                                                                                                                                                                                                 
[59114.834175]  ? iwl_mvm_tx_reclaim+0xac0/0xac0 [iwlmvm]                                                                                                                                                                                                          
[59114.834182]  ? memcpy+0x34/0x50                                                                                                                                                                                                                                 
[59114.834198]  iwl_pcie_rx_handle+0x66a/0x1fd0 [iwlwifi]                                                                                                                                                                                                          
[59114.834213]  ? iwl_pcie_rxq_alloc_rbs+0x830/0x830 [iwlwifi]                                                                                                                                                                                                     
[59114.834224]  iwl_pcie_irq_handler+0x2df/0x1160 [iwlwifi]                                                                                                                                                                                                        
[59114.834235]  ? iwl_pcie_handle_rfkill_irq+0x370/0x370 [iwlwifi]
[59114.834241]  ? irq_forced_thread_fn+0x140/0x140
[59114.834246]  irq_thread_fn+0x7d/0x120
[59114.834252]  irq_thread+0x280/0x340
[59114.834258]  ? irq_thread_dtor+0x1c0/0x1c0
[59114.834263]  ? __switch_to_asm+0x34/0x70
[59114.834268]  ? __switch_to_asm+0x40/0x70
[59114.834273]  ? __switch_to_asm+0x34/0x70
[59114.834281]  ? __sched_text_start+0x8/0x8
[59114.834309]  ? __wake_up_common+0x108/0x4f0
[59114.834334]  ? wake_threads_waitq+0x40/0x40
[59114.834340]  ? _raw_spin_unlock_irqrestore+0x3a/0x70
[59114.834346]  ? irq_thread_dtor+0x1c0/0x1c0
[59114.834351]  kthread+0x2cf/0x380
[59114.834356]  ? kthread_create_worker_on_cpu+0xc0/0xc0
[59114.834361]  ret_from_fork+0x22/0x40

[59114.834371] The buggy address belongs to the variable:
[59114.834383]  iwl_mvm_exit+0xc077/0x75f [iwlmvm]

[59114.834390] Memory state around the buggy address:
[59114.834399]  ffffffffc098a800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 fa
[59114.834404]  ffffffffc098a880: fa fa fa fa 00 00 00 00 00 00 00 00 00 00 00 00
[59114.834409] >ffffffffc098a900: 00 00 00 fa fa fa fa fa 00 00 00 06 fa fa fa fa
[59114.834414]                             ^
[59114.834419]  ffffffffc098a980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[59114.834425]  ffffffffc098aa00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[59114.834428] ==================================================================
Comment 14 Johannes Hirte 2018-10-11 18:58:12 UTC
Is there any more information I can provide for debugging this?
Comment 15 Luca Coelho 2018-10-13 07:02:01 UTC
I have fixed this issue and sent it upstream.  It should go into 4.20 and is CCed to stable, so it should trickle down to all the affected stable releases.

https://patchwork.kernel.org/patch/10639957/

Thanks for reporting and helping with this!