Bug 215682

Summary: Long iperf test of wireguard interface causes kernel panic
Product: Networking Reporter: Alexey Kunitskiy (alexey.kv)
Component: OtherAssignee: Stephen Hemminger (stephen)
Status: NEW ---    
Severity: normal CC: regressions
Priority: P1    
Hardware: ARM   
OS: Linux   
Kernel Version: 5.15.27 Subsystem:
Regression: No Bisected commit-id:

Description Alexey Kunitskiy 2022-03-14 16:51:08 UTC
I have setup of two Rock64 SBC which I connected via lan and created a wireguard interface on it to do some benchmarking with iperf3. After long run (~3 days) on one instance following kernel panic was observed:

[266706.637097] Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: netif_receive_skb_list_internal+0x2b0/0x2b0       
[266706.638171] CPU: 3 PID: 9861 Comm: kworker/3:0 Tainted: G         C        5.15.27-rockchip64 #trunk
[266706.638999] Hardware name: Pine64 Rock64 (DT)
[266706.639406] Workqueue: wg-crypt-wg0 wg_packet_decrypt_worker [wireguard]
[266706.640084] Call trace:
[266706.640316]  dump_backtrace+0x0/0x200
[266706.640686]  show_stack+0x18/0x28
[266706.640997]  dump_stack_lvl+0x68/0x84
[266706.641348]  dump_stack+0x18/0x34
[266706.641661]  panic+0x164/0x35c
[266706.641955]  __stack_chk_fail+0x3c/0x40
[266706.642309]  netif_receive_skb_list+0x0/0x158
[266706.642711]  gro_normal_list.part.159+0x20/0x40
[266706.643129]  napi_complete_done+0xc0/0x1e8
[266706.643512]  wg_packet_rx_poll+0x45c/0x8c0 [wireguard]
[266706.644000]  __napi_poll+0x38/0x230
[266706.644324]  net_rx_action+0x284/0x2c8
[266706.644675]  _stext+0x160/0x3f8
[266706.644986]  do_softirq+0xa8/0xb8
[266706.645297]  __local_bh_enable_ip+0xac/0xb8
[266706.645682]  _raw_spin_unlock_bh+0x34/0x60
[266706.646059]  wg_packet_decrypt_worker+0x50/0x1a8 [wireguard]
[266706.646589]  process_one_work+0x20c/0x4c8
[266706.646961]  worker_thread+0x48/0x478
[266706.647300]  kthread+0x138/0x150
[266706.647599]  ret_from_fork+0x10/0x20
[266706.647931] SMP: stopping secondary CPUs
[266706.648297] Kernel Offset: disabled
[266706.648615] CPU features: 0x00001001,00000846
[266706.649009] Memory Limit: none
[266706.649293] ---[ end Kernel panic - not syncing: stack-protector: Kernel stack is corrupted in: netif_receive_skb_list_internal+0x2b0/0x2b0 ]---
Comment 1 The Linux kernel's regression tracker (Thorsten Leemhuis) 2022-04-04 13:40:26 UTC
Is this still happening? And did earlier 5.15.y versions or older series (say 5.14) work fine?

Note: This is just an inquiry, I'm not one of the developers of the code in question.