Created attachment 258499 [details] kernel panic screenshoot Just after few seconds when host is starting to forward traffic i have same kernel panic. Type of traffic TCP/UDP normal Internet traffic from users.
Created attachment 258501 [details] Kernel panic Kernel 4.14.0-rc1-next-20170919 trying linux-next tree - also panic
Created attachment 258503 [details] Kernel 4.14-rc1+ panic Added new screenshoot with panic for Kernel 4.14-rc1+ (net-next from 20-09-2017) Difference to the first one is that now iommu is enabled - previous have iommu=off
Created attachment 258507 [details] Kernel 4.14-rc1+ Trying other boot options idle=poll iommu=off Kernel panic attached dst_destroy+0x4c/0x83
Some information about enviroment: Server is acting as a ip router with bgp There are 6x bgp sessions - each with full bgp table ~600k prefixes And it looks like panic is appearing after bgp sessions are connected - not by traffic - cause at time when panic occured there is almost no traffic. Also when I run tris server without turning on BGP and push thru this server traffic by pktgen there is no panic. just after it learn routes it panick
Created attachment 258509 [details] Kernel 4.13.2 Trying another kernels - and soo far same situation with 4.13.2 - panic just after bgp sessions are connected and kernel learning routes.
Latest working kernel without this problem 4.12.13
Created attachment 258521 [details] Kernel 4.13.2 - panic with two more lines
Created attachment 258525 [details] kernel panic screenshoot pinctrl-v4.13-1 first bisection panic git bisect good v4.12 Bisecting: 7105 revisions left to test after this (roughly 13 steps) [ac7b75966c9c86426b55fe1c50ae148aa4571075] Merge tag 'pinctrl-v4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl
Created attachment 258527 [details] kernel panic / bisection kernel 4.12-rc7+ bisect log git bisect start # bad: [ac7b75966c9c86426b55fe1c50ae148aa4571075] Merge tag 'pinctrl-v4.13-1' of git://git.kernel.org/pub/scm/linux/kernel/git/linusw/linux-pinctrl git bisect bad ac7b75966c9c86426b55fe1c50ae148aa4571075 # good: [e24dd9ee5399747b71c1d982a484fc7601795f31] Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security git bisect good e24dd9ee5399747b71c1d982a484fc7601795f31 # bad: [9cc9a5cb176ccb4f2cda5ac34da5a659926f125f] datapath: Avoid using stack larger than 1024. git bisect bad 9cc9a5cb176ccb4f2cda5ac34da5a659926f125f Going to next bisects
Some warning at bisection - not connected to main problem but somebody want to know maybee - (no panic) good: [073cf9e20c333ab29744717a23f9e43ec7512a20] Merge branch 'udp-reduce-cache-pressure' [ 639.883924] INFO: rcu_sched self-detected stall on CPU [ 639.883930] 0-...: (5999 ticks this GP) idle=ae2/140000000000001/0 softirq=43421/43421 fqs=2992 [ 639.883931] (t=6000 jiffies g=8207 c=8206 q=1035171) [ 639.883937] NMI backtrace for cpu 0 [ 639.883939] CPU: 0 PID: 2886 Comm: md2_raid1 Not tainted 4.12.0-rc4+ #2 [ 639.883940] Call Trace: [ 639.883942] <IRQ> [ 639.883947] dump_stack+0x63/0x7f [ 639.883949] nmi_cpu_backtrace+0x7a/0x89 [ 639.883953] ? irq_force_complete_move+0xe9/0xe9 [ 639.883955] nmi_trigger_cpumask_backtrace+0x4e/0xc6 [ 639.883956] arch_trigger_cpumask_backtrace+0x14/0x17 [ 639.883958] rcu_dump_cpu_stacks+0x6d/0xb9 [ 639.883961] rcu_check_callbacks+0x220/0x5fb [ 639.883964] ? raise_softirq+0x29/0x30 [ 639.883966] update_process_times+0x2a/0x4f [ 639.883968] tick_sched_timer+0xfa/0x12f [ 639.883969] __hrtimer_run_queues+0x6e/0xcf [ 639.883970] hrtimer_interrupt+0x69/0x10d [ 639.883972] local_apic_timer_interrupt+0x49/0x4d [ 639.883973] smp_apic_timer_interrupt+0x25/0x36 [ 639.883975] apic_timer_interrupt+0x89/0x90 [ 639.883977] RIP: 0010:atomic64_cmpxchg+0xc/0xe [ 639.883978] RSP: 0018:ffffc90008607d20 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 [ 639.883979] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 639.883979] RDX: ffff88086cb4f000 RSI: 0000000000000000 RDI: ffff88046b1a29e8 [ 639.883980] RBP: ffffc90008607d20 R08: 0000000000000000 R09: ffff88086be4c0d8 [ 639.883980] R10: 000000000000001e R11: 000000000000001e R12: ffff88086cb4f000 [ 639.883981] R13: ffff88046b1a29e8 R14: ffff88086be4c0d0 R15: 0000000000000000 [ 639.883982] </IRQ> [ 639.883983] __mutex_trylock_or_owner+0x4b/0x63 [ 639.883984] __mutex_trylock+0x9/0x11 [ 639.883986] mutex_trylock+0x9/0xe [ 639.883989] md_check_recovery+0xde/0x3d5 [ 639.883991] raid1d+0x35/0xc2c [ 639.883993] md_thread+0x11d/0x13b [ 639.883995] ? add_wait_queue_exclusive+0x43/0x43 [ 639.883996] ? md_do_sync+0xc61/0xc61 [ 639.883998] kthread+0xf4/0xfc [ 639.883999] ? __kthread_init_worker+0x2a/0x2a [ 639.884001] ret_from_fork+0x25/0x30 [ 745.520198] INFO: task kworker/1:1:598 blocked for more than 120 seconds. [ 745.520201] Not tainted 4.12.0-rc4+ #2 [ 745.520201] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 745.520202] kworker/1:1 D 0 598 2 0x00000000 [ 745.520209] Workqueue: events_long flush_old_commits [ 745.520210] Call Trace: [ 745.520225] __schedule+0x40e/0x4ea [ 745.520226] schedule+0x80/0x8b [ 745.520228] queue_log_writer+0x81/0xb5 [ 745.520231] ? try_to_wake_up+0x1f9/0x1f9 [ 745.520232] do_journal_end+0x1c3/0xb74 [ 745.520233] ? do_journal_begin_r+0x219/0x24d [ 745.520235] journal_end_sync+0x60/0x65 [ 745.520236] reiserfs_sync_fs+0x3a/0x57 [ 745.520237] flush_old_commits+0x7d/0x8e [ 745.520240] process_one_work+0x134/0x1df [ 745.520241] ? process_scheduled_works+0x2e/0x2e [ 745.520242] worker_thread+0x198/0x25a [ 745.520244] ? process_scheduled_works+0x2e/0x2e [ 745.520245] kthread+0xf4/0xfc [ 745.520246] ? __kthread_init_worker+0x2a/0x2a [ 745.520247] ? __request_module+0x1d5/0x1d5 [ 745.520249] ? call_usermodehelper_exec_async+0x11e/0x129 [ 745.520251] ret_from_fork+0x25/0x30 [ 745.520265] INFO: task snmpd:4410 blocked for more than 120 seconds. [ 745.520266] Not tainted 4.12.0-rc4+ #2 [ 745.520266] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 745.520267] snmpd D 0 4410 1 0x00000000 [ 745.520268] Call Trace: [ 745.520269] __schedule+0x40e/0x4ea [ 745.520271] schedule+0x80/0x8b [ 745.520272] queue_log_writer+0x81/0xb5 [ 745.520273] ? try_to_wake_up+0x1f9/0x1f9 [ 745.520274] do_journal_begin_r+0x18d/0x24d [ 745.520275] journal_begin+0xbb/0xf3 [ 745.520277] reiserfs_dirty_inode+0x4f/0x7c [ 745.520287] ? current_kernel_time64+0xd/0x31 [ 745.520288] ? kfree+0x24/0xb1 [ 745.520291] __mark_inode_dirty+0x2c/0x1cf [ 745.520292] generic_update_time+0x9f/0xaa [ 745.520294] file_update_time+0xad/0xce [ 745.520296] ? virt_to_head_page+0x3b/0x3d [ 745.520299] __generic_file_write_iter+0x5d/0x152 [ 745.520300] ? destroy_inode+0x39/0x52 [ 745.520301] generic_file_write_iter+0x4b/0x95 [ 745.520303] __vfs_write+0x79/0xa2 [ 745.520304] vfs_write+0x96/0xfc [ 745.520306] SyS_write+0x4d/0x7f [ 745.520307] entry_SYSCALL_64_fastpath+0x1a/0xa5 [ 745.520308] RIP: 0033:0x7f2da122e640 [ 745.520309] RSP: 002b:00007ffe3443e4e8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 745.520310] RAX: ffffffffffffffda RBX: 00007ffe3443b478 RCX: 00007f2da122e640 [ 745.520311] RDX: 000000000000002c RSI: 00007f2da20e4000 RDI: 0000000000000003 [ 745.520311] RBP: 0000000000000000 R08: 746e6f6362757320 R09: 69272072656e6961 [ 745.520312] R10: 2027726464615f61 R11: 0000000000000246 R12: 0000000000000048 [ 745.520313] R13: 0000000000000000 R14: 0000000000000000 R15: 00007ffe3443b6b8 [ 745.520324] INFO: task objcopy:24869 blocked for more than 120 seconds. [ 745.520325] Not tainted 4.12.0-rc4+ #2 [ 745.520325] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 745.520326] objcopy D 0 24869 24868 0x00000004 [ 745.520327] Call Trace: [ 745.520329] __schedule+0x40e/0x4ea [ 745.520330] ? bit_wait_timeout+0x44/0x44 [ 745.520331] schedule+0x80/0x8b [ 745.520333] io_schedule+0x12/0x35 [ 745.520334] bit_wait_io+0xd/0x2c [ 745.520335] __wait_on_bit+0x4c/0x86 [ 745.520337] ? submit_bio+0xf6/0xff [ 745.520338] out_of_line_wait_on_bit+0x6d/0x78 [ 745.520340] ? bit_wait_timeout+0x44/0x44 [ 745.520342] ? wake_atomic_t_function+0x28/0x28 [ 745.520344] __wait_on_buffer+0x21/0x23 [ 745.520346] search_by_key+0x19f/0x77c [ 745.520348] search_for_position_by_key+0x34/0x62 [ 745.520349] ? journal_transaction_should_end+0x24/0x95 [ 745.520350] reiserfs_do_truncate+0x330/0x3ad [ 745.520352] ? reiserfs_write_lock_nested+0x1f/0x35 [ 745.520354] reiserfs_delete_object+0x2a/0x68 [ 745.520355] reiserfs_evict_inode+0xc2/0x146 [ 745.520356] evict+0xb5/0x168 [ 745.520357] iput+0x134/0x139 [ 745.520359] do_unlinkat+0x10a/0x1a0 [ 745.520360] SyS_unlink+0x11/0x13 [ 745.520361] entry_SYSCALL_64_fastpath+0x1a/0xa5 [ 745.520362] RIP: 0033:0x2b4f96f6e787 [ 745.520363] RSP: 002b:00007fff6c35fcd8 EFLAGS: 00000206 ORIG_RAX: 0000000000000057 [ 745.520364] RAX: ffffffffffffffda RBX: 00002b4f97232640 RCX: 00002b4f96f6e787 [ 745.520364] RDX: 0000000000008000 RSI: 00007fff6c35fce0 RDI: 00007fff6c3607c9 [ 745.520365] RBP: 0000000000000ff1 R08: 0000000000000058 R09: 0000000000140000 [ 745.520366] R10: 00007fff6c35fa90 R11: 0000000000000206 R12: 0000000001133930 [ 745.520366] R13: 000000000000003f R14: 00002b4f97232698 R15: 000000000000e6e1
Another bisect (some warnings): with: # good: [2fae5d0e647c6470d206e72b5fc24972bb900f70] Merge branch 'bpf-ctx-narrow' git bisect good 2fae5d0e647c6470d206e72b5fc24972bb900f70 [ 171.393948] NETDEV WATCHDOG: enp4s0f0 (ixgbe): transmit queue 0 timed out [ 171.393957] ------------[ cut here ]------------ [ 171.393962] WARNING: CPU: 24 PID: 0 at dev_watchdog+0xcf/0x139 [ 171.393962] Modules linked in: bonding ipmi_si x86_pkg_temp_thermal [ 171.393967] CPU: 24 PID: 0 Comm: swapper/24 Not tainted 4.12.0-rc4+ #6 [ 171.393968] task: ffff88046d98c000 task.stack: ffffc90003320000 [ 171.393970] RIP: 0010:dev_watchdog+0xcf/0x139 [ 171.393970] RSP: 0018:ffff88046fb83ea8 EFLAGS: 00010246 [ 171.393971] RAX: 000000000000003d RBX: ffff88046bf10000 RCX: 0000000000000000 [ 171.393972] RDX: ffff88046fb92f01 RSI: ffffffff81e07102 RDI: ffff88046fb8da08 [ 171.393972] RBP: ffff88046fb83eb8 R08: ffffffff81e07102 R09: 000000000000003d [ 171.393973] R10: 0000000000000000 R11: 0000000000000001 R12: 0000000000000000 [ 171.393973] R13: ffff88046fb83ef8 R14: ffff88046d98c000 R15: ffffffff81c06008 [ 171.393974] FS: 0000000000000000(0000) GS:ffff88046fb80000(0000) knlGS:0000000000000000 [ 171.393975] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 171.393976] CR2: 0000000001c060c8 CR3: 000000086b910000 CR4: 00000000001406e0 [ 171.393976] Call Trace: [ 171.393978] <IRQ> [ 171.393979] ? netif_tx_lock+0x79/0x79 [ 171.393983] call_timer_fn.isra.24+0x17/0x77 [ 171.393984] run_timer_softirq+0x118/0x161 [ 171.393986] ? netif_tx_lock+0x79/0x79 [ 171.393988] ? arch_local_irq_enable+0xb/0xd [ 171.393990] __do_softirq+0xa8/0x19d [ 171.393991] irq_exit+0x5d/0x6b [ 171.393994] smp_apic_timer_interrupt+0x2a/0x36 [ 171.393995] apic_timer_interrupt+0x89/0x90 [ 171.393998] RIP: 0010:mwait_idle+0x4e/0x6a [ 171.393999] RSP: 0018:ffffc90003323e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff10 [ 171.394000] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 171.394000] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88046d98c000 [ 171.394001] RBP: ffffc90003323e98 R08: 00000000000000af R09: 0000000000000001 [ 171.394001] R10: 0000000000000001 R11: 0000000000000001 R12: 0000000000000000 [ 171.394002] R13: ffff88046d98c000 R14: ffff88046d98c000 R15: ffff88046d98c000 [ 171.394002] </IRQ> [ 171.394006] arch_cpu_idle+0xa/0xc [ 171.394007] default_idle_call+0x19/0x1b [ 171.394009] do_idle+0xb7/0x191 [ 171.394010] cpu_startup_entry+0x1d/0x21 [ 171.394012] start_secondary+0xd8/0xdc [ 171.394014] secondary_startup_64+0x9f/0x9f [ 171.394015] Code: cc 75 bd eb 35 48 89 df c6 05 b2 ba 75 00 01 e8 f8 60 fe ff 44 89 e1 48 89 de 48 89 c2 48 c7 c7 03 32 a4 81 31 c0 e8 8f 1a b6 ff <0f> ff 48 8b 83 e0 01 00 00 48 89 df ff 50 78 48 8b 05 3f 9c 6b [ 171.394033] ---[ end trace b78b2168985afd72 ]---
Created attachment 258529 [details] kernel panic net-next After aditional patch for debug: diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index f535779d9dc1dfe36934c2abba4e43d053ac5d6f..1eaa3553a724dc8c048f67b556337072d5addc82 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -3331,7 +3331,14 @@ void netdev_run_todo(void); */ static inline void dev_put(struct net_device *dev) { - this_cpu_dec(*dev->pcpu_refcnt); + int __percpu *pref = READ_ONCE(dev->pcpu_refcnt); + + if (!pref) { + pr_err("no pcpu_refcnt on dev %p(%s) state %d dismantle %d\n", + dev, dev->name, dev->reg_state, dev->dismantle); + BUG(); + } + this_cpu_dec(*pref); } /**
Created attachment 258531 [details] Kernel panic Kernel 4.14.0-rc1-next-20170919 A little more in new attachement from ipmi recording
After adding patch : diff --git a/include/net/dst.h b/include/net/dst.h index 93568bd0a352..82aff41c6f63 100644 --- a/include/net/dst.h +++ b/include/net/dst.h @@ -271,7 +271,7 @@ static inline void dst_use_noref(struct dst_entry *dst, unsigned long time) static inline struct dst_entry *dst_clone(struct dst_entry *dst) { if (dst) - atomic_inc(&dst->__refcnt); + dst_hold(dst); return dst; } dmesg output [ 222.276139] WARNING: CPU: 25 PID: 0 at dst_hold+0x25/0x29 [ 222.276140] Modules linked in: bonding ipmi_si x86_pkg_temp_thermal [ 222.276159] CPU: 25 PID: 0 Comm: swapper/25 Tainted: G W 4.14.0-rc1+ #1 [ 222.276160] task: ffff88046da5d000 task.stack: ffffc90003328000 [ 222.276161] RIP: 0010:dst_hold+0x25/0x29 [ 222.276162] RSP: 0018:ffff88046fbc39d8 EFLAGS: 00010246 [ 222.276163] RAX: 0000000000000000 RBX: ffff88045b7dc400 RCX: 0000000000000b84 [ 222.276164] RDX: ffff88084be83280 RSI: 0000000000000036 RDI: ffff88084be83200 [ 222.276165] RBP: ffff88046fbc39d8 R08: 0000000000000000 R09: ffff8804573880e2 [ 222.276165] R10: 00000000000000c0 R11: ffffffff815921b2 R12: ffff88046d7f8000 [ 222.276166] R13: ffff88046d7f8000 R14: 0000000000000000 R15: 0000000000000002 [ 222.276167] FS: 0000000000000000(0000) GS:ffff88046fbc0000(0000) knlGS:0000000000000000 [ 222.276167] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 222.276168] CR2: 00007fdd4c81d580 CR3: 000000086b737003 CR4: 00000000001606e0 [ 222.276168] Call Trace: [ 222.276169] <IRQ> [ 222.276170] skb_dst_force+0x32/0x34 [ 222.276172] __dev_queue_xmit+0x1ad/0x482 [ 222.276173] ? eth_header+0x28/0xc6 [ 222.276174] dev_queue_xmit+0xb/0xd [ 222.276175] neigh_connected_output+0x9b/0xb4 [ 222.276177] ip_finish_output2+0x234/0x294 [ 222.276178] ? ipt_do_table+0x369/0x388 [ 222.276179] ip_finish_output+0x12c/0x13f [ 222.276181] ip_output+0x53/0x87 [ 222.276182] ip_forward_finish+0x53/0x5a [ 222.276183] ip_forward+0x2cb/0x3e6 [ 222.276185] ? pskb_trim_rcsum.part.9+0x4b/0x4b [ 222.276186] ip_rcv_finish+0x2e2/0x321 [ 222.276187] ip_rcv+0x26f/0x2eb [ 222.276189] ? vlan_do_receive+0x4f/0x289 [ 222.276190] ? virt_to_head_page+0x3b/0x3d [ 222.276192] ? skb_gro_receive+0x13c/0x29e [ 222.276193] __netif_receive_skb_core+0x467/0x50b [ 222.276194] __netif_receive_skb+0x4d/0x5f [ 222.276196] netif_receive_skb_internal+0x5c/0xaf [ 222.276197] napi_gro_complete+0x7e/0x88 [ 222.276198] napi_gro_flush+0x4f/0x6b [ 222.276199] napi_complete_done+0x4b/0xbc [ 222.276201] ixgbe_poll+0xd2e/0xf09 [ 222.276202] ? kmem_cache_free_bulk+0x1b6/0x1f7 [ 222.276204] net_rx_action+0xf4/0x266 [ 222.276205] __do_softirq+0xa8/0x19d [ 222.276207] irq_exit+0x5d/0x6b [ 222.276208] do_IRQ+0x9c/0xb5 [ 222.276209] common_interrupt+0x93/0x93 [ 222.276210] </IRQ> [ 222.276211] RIP: 0010:mwait_idle+0x4e/0x6a [ 222.276212] RSP: 0018:ffffc9000332be98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff56 [ 222.276213] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 222.276213] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88046da5d000 [ 222.276214] RBP: ffffc9000332be98 R08: fffffffffffffffc R09: 0000000000000000 [ 222.276214] R10: ffff88046da5d000 R11: ffff88046da5d000 R12: 0000000000000000 [ 222.276215] R13: ffff88046da5d000 R14: ffff88046da5d000 R15: ffff88046da5d000 [ 222.276217] arch_cpu_idle+0xa/0xc [ 222.276218] default_idle_call+0x19/0x1b [ 222.276220] do_idle+0xbc/0x196 [ 222.276221] ? do_idle+0x194/0x196 [ 222.276222] cpu_startup_entry+0x1d/0x20 [ 222.276224] start_secondary+0xf6/0xfb [ 222.276225] secondary_startup_64+0xa5/0xa5 [ 222.276226] Code: 08 0f 4f c2 5d c3 8b 87 80 00 00 00 55 48 8d 97 80 00 00 00 48 89 e5 85 c0 74 10 8d 48 01 f0 0f b1 0a 0f 94 c1 84 c9 75 04 eb ec <0f> ff 5d c3 48 8b 47 58 31 d2 a8 01 74 0b 31 d2 48 a9 fe ff ff [ 222.276246] ---[ end trace c197aa7ded9b7547 ]--- [ 222.299282] ------------[ cut here ]------------ [ 222.299288] WARNING: CPU: 19 PID: 0 at dst_hold+0x25/0x29 [ 222.299288] Modules linked in: bonding ipmi_si x86_pkg_temp_thermal [ 222.299292] CPU: 19 PID: 0 Comm: swapper/19 Tainted: G W 4.14.0-rc1+ #1 [ 222.299293] task: ffff88086d919000 task.stack: ffffc900032f8000 [ 222.299294] RIP: 0010:dst_hold+0x25/0x29 [ 222.299295] RSP: 0018:ffff88087fa43a28 EFLAGS: 00010246 [ 222.299296] RAX: 0000000000000000 RBX: ffff88045aceb000 RCX: 0000000000000000 [ 222.299297] RDX: ffff880459d40180 RSI: 0000000000000000 RDI: ffff880459d40100 [ 222.299297] RBP: ffff88087fa43a28 R08: ffff88084dd13800 R09: 00000000000005c0 [ 222.299298] R10: ffff88084dd138c0 R11: ffffffff815921b2 R12: ffff88086cc4d000 [ 222.299299] R13: ffff88086cc4d000 R14: 0000000000000000 R15: 0000000000000002 [ 222.299300] FS: 0000000000000000(0000) GS:ffff88087fa40000(0000) knlGS:0000000000000000 [ 222.299301] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 222.299301] CR2: 00007f83f3ce1580 CR3: 0000000866dca002 CR4: 00000000001606e0 [ 222.299302] Call Trace: [ 222.299303] <IRQ> [ 222.299305] skb_dst_force+0x32/0x34 [ 222.299306] __dev_queue_xmit+0x1ad/0x482 [ 222.299308] ? eth_header+0x28/0xc6 [ 222.299310] dev_queue_xmit+0xb/0xd [ 222.299312] neigh_connected_output+0x9b/0xb4 [ 222.299314] ip_finish_output2+0x234/0x294 [ 222.299316] ? ipt_do_table+0x369/0x388 [ 222.299317] ip_finish_output+0x12c/0x13f [ 222.299319] ip_output+0x53/0x87 [ 222.299320] ip_forward_finish+0x53/0x5a [ 222.299321] ip_forward+0x2cb/0x3e6 [ 222.299323] ? pskb_trim_rcsum.part.9+0x4b/0x4b [ 222.299324] ip_rcv_finish+0x2e2/0x321 [ 222.299325] ip_rcv+0x26f/0x2eb [ 222.299337] ? vlan_do_receive+0x4f/0x289 [ 222.299338] __netif_receive_skb_core+0x467/0x50b [ 222.299341] ? tcp_gro_receive+0x239/0x239 [ 222.299344] ? inet_gro_receive+0x36/0x238 [ 222.299345] __netif_receive_skb+0x4d/0x5f [ 222.299346] netif_receive_skb_internal+0x5c/0xaf [ 222.299348] napi_gro_receive+0x45/0x81 [ 222.299351] ixgbe_poll+0xc8a/0xf09 [ 222.299354] ? kmem_cache_free_bulk+0x1b6/0x1f7 [ 222.299356] net_rx_action+0xf4/0x266 [ 222.299358] __do_softirq+0xa8/0x19d [ 222.299361] irq_exit+0x5d/0x6b [ 222.299362] do_IRQ+0x9c/0xb5 [ 222.299363] common_interrupt+0x93/0x93 [ 222.299364] </IRQ> [ 222.299366] RIP: 0010:mwait_idle+0x4e/0x6a [ 222.299367] RSP: 0018:ffffc900032fbe98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffc4 [ 222.299368] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 222.299368] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88086d919000 [ 222.299369] RBP: ffffc900032fbe98 R08: 0000000000000000 R09: 0000000000000000 [ 222.299370] R10: ffff88086d919000 R11: ffff88086d919000 R12: 0000000000000000 [ 222.299370] R13: ffff88086d919000 R14: ffff88086d919000 R15: ffff88086d919000 [ 222.299374] arch_cpu_idle+0xa/0xc [ 222.299375] default_idle_call+0x19/0x1b [ 222.299377] do_idle+0xbc/0x196 [ 222.299379] cpu_startup_entry+0x1d/0x20 [ 222.299380] start_secondary+0xf6/0xfb [ 222.299383] secondary_startup_64+0xa5/0xa5 [ 222.299384] Code: 08 0f 4f c2 5d c3 8b 87 80 00 00 00 55 48 8d 97 80 00 00 00 48 89 e5 85 c0 74 10 8d 48 01 f0 0f b1 0a 0f 94 c1 84 c9 75 04 eb ec <0f> ff 5d c3 48 8b 47 58 31 d2 a8 01 74 0b 31 d2 48 a9 fe ff ff [ 222.299402] ---[ end trace c197aa7ded9b7548 ]--- [ 222.324407] ------------[ cut here ]------------ [ 222.324410] WARNING: CPU: 32 PID: 0 at dst_hold+0x25/0x29 [ 222.324411] Modules linked in: bonding ipmi_si x86_pkg_temp_thermal [ 222.324414] CPU: 32 PID: 0 Comm: swapper/32 Tainted: G W 4.14.0-rc1+ #1 [ 222.324415] task: ffff88086d91d000 task.stack: ffffc90003360000 [ 222.324416] RIP: 0010:dst_hold+0x25/0x29 [ 222.324417] RSP: 0018:ffff88087fb039d8 EFLAGS: 00010246 [ 222.324418] RAX: 0000000000000000 RBX: ffff880459bd4900 RCX: 00000000000011be [ 222.324418] RDX: ffff880459bf0280 RSI: 0000000000000036 RDI: ffff880459bf0200 [ 222.324419] RBP: ffff88087fb039d8 R08: 0000000000000000 R09: ffff880848a920e2 [ 222.324420] R10: 00000000000000c0 R11: ffffffff815921b2 R12: ffff88046d7f8000 [ 222.324420] R13: ffff88046d7f8000 R14: 0000000000000000 R15: 0000000000000002 [ 222.324421] FS: 0000000000000000(0000) GS:ffff88087fb00000(0000) knlGS:0000000000000000 [ 222.324422] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 222.324422] CR2: 0000000011f99008 CR3: 000000086b737000 CR4: 00000000001606e0 [ 222.324423] Call Trace: [ 222.324433] <IRQ> [ 222.324435] skb_dst_force+0x32/0x34 [ 222.324436] __dev_queue_xmit+0x1ad/0x482 [ 222.324438] ? eth_header+0x28/0xc6 [ 222.324439] dev_queue_xmit+0xb/0xd [ 222.324441] neigh_connected_output+0x9b/0xb4 [ 222.324443] ip_finish_output2+0x234/0x294 [ 222.324444] ? ipt_do_table+0x369/0x388 [ 222.324445] ip_finish_output+0x12c/0x13f [ 222.324447] ip_output+0x53/0x87 [ 222.324448] ip_forward_finish+0x53/0x5a [ 222.324449] ip_forward+0x2cb/0x3e6 [ 222.324451] ? pskb_trim_rcsum.part.9+0x4b/0x4b [ 222.324452] ip_rcv_finish+0x2e2/0x321 [ 222.324453] ip_rcv+0x26f/0x2eb [ 222.324455] ? vlan_do_receive+0x4f/0x289 [ 222.324457] ? virt_to_head_page+0x3b/0x3d [ 222.324458] ? skb_gro_receive+0x13c/0x29e [ 222.324460] __netif_receive_skb_core+0x467/0x50b [ 222.324461] __netif_receive_skb+0x4d/0x5f [ 222.324462] netif_receive_skb_internal+0x5c/0xaf [ 222.324464] napi_gro_complete+0x7e/0x88 [ 222.324465] napi_gro_flush+0x4f/0x6b [ 222.324466] napi_complete_done+0x4b/0xbc [ 222.324468] ixgbe_poll+0xd2e/0xf09 [ 222.324470] ? kmem_cache_free_bulk+0x1b6/0x1f7 [ 222.324471] net_rx_action+0xf4/0x266 [ 222.324473] __do_softirq+0xa8/0x19d [ 222.324475] irq_exit+0x5d/0x6b [ 222.324476] do_IRQ+0x9c/0xb5 [ 222.324477] common_interrupt+0x93/0x93 [ 222.324478] </IRQ> [ 222.324480] RIP: 0010:mwait_idle+0x4e/0x6a [ 222.324480] RSP: 0018:ffffc90003363e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffffa6 [ 222.324482] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 222.324482] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88086d91d000 [ 222.324483] RBP: ffffc90003363e98 R08: fffffffffffffffc R09: 0000000000000000 [ 222.324483] R10: ffff88086d91d000 R11: ffff88086d91d000 R12: 0000000000000000 [ 222.324484] R13: ffff88086d91d000 R14: ffff88086d91d000 R15: ffff88086d91d000 [ 222.324486] arch_cpu_idle+0xa/0xc [ 222.324488] default_idle_call+0x19/0x1b [ 222.324490] do_idle+0xbc/0x196 [ 222.324491] ? complete+0x48/0x50 [ 222.324493] cpu_startup_entry+0x1d/0x20 [ 222.324494] start_secondary+0xf6/0xfb [ 222.324497] secondary_startup_64+0xa5/0xa5 [ 222.324497] Code: 08 0f 4f c2 5d c3 8b 87 80 00 00 00 55 48 8d 97 80 00 00 00 48 89 e5 85 c0 74 10 8d 48 01 f0 0f b1 0a 0f 94 c1 84 c9 75 04 eb ec <0f> ff 5d c3 48 8b 47 58 31 d2 a8 01 74 0b 31 d2 48 a9 fe ff ff [ 222.324516] ---[ end trace c197aa7ded9b7549 ]---
update [ 961.873764] dst_release: dst:ffff88086bfcfe00 refcnt:-1 [ 989.919469] ------------[ cut here ]------------ [ 989.919476] WARNING: CPU: 4 PID: 0 at dst_hold+0x25/0x29 [ 989.919476] Modules linked in: bonding ipmi_si x86_pkg_temp_thermal [ 989.919481] CPU: 4 PID: 0 Comm: swapper/4 Tainted: G W 4.14.0-rc1+ #1 [ 989.919482] task: ffff88046da32000 task.stack: ffffc90003280000 [ 989.919484] RIP: 0010:dst_hold+0x25/0x29 [ 989.919484] RSP: 0018:ffff88046f903a28 EFLAGS: 00010246 [ 989.919485] RAX: 0000000000000000 RBX: ffff880467bd2900 RCX: 0000000000000000 [ 989.919486] RDX: ffff88045ab85a80 RSI: 0000000000000000 RDI: ffff88045ab85a00 [ 989.919487] RBP: ffff88046f903a28 R08: ffff880457e84000 R09: 0000000000000344 [ 989.919488] R10: ffff880457e840c0 R11: ffffffff815921b2 R12: ffff88046d305000 [ 989.919488] R13: ffff88046d305000 R14: 0000000000000000 R15: 0000000000000002 [ 989.919489] FS: 0000000000000000(0000) GS:ffff88046f900000(0000) knlGS:0000000000000000 [ 989.919490] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 989.919491] CR2: 00007f5a78e3a000 CR3: 0000000863016001 CR4: 00000000001606e0 [ 989.919492] Call Trace: [ 989.919493] <IRQ> [ 989.919496] skb_dst_force+0x32/0x34 [ 989.919498] __dev_queue_xmit+0x1ad/0x482 [ 989.919501] ? eth_header+0x28/0xc6 [ 989.919502] dev_queue_xmit+0xb/0xd [ 989.919504] neigh_connected_output+0x9b/0xb4 [ 989.919507] ip_finish_output2+0x234/0x294 [ 989.919509] ? ipt_do_table+0x369/0x388 [ 989.919510] ip_finish_output+0x12c/0x13f [ 989.919512] ip_output+0x53/0x87 [ 989.919513] ip_forward_finish+0x53/0x5a [ 989.919515] ip_forward+0x2cb/0x3e6 [ 989.919516] ? pskb_trim_rcsum.part.9+0x4b/0x4b [ 989.919518] ip_rcv_finish+0x2e2/0x321 [ 989.919519] ip_rcv+0x26f/0x2eb [ 989.919522] ? vlan_do_receive+0x4f/0x289 [ 989.919523] __netif_receive_skb_core+0x467/0x50b [ 989.919526] ? tcp_gro_receive+0x239/0x239 [ 989.919529] ? inet_gro_receive+0x226/0x238 [ 989.919530] __netif_receive_skb+0x4d/0x5f [ 989.919532] netif_receive_skb_internal+0x5c/0xaf [ 989.919533] napi_gro_receive+0x45/0x81 [ 989.919536] ixgbe_poll+0xc8a/0xf09 [ 989.919539] ? kmem_cache_free_bulk+0x1b6/0x1f7 [ 989.919540] net_rx_action+0xf4/0x266 [ 989.919543] __do_softirq+0xa8/0x19d [ 989.919545] irq_exit+0x5d/0x6b [ 989.919546] do_IRQ+0x9c/0xb5 [ 989.919548] common_interrupt+0x93/0x93 [ 989.919548] </IRQ> [ 989.919551] RIP: 0010:mwait_idle+0x4e/0x6a [ 989.919551] RSP: 0018:ffffc90003283e98 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff47 [ 989.919553] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 989.919553] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88046da32000 [ 989.919554] RBP: ffffc90003283e98 R08: 0000000000000001 R09: ffffffff81c06020 [ 989.919554] R10: ffff88046da32000 R11: ffff88046da32000 R12: 0000000000000000 [ 989.919555] R13: ffff88046da32000 R14: ffff88046da32000 R15: ffff88046da32000 [ 989.919559] arch_cpu_idle+0xa/0xc [ 989.919561] default_idle_call+0x19/0x1b [ 989.919565] do_idle+0xbc/0x196 [ 989.919566] cpu_startup_entry+0x1d/0x20 [ 989.919568] start_secondary+0xf6/0xfb [ 989.919571] secondary_startup_64+0xa5/0xa5 [ 989.919572] Code: 08 0f 4f c2 5d c3 8b 87 80 00 00 00 55 48 8d 97 80 00 00 00 48 89 e5 85 c0 74 10 8d 48 01 f0 0f b1 0a 0f 94 c1 84 c9 75 04 eb ec <0f> ff 5d c3 48 8b 47 58 31 d2 a8 01 74 0b 31 d2 48 a9 fe ff ff [ 989.919593] ---[ end trace c197aa7ded9b754f ]--- [ 989.919594] dst_release: dst:ffff88045ab85a00 refcnt:-1
Problem solved by patch: https://www.mail-archive.com/netdev@vger.kernel.org/msg189383.html