We have 10 equal server what do Trafiic Shape (tc (htb, u32, sfq) and iptables) only. Few of them halt one times in week. Timer settings in config: HZ=300 NO_HZ=n HIGH_RES_TIMERS = n Server 1: [321478.840858] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01fafc7, registers: [321478.840858] Modules linked in: netconsole e1000e i2c_i801 e1000 i2c_core [321478.840858] [321478.840858] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [321478.840858] EIP: 0060:[<c01fafc7>] EFLAGS: 00000082 CPU: 3 [321478.840858] EIP is at rb_insert_color+0x17/0xc0 [321478.840858] EAX: f10294a4 EBX: f10294a4 ECX: 00000000 EDX: f10294a4 [321478.840858] ESI: f10294a4 EDI: f10294a4 EBP: c202d0d4 ESP: f7c5fcac [321478.840858] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [321478.840858] Process swapper (pid: 0, ti=f7c5e000 task=f7c32940 task.ti=f7c5e000) [321478.840858] Stack: f10294a4 00000000 c202d0cc c202d0d4 c013a8ff f10294a4 c202d0cc c20230cc [321478.840858] c04450a0 c013adea 00000000 f7c5fcfc 406ce400 00012461 00000001 00000286 [321478.840858] f1029000 ffffffff 00000000 00000000 c02d15fe 00000000 f1029000 c02d6da6 [321478.840858] Call Trace: [321478.840858] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [321478.840858] [<c013adea>] hrtimer_start+0xaa/0x130 [321478.840858] [<c02d15fe>] qdisc_watchdog_schedule+0x1e/0x30 [321478.840858] [<c02d6da6>] htb_dequeue+0x6a6/0x810 [321478.840858] [<c02d77f7>] sfq_drop+0x1a7/0x260 [321478.840858] [<c02d77f7>] sfq_drop+0x1a7/0x260 [321478.840858] [<c02d14f2>] tc_classify+0x42/0x90 [321478.840858] [<c02d71c0>] htb_enqueue+0x0/0x1e0 [321478.840858] [<c02d047c>] __qdisc_run+0x19c/0x1d0 [321478.840858] [<c02d71c0>] htb_enqueue+0x0/0x1e0 [321478.840858] [<c02c4cb7>] dev_queue_xmit+0x267/0x380 [321478.840858] [<c02e6040>] ip_forward_finish+0x0/0x40 [321478.840858] [<c02e8bef>] ip_finish_output+0x11f/0x280 [321478.840858] [<c02e630f>] ip_forward+0x28f/0x2d0 [321478.840858] [<c02e6065>] ip_forward_finish+0x25/0x40 [321478.840858] [<c02e4ba2>] ip_rcv_finish+0x122/0x360 [321478.840858] [<c016d7e9>] add_partial+0x19/0x60 [321478.840858] [<c016e8d9>] __slab_free+0x169/0x290 [321478.840858] [<c016e8d9>] __slab_free+0x169/0x290 [321478.840858] [<c02e5020>] ip_rcv+0x0/0x290 [321478.840858] [<c02c1b4b>] netif_receive_skb+0x26b/0x470 [321478.840858] [<f886b74d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [321478.840858] [<f886e9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [321478.840858] [<f886af59>] e1000_clean+0x49/0x1f0 [e1000e] [321478.840858] [<c02c3f58>] net_rx_action+0xf8/0x1b0 [321478.840858] [<c012a062>] __do_softirq+0x82/0x100 [321478.840858] [<c012a117>] do_softirq+0x37/0x40 [321478.840858] [<c0107120>] do_IRQ+0x40/0x80 [321478.840858] [<c01055a3>] common_interrupt+0x23/0x28 [321478.840858] [<c010a5e2>] mwait_idle+0x32/0x40 [321478.840858] [<c010a5b0>] mwait_idle+0x0/0x40 [321478.840858] [<c01036e8>] cpu_idle+0x48/0xc0 [321478.840858] ======================= [321478.840858] Code: 24 83 c4 0c c3 89 56 04 eb e3 8d 76 00 8d bc 27 00 00 00 00 55 89 d5 57 89 c7 56 53 90 8d b4 26 00 00 00 00 8b 1f 83 e3 fc 74 32 <8b> 03 89 d9 a8 01 75 2a 89 c6 83 e6 fc 8b 56 08 39 d3 74 45 85 [ 2251.728719] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01fafd4, registers: [ 2251.728719] Modules linked in: netconsole i2c_i801 i2c_core e1000e e1000 [ 2251.728719] [ 2251.728719] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [ 2251.728719] EIP: 0060:[<c01fafd4>] EFLAGS: 00000082 CPU: 3 [ 2251.728719] EIP is at rb_insert_color+0x24/0xc0 [ 2251.728719] EAX: f6c134a4 EBX: f6c134a4 ECX: f6c134a4 EDX: f6c134a4 [ 2251.728719] ESI: f6c134a4 EDI: f6c134a4 EBP: c202d0d4 ESP: f7c5fcac [ 2251.728719] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 2251.728719] Process swapper (pid: 0, ti=f7c5e000 task=f7c32940 task.ti=f7c5e000) [ 2251.728719] Stack: f6c134a4 00000000 c202d0cc c202d0d4 c013a8ff f6c134a4 c202d0cc c20230cc [ 2251.728719] c04450a0 c013adea 00000000 f7c5fcfc 392e7c00 0000020c 00000001 00000286 [ 2251.728719] f6c13000 ffffffff 00000000 00000000 c02d15fe 00000000 f6c13000 c02d6da6 [ 2251.728719] Call Trace: [ 2251.728719] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [ 2251.728719] [<c013adea>] hrtimer_start+0xaa/0x130 [ 2251.728719] [<c02d15fe>] qdisc_watchdog_schedule+0x1e/0x30 [ 2251.728719] [<c02d6da6>] htb_dequeue+0x6a6/0x810 [ 2251.728719] [<c02d047c>] __qdisc_run+0x19c/0x1d0 [ 2251.728719] [<c02d71c0>] htb_enqueue+0x0/0x1e0 [ 2251.728719] [<c02c4cb7>] dev_queue_xmit+0x267/0x380 [ 2251.728719] [<c02e6040>] ip_forward_finish+0x0/0x40 [ 2251.728719] [<c02e8bef>] ip_finish_output+0x11f/0x280 [ 2251.728719] [<c02e630f>] ip_forward+0x28f/0x2d0 [ 2251.728719] [<c02e6065>] ip_forward_finish+0x25/0x40 [ 2251.728719] [<c02e4ba2>] ip_rcv_finish+0x122/0x360 [ 2251.728719] [<c016d7e9>] add_partial+0x19/0x60 [ 2251.728719] [<c016e8d9>] __slab_free+0x169/0x290 [ 2251.728719] [<c02e5020>] ip_rcv+0x0/0x290 [ 2251.728719] [<c02c1b4b>] netif_receive_skb+0x26b/0x470 [ 2251.728719] [<f886c74d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [ 2251.728719] [<f886f9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [ 2251.728719] [<f886bf59>] e1000_clean+0x49/0x1f0 [e1000e] [ 2251.728719] [<c02c3f58>] net_rx_action+0xf8/0x1b0 [ 2251.728719] [<c012a062>] __do_softirq+0x82/0x100 [ 2251.728719] [<c012a117>] do_softirq+0x37/0x40 [ 2251.728719] [<c0107120>] do_IRQ+0x40/0x80 [ 2251.728719] [<c01055a3>] common_interrupt+0x23/0x28 [ 2251.728719] [<c010a5e2>] mwait_idle+0x32/0x40 [ 2251.728719] [<c010a5b0>] mwait_idle+0x0/0x40 [ 2251.728719] [<c01036e8>] cpu_idle+0x48/0xc0 [ 2251.728719] ======================= [ 2251.728719] Code: 8d bc 27 00 00 00 00 55 89 d5 57 89 c7 56 53 90 8d b4 26 00 00 00 00 8b 1f 83 e3 fc 74 32 8b 03 89 d9 a8 01 75 2a 89 c6 83 e6 fc <8b> 56 08 39 d3 74 45 85 d2 74 25 8b 02 a8 01 75 1f 83 c8 01 89 [196496.545559] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01faf4c, registers: [196496.545559] Modules linked in: netconsole i2c_i801 e1000e e1000 i2c_core [196496.545559] [196496.545559] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [196496.545559] EIP: 0060:[<c01faf4c>] EFLAGS: 00000096 CPU: 3 [196496.545559] EIP is at __rb_rotate_right+0xc/0x70 [196496.545559] EAX: f741f4a4 EBX: f741f4a4 ECX: f741f4a4 EDX: c202d0d4 [196496.545559] ESI: f741f4a4 EDI: f741f4a4 EBP: c202d0d4 ESP: f7c5fc9c [196496.545559] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [196496.545559] Process swapper (pid: 0, ti=f7c5e000 task=f7c32940 task.ti=f7c5e000) [196496.545559] Stack: f741f4a4 f741f4a4 f741f4a4 c01fb041 f741f4a4 00000000 c202d0cc c202d0d4 [196496.545559] c013a8ff f741f4a4 c202d0cc c20230cc c04450a0 c013adea 00000000 f7c5fcfc [196496.545559] 8af6d400 0000b2b5 00000001 00000286 f741f000 ffffffff 00000000 00000000 [196496.545559] Call Trace: [196496.545559] [<c01fb041>] rb_insert_color+0x91/0xc0 [196496.545559] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [196496.545559] [<c013adea>] hrtimer_start+0xaa/0x130 [196496.545559] [<c02d15fe>] qdisc_watchdog_schedule+0x1e/0x30 [196496.545559] [<c02d6da6>] htb_dequeue+0x6a6/0x810 [196496.545559] [<c02d047c>] __qdisc_run+0x19c/0x1d0 [196496.545559] [<c02d71c0>] htb_enqueue+0x0/0x1e0 [196496.545559] [<c02c4cb7>] dev_queue_xmit+0x267/0x380 [196496.545559] [<c02e6040>] ip_forward_finish+0x0/0x40 [196496.545559] [<c02e8bef>] ip_finish_output+0x11f/0x280 [196496.545559] [<c02e630f>] ip_forward+0x28f/0x2d0 [196496.545559] [<c02e6065>] ip_forward_finish+0x25/0x40 [196496.545559] [<c02e4ba2>] ip_rcv_finish+0x122/0x360 [196496.545559] [<c02be072>] __netdev_alloc_skb+0x22/0x50 [196496.545559] [<c033091c>] notifier_call_chain+0x3c/0x80 [196496.545559] [<c02e5020>] ip_rcv+0x0/0x290 [196496.545559] [<c02c1b4b>] netif_receive_skb+0x26b/0x470 [196496.545559] [<c02be072>] __netdev_alloc_skb+0x22/0x50 [196496.545559] [<f886774d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [196496.545559] [<f886a9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [196496.545559] [<f8866f59>] e1000_clean+0x49/0x1f0 [e1000e] [196496.545559] [<c02c3f58>] net_rx_action+0xf8/0x1b0 [196496.545559] [<c012a062>] __do_softirq+0x82/0x100 [196496.545559] [<c012a117>] do_softirq+0x37/0x40 [196496.545559] [<c0107120>] do_IRQ+0x40/0x80 [196496.545559] [<c0114077>] smp_apic_timer_interrupt+0x57/0x90 [196496.545559] [<c01055a3>] common_interrupt+0x23/0x28 [196496.545559] [<c010a5e2>] mwait_idle+0x32/0x40 [196496.545559] [<c010a5b0>] mwait_idle+0x0/0x40 [196496.545559] [<c01036e8>] cpu_idle+0x48/0xc0 [196496.545559] ======================= [196496.545559] Code: 24 08 83 e0 03 09 d0 89 03 8b 1c 24 83 c4 0c c3 89 56 08 eb e3 8d 76 00 8d bc 27 00 00 00 00 83 ec 0c 89 1c 24 89 c3 89 7c 24 08 <89> d7 89 74 24 04 8b 50 08 8b 30 8b 4a 04 83 e6 fc 85 c9 89 48 [23749.920305] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01faed0, registers: [23749.920305] Modules linked in: netconsole e1000e i2c_i801 e1000 i2c_core [23749.920305] [23749.920305] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [23749.920305] EIP: 0060:[<c01faed0>] EFLAGS: 00000082 CPU: 3 [23749.920305] EIP is at __rb_rotate_left+0x0/0x70 [23749.920305] EAX: f72914a4 EBX: f72914a4 ECX: f72914a4 EDX: c202d0d4 [23749.920305] ESI: f72914a4 EDI: f72914a4 EBP: c202d0d4 ESP: f7c5fca8 [23749.920305] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [23749.920305] Process swapper (pid: 0, ti=f7c5e000 task=f7c32940 task.ti=f7c5e000) [23749.920305] Stack: c01fb018 f72914a4 00000000 c202d0cc c202d0d4 c013a8ff f72914a4 c202d0cc [23749.920305] c20190cc c04450a0 c013adea 00000000 f7c5fcfc d5dfdc00 00001598 00000001 [23749.920305] 00000286 f7291000 ffffffff 00000000 00000000 c02d15fe 00000000 f7291000 [23749.920305] Call Trace: [23749.920305] [<c01fb018>] rb_insert_color+0x68/0xc0 [23749.920305] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [23749.920305] [<c013adea>] hrtimer_start+0xaa/0x130 [23749.920305] [<c02d15fe>] qdisc_watchdog_schedule+0x1e/0x30 [23749.920305] [<c02d6da6>] htb_dequeue+0x6a6/0x810 [23749.920305] [<c02d047c>] __qdisc_run+0x19c/0x1d0 [23749.920305] [<c02d71c0>] htb_enqueue+0x0/0x1e0 [23749.920305] [<c02c4cb7>] dev_queue_xmit+0x267/0x380 [23749.920305] [<c02e6040>] ip_forward_finish+0x0/0x40 [23749.920305] [<c02e8bef>] ip_finish_output+0x11f/0x280 [23749.920305] [<c02e630f>] ip_forward+0x28f/0x2d0 [23749.920305] [<c02e6065>] ip_forward_finish+0x25/0x40 [23749.920305] [<c02e4ba2>] ip_rcv_finish+0x122/0x360 [23749.920305] [<c02bd007>] __alloc_skb+0x57/0x120 [23749.920305] [<c02e5020>] ip_rcv+0x0/0x290 [23749.920305] [<c02c1b4b>] netif_receive_skb+0x26b/0x470 [23749.920305] [<c02be072>] __netdev_alloc_skb+0x22/0x50 [23749.920305] [<f886774d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [23749.920305] [<f886a9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [23749.920305] [<f8866f59>] e1000_clean+0x49/0x1f0 [e1000e] [23749.920305] [<c02c3f58>] net_rx_action+0xf8/0x1b0 [23749.920305] [<c012a062>] __do_softirq+0x82/0x100 [23749.920305] [<c012a117>] do_softirq+0x37/0x40 [23749.920305] [<c0107120>] do_IRQ+0x40/0x80 [23749.920305] [<c01055a3>] common_interrupt+0x23/0x28 [23749.920305] [<c010a5e2>] mwait_idle+0x32/0x40 [23749.920305] [<c010a5b0>] mwait_idle+0x0/0x40 [23749.920305] [<c01036e8>] cpu_idle+0x48/0xc0 [23749.920305] ======================= [23749.920305] Code: 35 3a c0 e8 c3 b0 f2 ff b8 01 00 00 00 8b 5c 24 0c 8b 74 24 10 8b 7c 24 14 83 c4 18 c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 <83> ec 0c 89 1c 24 89 c3 89 7c 24 08 89 d7 89 74 24 04 8b 50 04 Server 2: [17053.718192] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01fb02d, registers: [17053.718192] Modules linked in: netconsole e1000e i2c_i801 e1000 i2c_core [17053.718192] [17053.718192] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [17053.718192] EIP: 0060:[<c01fb02d>] EFLAGS: 00000046 CPU: 1 [17053.718192] EIP is at rb_insert_color+0x7d/0xc0 [17053.718192] EAX: f3412ca4 EBX: f3412ca4 ECX: f3412ca4 EDX: 00000000 [17053.718192] ESI: f3412ca4 EDI: f3412ca4 EBP: c20190d4 ESP: f7c4dcac [17053.718192] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [17053.718192] Process swapper (pid: 0, ti=f7c4c000 task=f7c314a0 task.ti=f7c4c000) [17053.718192] Stack: f3412ca4 00000000 c20190cc c20190d4 c013a8ff f3412ca4 c20190cc c200f0cc [17053.718192] c044b0a0 c013adea 00000000 f7c4dcfc c121d800 00000f81 00000001 00000286 [17053.718192] f3412800 ffffffff 00000000 00000000 c02d527e 00000000 f3412800 c02daa26 [17053.718192] Call Trace: [17053.718192] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [17053.718192] [<c013adea>] hrtimer_start+0xaa/0x130 [17053.718192] [<c02d527e>] qdisc_watchdog_schedule+0x1e/0x30 [17053.718192] [<c02daa26>] htb_dequeue+0x6a6/0x810 [17053.718192] [<c02d40fc>] __qdisc_run+0x19c/0x1d0 [17053.718192] [<c02dae40>] htb_enqueue+0x0/0x1e0 [17053.718192] [<c02c86b7>] dev_queue_xmit+0x267/0x380 [17053.718192] [<c02e9cc0>] ip_forward_finish+0x0/0x40 [17053.718192] [<c02ec86f>] ip_finish_output+0x11f/0x280 [17053.718192] [<c02e9f8f>] ip_forward+0x28f/0x2d0 [17053.718192] [<c02e9ce5>] ip_forward_finish+0x25/0x40 [17053.718192] [<c02e8822>] ip_rcv_finish+0x122/0x360 [17053.718192] [<c02c0777>] __alloc_skb+0x57/0x120 [17053.718192] [<c02bfe68>] __kfree_skb+0x8/0x80 [17053.718192] [<f883565b>] e1000_unmap_and_free_tx_resource+0x5b/0x80 [e1000] [17053.718192] [<c02e8ca0>] ip_rcv+0x0/0x290 [17053.718192] [<c02c54fb>] netif_receive_skb+0x26b/0x470 [17053.718192] [<f886b74d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [17053.718192] [<f886e9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [17053.718192] [<f886af59>] e1000_clean+0x49/0x1f0 [e1000e] [17053.718192] [<c02c790b>] net_rx_action+0xfb/0x200 [17053.718192] [<c012a062>] __do_softirq+0x82/0x100 [17053.718192] [<c012a117>] do_softirq+0x37/0x40 [17053.718192] [<c0107120>] do_IRQ+0x40/0x80 [17053.718192] [<c01055a3>] common_interrupt+0x23/0x28 [17053.718192] [<c010a5e2>] mwait_idle+0x32/0x40 [17053.718192] [<c010a5b0>] mwait_idle+0x0/0x40 [17053.718192] [<c01036e8>] cpu_idle+0x48/0xc0 [17053.718192] ======================= [17053.718192] Code: 5d c3 3b 7b 08 74 3d 83 09 01 89 ea 89 f0 83 26 fe e8 b8 fe ff ff eb a6 8d b6 00 00 00 00 8b 56 04 85 d2 74 06 8b 02 a8 01 74 b8 <3b> 7b 04 74 23 83 09 01 89 ea 89 f0 83 26 fe e8 ff fe ff ff e9 [75131.217107] BUG: NMI Watchdog detected LOCKUP on CPU0, ip c01fb037, registers: [75131.217107] Modules linked in: netconsole i2c_i801 e1000e i2c_core e1000 [75131.217107] [75131.217107] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [75131.217107] EIP: 0060:[<c01fb037>] EFLAGS: 00000086 CPU: 0 [75131.217107] EIP is at rb_insert_color+0x87/0xc0 [75131.217107] EAX: f56bb4a4 EBX: f56bb4a4 ECX: f56bb4a4 EDX: c200f0d4 [75131.217107] ESI: f56bb4a4 EDI: f56bb4a4 EBP: c200f0d4 ESP: c0411cdc [75131.217107] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [75131.217107] Process swapper (pid: 0, ti=c0410000 task=c03df340 task.ti=c0410000) [75131.217107] Stack: f56bb4a4 00000000 c200f0cc c200f0d4 c013a8ff f56bb4a4 c200f0cc c200f0cc [75131.217107] c044b0a0 c013adea c013d06d c0411d2c d940e400 00004454 00000000 00000286 [75131.217107] f56bb000 ffffffff 00000000 00000000 c02d527e 00000000 f56bb000 c02daa26 [75131.217107] Call Trace: [75131.217107] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [75131.217107] [<c013adea>] hrtimer_start+0xaa/0x130 [75131.217107] [<c013d06d>] getnstimeofday+0x3d/0xe0 [75131.217107] [<c02d527e>] qdisc_watchdog_schedule+0x1e/0x30 [75131.217107] [<c02daa26>] htb_dequeue+0x6a6/0x810 [75131.217107] [<c02d40fc>] __qdisc_run+0x19c/0x1d0 [75131.217107] [<c02dae40>] htb_enqueue+0x0/0x1e0 [75131.217107] [<c02c86b7>] dev_queue_xmit+0x267/0x380 [75131.217107] [<c02e9cc0>] ip_forward_finish+0x0/0x40 [75131.217107] [<c02ec86f>] ip_finish_output+0x11f/0x280 [75131.217107] [<c02e9f8f>] ip_forward+0x28f/0x2d0 [75131.217107] [<c02e9ce5>] ip_forward_finish+0x25/0x40 [75131.217107] [<c02e8822>] ip_rcv_finish+0x122/0x360 [75131.217107] [<c02c17e2>] __netdev_alloc_skb+0x22/0x50 [75131.217107] [<c02e8ca0>] ip_rcv+0x0/0x290 [75131.217107] [<c02c54fb>] netif_receive_skb+0x26b/0x470 [75131.217107] [<c02c17e2>] __netdev_alloc_skb+0x22/0x50 [75131.217107] [<f886c74d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [75131.217107] [<f886f9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [75131.217107] [<f886bf59>] e1000_clean+0x49/0x1f0 [e1000e] [75131.217107] [<c02c790b>] net_rx_action+0xfb/0x200 [75131.217107] [<c012a062>] __do_softirq+0x82/0x100 [75131.217107] [<c012a117>] do_softirq+0x37/0x40 [75131.217107] [<c0107120>] do_IRQ+0x40/0x80 [75131.217107] [<c01055a3>] common_interrupt+0x23/0x28 [75131.217107] [<c010a5e2>] mwait_idle+0x32/0x40 [75131.217107] [<c010a5b0>] mwait_idle+0x0/0x40 [75131.217107] [<c01036e8>] cpu_idle+0x48/0xc0 [75131.217107] ======================= [75131.217107] Code: 89 ea 89 f0 83 26 fe e8 b8 fe ff ff eb a6 8d b6 00 00 00 00 8b 56 04 85 d2 74 06 8b 02 a8 01 74 b8 3b 7b 04 74 23 83 09 01 89 ea <89> f0 83 26 fe e8 ff fe ff ff e9 7a ff ff ff 89 ea 89 d8 e8 f1 [176617.218140] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01faee7, registers: [176617.218140] Modules linked in: netconsole e1000e e1000 i2c_i801 i2c_core [176617.218140] [176617.218140] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [176617.218140] EIP: 0060:[<c01faee7>] EFLAGS: 00000096 CPU: 1 [176617.218140] EIP is at __rb_rotate_left+0x17/0x70 [176617.218140] EAX: f6c104a4 EBX: f6c104a4 ECX: f6c104a4 EDX: f6c104a4 [176617.218140] ESI: f6c104a4 EDI: c20190d4 EBP: c20190d4 ESP: f7c4dc9c [176617.218140] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [176617.218140] Process swapper (pid: 0, ti=f7c4c000 task=f7c314a0 task.ti=f7c4c000) [176617.218140] Stack: f6c104a4 f6c104a4 f6c104a4 c01fb018 f6c104a4 00000000 c20190cc c20190d4 [176617.218140] c013a8ff f6c104a4 c20190cc c200f0cc c044b0a0 c013adea 00000000 f7c4dcfc [176617.218140] 06da7c00 0000a0a1 00000001 00000286 f6c10000 ffffffff 00000000 00000000 [176617.218140] Call Trace: [176617.218140] [<c01fb018>] rb_insert_color+0x68/0xc0 [176617.218140] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [176617.218140] [<c013adea>] hrtimer_start+0xaa/0x130 [176617.218140] [<c02d527e>] qdisc_watchdog_schedule+0x1e/0x30 [176617.218140] [<c02daa26>] htb_dequeue+0x6a6/0x810 [176617.218140] [<c02d40fc>] __qdisc_run+0x19c/0x1d0 [176617.218140] [<c02dae40>] htb_enqueue+0x0/0x1e0 [176617.218140] [<c02c86b7>] dev_queue_xmit+0x267/0x380 [176617.218140] [<c02e9cc0>] ip_forward_finish+0x0/0x40 [176617.218140] [<c02ec86f>] ip_finish_output+0x11f/0x280 [176617.218140] [<c02e9f8f>] ip_forward+0x28f/0x2d0 [176617.218140] [<c02e9ce5>] ip_forward_finish+0x25/0x40 [176617.218140] [<c02e8822>] ip_rcv_finish+0x122/0x360 [176617.218140] [<c016007b>] split_vma+0xb/0x130 [176617.218140] [<c016d7e9>] add_partial+0x19/0x60 [176617.218140] [<c016e8d9>] __slab_free+0x169/0x290 [176617.218140] [<c02e8ca0>] ip_rcv+0x0/0x290 [176617.218140] [<c02c54fb>] netif_receive_skb+0x26b/0x470 [176617.218140] [<c02c17e2>] __netdev_alloc_skb+0x22/0x50 [176617.218140] [<f886774d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [176617.218140] [<f886a9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [176617.218140] [<f8866f59>] e1000_clean+0x49/0x1f0 [e1000e] [176617.218140] [<c02c790b>] net_rx_action+0xfb/0x200 [176617.218140] [<c012a062>] __do_softirq+0x82/0x100 [176617.218140] [<c012a117>] do_softirq+0x37/0x40 [176617.218140] [<c0107120>] do_IRQ+0x40/0x80 [176617.218140] [<c01055a3>] common_interrupt+0x23/0x28 [176617.218140] [<c010a5e2>] mwait_idle+0x32/0x40 [176617.218140] [<c010a5b0>] mwait_idle+0x0/0x40 [176617.218140] [<c01036e8>] cpu_idle+0x48/0xc0 [176617.218140] ======================= [176617.218140] Code: 24 14 83 c4 18 c3 90 90 90 90 90 90 90 90 90 90 90 90 90 90 83 ec 0c 89 1c 24 89 c3 89 7c 24 08 89 d7 89 74 24 04 8b 50 04 8b 30 <8b> 4a 08 83 e6 fc 85 c9 89 48 04 74 09 8b 01 83 e0 03 09 d8 89 [121302.565021] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01faf10, registers: [121302.565021] Modules linked in: netconsole e1000e i2c_i801 e1000 i2c_core [121302.565021] [121302.565021] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [121302.565021] EIP: 0060:[<c01faf10>] EFLAGS: 00000046 CPU: 1 [121302.565021] EIP is at __rb_rotate_left+0x40/0x70 [121302.565021] EAX: f7d5c4a4 EBX: f7d5c4a4 ECX: 00000000 EDX: f7d5c4a4 [121302.565021] ESI: f7d5c4a4 EDI: c20190d4 EBP: c20190d4 ESP: f7c4dc9c [121302.565021] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [121302.565021] Process swapper (pid: 0, ti=f7c4c000 task=f7c314a0 task.ti=f7c4c000) [121302.565021] Stack: f7d5c4a4 f7d5c4a4 f7d5c4a4 c01fb018 f7d5c4a4 00000000 c20190cc c20190d4 [121302.565021] c013a8ff f7d5c4a4 c20190cc c20230cc c044b0a0 c013adea 00000000 f7c4dcfc [121302.565021] 1467b400 00006e52 00000001 00000286 f7d5c000 ffffffff 00000000 00000000 [121302.565021] Call Trace: [121302.565021] [<c01fb018>] rb_insert_color+0x68/0xc0 [121302.565021] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [121302.565021] [<c013adea>] hrtimer_start+0xaa/0x130 [121302.565021] [<c02d527e>] qdisc_watchdog_schedule+0x1e/0x30 [121302.565021] [<c02daa26>] htb_dequeue+0x6a6/0x810 [121302.565021] [<c02d40fc>] __qdisc_run+0x19c/0x1d0 [121302.565021] [<c02dae40>] htb_enqueue+0x0/0x1e0 [121302.565021] [<c02c86b7>] dev_queue_xmit+0x267/0x380 [121302.565021] [<c02e9cc0>] ip_forward_finish+0x0/0x40 [121302.565021] [<c02ec86f>] ip_finish_output+0x11f/0x280 [121302.565021] [<c02e9f8f>] ip_forward+0x28f/0x2d0 [121302.565021] [<c02e9ce5>] ip_forward_finish+0x25/0x40 [121302.565021] [<c02e8822>] ip_rcv_finish+0x122/0x360 [121302.565021] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [121302.565021] [<c02e8ca0>] ip_rcv+0x0/0x290 [121302.565021] [<c02c54fb>] netif_receive_skb+0x26b/0x470 [121302.565021] [<f886774d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [121302.565021] [<f886a9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [121302.565021] [<f8866f59>] e1000_clean+0x49/0x1f0 [e1000e] [121302.565021] [<c02c790b>] net_rx_action+0xfb/0x200 [121302.565021] [<c012a062>] __do_softirq+0x82/0x100 [121302.565021] [<c012a117>] do_softirq+0x37/0x40 [121302.565021] [<c0107120>] do_IRQ+0x40/0x80 [121302.565021] [<c01055a3>] common_interrupt+0x23/0x28 [121302.565021] [<c010a5e2>] mwait_idle+0x32/0x40 [121302.565021] [<c010a5b0>] mwait_idle+0x0/0x40 [121302.565021] [<c01036e8>] cpu_idle+0x48/0xc0 [121302.565021] ======================= [121302.565021] Code: 8b 30 8b 4a 08 83 e6 fc 85 c9 89 48 04 74 09 8b 01 83 e0 03 09 d8 89 01 8b 02 89 5a 08 83 e0 03 09 f0 85 f6 89 02 74 0a 3b 5e 08 <74> 1f 89 56 04 eb 02 89 17 8b 03 8b 74 24 04 8b 7c 24 08 83 e0 [96112.953448] BUG: NMI Watchdog detected LOCKUP on CPU1, ip c01faf93, registers: [96112.953448] Modules linked in: netconsole e1000e i2c_i801 e1000 i2c_core [96112.953448] [96112.953448] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [96112.953448] EIP: 0060:[<c01faf93>] EFLAGS: 00000046 CPU: 1 [96112.953448] EIP is at __rb_rotate_right+0x53/0x70 [96112.953448] EAX: f3287ca4 EBX: f3287ca4 ECX: 00000000 EDX: f3287ca4 [96112.953448] ESI: f3287ca4 EDI: f3287ca4 EBP: c20190d4 ESP: f7c4dc9c [96112.953448] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [96112.953448] Process swapper (pid: 0, ti=f7c4c000 task=f7c314a0 task.ti=f7c4c000) [96112.953448] Stack: f3287ca4 f3287ca4 f3287ca4 c01fb041 f3287ca4 00000000 c20190cc c20190d4 [96112.953448] c013a8ff f3287ca4 c20190cc c200f0cc c044b0a0 c013adea 00000000 f7c4dcfc [96112.953448] 2ac32800 00005769 00000001 00000286 f3287800 ffffffff 00000000 00000000 [96112.953448] Call Trace: [96112.953448] [<c01fb041>] rb_insert_color+0x91/0xc0 [96112.953448] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [96112.953448] [<c013adea>] hrtimer_start+0xaa/0x130 [96112.953448] [<c02d527e>] qdisc_watchdog_schedule+0x1e/0x30 [96112.953448] [<c02daa26>] htb_dequeue+0x6a6/0x810 [96112.953448] [<c02d40fc>] __qdisc_run+0x19c/0x1d0 [96112.953448] [<c02dae40>] htb_enqueue+0x0/0x1e0 [96112.953448] [<c02c86b7>] dev_queue_xmit+0x267/0x380 [96112.953448] [<c02e9cc0>] ip_forward_finish+0x0/0x40 [96112.953448] [<c02ec86f>] ip_finish_output+0x11f/0x280 [96112.953448] [<c02e9f8f>] ip_forward+0x28f/0x2d0 [96112.953448] [<c02e9ce5>] ip_forward_finish+0x25/0x40 [96112.953448] [<c02e8822>] ip_rcv_finish+0x122/0x360 [96112.953448] [<c016d7e9>] add_partial+0x19/0x60 [96112.953448] [<c016d7e9>] add_partial+0x19/0x60 [96112.953448] [<c016e8d9>] __slab_free+0x169/0x290 [96112.953448] [<c011c700>] find_busiest_group+0x180/0x740 [96112.953448] [<c02e8ca0>] ip_rcv+0x0/0x290 [96112.953448] [<c02c54fb>] netif_receive_skb+0x26b/0x470 [96112.953448] [<f886774d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [96112.953448] [<f886a9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [96112.953448] [<f8866f59>] e1000_clean+0x49/0x1f0 [e1000e] [96112.953448] [<c02c790b>] net_rx_action+0xfb/0x200 [96112.953448] [<c012a062>] __do_softirq+0x82/0x100 [96112.953448] [<c012a117>] do_softirq+0x37/0x40 [96112.953448] [<c0107120>] do_IRQ+0x40/0x80 [96112.953448] [<c01055a3>] common_interrupt+0x23/0x28 [96112.953448] [<c010a5e2>] mwait_idle+0x32/0x40 [96112.953448] [<c010a5b0>] mwait_idle+0x0/0x40 [96112.953448] [<c01036e8>] cpu_idle+0x48/0xc0 [96112.953448] ======================= [96112.953448] Code: 03 09 d8 89 01 8b 02 89 5a 04 83 e0 03 09 f0 85 f6 89 02 74 0a 3b 5e 04 74 1f 89 56 08 eb 02 89 17 8b 03 8b 74 24 04 8b 7c 24 08 <83> e0 03 09 d0 89 03 8b 1c 24 83 c4 0c c3 89 56 04 eb e3 8d 76 Server 3: [ 8518.194288] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01faf57, registers: [ 8518.194288] Modules linked in: netconsole i2c_i801 e1000e i2c_core e1000 [ 8518.194288] [ 8518.194288] Pid: 0, comm: swapper Not tainted (2.6.26.5-fw #1) [ 8518.194288] EIP: 0060:[<c01faf57>] EFLAGS: 00000092 CPU: 3 [ 8518.194288] EIP is at __rb_rotate_right+0x17/0x70 [ 8518.194288] EAX: f52b14a4 EBX: f52b14a4 ECX: f52b14a4 EDX: f52b14a4 [ 8518.194288] ESI: f52b14a4 EDI: c202d0d4 EBP: c202d0d4 ESP: f7c5fc68 [ 8518.194288] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 8518.194288] Process swapper (pid: 0, ti=f7c5e000 task=f7c32940 task.ti=f7c5e000) [ 8518.194288] Stack: f52b14a4 f52b14a4 f52b14a4 c01fb041 f52b14a4 00000000 c202d0cc c202d0d4 [ 8518.194288] c013a8ff f52b14a4 c202d0cc c200f0cc c044b0a0 c013adea 00000000 f7c5fcc8 [ 8518.194288] 4bca9000 000007bf 00000001 00000286 f52b1000 ffffffff 00000000 00000000 [ 8518.194288] Call Trace: [ 8518.194288] [<c01fb041>] rb_insert_color+0x91/0xc0 [ 8518.194288] [<c013a8ff>] enqueue_hrtimer+0x5f/0x80 [ 8518.194288] [<c013adea>] hrtimer_start+0xaa/0x130 [ 8518.194288] [<c02d585e>] qdisc_watchdog_schedule+0x1e/0x30 [ 8518.194288] [<c02db006>] htb_dequeue+0x6a6/0x810 [ 8518.194288] [<c02d46dc>] __qdisc_run+0x19c/0x1d0 [ 8518.194288] [<c02db420>] htb_enqueue+0x0/0x1e0 [ 8518.194288] [<c02c8c97>] dev_queue_xmit+0x267/0x380 [ 8518.194288] [<c02d3c9b>] eth_header+0x2b/0xc0 [ 8518.194288] [<c02ce0ab>] neigh_resolve_output+0xdb/0x280 [ 8518.194288] [<c02ea2a0>] ip_forward_finish+0x0/0x40 [ 8518.194288] [<c02ece4f>] ip_finish_output+0x11f/0x280 [ 8518.194288] [<c02ea56f>] ip_forward+0x28f/0x2d0 [ 8518.194288] [<c02ea2c5>] ip_forward_finish+0x25/0x40 [ 8518.194288] [<c02e8e02>] ip_rcv_finish+0x122/0x360 [ 8518.194288] [<c02c0d57>] __alloc_skb+0x57/0x120 [ 8518.194288] [<c0109c6a>] nommu_map_single+0x2a/0x60 [ 8518.194288] [<c02e9280>] ip_rcv+0x0/0x290 [ 8518.194288] [<c02c5adb>] netif_receive_skb+0x26b/0x470 [ 8518.194288] [<f886c74d>] e1000_receive_skb+0x4d/0x1b0 [e1000e] [ 8518.194288] [<f886f9bc>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [ 8518.194288] [<f886bf59>] e1000_clean+0x49/0x1f0 [e1000e] [ 8518.194288] [<c02c7eeb>] net_rx_action+0xfb/0x200 [ 8518.194288] [<c012a062>] __do_softirq+0x82/0x100 [ 8518.194288] [<c012a117>] do_softirq+0x37/0x40 [ 8518.194288] [<c0107120>] do_IRQ+0x40/0x80 [ 8518.194288] [<c01055a3>] common_interrupt+0x23/0x28 [ 8518.194288] [<c010a5e2>] mwait_idle+0x32/0x40 [ 8518.194288] [<c010a5b0>] mwait_idle+0x0/0x40 [ 8518.194288] [<c01036e8>] cpu_idle+0x48/0xc0 [ 8518.194288] ======================= [ 8518.194288] Code: 24 83 c4 0c c3 89 56 08 eb e3 8d 76 00 8d bc 27 00 00 00 00 83 ec 0c 89 1c 24 89 c3 89 7c 24 08 89 d7 89 74 24 04 8b 50 08 8b 30 <8b> 4a 04 83 e6 fc 85 c9 89 48 08 74 09 8b 01 83 e0 03 09 d8 89
2.6.26.6 Bug still here. [ 5280.696710] BUG: NMI Watchdog detected LOCKUP<3>e1000: eth0: e1000_clean_tx_irq: Detected Tx Unit Hang [ 5280.696710] Tx Queue <0> [ 5280.696710] TDH <18> [ 5280.696710] TDT <18> [ 5280.696710] next_to_use <18> [ 5280.696710] next_to_clean <6d> [ 5280.696710] buffer_info[next_to_clean] [ 5280.696710] time_stamp <4bf406> [ 5280.696710] next_to_watch <6d> [ 5280.696710] jiffies <4c02a9> [ 5280.696710] next_to_watch.status <1> [ 5280.696710] on CPU3, ip c01fafb0, registers: [ 5280.696710] Modules linked in: netconsole i2c_i801 e1000e e1000 i2c_core [ 5280.696710] [ 5280.696710] Pid: 0, comm: swapper Not tainted (2.6.26.6-fw #1) [ 5280.696710] EIP: 0060:[<c01fafb0>] EFLAGS: 00000096 CPU: 3 [ 5280.696710] EIP is at rb_insert_color+0x10/0xc0 [ 5280.696710] EAX: f55554a4 EBX: f55554a4 ECX: 00000000 EDX: f55554a4 [ 5280.696710] ESI: f55554a4 EDI: f55554a4 EBP: c202d0d4 ESP: f7c5fe04 [ 5280.696710] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 5280.696710] Process swapper (pid: 0, ti=f7c5e000 task=f7c32940 task.ti=f7c5e000) [ 5280.696710] Stack: f55554a4 00000000 c202d0cc c202d0d4 c013aa4f f55554a4 c202d0cc c202d0cc [ 5280.696710] c044b0a0 c013af3a c013d1bd f7c5fe54 d948bc00 000004cc 00000000 00000286 [ 5280.696710] f5555000 ffffffff 00000000 00000000 c02d521e 00000000 f5555000 c02da9d6 [ 5280.696710] Call Trace: [ 5280.696710] [<c013aa4f>] enqueue_hrtimer+0x5f/0x80 [ 5280.696710] [<c013af3a>] hrtimer_start+0xaa/0x130 [ 5280.696710] [<c013d1bd>] getnstimeofday+0x3d/0xe0 [ 5280.696710] [<c02d521e>] qdisc_watchdog_schedule+0x1e/0x30 [ 5280.696710] [<c02da9d6>] htb_dequeue+0x6a6/0x810 [ 5280.696710] [<c02d409c>] __qdisc_run+0x19c/0x1d0 [ 5280.696710] [<c013b19d>] hrtimer_run_pending+0x1d/0x90 [ 5280.696710] [<c02c7a6e>] net_tx_action+0xbe/0xf0 [ 5280.696710] [<c012a1c2>] __do_softirq+0x82/0x100 [ 5280.696710] [<c012a277>] do_softirq+0x37/0x40 [ 5280.696710] [<c0107120>] do_IRQ+0x40/0x80 [ 5280.696710] [<c01055a3>] common_interrupt+0x23/0x28 [ 5280.696710] [<c010a602>] mwait_idle+0x32/0x40 [ 5280.696710] [<c010a5d0>] mwait_idle+0x0/0x40 [ 5280.696710] [<c01036e8>] cpu_idle+0x48/0xc0 [ 5280.696710] ======================= [ 5280.696710] Code: 03 09 d0 89 03 8b 1c 24 83 c4 0c c3 89 56 04 eb e3 8d 76 00 8d bc 27 00 00 00 00 55 89 d5 57 89 c7 56 53 90 8d b4 26 00 00 00 00 <8b> 1f 83 e3 fc 74 32 8b 03 89 d9 a8 01 75 2a 89 c6 83 e6 fc 8b
[ 6951.841662] BUG: NMI Watchdog detected LOCKUP on CPU3, ip c01fde4c, registers: [ 6951.841662] Modules linked in: sch_sfq sch_htb netconsole e1000 i2c_i801 e1000e i2c_core [ 6951.841662] [ 6951.841662] Pid: 0, comm: swapper Not tainted (2.6.27-fw #1) [ 6951.841662] EIP: 0060:[<c01fde4c>] EFLAGS: 00000092 CPU: 3 [ 6951.841662] EIP is at __rb_rotate_right+0xc/0x70 [ 6951.841662] EAX: f70c3c68 EBX: f70c3c68 ECX: f70c3c68 EDX: c202c134 [ 6951.841662] ESI: f70c3c68 EDI: f70c3c68 EBP: c202c134 ESP: f785fc2c [ 6951.841662] DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 [ 6951.841662] Process swapper (pid: 0, ti=f785e000 task=f7832940 task.ti=f785e000) [ 6951.841662] Stack: f70c3c68 f70c3c68 f70c3c68 c01fdf41 f70c3c68 00000000 c202c12c c202c134 [ 6951.841662] c013a91f f70c3c68 c202c12c c202212c c045b100 c013ae0a 00000000 c013d63d [ 6951.841662] 9a011800 00000652 00000001 00000282 00000652 f70c3c68 00000000 00000000 [ 6951.841662] Call Trace: [ 6951.841662] [<c01fdf41>] rb_insert_color+0x91/0xc0 [ 6951.841662] [<c013a91f>] enqueue_hrtimer+0x5f/0x80 [ 6951.841662] [<c013ae0a>] hrtimer_start+0xaa/0x130 [ 6951.841662] [<c013d63d>] getnstimeofday+0x3d/0xe0 [ 6951.841662] [<c02de83d>] qdisc_watchdog_schedule+0x3d/0x50 [ 6951.841662] [<f88ac343>] htb_dequeue+0x683/0x7b0 [sch_htb] [ 6951.841662] [<c02ce692>] dev_hard_start_xmit+0x1d2/0x2c0 [ 6951.841662] [<c02dc87a>] __qdisc_run+0x13a/0x1d0 [ 6951.841662] [<c02d0ed7>] dev_queue_xmit+0x227/0x4f0 [ 6951.841662] [<c02f29ff>] ip_finish_output+0x11f/0x280 [ 6951.841662] [<c02f00e0>] ip_forward+0x290/0x310 [ 6951.841662] [<c02efe35>] ip_forward_finish+0x25/0x40 [ 6951.841662] [<c02ee9a2>] ip_rcv_finish+0x122/0x360 [ 6951.841662] [<c02c8cc6>] __alloc_skb+0x36/0x120 [ 6951.841662] [<c02c9d02>] __netdev_alloc_skb+0x22/0x50 [ 6951.841662] [<c02eee20>] ip_rcv+0x0/0x290 [ 6951.841662] [<c02ce064>] netif_receive_skb+0x274/0x4d0 [ 6951.841662] [<c0108b1a>] nommu_map_single+0x2a/0x60 [ 6951.841662] [<f883be39>] e1000_receive_skb+0x49/0x80 [e1000e] [ 6951.841662] [<f883e84c>] e1000_clean_rx_irq+0x23c/0x300 [e1000e] [ 6951.841662] [<f883b3ad>] e1000_clean+0x1bd/0x570 [e1000e] [ 6951.841662] [<c02d03bc>] net_rx_action+0x13c/0x200 [ 6951.841662] [<c0129b72>] __do_softirq+0x82/0x100 [ 6951.841662] [<c0129c27>] do_softirq+0x37/0x40 [ 6951.841662] [<c0106060>] do_IRQ+0x40/0x80 [ 6951.841662] [<c01134c7>] smp_apic_timer_interrupt+0x57/0x90 [ 6951.841662] [<c010457f>] common_interrupt+0x23/0x28 [ 6951.841662] [<c0109aa2>] mwait_idle+0x32/0x40 [ 6951.841662] [<c01026c8>] cpu_idle+0x48/0xe0 [ 6951.841662] ======================= [ 6951.841662] Code: 24 08 83 e0 03 09 d0 89 03 8b 1c 24 83 c4 0c c3 89 56 08 eb e3 8d 76 00 8d bc 27 00 00 00 00 83 ec 0c 89 1c 24 89 c3 89 7c 24 08 <89> d7 89 74 24 04 8b 50 08 8b 30 8b 4a 04 83 e6 fc 85 c9 89 48
2.6.27 get now!
INFO: This bug is tracked on netdev with Subject: deadlocks if use htb.
Summary of tests. Jarek answer: > Here is my current opinion on this bug: > > 1) I'm almost sure it's not a htb, but hrtimers bug (some race), > > 2) the htb patches you've tested are not "the proper" way of fixing > it; I see substantial changes in hrtimers code in the "-tip" tree > (probably for 2.6.29), which, probably, you'll be advised by > hrtimers maintainers to try, and I guess, it's not easy on a > production system, > > So, it's up to you: > > 1) since these patches work for you, you can stop with testing and > wait with these patched kernels until 2.6.29 (I can propose this > #2 patch as a temporary fix then), > > 2) for curiosity you could try this patch #4 alone on one box first > (after reverting at least patch #2), but again: if it works, it > could be only treated as a temporary hack, and alternative of #2. > > Thanks, > Jarek P. Problem temporary fixed for me (system not crashed for 1 week) and i can wait for new kernels long time, but i can test hrtimer fixes if anyone intersted for this.
On Thu, Dec 18, 2008 at 03:42:52AM -0800, bugme-daemon@bugzilla.kernel.org wrote: ... > Problem temporary fixed for me (system not crashed for 1 week) and i can wait > for new kernels long time, but i can test hrtimer fixes if anyone intersted > for > this. Sure we are. Here is a link to the patches in the -tip tree: http://git.kernel.org/?p=linux/kernel/git/mingo/linux-2.6-sched-devel.git;a=history;f=kernel/hrtimer.c;h=b741f850426e5ba8841feca4c730f3da1c65f7b8;hb=HEAD I mean top three Peter Zijlstra's "hrtimer: removing all ur callback modes" patches. They should apply to the current -linus or -net tree, but I didn't try to compile. Jarek P.
Per Jarek's suggestion, I ran 2.6.28 plus Peter Zijlstra's "hrtimer: removing all ur callback modes" patches dated 2008-11-25, 2008-12-04 and 2008-12-08. Uptime was 2 days 22 hours before I hit what appears to be an unrelated bug related to the IPv6 FIB. (Reported on dev lists with subject 'panic with 2.6.28 while doing "ip -6 route"'.) Will continue testing with Zijlstra's patches...
I should add that with 2.6.28, without the Zijlstra patches, the system would hang after about an hour.
For the record: this bug is expected to be fixed now: 1) in 2.6.29 tree by above mentioned Peter Zijlstra's changes to hrtimers, 2) in 2.6.28.2 and 2.6.27.13 by a temporary patch to sch_htb: http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.28.y.git;a=commit;h=e46032840eae03a502638049468edc1167345c9c http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.27.y.git;a=commit;h=9befaf375925471a49159d775b38d42c04e218a1 so this bug report could be closed. Jarek P.
fixed by http://git.kernel.org/?p=linux/kernel/git/stable/linux-2.6.28.y.git;a=commit;h=e46032840eae03a502638049468edc1167345c9c Thanks all!