Latest working kernel version: 2.6.24 Distribution: gentoo Hardware Environment: lspci 00:00.0 Host bridge: Intel Corporation Server DRAM Controller 00:01.0 PCI bridge: Intel Corporation Server Host-Primary PCI Express Bridge 00:19.0 Ethernet controller: Intel Corporation 82566DM-2 Gigabit Network Connection (rev 02) 00:1a.0 USB Controller: Intel Corporation USB UHCI Controller #4 (rev 02) 00:1a.1 USB Controller: Intel Corporation USB UHCI Controller #5 (rev 02) 00:1a.2 USB Controller: Intel Corporation USB UHCI Controller #6 (rev 02) 00:1a.7 USB Controller: Intel Corporation USB2 EHCI Controller #2 (rev 02) 00:1c.0 PCI bridge: Intel Corporation PCI Express Port 1 (rev 02) 00:1c.4 PCI bridge: Intel Corporation PCI Express Port 5 (rev 02) 00:1d.0 USB Controller: Intel Corporation USB UHCI Controller #1 (rev 02) 00:1d.1 USB Controller: Intel Corporation USB UHCI Controller #2 (rev 02) 00:1d.2 USB Controller: Intel Corporation USB UHCI Controller #3 (rev 02) 00:1d.7 USB Controller: Intel Corporation USB2 EHCI Controller #1 (rev 02) 00:1e.0 PCI bridge: Intel Corporation 82801 PCI Bridge (rev 92) 00:1f.0 ISA bridge: Intel Corporation LPC Interface Controller (rev 02) 00:1f.2 SATA controller: Intel Corporation 6 port SATA AHCI Controller (rev 02) 00:1f.3 SMBus: Intel Corporation SMBus Controller (rev 02) 01:00.0 Ethernet controller: Intel Corporation 82572EI Gigabit Ethernet Controller (Copper) (rev 06) 03:00.0 VGA compatible controller: Matrox Graphics, Inc. MGA G200e [Pilot] ServerEngines (SEP1) (rev 02) 04:02.0 Ethernet controller: Intel Corporation 82541GI Gigabit Ethernet Controller (rev 05) cat /proc/cpuinfo processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU 3075 @ 2.66GHz stepping : 11 cpu MHz : 2660.095 cache size : 4096 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 5324.11 clflush size : 64 processor : 1 vendor_id : GenuineIntel cpu family : 6 model : 15 model name : Intel(R) Xeon(R) CPU 3075 @ 2.66GHz stepping : 11 cpu MHz : 2660.095 cache size : 4096 KB physical id : 0 siblings : 2 core id : 1 cpu cores : 2 fdiv_bug : no hlt_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 10 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe nx lm constant_tsc arch_perfmon pebs bts pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr lahf_lm bogomips : 5320.37 clflush size : 64 cat /proc/meminfo MemTotal: 2068148 kB MemFree: 1432804 kB Buffers: 203020 kB Cached: 253120 kB SwapCached: 0 kB Active: 204576 kB Inactive: 263872 kB HighTotal: 1177008 kB HighFree: 908920 kB LowTotal: 891140 kB LowFree: 523884 kB SwapTotal: 995896 kB SwapFree: 995896 kB Dirty: 1724 kB Writeback: 0 kB AnonPages: 12308 kB Mapped: 5072 kB Slab: 158532 kB SReclaimable: 74692 kB SUnreclaim: 83840 kB PageTables: 260 kB NFS_Unstable: 0 kB Bounce: 0 kB CommitLimit: 2029968 kB Committed_AS: 20436 kB VmallocTotal: 110584 kB VmallocUsed: 1876 kB VmallocChunk: 108612 kB HugePages_Total: 0 HugePages_Free: 0 HugePages_Rsvd: 0 HugePages_Surp: 0 Hugepagesize: 4096 kB Software Environment: bgp/quagga Problem Description: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c043c372>] fn_trie_lookup+0xe3/0x288 [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c03f6417>] fib_rules_lookup+0x71/0xb6 [<c043d652>] fib_lookup+0x28/0x36 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c03f1241>] neigh_event_ns+0x3b/0x70 [<c0432523>] arp_process+0x1e5/0x534 [<c03edd5a>] dev_queue_xmit+0x279/0x29f [<c0419415>] ip_finish_output+0x1c6/0x1fc [<c03f81d5>] tc_classify+0x14/0x6b [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c043c372>] fn_trie_lookup+0xe3/0x288 [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c03f6417>] fib_rules_lookup+0x71/0xb6 [<c043d652>] fib_lookup+0x28/0x36 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c03f1241>] neigh_event_ns+0x3b/0x70 [<c0432523>] arp_process+0x1e5/0x534 [<c03edd5a>] dev_queue_xmit+0x279/0x29f [<c0419415>] ip_finish_output+0x1c6/0x1fc [<c03f81d5>] tc_classify+0x14/0x6b [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]---
Reply-To: akpm@linux-foundation.org On Tue, 25 Mar 2008 02:50:54 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=10318 > > Summary: WARNING: at arch/x86/mm/highmem_32.c:43 > kmap_atomic_prot+0x87/0x184() > Product: Networking > Version: 2.5 > KernelVersion: 2.6.25-rc6-git7 > Platform: All > OS/Version: Linux > Tree: Mainline > Status: NEW > Severity: normal > Priority: P1 > Component: IPV4 > AssignedTo: shemminger@linux-foundation.org > ReportedBy: pstaszewski@artcom.pl > > > Latest working kernel version: 2.6.24 This is a post-2.6.24 regression. > Software Environment: bgp/quagga The app does a lot of route management stuff. > Problem Description: > Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 > [<c021a0bf>] warn_on_slowpath+0x40/0x4f > [<c043c372>] fn_trie_lookup+0xe3/0x288 > [<c043d61a>] fib4_rule_action+0x3d/0x4d > [<c03f6417>] fib_rules_lookup+0x71/0xb6 > [<c043d652>] fib_lookup+0x28/0x36 > [<c023f126>] __rmqueue_smallest+0x83/0xe1 > [<c023f197>] __rmqueue+0x13/0x172 > [<c0211806>] kmap_atomic_prot+0x87/0x184 > [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 > [<c023ff92>] __alloc_pages+0x71/0x2cf > [<c0240229>] __get_free_pages+0x39/0x47 > [<c03f10d0>] neigh_create+0x2d8/0x40e > [<c045b85b>] _read_unlock_bh+0x5/0xd > [<c03f0539>] neigh_lookup+0x92/0x9b > [<c03f1241>] neigh_event_ns+0x3b/0x70 > [<c0432523>] arp_process+0x1e5/0x534 > [<c03edd5a>] dev_queue_xmit+0x279/0x29f > [<c0419415>] ip_finish_output+0x1c6/0x1fc > [<c03f81d5>] tc_classify+0x14/0x6b > [<c03eb578>] netif_receive_skb+0x29f/0x30e > [<c0357e63>] e1000_receive_skb+0x132/0x14c > [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c > [<c0356f82>] e1000_clean+0x29f/0x427 > [<c03ed3ee>] net_rx_action+0x5c/0x14a > [<c021e25e>] __do_softirq+0x5d/0xc1 > [<c021e2f4>] do_softirq+0x32/0x36 > [<c021e585>] irq_exit+0x35/0x67 > [<c0204f79>] do_IRQ+0x73/0x82 > [<c020343b>] common_interrupt+0x23/0x28 > [<c0201377>] mwait_idle_with_hints+0x36/0x39 > [<c020137a>] mwait_idle+0x0/0xa > [<c0201817>] cpu_idle+0xa8/0xc8 > ======================= > ---[ end trace 6a93a9703f6a626e ]--- > ------------[ cut here ]------------ This backtrace is a mess. > WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() > Modules linked in: > Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 > [<c021a0bf>] warn_on_slowpath+0x40/0x4f > [<c043c372>] fn_trie_lookup+0xe3/0x288 > [<c043d61a>] fib4_rule_action+0x3d/0x4d > [<c03f6417>] fib_rules_lookup+0x71/0xb6 > [<c043d652>] fib_lookup+0x28/0x36 > [<c023f126>] __rmqueue_smallest+0x83/0xe1 > [<c023f197>] __rmqueue+0x13/0x172 > [<c0211806>] kmap_atomic_prot+0x87/0x184 > [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 > [<c023ff92>] __alloc_pages+0x71/0x2cf > [<c0240229>] __get_free_pages+0x39/0x47 > [<c03f10d0>] neigh_create+0x2d8/0x40e > [<c045b85b>] _read_unlock_bh+0x5/0xd > [<c03f0539>] neigh_lookup+0x92/0x9b > [<c03f1241>] neigh_event_ns+0x3b/0x70 > [<c0432523>] arp_process+0x1e5/0x534 > [<c03edd5a>] dev_queue_xmit+0x279/0x29f > [<c0419415>] ip_finish_output+0x1c6/0x1fc > [<c03f81d5>] tc_classify+0x14/0x6b > [<c03eb578>] netif_receive_skb+0x29f/0x30e > [<c0357e63>] e1000_receive_skb+0x132/0x14c > [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c > [<c0356f82>] e1000_clean+0x29f/0x427 > [<c03ed3ee>] net_rx_action+0x5c/0x14a > [<c021e25e>] __do_softirq+0x5d/0xc1 > [<c021e2f4>] do_softirq+0x32/0x36 > [<c021e585>] irq_exit+0x35/0x67 > [<c0204f79>] do_IRQ+0x73/0x82 > [<c020343b>] common_interrupt+0x23/0x28 > [<c0201377>] mwait_idle_with_hints+0x36/0x39 > [<c020137a>] mwait_idle+0x0/0xa > [<c0201817>] cpu_idle+0xa8/0xc8 > ======================= They all are. afacit what's happened is that someone is running __alloc_pages(..., __GFP_ZERO) from softirq context. But the __GFP_ZERO implementation uses KM_USER0 which cannot be used from softirq context because non-interrupt code on this CPU might be using the same kmap slot. Can anyone thing of anything which recently changed in either networking core or e1000e which would have triggered this? I think the core MM code is being doubly dumb here. a) We should be able to use __GFP_ZERO from all copntexts. b) it's not a highmem page anyway, so we won't be using that kmap slot. Pawel, can you please confirm that this: --- a/arch/x86/mm/highmem_32.c~a +++ a/arch/x86/mm/highmem_32.c @@ -73,15 +73,15 @@ void *kmap_atomic_prot(struct page *page { enum fixed_addresses idx; unsigned long vaddr; - /* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */ - - debug_kmap_atomic_prot(type); + /* even !CONFIG_PREEMPT needs this, for in_atomic in do_page_fault */ pagefault_disable(); if (!PageHighMem(page)) return page_address(page); + debug_kmap_atomic_prot(type); + idx = type + KM_TYPE_NR*smp_processor_id(); vaddr = __fix_to_virt(FIX_KMAP_BEGIN + idx); BUG_ON(!pte_none(*(kmap_pte-idx))); _ fixes it? Thanks.
Reply-To: dada1@cosmosbay.com Andrew Morton a
On Tue, 25 Mar 2008, Eric Dumazet wrote: > If kzalloc() or __get_free_pages(__GFP_ZERO) is not allowed from softirq then > many usages are illegal. There is clearly no problem for GFP_KERNEL. The only issue could arise for GFP_HIGHMEM|__GFP_ZERO. I thought we had dealt with these false positives a while back?
* Andrew Morton <akpm@linux-foundation.org> wrote: > afacit what's happened is that someone is running __alloc_pages(..., > __GFP_ZERO) from softirq context. But the __GFP_ZERO implementation > uses KM_USER0 which cannot be used from softirq context because > non-interrupt code on this CPU might be using the same kmap slot. > > Can anyone thing of anything which recently changed in either > networking core or e1000e which would have triggered this? > > I think the core MM code is being doubly dumb here. > > a) We should be able to use __GFP_ZERO from all copntexts. > > b) it's not a highmem page anyway, so we won't be using that kmap > slot. i think this came up before (with kzalloc()) and the MM code should have been fixed to not even attempt a kmap_atomic(), instead of working it around in the callsite or in the kmap_atomic() code. Ingo
and more: ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c043c372>] fn_trie_lookup+0xe3/0x288 [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c03f6417>] fib_rules_lookup+0x71/0xb6 [<c043d652>] fib_lookup+0x28/0x36 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c03f1241>] neigh_event_ns+0x3b/0x70 [<c0432523>] arp_process+0x1e5/0x534 [<c03edd5a>] dev_queue_xmit+0x279/0x29f [<c0419415>] ip_finish_output+0x1c6/0x1fc [<c03f81d5>] tc_classify+0x14/0x6b [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c043c372>] fn_trie_lookup+0xe3/0x288 [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c03f6417>] fib_rules_lookup+0x71/0xb6 [<c043d652>] fib_lookup+0x28/0x36 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c03f1241>] neigh_event_ns+0x3b/0x70 [<c0432523>] arp_process+0x1e5/0x534 [<c03edd5a>] dev_queue_xmit+0x279/0x29f [<c0419415>] ip_finish_output+0x1c6/0x1fc [<c03f81d5>] tc_classify+0x14/0x6b [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- ------------[ cut here ]------------ WARNING: at arch/x86/mm/highmem_32.c:43 kmap_atomic_prot+0x87/0x184() Modules linked in: Pid: 0, comm: swapper Not tainted 2.6.25-rc6-git7 #1 [<c021a0bf>] warn_on_slowpath+0x40/0x4f [<c022ff00>] clockevents_program_event+0x2e/0xd2 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022df23>] update_wall_time+0x552/0x6c7 [<c022ee6d>] clocksource_get_next+0x39/0x3f [<c022df23>] update_wall_time+0x552/0x6c7 [<c022d8b8>] getnstimeofday+0x2b/0xb2 [<c022ff96>] clockevents_program_event+0xc4/0xd2 [<c023f126>] __rmqueue_smallest+0x83/0xe1 [<c023f197>] __rmqueue+0x13/0x172 [<c0211806>] kmap_atomic_prot+0x87/0x184 [<c023fe7c>] get_page_from_freelist+0x2c5/0x358 [<c023ff92>] __alloc_pages+0x71/0x2cf [<c043d61a>] fib4_rule_action+0x3d/0x4d [<c0240229>] __get_free_pages+0x39/0x47 [<c03f10d0>] neigh_create+0x2d8/0x40e [<c045b85b>] _read_unlock_bh+0x5/0xd [<c03f0539>] neigh_lookup+0x92/0x9b [<c04319f3>] arp_bind_neighbour+0x53/0x64 [<c0413343>] rt_intern_hash+0x257/0x32d [<c0414406>] ip_route_input+0x812/0xbc6 [<c0415c16>] ip_rcv_finish+0x2a/0x293 [<c03eb578>] netif_receive_skb+0x29f/0x30e [<c0357e63>] e1000_receive_skb+0x132/0x14c [<c0359ecf>] e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] e1000_clean+0x29f/0x427 [<c022c395>] ktime_get_ts+0x11/0x3a [<c03ed3ee>] net_rx_action+0x5c/0x14a [<c021e25e>] __do_softirq+0x5d/0xc1 [<c021e2f4>] do_softirq+0x32/0x36 [<c021e585>] irq_exit+0x35/0x67 [<c0204f79>] do_IRQ+0x73/0x82 [<c020343b>] common_interrupt+0x23/0x28 [<c0201377>] mwait_idle_with_hints+0x36/0x39 [<c020137a>] mwait_idle+0x0/0xa [<c0201817>] cpu_idle+0xa8/0xc8 ======================= ---[ end trace 6a93a9703f6a626e ]--- eth1: TSO is Disabled eth0: TSO is Disabled eth0: Detected Tx Unit Hang: TDH <76> TDT <76> next_to_use <76> next_to_clean <cb> buffer_info[next_to_clean]: time_stamp <1a432ea> next_to_watch <cb> jiffies <1a43554> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <b0> TDT <b0> next_to_use <b0> next_to_clean <5> buffer_info[next_to_clean]: time_stamp <1a4e551> next_to_watch <5> jiffies <1a4e81e> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <93> TDT <93> next_to_use <93> next_to_clean <e8> buffer_info[next_to_clean]: time_stamp <1a5ea83> next_to_watch <e8> jiffies <1a5ece0> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <1d> TDT <1d> next_to_use <1d> next_to_clean <72> buffer_info[next_to_clean]: time_stamp <1a5f2c4> next_to_watch <72> jiffies <1a5f522> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <31> TDT <31> next_to_use <31> next_to_clean <86> buffer_info[next_to_clean]: time_stamp <1a60384> next_to_watch <86> jiffies <1a605ec> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <1d> TDT <1d> next_to_use <1d> next_to_clean <72> buffer_info[next_to_clean]: time_stamp <1a688a2> next_to_watch <72> jiffies <1a68b0e> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <fb> TDT <fb> next_to_use <fb> next_to_clean <50> buffer_info[next_to_clean]: time_stamp <1a72364> next_to_watch <50> jiffies <1a725dd> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <2a> TDT <2a> next_to_use <2a> next_to_clean <7f> buffer_info[next_to_clean]: time_stamp <1a880c7> next_to_watch <7f> jiffies <1a88363> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <8b> TDT <8b> next_to_use <8b> next_to_clean <e0> buffer_info[next_to_clean]: time_stamp <1a8dee1> next_to_watch <e0> jiffies <1a8e16c> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <47> TDT <47> next_to_use <47> next_to_clean <9c> buffer_info[next_to_clean]: time_stamp <1a92c54> next_to_watch <9c> jiffies <1a92f58> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <5b> TDT <5b> next_to_use <5b> next_to_clean <b0> buffer_info[next_to_clean]: time_stamp <1a93f7d> next_to_watch <b0> jiffies <1a9425a> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <e5> TDT <e5> next_to_use <e5> next_to_clean <3a> buffer_info[next_to_clean]: time_stamp <1a94950> next_to_watch <3a> jiffies <1a94c22> next_to_watch.status <1> BUG: soft lockup - CPU#0 stuck for 61s! [events/0:9] Pid: 9, comm: events/0 Not tainted (2.6.25-rc6-git7 #1) EIP: 0060:[<c02ec183>] EFLAGS: 00000202 CPU: 0 EIP is at _raw_spin_lock+0x30/0xed EAX: f7c6a070 EBX: f7d43130 ECX: 00000000 EDX: f7d80000 ESI: 00003313 EDI: 00020000 EBP: 00003130 ESP: f7c77c90 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: b7dee031 CR3: 36084000 CR4: 000006d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 [<c0412665>] ? rt_garbage_collect+0x131/0x27e [<c03ef171>] ? dst_alloc+0x16/0x5f [<c041429e>] ? ip_route_input+0x6aa/0xbc6 [<c0415c16>] ? ip_rcv_finish+0x2a/0x293 [<c03eb578>] ? netif_receive_skb+0x29f/0x30e [<c0357e63>] ? e1000_receive_skb+0x132/0x14c [<c0359ecf>] ? e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] ? e1000_clean+0x29f/0x427 [<c03f2a49>] ? neigh_timer_handler+0x248/0x27f [<c03ed3ee>] ? net_rx_action+0x5c/0x14a [<c021e25e>] ? __do_softirq+0x5d/0xc1 [<c021e2f4>] ? do_softirq+0x32/0x36 [<c021e585>] ? irq_exit+0x35/0x67 [<c020d084>] ? smp_apic_timer_interrupt+0x6e/0x7a [<c02034f8>] ? apic_timer_interrupt+0x28/0x30 [<c03ef022>] ? dst_gc_task+0x2f/0x168 [<c02e75d0>] ? rb_insert_color+0x4c/0xad [<c022bd00>] ? enqueue_hrtimer+0xe0/0xeb [<c045b917>] ? _spin_lock_irqsave+0x9/0xd [<c0221895>] ? lock_timer_base+0x19/0x35 [<c02219ac>] ? __mod_timer+0x9d/0xa7 [<c022745f>] ? queue_delayed_work_on+0x8b/0x97 [<c0244bbd>] ? vmstat_update+0x0/0x24 [<c03eeff3>] ? dst_gc_task+0x0/0x168 [<c0226dbd>] ? run_workqueue+0x6f/0xe7 [<c02274d4>] ? worker_thread+0x0/0x83 [<c022754d>] ? worker_thread+0x79/0x83 [<c0229754>] ? autoremove_wake_function+0x0/0x2d [<c02296a1>] ? kthread+0x38/0x5f [<c0229669>] ? kthread+0x0/0x5f [<c020363b>] ? kernel_thread_helper+0x7/0x10 ======================= eth0: Detected Tx Unit Hang: TDH <83> TDT <83> next_to_use <83> next_to_clean <d8> buffer_info[next_to_clean]: time_stamp <1a96565> next_to_watch <d8> jiffies <1a9682f> next_to_watch.status <1> eth1: Detected Tx Unit Hang: TDH <ba> TDT <ba> next_to_use <ba> next_to_clean <ae> buffer_info[next_to_clean]: time_stamp <1a96762> next_to_watch <ae> jiffies <1a968c5> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <21> TDT <21> next_to_use <21> next_to_clean <76> buffer_info[next_to_clean]: time_stamp <1a980d7> next_to_watch <76> jiffies <1a983bc> next_to_watch.status <1> eth1: Detected Tx Unit Hang: TDH <d0> TDT <d0> next_to_use <d0> next_to_clean <25> buffer_info[next_to_clean]: time_stamp <1a98a14> next_to_watch <25> jiffies <1a98d80> next_to_watch.status <1> eth0: Detected Tx Unit Hang: TDH <bf> TDT <bf> next_to_use <bf> next_to_clean <14> buffer_info[next_to_clean]: time_stamp <1a99c6d> next_to_watch <14> jiffies <1a99f2e> next_to_watch.status <1> eth1: Detected Tx Unit Hang: TDH <e0> TDT <e0> next_to_use <e0> next_to_clean <35> buffer_info[next_to_clean]: time_stamp <1a99c54> next_to_watch <35> jiffies <1a99f93> next_to_watch.status <1> BUG: soft lockup - CPU#0 stuck for 61s! [events/0:9] Pid: 9, comm: events/0 Not tainted (2.6.25-rc6-git7 #1) EIP: 0060:[<c02ec160>] EFLAGS: 00000246 CPU: 0 EIP is at _raw_spin_lock+0xd/0xed EAX: f7d4f8e0 EBX: f7d4f8e0 ECX: 00000000 EDX: f7d80000 ESI: 0000af8e EDI: 00020000 EBP: 0000f8e0 ESP: f7c77c90 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 CR0: 8005003b CR2: b7dee031 CR3: 36084000 CR4: 000006d0 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 [<c0412665>] ? rt_garbage_collect+0x131/0x27e [<c03ef171>] ? dst_alloc+0x16/0x5f [<c041429e>] ? ip_route_input+0x6aa/0xbc6 [<c0415c16>] ? ip_rcv_finish+0x2a/0x293 [<c03eb578>] ? netif_receive_skb+0x29f/0x30e [<c0357e63>] ? e1000_receive_skb+0x132/0x14c [<c0359ecf>] ? e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] ? e1000_clean+0x29f/0x427 [<c03ed3ee>] ? net_rx_action+0x5c/0x14a [<c021e25e>] ? __do_softirq+0x5d/0xc1 [<c021e2f4>] ? do_softirq+0x32/0x36 [<c021e585>] ? irq_exit+0x35/0x67 [<c020d084>] ? smp_apic_timer_interrupt+0x6e/0x7a [<c02034f8>] ? apic_timer_interrupt+0x28/0x30 [<c03ef022>] ? dst_gc_task+0x2f/0x168 [<c02e75d0>] ? rb_insert_color+0x4c/0xad [<c022bd00>] ? enqueue_hrtimer+0xe0/0xeb [<c045b917>] ? _spin_lock_irqsave+0x9/0xd [<c0221895>] ? lock_timer_base+0x19/0x35 [<c02219ac>] ? __mod_timer+0x9d/0xa7 [<c022745f>] ? queue_delayed_work_on+0x8b/0x97 [<c0244bbd>] ? vmstat_update+0x0/0x24 [<c03eeff3>] ? dst_gc_task+0x0/0x168 [<c0226dbd>] ? run_workqueue+0x6f/0xe7 [<c02274d4>] ? worker_thread+0x0/0x83 [<c022754d>] ? worker_thread+0x79/0x83 [<c0229754>] ? autoremove_wake_function+0x0/0x2d [<c02296a1>] ? kthread+0x38/0x5f [<c0229669>] ? kthread+0x0/0x5f [<c020363b>] ? kernel_thread_helper+0x7/0x10 ======================= BUG: soft lockup - CPU#1 stuck for 61s! [snmpd:5408] Pid: 5408, comm: snmpd Not tainted (2.6.25-rc6-git7 #1) EIP: 0060:[<c0412665>] EFLAGS: 00200202 CPU: 1 EIP is at rt_garbage_collect+0x131/0x27e EAX: f767edf0 EBX: 00000000 ECX: 00000000 EDX: f7d86d01 ESI: 0000ef84 EDI: 00020000 EBP: 0000f840 ESP: f6d09a1c DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 CR0: 8005003b CR2: b7fd8000 CR3: 370b5000 CR4: 00000690 DR0: 00000000 DR1: 00000000 DR2: 00000000 DR3: 00000000 DR6: ffff0ff0 DR7: 00000400 [<c03ef171>] ? dst_alloc+0x16/0x5f [<c041429e>] ? ip_route_input+0x6aa/0xbc6 [<c0415c16>] ? ip_rcv_finish+0x2a/0x293 [<c03eb578>] ? netif_receive_skb+0x29f/0x30e [<c0357e63>] ? e1000_receive_skb+0x132/0x14c [<c0359ecf>] ? e1000_clean_rx_irq+0x1fa/0x29c [<c0356f82>] ? e1000_clean+0x29f/0x427 [<c03f2a55>] ? neigh_timer_handler+0x254/0x27f [<c03ed3ee>] ? net_rx_action+0x5c/0x14a [<c021e25e>] ? __do_softirq+0x5d/0xc1 [<c021e2f4>] ? do_softirq+0x32/0x36 [<c021e545>] ? local_bh_enable_ip+0x35/0x40 [<c04126c7>] ? rt_garbage_collect+0x193/0x27e [<c03ef171>] ? dst_alloc+0x16/0x5f [<c04138b7>] ? __ip_route_output_key+0x49e/0x6db [<c02916aa>] ? inode2sd+0x123/0x138 [<c0413aff>] ? ip_route_output_flow+0xb/0x3e [<c0430c22>] ? udp_sendmsg+0x296/0x538 [<c0435af6>] ? inet_sendmsg+0x35/0x3f [<c03e24a4>] ? sock_sendmsg+0xb8/0xd1 [<c0229754>] ? autoremove_wake_function+0x0/0x2d [<c02987da>] ? reiserfs_dirty_inode+0x70/0x78 [<c02e97a1>] ? copy_from_user+0x2d/0x59 [<c02e97a1>] ? copy_from_user+0x2d/0x59 [<c03e264c>] ? sys_sendmsg+0x18f/0x1f2 [<c023cfe5>] ? generic_file_aio_write+0x5e/0xac [<c0257a84>] ? do_sync_write+0xbf/0x100 [<c0229754>] ? autoremove_wake_function+0x0/0x2d [<c03e3458>] ? sys_socketcall+0x168/0x181 [<c0202a2a>] ? sysenter_past_esp+0x5f/0x85 =======================
This server is running BGP/quagga with ~5000 /29 net prefixes ip a sh | grep inet | wc -l 5342 ip l sh |grep vlan | wc -l 121 average connected clients: ip neigh show | wc -l 2355 ethtool -S eth0 NIC statistics: rx_packets: 1819803357 tx_packets: 1714843474 rx_bytes: 1502903186279 tx_bytes: 1382777787478 rx_broadcast: 1730081 tx_broadcast: 1028199 rx_multicast: 4006 tx_multicast: 0 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 4006 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 72797 rx_missed_errors: 5745439 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 2698 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 0 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 1502903186279 rx_csum_offload_good: 1808325083 rx_csum_offload_errors: 3263 rx_header_split: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0 ethtool -S eth1 NIC statistics: rx_packets: 1717858068 tx_packets: 1815955165 rx_bytes: 1387719125983 tx_bytes: 1502643097276 rx_broadcast: 4560 tx_broadcast: 114 rx_multicast: 362 tx_multicast: 0 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 362 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 111097 rx_missed_errors: 3987953 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 3078 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 0 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 1387719125983 rx_csum_offload_good: 1711660724 rx_csum_offload_errors: 436 rx_header_split: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 0 rx_dma_failed: 0 tx_dma_failed: 0 cat /proc/interrupts CPU0 CPU1 0: 175 76 IO-APIC-edge timer 1: 1 1 IO-APIC-edge i8042 6: 1 1 IO-APIC-edge floppy 8: 11 15 IO-APIC-edge rtc 9: 0 0 IO-APIC-fasteoi acpi 12: 2 2 IO-APIC-edge i8042 18: 19 15 IO-APIC-fasteoi 218: 771508779 182 PCI-MSI-edge eth1 219: 12251 748704785 PCI-MSI-edge eth0 220: 65164 62870 PCI-MSI-edge ahci NMI: 0 0 Non-maskable interrupts LOC: 9761747 8884458 Local timer interrupts RES: 3077 4847 Rescheduling interrupts CAL: 713 375 function call interrupts TLB: 2896 4796 TLB shootdowns TRM: 0 0 Thermal event interrupts SPU: 0 0 Spurious interrupts ERR: 0 MIS: 0
> 218: 771508779 182 PCI-MSI-edge eth1 > 219: 12251 748704785 PCI-MSI-edge eth0 From curiosity: did you try with affinity set for these irqs? Regards, Jarek P.
hmm. yes i use affinity CPU0 eth0 / CPU1 eth1 Because one CPU can't handle gigabit traffic... always 100% CPU (with traffic management use "iproute2/hfsc") those irq's that are on cpu0:219: 12251 /cpu1:218 182 are from mistake because i try to change cpu's irq bind for test.
Fixed by: http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=e72e9c23ee025a4c063ca112ba0a6059f9ecc9b7