Bug 13084 - page allocation failure. order:0, mode:0x20
Summary: page allocation failure. order:0, mode:0x20
Status: CLOSED OBSOLETE
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-04-13 19:27 UTC by reeve.yang
Modified: 2012-05-30 16:11 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.17.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
kernel config file. (29.01 KB, application/octet-stream)
2009-04-13 19:27 UTC, reeve.yang
Details

Description reeve.yang 2009-04-13 19:27:26 UTC
Created attachment 20964 [details]
kernel config file.

The system is Intel Xeon Quad core with 8G physical RAM. When it's under UPD loads, e.g., DNS queries, the box is stuck in terms it cannot be pinged or login. By checking syslog, I'm seeing following trace back from various dameon/processes. The network controller is E1000 82571 with NAPI enabled in kernel.

page allocation failure. order:0, mode:0x20
Apr  1 20:53:55 (none) kernel: <c013eb43> __alloc_pages+0x219/0x309  <c0155b8c> kmem_getpages+0x34/0x98
Apr  1 20:53:55 (none) kernel: <c04205f5> fib_lookup+0xb0/0xfd  <c01568bf> cache_grow+0xd9/0x1a4
Apr  1 20:53:55 (none) kernel: <c0156ae0> cache_alloc_refill+0x156/0x1e1  <c0156d63> kmem_cache_alloc+0x4f/0x51
Apr  1 20:53:55 (none) kernel: <c03d77e6> dst_alloc+0x39/0xab  <c03eb8b1> ip_route_input_slow+0x216/0x8b5
Apr  1 20:53:55 (none) kernel: <c03ee49d> ip_rcv+0x408/0x54a  <c03ee7f5> ip_rcv_finish+0x0/0x2b3
Apr  1 20:53:55 (none) kernel: <c03d3432> __net_timestamp+0x17/0x2d  <c03d3ff2> netif_receive_skb+0x208/0x2ab
Apr  1 20:53:55 (none) kernel: <f885c7a9> e1000_clean_rx_irq_ps+0x26d/0x51f [e1000]  <c0156ce7> cache_flusharray+0x9f/0xcc
Apr  1 20:53:55 (none) kernel: <f885bc25> e1000_clean+0xaf/0x175 [e1000]  <c03d4210> net_rx_action+0x72/0xf5
Apr  1 20:53:55 (none) kernel: <c0120eb2> __do_softirq+0xc2/0xd4  <c0120ef6> do_softirq+0x32/0x34
Apr  1 20:53:55 (none) kernel: <c0105256> do_IRQ+0x1e/0x24  <c01036fa> common_interrupt+0x1a/0x20
Apr  1 20:53:55 (none) kernel: Mem-info:
Apr  1 20:53:55 (none) kernel: DMA per-cpu:
Apr  1 20:53:55 (none) kernel: cpu 0 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 0 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 1 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 1 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 2 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 2 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 3 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 3 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: DMA32 per-cpu: empty
Apr  1 20:53:55 (none) kernel: Normal per-cpu:
Apr  1 20:53:55 (none) kernel: cpu 0 hot: high 186, batch 31 used:13
Apr  1 20:53:55 (none) kernel: cpu 0 cold: high 62, batch 15 used:4
Apr  1 20:53:55 (none) kernel: cpu 1 hot: high 186, batch 31 used:8
Apr  1 20:53:55 (none) kernel: cpu 1 cold: high 62, batch 15 used:0
Apr  1 20:53:55 (none) kernel: cpu 2 hot: high 186, batch 31 used:24
Apr  1 20:53:55 (none) kernel: cpu 2 cold: high 62, batch 15 used:14
Apr  1 20:53:55 (none) kernel: cpu 3 hot: high 186, batch 31 used:30
Apr  1 20:53:55 (none) kernel: cpu 3 cold: high 62, batch 15 used:48
Apr  1 20:53:55 (none) kernel: HighMem per-cpu:
Apr  1 20:53:55 (none) kernel: cpu 0 hot: high 186, batch 31 used:28
Apr  1 20:53:55 (none) kernel: cpu 0 cold: high 62, batch 15 used:9
Apr  1 20:53:55 (none) kernel: cpu 1 hot: high 186, batch 31 used:182
Apr  1 20:53:55 (none) kernel: cpu 1 cold: high 62, batch 15 used:7
Apr  1 20:53:55 (none) kernel: cpu 2 hot: high 186, batch 31 used:85
Apr  1 20:53:55 (none) kernel: cpu 2 cold: high 62, batch 15 used:12
Apr  1 20:53:55 (none) kernel: cpu 3 hot: high 186, batch 31 used:155
Apr  1 20:53:55 (none) kernel: cpu 3 cold: high 62, batch 15 used:9
Apr  1 20:53:55 (none) kernel: Free pages:     3967912kB (3962984kB HighMem)
Apr  1 20:53:55 (none) kernel: Active:893435 inactive:20412 dirty:178 writeback:0 unstable:0 free:991978 slab:166955 mapped:65893 pagetables:854
Apr  1 20:53:55 (none) kernel: DMA free:3548kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16384kB pages_scanned:0 all_unreclaimable? yes
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 880 9200
Apr  1 20:53:55 (none) kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 880 9200
Apr  1 20:53:55 (none) kernel: Normal free:1380kB min:3756kB low:4692kB high:5632kB active:116612kB inactive:38256kB present:901120kB pages_scanned:0 all_unreclaimable? no
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 0 66560
Apr  1 20:53:55 (none) kernel: HighMem free:3962984kB min:512kB low:9396kB high:18284kB active:3457128kB inactive:43392kB present:8519680kB pages_scanned:0 all_unreclaimable? no
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 0 0
Apr  1 20:53:55 (none) kernel: DMA: 1*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3548kB
Apr  1 20:53:55 (none) kernel: DMA32: empty
Apr  1 20:53:55 (none) kernel: Normal: 1*4kB 6*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1380kB
Apr  1 20:53:55 (none) kernel: HighMem: 996*4kB 665*8kB 431*16kB 205*32kB 120*64kB 71*128kB 36*256kB 13*512kB 0*1024kB 8*2048kB 950*4096kB = 3962984kB
Apr  1 20:53:55 (none) kernel: Swap cache: add 0, delete 0, find 0/0, race 0+0
Apr  1 20:53:55 (none) kernel: Free swap  = 2047992kB
Apr  1 20:53:55 (none) kernel: Total swap = 2047992kB
Apr  1 20:53:55 (none) kernel: Free swap:       2047992kB
Apr  1 20:53:55 (none) kernel: 2359296 pages of RAM
Apr  1 20:53:55 (none) kernel: 2129920 pages of HIGHMEM
Apr  1 20:53:55 (none) kernel: 282325 reserved pages
Apr  1 20:53:55 (none) kernel: 864189 pages shared
Apr  1 20:53:55 (none) kernel: 0 pages swap cached
Apr  1 20:53:55 (none) kernel: 180 pages dirty
Apr  1 20:53:55 (none) kernel: 0 pages writeback
Apr  1 20:53:55 (none) kernel: 65893 pages mapped
Apr  1 20:53:55 (none) kernel: 166955 pages slab
Apr  1 20:53:55 (none) kernel: 854 pages pagetables
Apr  1 20:53:55 (none) kernel: make_wd_conf: page allocation failure. order:0, mode:0x20
Apr  1 20:53:55 (none) kernel: <c013eb43> __alloc_pages+0x219/0x309  <c0155b8c> kmem_getpages+0x34/0x98
Apr  1 20:53:55 (none) kernel: <c04205f5> fib_lookup+0xb0/0xfd  <c01568bf> cache_grow+0xd9/0x1a4
Apr  1 20:53:55 (none) kernel: <c0156ae0> cache_alloc_refill+0x156/0x1e1  <c0156d63> kmem_cache_alloc+0x4f/0x51
Apr  1 20:53:55 (none) kernel: <c03d77e6> dst_alloc+0x39/0xab  <c03eb8b1> ip_route_input_slow+0x216/0x8b5
Apr  1 20:53:55 (none) kernel: <c03ee49d> ip_rcv+0x408/0x54a  <c03ee7f5> ip_rcv_finish+0x0/0x2b3
Apr  1 20:53:55 (none) kernel: <c03d3432> __net_timestamp+0x17/0x2d  <c03d3ff2> netif_receive_skb+0x208/0x2ab
Apr  1 20:53:55 (none) kernel: <f885c7a9> e1000_clean_rx_irq_ps+0x26d/0x51f [e1000]  <c0156ce7> cache_flusharray+0x9f/0xcc
Apr  1 20:53:55 (none) kernel: <f885bc25> e1000_clean+0xaf/0x175 [e1000]  <c03d4210> net_rx_action+0x72/0xf5
Apr  1 20:53:55 (none) kernel: <c0120eb2> __do_softirq+0xc2/0xd4  <c0120ef6> do_softirq+0x32/0x34
Apr  1 20:53:55 (none) kernel: <c0105256> do_IRQ+0x1e/0x24  <c01036fa> common_interrupt+0x1a/0x20
Apr  1 20:53:55 (none) kernel: Mem-info:
Apr  1 20:53:55 (none) kernel: DMA per-cpu:
Apr  1 20:53:55 (none) kernel: cpu 0 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 0 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 1 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 1 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 2 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 2 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 3 hot: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: cpu 3 cold: high 0, batch 1 used:0
Apr  1 20:53:55 (none) kernel: DMA32 per-cpu: empty
Apr  1 20:53:55 (none) kernel: Normal per-cpu:
Apr  1 20:53:55 (none) kernel: cpu 0 hot: high 186, batch 31 used:13
Apr  1 20:53:55 (none) kernel: cpu 0 cold: high 62, batch 15 used:4
Apr  1 20:53:55 (none) kernel: cpu 1 hot: high 186, batch 31 used:8
Apr  1 20:53:55 (none) kernel: cpu 1 cold: high 62, batch 15 used:0
Apr  1 20:53:55 (none) kernel: cpu 2 hot: high 186, batch 31 used:24
Apr  1 20:53:55 (none) kernel: cpu 2 cold: high 62, batch 15 used:14
Apr  1 20:53:55 (none) kernel: cpu 3 hot: high 186, batch 31 used:30
Apr  1 20:53:55 (none) kernel: cpu 3 cold: high 62, batch 15 used:48
Apr  1 20:53:55 (none) kernel: HighMem per-cpu:
Apr  1 20:53:55 (none) kernel: cpu 0 hot: high 186, batch 31 used:28
Apr  1 20:53:55 (none) kernel: cpu 0 cold: high 62, batch 15 used:9
Apr  1 20:53:55 (none) kernel: cpu 1 hot: high 186, batch 31 used:182
Apr  1 20:53:55 (none) kernel: cpu 1 cold: high 62, batch 15 used:7
Apr  1 20:53:55 (none) kernel: cpu 2 hot: high 186, batch 31 used:85
Apr  1 20:53:55 (none) kernel: cpu 2 cold: high 62, batch 15 used:12
Apr  1 20:53:55 (none) kernel: cpu 3 hot: high 186, batch 31 used:154
Apr  1 20:53:55 (none) kernel: cpu 3 cold: high 62, batch 15 used:9
Apr  1 20:53:55 (none) kernel: Free pages:     3967912kB (3962984kB HighMem)
Apr  1 20:53:55 (none) kernel: Active:893438 inactive:20410 dirty:180 writeback:0 unstable:0 free:991978 slab:166955 mapped:65893 pagetables:854
Apr  1 20:53:55 (none) kernel: DMA free:3548kB min:68kB low:84kB high:100kB active:0kB inactive:0kB present:16384kB pages_scanned:0 all_unreclaimable? yes
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 880 9200
Apr  1 20:53:55 (none) kernel: DMA32 free:0kB min:0kB low:0kB high:0kB active:0kB inactive:0kB present:0kB pages_scanned:0 all_unreclaimable? no
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 880 9200
Apr  1 20:53:55 (none) kernel: Normal free:1380kB min:3756kB low:4692kB high:5632kB active:116620kB inactive:38248kB present:901120kB pages_scanned:0 all_unreclaimable? no
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 0 66560
Apr  1 20:53:55 (none) kernel: HighMem free:3962984kB min:512kB low:9396kB high:18284kB active:3457132kB inactive:43392kB present:8519680kB pages_scanned:0 all_unreclaimable? no
Apr  1 20:53:55 (none) kernel: lowmem_reserve[]: 0 0 0 0
Apr  1 20:53:55 (none) kernel: DMA: 1*4kB 1*8kB 1*16kB 0*32kB 1*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 0*4096kB = 3548kB
Apr  1 20:53:55 (none) kernel: DMA32: empty
Apr  1 20:53:55 (none) kernel: Normal: 1*4kB 6*8kB 1*16kB 1*32kB 0*64kB 0*128kB 1*256kB 0*512kB 1*1024kB 0*2048kB 0*4096kB = 1380kB
Apr  1 20:53:55 (none) kernel: HighMem: 996*4kB 665*8kB 431*16kB 205*32kB 120*64kB 71*128kB 36*256kB 13*512kB 0*1024kB 8*2048kB 950*4096kB = 3962984kB
Comment 1 Andrew Morton 2009-04-13 19:47:11 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Mon, 13 Apr 2009 19:27:27 GMT
bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=13084
> 
>            Summary: page allocation failure. order:0, mode:0x20
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 2.6.17.4
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Page Allocator
>         AssignedTo: akpm@linux-foundation.org
>         ReportedBy: reeve.yang@gmail.com
>         Regression: No
> 
> 
> Created an attachment (id=20964)
>  --> (http://bugzilla.kernel.org/attachment.cgi?id=20964)
> kernel config file.
> 
> The system is Intel Xeon Quad core with 8G physical RAM. When it's under UPD
> loads, e.g., DNS queries, the box is stuck in terms it cannot be pinged or
> login. By checking syslog, I'm seeing following trace back from various
> dameon/processes. The network controller is E1000 82571 with NAPI enabled in
> kernel.
>
> page allocation failure. order:0, mode:0x20

This is very common.  e1000 attempts to do large memory allocations
from within interrupt context and the page allocator cannot satisfy the
allocation and is not allowed to do the necessary work to make the
allocation attempt succeed.  It's the same with all net drivers, but
e1000 is especially prone, apparently because of hardware suckiness.

However the networking stack should just drop the packet and the system
will recover.

You report is unclear.  Yes, the machine wedges up under the UDP load. 
But does it recover when the other machine stops spraying UDP packets
at this machine?  It _should_ recover.  If it does not, we have a bug
somewhere.

The usual workaround for these problems is to increase the value in
/proc/sys/vm/min_free_kbytes.

2.6.17 is fairly old.  If we need to do additional work on this report
then we'll be asking you to test something more recent - ideally
2.6.29.

Thanks.
Comment 2 reeve.yang 2009-04-13 19:59:26 UTC
The reason I report this is that the system doesn't recover after I stopping the traffic. I kept monitoring the LowFree memory, it's been dropping at roughly rate 30M/Sec and system is freezing when it becomes roughly 30M.

I tried to increase the min_free_kbytes but it doesn't help at all.

I tested 2.6.22.15, and the problem is not happening on the exactly same test.
Comment 3 Jesse Brandeburg 2009-04-13 20:06:07 UTC
On Mon, 13 Apr 2009, Andrew Morton wrote:
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> On Mon, 13 Apr 2009 19:27:27 GMT
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> > http://bugzilla.kernel.org/show_bug.cgi?id=13084
> > 
> >            Summary: page allocation failure. order:0, mode:0x20
> >            Product: Memory Management
> >            Version: 2.5
> >     Kernel Version: 2.6.17.4
> >           Platform: All
> >         OS/Version: Linux
> >               Tree: Mainline
> >             Status: NEW
> >           Severity: high
> >           Priority: P1
> >          Component: Page Allocator
> >         AssignedTo: akpm@linux-foundation.org
> >         ReportedBy: reeve.yang@gmail.com
> >         Regression: No
> > 
> > 
> > Created an attachment (id=20964)
> >  --> (http://bugzilla.kernel.org/attachment.cgi?id=20964)
> > kernel config file.
> > 
> > The system is Intel Xeon Quad core with 8G physical RAM. When it's under
> UPD
> > loads, e.g., DNS queries, the box is stuck in terms it cannot be pinged or
> > login. By checking syslog, I'm seeing following trace back from various
> > dameon/processes. The network controller is E1000 82571 with NAPI enabled
> in
> > kernel.
> >
> > page allocation failure. order:0, mode:0x20
> 
> This is very common.  e1000 attempts to do large memory allocations
> from within interrupt context and the page allocator cannot satisfy the
> allocation and is not allowed to do the necessary work to make the
> allocation attempt succeed.  It's the same with all net drivers, but
> e1000 is especially prone, apparently because of hardware suckiness.

while in jumbo mode, andrew's statement is true, but with order:0 
allocation failures it is just normal networking goo that causes the 
memory allocator to run out of free pages, seems much less frequent in 
newer kernels.
 
> However the networking stack should just drop the packet and the system
> will recover.

I think at that point the kernel gets quite busy printing warnings about 
how much it is out of memory.

> You report is unclear.  Yes, the machine wedges up under the UDP load. 
> But does it recover when the other machine stops spraying UDP packets
> at this machine?  It _should_ recover.  If it does not, we have a bug
> somewhere.

In this case kmem_cache_alloc is failing to get memory, being called by 
the route_dst code, maybe someone on netdev can comment if this has been 
fixed along the way.
 
> The usual workaround for these problems is to increase the value in
> /proc/sys/vm/min_free_kbytes.

this should help a lot in my experience.

> 2.6.17 is fairly old.  If we need to do additional work on this report
> then we'll be asking you to test something more recent - ideally
> 2.6.29.

If you must run 2.6.17, then you might want to try the e1000e driver (*not 
e1000*) from sourceforge for your 82571.

Otherwise I also will be asking you to soon try a newer kernel.
Comment 4 Andrew Morton 2009-04-13 20:07:33 UTC
Please don't update this bug via the bugzilla interface.  Please use emailed
reply-to-all, as I asked.

Because nobody who read that email will think to click on the link to see
if some of the conversation got diverted back into bugzilla.
Comment 5 reeve.yang 2009-04-13 20:11:49 UTC
Here is the memory snapshot when problem happening:

MemTotal:      8307844 kB
MemFree:       6091208 kB
Buffers:          6524 kB
Cached:        1121528 kB
SwapCached:          0 kB
Active:        1361052 kB
Inactive:        25784 kB
HighTotal:     7470464 kB
HighFree:      6083688 kB
LowTotal:       837380 kB
LowFree:          7520 kB
SwapTotal:     2047992 kB
SwapFree:      2047992 kB
Dirty:          744488 kB
Writeback:           0 kB
Mapped:         285532 kB
Slab:           797500 kB
CommitLimit:   6201912 kB
Committed_AS:   459788 kB
PageTables:       3532 kB
VmallocTotal:   118776 kB
VmallocUsed:      2432 kB
VmallocChunk:   116084 kB

You can see I have lots of physical RAM available. The LowFree
reduction rate is about 10M/Second.

On Mon, Apr 13, 2009 at 1:06 PM, Brandeburg, Jesse
<jesse.brandeburg@intel.com> wrote:
> On Mon, 13 Apr 2009, Andrew Morton wrote:
>> (switched to email.  Please respond via emailed reply-to-all, not via the
>> bugzilla web interface).
>>
>> On Mon, 13 Apr 2009 19:27:27 GMT
>> bugzilla-daemon@bugzilla.kernel.org wrote:
>>
>> > http://bugzilla.kernel.org/show_bug.cgi?id=13084
>> >
>> >            Summary: page allocation failure. order:0, mode:0x20
>> >            Product: Memory Management
>> >            Version: 2.5
>> >     Kernel Version: 2.6.17.4
>> >           Platform: All
>> >         OS/Version: Linux
>> >               Tree: Mainline
>> >             Status: NEW
>> >           Severity: high
>> >           Priority: P1
>> >          Component: Page Allocator
>> >         AssignedTo: akpm@linux-foundation.org
>> >         ReportedBy: reeve.yang@gmail.com
>> >         Regression: No
>> >
>> >
>> > Created an attachment (id=20964)
>> >  --> (http://bugzilla.kernel.org/attachment.cgi?id=20964)
>> > kernel config file.
>> >
>> > The system is Intel Xeon Quad core with 8G physical RAM. When it's under
>> UPD
>> > loads, e.g., DNS queries, the box is stuck in terms it cannot be pinged or
>> > login. By checking syslog, I'm seeing following trace back from various
>> > dameon/processes. The network controller is E1000 82571 with NAPI enabled
>> in
>> > kernel.
>> >
>> > page allocation failure. order:0, mode:0x20
>>
>> This is very common.  e1000 attempts to do large memory allocations
>> from within interrupt context and the page allocator cannot satisfy the
>> allocation and is not allowed to do the necessary work to make the
>> allocation attempt succeed.  It's the same with all net drivers, but
>> e1000 is especially prone, apparently because of hardware suckiness.
>
> while in jumbo mode, andrew's statement is true, but with order:0
> allocation failures it is just normal networking goo that causes the
> memory allocator to run out of free pages, seems much less frequent in
> newer kernels.
>
>> However the networking stack should just drop the packet and the system
>> will recover.
>
> I think at that point the kernel gets quite busy printing warnings about
> how much it is out of memory.
>
>> You report is unclear.  Yes, the machine wedges up under the UDP load.
>> But does it recover when the other machine stops spraying UDP packets
>> at this machine?  It _should_ recover.  If it does not, we have a bug
>> somewhere.
>
> In this case kmem_cache_alloc is failing to get memory, being called by
> the route_dst code, maybe someone on netdev can comment if this has been
> fixed along the way.
>
>> The usual workaround for these problems is to increase the value in
>> /proc/sys/vm/min_free_kbytes.
>
> this should help a lot in my experience.
>
>> 2.6.17 is fairly old.  If we need to do additional work on this report
>> then we'll be asking you to test something more recent - ideally
>> 2.6.29.
>
> If you must run 2.6.17, then you might want to try the e1000e driver (*not
> e1000*) from sourceforge for your 82571.
>
> Otherwise I also will be asking you to soon try a newer kernel.
>
Comment 6 David S. Miller 2009-04-13 20:41:44 UTC
From: "Brandeburg, Jesse" <jesse.brandeburg@intel.com>
Date: Mon, 13 Apr 2009 13:06:04 -0700 (Pacific Daylight Time)

> On Mon, 13 Apr 2009, Andrew Morton wrote:
>> You report is unclear.  Yes, the machine wedges up under the UDP load. 
>> But does it recover when the other machine stops spraying UDP packets
>> at this machine?  It _should_ recover.  If it does not, we have a bug
>> somewhere.
> 
> In this case kmem_cache_alloc is failing to get memory, being called by 
> the route_dst code, maybe someone on netdev can comment if this has been 
> fixed along the way.

Although I have some level of tolerance, there is zero way I'm
going to analyze anything on 2.6.17 kernels nor am I going to
encourage other core networking developers to waste their time
on this either.

It is easily the case that we've fixed 10's of thousands of
VM and networking bugs since then, if not more.

Note You need to log in before you can comment on or make changes to this bug.