Bug 13561
Summary: | swapper: page allocation failure. order:0, mode:0x20 | ||
---|---|---|---|
Product: | Memory Management | Reporter: | Nigel Kukard (nkukard) |
Component: | Other | Assignee: | Andrew Morton (akpm) |
Status: | RESOLVED OBSOLETE | ||
Severity: | normal | CC: | alan, james, seb.kernel |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.39.3 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Nigel Kukard
2009-06-17 18:41:22 UTC
(switched to email. Please respond via emailed reply-to-all, not via the bugzilla web interface). On Wed, 17 Jun 2009 18:41:24 GMT bugzilla-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=13561 > > Summary: swapper: page allocation failure. order:0, mode:0x20 > Product: Memory Management > > ... > > [ 1884.639134] swapper: page allocation failure. order:0, mode:0x20 > [ 1884.639136] Pid: 0, comm: swapper Tainted: P 2.6.29.4-1.0 #1 > [ 1884.639137] Call Trace: > [ 1884.639139] <IRQ> [<ffffffff8028828b>] > __alloc_pages_internal+0x41e/0x43f > [ 1884.639146] [<ffffffff802ac183>] alloc_pages_current+0xb9/0xc2 > [ 1884.639149] [<ffffffff802b08eb>] new_slab+0xcf/0x28c > [ 1884.639151] [<ffffffff802b0d14>] __slab_alloc+0x200/0x3e2 > [ 1884.639154] [<ffffffff803d9e3b>] ? __alloc_skb+0x42/0x131 > [ 1884.639157] [<ffffffff804974cc>] ? _spin_lock_irqsave+0x28/0x31 > [ 1884.639160] [<ffffffff803d9e3b>] ? __alloc_skb+0x42/0x131 > [ 1884.639162] [<ffffffff802b12cf>] kmem_cache_alloc_node+0x7c/0xc6 > [ 1884.639165] [<ffffffff8024b8a4>] ? __mod_timer+0xb3/0xc5 > [ 1884.639168] [<ffffffff803d9e3b>] __alloc_skb+0x42/0x131 > [ 1884.639171] [<ffffffff8041892e>] tcp_send_ack+0x2b/0x112 > [ 1884.639173] [<ffffffff80415c53>] __tcp_ack_snd_check+0x65/0x7d > [ 1884.639176] [<ffffffff80416906>] tcp_rcv_established+0x7d7/0x926 > [ 1884.639179] [<ffffffff8041dfb3>] tcp_v4_do_rcv+0x1b1/0x35e > [ 1884.639181] [<ffffffff8041e606>] tcp_v4_rcv+0x4a6/0x785 > [ 1884.639185] [<ffffffff80402a2d>] ip_local_deliver_finish+0x177/0x25a > [ 1884.639187] [<ffffffff80402b82>] ip_local_deliver+0x72/0x7a > [ 1884.639190] [<ffffffff804025e3>] ip_rcv_finish+0x32b/0x345 > [ 1884.639192] [<ffffffff80402879>] ip_rcv+0x27c/0x2b9 > [ 1884.639195] [<ffffffff803e05b0>] netif_receive_skb+0x471/0x496 > [ 1884.639201] [<ffffffffa0ca07c0>] rtl8169_rx_interrupt+0x362/0x43d [r8169] > [ 1884.639205] [<ffffffffa0ca38b3>] rtl8169_poll+0x3f/0x1fe [r8169] > [ 1884.639208] [<ffffffff803de82c>] net_rx_action+0xae/0x19c > [ 1884.639212] [<ffffffff8024745d>] __do_softirq+0x8a/0x139 > [ 1884.639214] [<ffffffff8021259c>] call_softirq+0x1c/0x30 > [ 1884.639217] [<ffffffff80213d78>] do_softirq+0x44/0x8f > [ 1884.639219] [<ffffffff80247153>] irq_exit+0x3f/0x7e > [ 1884.639222] [<ffffffff80213ff3>] do_IRQ+0xc3/0xe4 > [ 1884.639224] [<ffffffff80211d13>] ret_from_intr+0x0/0x29 > [ 1884.639225] <EOI> [<ffffffff8022704e>] ? native_safe_halt+0x6/0x8 > [ 1884.639230] [<ffffffff802180c2>] ? default_idle+0x2e/0x4b > [ 1884.639233] [<ffffffff8021830c>] ? c1e_idle+0x109/0x110 > [ 1884.639235] [<ffffffff802590c9>] ? atomic_notifier_call_chain+0x13/0x15 > [ 1884.639239] [<ffffffff8021022e>] ? cpu_idle+0x59/0x9a > [ 1884.639242] [<ffffffff804925d6>] ? start_secondary+0x254/0x25b yep, that's OK. The kernel was excessively low on memory and the network driver was unable to allocate a page for a received packet. The packet will just be dropped and everything should recover. If you get a lot of these warnings (one per minute?) then increasing the value in /proc/sys/vm/min_free_kbytes should help. Dave, we get quite a few reports of this nature, especially from e1000 (grr). Do you think we could/should suppress the warning, by sprinkling a few __GFP_NOWARNs in the right places? It doesn't seem like it's being very useful? > > On Wed, 17 Jun 2009 18:41:24 GMT > bugzilla-daemon@bugzilla.kernel.org wrote: > > >> http://bugzilla.kernel.org/show_bug.cgi?id=13561 >> >> Summary: swapper: page allocation failure. order:0, mode:0x20 >> Product: Memory Management >> >> ... >> >> [ 1884.639134] swapper: page allocation failure. order:0, mode:0x20 >> [ 1884.639136] Pid: 0, comm: swapper Tainted: P 2.6.29.4-1.0 #1 >> [ 1884.639137] Call Trace: >> [ 1884.639139] <IRQ> [<ffffffff8028828b>] >> __alloc_pages_internal+0x41e/0x43f >> [ 1884.639146] [<ffffffff802ac183>] alloc_pages_current+0xb9/0xc2 >> [ 1884.639149] [<ffffffff802b08eb>] new_slab+0xcf/0x28c >> [ 1884.639151] [<ffffffff802b0d14>] __slab_alloc+0x200/0x3e2 >> [ 1884.639154] [<ffffffff803d9e3b>] ? __alloc_skb+0x42/0x131 >> [ 1884.639157] [<ffffffff804974cc>] ? _spin_lock_irqsave+0x28/0x31 >> [ 1884.639160] [<ffffffff803d9e3b>] ? __alloc_skb+0x42/0x131 >> [ 1884.639162] [<ffffffff802b12cf>] kmem_cache_alloc_node+0x7c/0xc6 >> [ 1884.639165] [<ffffffff8024b8a4>] ? __mod_timer+0xb3/0xc5 >> [ 1884.639168] [<ffffffff803d9e3b>] __alloc_skb+0x42/0x131 >> [ 1884.639171] [<ffffffff8041892e>] tcp_send_ack+0x2b/0x112 >> [ 1884.639173] [<ffffffff80415c53>] __tcp_ack_snd_check+0x65/0x7d >> [ 1884.639176] [<ffffffff80416906>] tcp_rcv_established+0x7d7/0x926 >> [ 1884.639179] [<ffffffff8041dfb3>] tcp_v4_do_rcv+0x1b1/0x35e >> [ 1884.639181] [<ffffffff8041e606>] tcp_v4_rcv+0x4a6/0x785 >> [ 1884.639185] [<ffffffff80402a2d>] ip_local_deliver_finish+0x177/0x25a >> [ 1884.639187] [<ffffffff80402b82>] ip_local_deliver+0x72/0x7a >> [ 1884.639190] [<ffffffff804025e3>] ip_rcv_finish+0x32b/0x345 >> [ 1884.639192] [<ffffffff80402879>] ip_rcv+0x27c/0x2b9 >> [ 1884.639195] [<ffffffff803e05b0>] netif_receive_skb+0x471/0x496 >> [ 1884.639201] [<ffffffffa0ca07c0>] rtl8169_rx_interrupt+0x362/0x43d >> [r8169] >> [ 1884.639205] [<ffffffffa0ca38b3>] rtl8169_poll+0x3f/0x1fe [r8169] >> [ 1884.639208] [<ffffffff803de82c>] net_rx_action+0xae/0x19c >> [ 1884.639212] [<ffffffff8024745d>] __do_softirq+0x8a/0x139 >> [ 1884.639214] [<ffffffff8021259c>] call_softirq+0x1c/0x30 >> [ 1884.639217] [<ffffffff80213d78>] do_softirq+0x44/0x8f >> [ 1884.639219] [<ffffffff80247153>] irq_exit+0x3f/0x7e >> [ 1884.639222] [<ffffffff80213ff3>] do_IRQ+0xc3/0xe4 >> [ 1884.639224] [<ffffffff80211d13>] ret_from_intr+0x0/0x29 >> [ 1884.639225] <EOI> [<ffffffff8022704e>] ? native_safe_halt+0x6/0x8 >> [ 1884.639230] [<ffffffff802180c2>] ? default_idle+0x2e/0x4b >> [ 1884.639233] [<ffffffff8021830c>] ? c1e_idle+0x109/0x110 >> [ 1884.639235] [<ffffffff802590c9>] ? atomic_notifier_call_chain+0x13/0x15 >> [ 1884.639239] [<ffffffff8021022e>] ? cpu_idle+0x59/0x9a >> [ 1884.639242] [<ffffffff804925d6>] ? start_secondary+0x254/0x25b >> > > yep, that's OK. The kernel was excessively low on memory and the > network driver was unable to allocate a page for a received packet. > Even though the box had 4Gbyte RAM with nothing running but console? maybe I"m mistaking the amount of RAM for the amount of memory in a buffer used for network IO? -N > The packet will just be dropped and everything should recover. If you > get a lot of these warnings (one per minute?) then increasing the value > in /proc/sys/vm/min_free_kbytes should help. > > > Dave, we get quite a few reports of this nature, especially from e1000 > (grr). Do you think we could/should suppress the warning, by > sprinkling a few __GFP_NOWARNs in the right places? It doesn't seem > like it's being very useful? > > > On Wed, 17 Jun 2009 19:29:25 +0000 Nigel Kukard <nkukard@lbsd.net> wrote: > > yep, that's OK. The kernel was excessively low on memory and the > > network driver was unable to allocate a page for a received packet. > > > > Even though the box had 4Gbyte RAM with nothing running but console? yep. > maybe I"m mistaking the amount of RAM for the amount of memory in a > buffer used for network IO? [ 1884.639264] Node 0 DMA free:11692kB min:28kB low:32kB high:40kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB present:15284kB pages_scanned:0 all_unreclaimable? yes [ 1884.639271] Node 0 DMA32 free:5412kB min:6560kB low:8200kB high:9840kB active_anon:60476kB inactive_anon:12224kB active_file:28720kB inactive_file:2987624kB unevictable:0kB present:3333216kB pages_scanned:0 all_unreclaimable? no [ 1884.639279] Node 0 Normal free:444kB min:1524kB low:1904kB high:2284kB active_anon:12436kB inactive_anon:13984kB active_file:16356kB inactive_file:666568kB unevictable:480kB present:775680kB pages_scanned:0 all_unreclaimable? no Pretty much all your memory was used by pagecache. The DMA32 zone is below the minimum threshold. The VM has decided that there's no memory left for networking. It has retained a little bit of free memory so the block device drivers can still perform writeout, to free up additional memory. What is _supposed_ to happen (and usually does) it that kswapd will see that the machine is getting short on memory and will work to free some up, thus making it available to networking receive. But sometimes kswapd can't keep up. > >> maybe I"m mistaking the amount of RAM for the amount of memory in a >> buffer used for network IO? >> > > [ 1884.639264] Node 0 DMA free:11692kB min:28kB low:32kB high:40kB > active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB > unevictable:0kB present:15284kB pages_scanned:0 all_unreclaimable? yes > > [ 1884.639271] Node 0 DMA32 free:5412kB min:6560kB low:8200kB high:9840kB > active_anon:60476kB inactive_anon:12224kB active_file:28720kB > inactive_file:2987624kB unevictable:0kB present:3333216kB pages_scanned:0 > all_unreclaimable? no > > [ 1884.639279] Node 0 Normal free:444kB min:1524kB low:1904kB high:2284kB > active_anon:12436kB inactive_anon:13984kB active_file:16356kB > inactive_file:666568kB unevictable:480kB present:775680kB pages_scanned:0 > all_unreclaimable? no > > Pretty much all your memory was used by pagecache. The DMA32 zone is > below the minimum threshold. The VM has decided that there's no memory > left for networking. It has retained a little bit of free memory so > the block device drivers can still perform writeout, to free up > additional memory. > > What is _supposed_ to happen (and usually does) it that kswapd will see > that the machine is getting short on memory and will work to free some > up, thus making it available to networking receive. But sometimes > kswapd can't keep up. > Understood .... I did copy about 2.5Tb of files from 1 SATA drive to another and was busy trasferring 50Gb to the box from Gbit ethernet. Cool, thanks again guys. Just thought I'd report to make sure its something trivial and not a serious issue. :) -N Probably same problem here with a 2.6.39.3 kernel and a rtl8169 NIC (and order=1). Got this stack trace in the logs on an Atom with 4GB of memory, no swap activity but high LVM+Raid1 disk activity (got an error on ext4 later). Network activity was about 5KB/s according to MRTG (5 minutes granularity). The RX count error of the network card was not increased. eth0 Link encap:Ethernet HWaddr 1c:6f:65:5d:01:c3 inet adr:192.168.0.2 Bcast:192.168.0.255 Masque:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:6192264 errors:0 dropped:0 overruns:0 frame:0 TX packets:8650686 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 lg file transmission:1000 RX bytes:3144978240 (2.9 GiB) TX bytes:9084870934 (8.4 GiB) Interruption:41 [640793.018739] EXT4-fs (dm-13): mounted filesystem with ordered data mode. Opts: (null) [647233.722234] kworker/0:0: page allocation failure. order:1, mode:0x20 [647233.722337] Pid: 0, comm: kworker/0:0 Not tainted 2.6.39.3.skc2 #23 [647233.722429] Call Trace: [647233.722509] <IRQ> [<ffffffff81070dff>] __alloc_pages_nodemask+0x6e7/0x721 [647233.722626] [<ffffffff81096821>] cache_alloc_refill+0x264/0x4a6 [647233.722723] [<ffffffff81435392>] ? __nf_ct_refresh_acct+0x5d/0xc6 [647233.722819] [<ffffffff81096593>] kmem_cache_alloc+0x4c/0x76 [647233.722915] [<ffffffff813f1fab>] sk_prot_alloc+0x2a/0x9a [647233.723011] [<ffffffff813f20d3>] sk_clone+0x1d/0x266 [647233.723108] [<ffffffff8146b78b>] inet_csk_clone+0x13/0x84 [647233.723205] [<ffffffff8147f5b9>] tcp_create_openreq_child+0x24/0x43a [647233.723305] [<ffffffff8147dfd8>] tcp_v4_syn_recv_sock+0x57/0x1e2 [647233.723403] [<ffffffff8147f444>] tcp_check_req+0x241/0x392 [647233.723499] [<ffffffff8149ed26>] ? nf_nat_fn+0x12e/0x144 [647233.723595] [<ffffffff8147d646>] tcp_v4_do_rcv+0xfc/0x236 [647233.723693] [<ffffffff8147dc7b>] tcp_v4_rcv+0x4fb/0x801 [647233.723788] [<ffffffff8146343e>] ip_local_deliver_finish+0x125/0x1d0 [647233.723887] [<ffffffff8146355b>] ip_local_deliver+0x72/0x79 [647233.723984] [<ffffffff8146308f>] ip_rcv_finish+0x27f/0x2a9 [647233.724080] [<ffffffff814632f3>] ip_rcv+0x23a/0x260 [647233.724174] [<ffffffff813fca52>] __netif_receive_skb+0x441/0x473 [647233.724270] [<ffffffff813fcc94>] netif_receive_skb+0x67/0x6e [647233.724363] [<ffffffff813fcd79>] napi_skb_finish+0x24/0x3b [647233.724457] [<ffffffff813fd22b>] napi_gro_receive+0xa8/0xad [647233.724550] [<ffffffff812a3c96>] rtl8169_rx_interrupt+0x2e0/0x354 [647233.724645] [<ffffffff812a5dcc>] rtl8169_poll+0x33/0x181 [647233.724748] [<ffffffff812a565b>] ? rtl8169_interrupt+0x28c/0x2f1 [647233.724858] [<ffffffff813fd327>] net_rx_action+0x6a/0x16f [647233.724967] [<ffffffff810368d0>] __do_softirq+0x83/0x114 [647233.725068] [<ffffffff814e46cc>] call_softirq+0x1c/0x30 [647233.725172] [<ffffffff810033fb>] do_softirq+0x33/0x68 [647233.725276] [<ffffffff8103676d>] irq_exit+0x3f/0x88 [647233.725368] [<ffffffff81002ce2>] do_IRQ+0x98/0xaf [647233.725459] [<ffffffff814e3253>] common_interrupt+0x13/0x13 [647233.725551] <EOI> [<ffffffff811e1511>] ? intel_idle+0xc3/0xe9 [647233.725655] [<ffffffff811e14f4>] ? intel_idle+0xa6/0xe9 [647233.725747] [<ffffffff8136eec7>] cpuidle_idle_call+0x94/0xcd [647233.725842] [<ffffffff81001314>] cpu_idle+0xa7/0xdd [647233.725934] [<ffffffff81b54ac7>] start_secondary+0x1e5/0x1ec [647233.726026] Mem-Info: [647233.726110] DMA per-cpu: [647233.726191] CPU 0: hi: 0, btch: 1 usd: 0 [647233.726279] CPU 1: hi: 0, btch: 1 usd: 0 [647233.726366] CPU 2: hi: 0, btch: 1 usd: 0 [647233.726453] CPU 3: hi: 0, btch: 1 usd: 0 [647233.726539] DMA32 per-cpu: [647233.726620] CPU 0: hi: 186, btch: 31 usd: 39 [647233.726708] CPU 1: hi: 186, btch: 31 usd: 172 [647233.726795] CPU 2: hi: 186, btch: 31 usd: 5 [647233.726884] CPU 3: hi: 186, btch: 31 usd: 40 [647233.726970] Normal per-cpu: [647233.727051] CPU 0: hi: 186, btch: 31 usd: 61 [647233.727143] CPU 1: hi: 186, btch: 31 usd: 71 [647233.727231] CPU 2: hi: 186, btch: 31 usd: 26 [647233.727319] CPU 3: hi: 186, btch: 31 usd: 20 [647233.727414] active_anon:177616 inactive_anon:73202 isolated_anon:0 [647233.727418] active_file:217661 inactive_file:241205 isolated_file:0 [647233.727422] unevictable:0 dirty:452 writeback:0 unstable:0 [647233.727426] free:89576 slab_reclaimable:198275 slab_unreclaimable:8385 [647233.727430] mapped:11597 shmem:8406 pagetables:1207 bounce:0 [647233.727874] DMA free:15924kB min:256kB low:320kB high:384kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15700kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:0kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes [647233.728400] lowmem_reserve[]: 0 2990 4000 4000 [647233.728513] DMA32 free:321236kB min:50324kB low:62904kB high:75484kB active_anon:585268kB inactive_anon:162624kB active_file:631132kB inactive_file:713800kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3061856kB mlocked:0kB dirty:1332kB writeback:0kB mapped:26016kB shmem:28272kB slab_reclaimable:616544kB slab_unreclaimable:10116kB kernel_stack:272kB pagetables:1608kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [647233.729049] lowmem_reserve[]: 0 0 1010 1010 [647233.729158] Normal free:21144kB min:16996kB low:21244kB high:25492kB active_anon:125196kB inactive_anon:130184kB active_file:239512kB inactive_file:251020kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1034240kB mlocked:0kB dirty:476kB writeback:0kB mapped:20372kB shmem:5352kB slab_reclaimable:176556kB slab_unreclaimable:23424kB kernel_stack:1296kB pagetables:3220kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no [647233.729720] lowmem_reserve[]: 0 0 0 0 [647233.729815] DMA: 1*4kB 0*8kB 1*16kB 1*32kB 2*64kB 1*128kB 1*256kB 0*512kB 1*1024kB 1*2048kB 3*4096kB = 15924kB [647233.730025] DMA32: 80216*4kB 0*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 320864kB [647233.730229] Normal: 4492*4kB 141*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 0*4096kB = 21144kB [647233.730432] 471748 total pagecache pages [647233.730516] 4414 pages in swap cache [647233.730603] Swap cache stats: add 8778, delete 4364, find 3043263/3043783 [647233.730696] Free swap = 503124kB [647233.730777] Total swap = 524280kB [647233.732143] 1048560 pages RAM [647233.732143] 37520 pages reserved [647233.732143] 310276 pages shared [647233.732143] 637616 pages non-shared Closing as obsolete, if this is still seen with modern kernels please re-open and update |