Hi, We always experience web server hanging problem. xfs keeps printing XFS: possible memory allocation deadlock in kmem_alloc (mode:0x8250) We finally find out that it's the memory fragmentation problem. We are experiencing this problem even if we have 50% free memory (+ buffer/cache) . echo 1 > /proc/sys/vm/drop_caches solve the problem instantly. I was wondering if you could enhance the message, for example, print the amount of pages being requested, and print messages like Node 0 Normal: 12544*4kB 68*8kB 0*16kB 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB = 54816kB to help debugging, and try to allocate smaller number of pages. Thanks.
I agree that it'd be good to print the size being requested. These large allocations are often the result of extent maps for very badly fragmented files; you might consider looking to see if that is the problem in your case. "allocating a smaller number of pages" isn't necessarily possible, though. -Eric
Hi Eric, I was able to find that on my system it's in most case an allocation of 128KB contiguous memory space. The only things I could do without change in any code are changing the vm.swappiness and vm.vfs_cache_pressure and hope I could have more free slabs of larger space. But I still see those memory allocation related warnings from XFS. Would you like to consider use vmalloc() as a fallback after a few failed tries to kmalloc()?
I just saw a related discussion and I realized that I was using 64K dir block and it seems that commit b3f03bac8132207a20286d5602eda64500c19724 fixed the problem. I'll try if it solves my problem.
I'm batch-closing all xfs bugs which are more than 1 year old, sorry about that. If you still have this issue on a current kernel, please retest and re-open with this information. Thanks, -Eric
how to reopen this bug? I could repro this issue very successfully ( application: fio ) on SLES 12.
Hi, Let me start with little bit background, however just a brief. Intermittently, however more frequently some of the random write test cases with block size split were failing due to fio hung issue. I have been observing this since last few days, but couldn’t repro it frequently. FIO hung issue was bit annoying to me where test would fail due to fio hung state. why fio was hung (in deadlock state)? FIO would look for contiguous memory allocation, however I don’t have logs anywhere to say that what size of pages have been requested. This could be the potential reason being extreme memory fragmentation and/or XFS memory pressure issue led FIO into deadlock state. Some specific test cases (FIO workloads) induces extremely fragmented filesystem by involving tailored FIO workloads, and read/write on the same would induce high memory pressure as well. One of these reason might have cause memory extreme fragmentation and failed SLAB memory allocation. My tests would run the test with defined runtime (runtime=1200) and kill the if jobs exceeding 1200s runtime. I have been observing frequently since couple of months on xfs filesystem, where FIO was not honoring defined runtime b’css of hung/deadlock state, and/or sometime killing (pkill -9) dishonored FIO procs in this state resulting defunct process (/corrupted state). Here is the substantial proof for the same, where memory allocation was hung... <snip> [Mon Oct 10 21:08:53 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:54 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:55 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:55 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:55 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:56 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) [Mon Oct 10 21:08:56 2016] XFS: possible memory allocation deadlock in kmem_alloc (mode:0x250) ATF sends ‘pkill -9’ to dishonored FIO procs, in this case deadlock procs which led proc into defunct state… <snip> r3a4-28:~ # ps -aef | grep 9163 root 9163 1 0 15:57 ? 00:00:05 /usr/local/staf/bin/STAFProc root 18111 9163 47 20:48 ? 00:21:17 [fio] <defunct> root 18125 9163 47 20:48 ? 00:21:12 [fio] <defunct> root 18171 9163 47 20:48 ? 00:21:15 [fio] <defunct> root 18186 9163 47 20:48 ? 00:21:11 [fio] <defunct> FIO process traces. <snip> 2016-10-10T20:14:39.134180-07:00 r3a4-28 kernel: [15561.432108] fio: page allocation failure: order:1, mode:0x200020 2016-10-10T20:14:39.134202-07:00 r3a4-28 kernel: [15561.432114] CPU: 2 PID: 16735 Comm: fio Tainted: G X 3.12.28-4-default #1 2016-10-10T20:14:39.134203-07:00 r3a4-28 kernel: [15561.432115] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.2 01/15/2015 2016-10-10T20:14:39.134203-07:00 r3a4-28 kernel: [15561.432117] ffff88047fc438c0 ffffffff8150b1db 0000000000200020 ffffffff8113af90 2016-10-10T20:14:39.134204-07:00 r3a4-28 kernel: [15561.432121] ffffffff8107793e 0000000000000000 ffff88047ff99b00 000000027ff98000 2016-10-10T20:14:39.134205-07:00 r3a4-28 kernel: [15561.432123] ffff88047ffbdf68 0000000000000046 0000000000000000 0000000000000000 2016-10-10T20:14:39.164051-07:00 r3a4-28 kernel: [15561.432126] Call Trace: 2016-10-10T20:14:39.164066-07:00 r3a4-28 kernel: [15561.432139] [<ffffffff8100467d>] dump_trace+0x7d/0x2d0 2016-10-10T20:14:39.164068-07:00 r3a4-28 kernel: [15561.432143] [<ffffffff81004964>] show_stack_log_lvl+0x94/0x170 2016-10-10T20:14:39.164069-07:00 r3a4-28 kernel: [15561.432147] [<ffffffff81005d91>] show_stack+0x21/0x50 2016-10-10T20:14:39.164070-07:00 r3a4-28 kernel: [15561.432152] [<ffffffff8150b1db>] dump_stack+0x41/0x51 2016-10-10T20:14:39.164071-07:00 r3a4-28 kernel: [15561.432158] [<ffffffff8113af90>] warn_alloc_failed+0xf0/0x160 2016-10-10T20:14:39.164072-07:00 r3a4-28 kernel: [15561.432163] [<ffffffff8150763a>] __alloc_pages_slowpath+0x72f/0x796 2016-10-10T20:14:39.164197-07:00 r3a4-28 kernel: [15561.432166] [<ffffffff8113ee7a>] __alloc_pages_nodemask+0x1ea/0x210 2016-10-10T20:14:39.164202-07:00 r3a4-28 kernel: [15561.432172] [<ffffffff81183f48>] kmem_getpages+0x58/0x1a0 2016-10-10T20:14:39.164203-07:00 r3a4-28 kernel: [15561.432176] [<ffffffff811859df>] fallback_alloc+0x20f/0x250 2016-10-10T20:14:39.164204-07:00 r3a4-28 kernel: [15561.432180] [<ffffffff8118641c>] kmem_cache_alloc+0x11c/0x440 2016-10-10T20:14:39.164205-07:00 r3a4-28 kernel: [15561.432185] [<ffffffff814120f4>] sk_prot_alloc+0x34/0x180 2016-10-10T20:14:39.164206-07:00 r3a4-28 kernel: [15561.432189] [<ffffffff814126db>] sk_clone_lock+0x1b/0x2e0 2016-10-10T20:14:39.164207-07:00 r3a4-28 kernel: [15561.432194] [<ffffffff81465561>] inet_csk_clone_lock+0x11/0x140 2016-10-10T20:14:39.164344-07:00 r3a4-28 kernel: [15561.432199] [<ffffffff8147d45e>] tcp_create_openreq_child+0x1e/0x430 2016-10-10T20:14:39.164348-07:00 r3a4-28 kernel: [15561.432202] [<ffffffff81479708>] tcp_v4_syn_recv_sock+0x38/0x250 2016-10-10T20:14:39.164350-07:00 r3a4-28 kernel: [15561.432206] [<ffffffff8147e0c9>] tcp_check_req+0x3b9/0x530 2016-10-10T20:14:39.164351-07:00 r3a4-28 kernel: [15561.432212] [<ffffffff8147b21c>] tcp_v4_do_rcv+0x1dc/0x300 2016-10-10T20:14:39.164352-07:00 r3a4-28 kernel: [15561.432217] [<ffffffff8147d3ef>] tcp_v4_rcv+0x75f/0x780 2016-10-10T20:14:39.164353-07:00 r3a4-28 kernel: [15561.432221] [<ffffffff8145a05e>] ip_local_deliver_finish+0x9e/0x200 2016-10-10T20:14:39.164356-07:00 r3a4-28 kernel: [15561.432226] [<ffffffff81425d07>] __netif_receive_skb_core+0x627/0x800 2016-10-10T20:14:39.164357-07:00 r3a4-28 kernel: [15561.432230] [<ffffffff81425f5f>] netif_receive_skb+0x1f/0x90 2016-10-10T20:14:39.164358-07:00 r3a4-28 kernel: [15561.432234] [<ffffffff814268e8>] napi_gro_receive+0x68/0x90 2016-10-10T20:14:39.164359-07:00 r3a4-28 kernel: [15561.432244] [<ffffffffa049e4be>] igb_clean_rx_irq+0x3be/0x8a0 [igb] 2016-10-10T20:14:39.164361-07:00 r3a4-28 kernel: [15561.432261] [<ffffffffa049ec82>] igb_poll+0x2e2/0x760 [igb] 2016-10-10T20:14:39.164361-07:00 r3a4-28 kernel: [15561.432270] [<ffffffff814262a0>] net_rx_action+0x140/0x240 2016-10-10T20:14:39.164362-07:00 r3a4-28 kernel: [15561.432275] [<ffffffff8105a215>] __do_softirq+0xe5/0x230 2016-10-10T20:14:39.164495-07:00 r3a4-28 kernel: [15561.432280] [<ffffffff8151ac9c>] call_softirq+0x1c/0x30 2016-10-10T20:14:39.164500-07:00 r3a4-28 kernel: [15561.432283] [<ffffffff810045c5>] do_softirq+0x55/0x90 2016-10-10T20:14:39.164501-07:00 r3a4-28 kernel: [15561.432286] [<ffffffff8105a4b5>] irq_exit+0x95/0xa0 2016-10-10T20:14:39.164502-07:00 r3a4-28 kernel: [15561.432290] [<ffffffff8151b47e>] do_IRQ+0x4e/0xb0 2016-10-10T20:14:39.164503-07:00 r3a4-28 kernel: [15561.432295] [<ffffffff8151162d>] common_interrupt+0x6d/0x6d 2016-10-10T20:14:39.164504-07:00 r3a4-28 kernel: [15561.432308] [<0000000000431b38>] 0x431b37 2016-10-10T20:14:39.164505-07:00 r3a4-28 kernel: [15561.432309] SLAB: Unable to allocate memory on node 0 (gfp=0x20) 2016-10-10T20:14:39.164509-07:00 r3a4-28 kernel: [15561.432310] cache: TCP, object size: 1728, order: 1 2016-10-10T20:14:39.164510-07:00 r3a4-28 kernel: [15561.432313] node 0: slabs: 5/5, objs: 20/20, free: 0 2016-10-10T20:14:39.164511-07:00 r3a4-28 kernel: [15561.432624] fio: page allocation failure: order:1, mode:0x200020 2016-10-10T20:14:39.164512-07:00 r3a4-28 kernel: [15561.432626] CPU: 2 PID: 16735 Comm: fio Tainted: G X 3.12.28-4-default #1 2016-10-10T20:14:39.164513-07:00 r3a4-28 kernel: [15561.432627] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.2 01/15/2015 2016-10-10T20:14:39.164514-07:00 r3a4-28 kernel: [15561.432628] ffff88047fc438c0 ffffffff8150b1db 0000000000200020 ffffffff8113af90 2016-10-10T20:14:39.164505-07:00 r3a4-28 kernel: [15561.432309] SLAB: Unable to allocate memory on node 0 (gfp=0x20) 2016-10-10T20:14:39.164509-07:00 r3a4-28 kernel: [15561.432310] cache: TCP, object size: 1728, order: 1 2016-10-10T20:14:39.164510-07:00 r3a4-28 kernel: [15561.432313] node 0: slabs: 5/5, objs: 20/20, free: 0 2016-10-10T20:14:39.164511-07:00 r3a4-28 kernel: [15561.432624] fio: page allocation failure: order:1, mode:0x200020 2016-10-10T20:14:39.164512-07:00 r3a4-28 kernel: [15561.432626] CPU: 2 PID: 16735 Comm: fio Tainted: G X 3.12.28-4-default #1 2016-10-10T20:14:39.164513-07:00 r3a4-28 kernel: [15561.432627] Hardware name: Supermicro X9SRW-F/X9SRW-F, BIOS 3.2 01/15/2015 2016-10-10T20:14:39.164514-07:00 r3a4-28 kernel: [15561.432628] ffff88047fc438c0 ffffffff8150b1db 0000000000200020 ffffffff8113af90 2016-10-10T20:14:39.164669-07:00 r3a4-28 kernel: [15561.432631] 0000000000000000 0000000000000000 ffff88047ff99b00 0000000200000000 2016-10-10T20:14:39.164673-07:00 r3a4-28 kernel: [15561.432633] 0000000000000002 0000000000000000 00ff880400000000 0000000000000000 I was thinking that following procedure would resolve the issue, however it did not help to prevent the issue: I observed this with more than 60% free memory as well(+ buffer/cache) 1. Clearing aches: Drop the caches "echo 3 > /proc/sys/vm/drop_caches" or “echo 1 > /proc/sys/vm/drop_caches”, executing this for every 2 minutes 2. ”echo 1 > /proc/sys/vm/compact_memory", (with CONFIG_COMPACTION=y) 3. Reduce huge page size to 128K Aforementioned procedure did not help to mitigate/prevent the issue Here is <snip> Before reboot (even after compact) r3a4-28:~ # cat /proc/buddyinfo │·· Node 0, zone DMA 1 0 0 1 0 1 1 0 1 1 3 │·· Node 0, zone DMA32 26713 23984 16125 7850 2725 916 526 244 93 78 4 │·· Node 0, zone Normal 326417 256905 167300 76595 21201 4638 1814 778 207 371 3 after reboot: r3a4-28:~ # cat /proc/buddyinfo │·· Node 0, zone DMA 1 0 0 1 0 1 1 0 1 1 3 │·· Node 0, zone DMA32 197 117 39 9 1 2 2 4 2 2 426 │·· Node 0, zone Normal 30 487 133 51 8 3 12 11 6 4 3347
If you have a SLES12 problem, please report that to SuSE. This is the upstream bugtracker, but you're running a distro kernel based on 3.12...
oh! ok. BTW, this is not SLES 12 issue. This is XFS issue. The issue persist on RHEL 7.2 ( kernel 3.10.0-327.10.1.el7.x86_64), OL, etc.
If you have a RHEL bug, please report it to Red Hat. If you have an Oracle bug, please report it to Oracle. If you can reproduce it on an upstream kernel, please report it here.
Sure. let me set it ip and rerun the tests, and will keep you posted.