When I defragment my system using the command below, the system starts to react slow and I get warnings in syslog. After some time the OOM-Killer kicks in and kills unrelated processes. find /usr/ -type d -atime -2 -o -type f | while read file; do /usr/sbin/btrfs filesystem defrag -f -v "${file}"; done [ 90.357859] ------------[ cut here ]------------ [ 90.357891] WARNING: CPU: 3 PID: 673 at fs/btrfs/inode.c:4741 inode_tree_add+0x154/0x160 [btrfs]() [ 90.357893] Modules linked in: vfat fat rfcomm nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle bnep bluetooth nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables mmc_block snd_hda_codec_hdmi snd_hda_codec_idt cdc_mbim dell_wmi iTCO_wdt sparse_keymap uvcvideo iTCO_vendor_support videobuf2_vmalloc videobuf2_memops videobuf2_core ppdev videodev cdc_ncm usbnet media mii cdc_acm cdc_wdm snd_hda_intel snd_hda_codec snd_hwdep dell_laptop snd_seq dcdbas snd_seq_device snd_pcm i2c_i801 sdhci_pci sdhci iwlwifi e1000e mmc_core snd_page_alloc lpc_ich snd_timer cfg80211 mfd_core snd soundcore ptp rfkill pps_core [ 90.357926] wmi parport_pc parport mperf uinput binfmt_misc btrfs libcrc32c xor zlib_deflate raid6_pq dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core video [ 90.357939] CPU: 3 PID: 673 Comm: btrfs-endio-wri Not tainted 3.11.0-0.rc1.git1.2.fc20.x86_64 #1 [ 90.357941] Hardware name: Dell Inc. Latitude E6320/050F4T, BIOS A08 10/18/2011 [ 90.357943] 0000000000000009 ffff880215aa1b70 ffffffff8164db74 0000000000000000 [ 90.357946] ffff880215aa1ba8 ffffffff810670bd ffff8800699c9560 ffff88020b33a800 [ 90.357948] ffff88020b33aca8 ffff8800699c15f8 00000000003ac817 ffff880215aa1bb8 [ 90.357951] Call Trace: [ 90.357958] [<ffffffff8164db74>] dump_stack+0x45/0x56 [ 90.357962] [<ffffffff810670bd>] warn_slowpath_common+0x7d/0xa0 [ 90.357965] [<ffffffff8106719a>] warn_slowpath_null+0x1a/0x20 [ 90.357976] [<ffffffffa0173a84>] inode_tree_add+0x154/0x160 [btrfs] [ 90.357989] [<ffffffffa017cb7f>] btrfs_iget+0x11f/0x540 [btrfs] [ 90.358000] [<ffffffffa017d50c>] relink_extent_backref+0xcc/0x790 [btrfs] [ 90.358004] [<ffffffff8118d45e>] ? kfree+0x13e/0x180 [ 90.358014] [<ffffffffa017ddc9>] btrfs_finish_ordered_io+0x1f9/0x960 [btrfs] [ 90.358025] [<ffffffffa017e715>] finish_ordered_fn+0x15/0x20 [btrfs] [ 90.358037] [<ffffffffa019e4eb>] worker_loop+0x12b/0x500 [btrfs] [ 90.358049] [<ffffffffa019e3c0>] ? btrfs_queue_worker+0x300/0x300 [btrfs] [ 90.358052] [<ffffffff8108bd70>] kthread+0xc0/0xd0 [ 90.358056] [<ffffffff8108bcb0>] ? insert_kthread_work+0x40/0x40 [ 90.358060] [<ffffffff8165cdac>] ret_from_fork+0x7c/0xb0 [ 90.358062] [<ffffffff8108bcb0>] ? insert_kthread_work+0x40/0x40 [ 90.358064] ---[ end trace 491ee6fab6a924f1 ]---
What mount options, how much memory do you have in the system, and do you have snapshots on this fs? I'll try and reproduce locally.
mount options are: rw,noatime,compress-force=lzo,ssd,discard,space_cache I am running a script which snapshots every 60min, so there there are quite a few. I could try to prepare an filesystem-image with all personal data deleted if that would help?
Can you try btrfs-next, I have some backref walking patches that will likely fix this.
Created attachment 106997 [details] syslog of OOMs
while btrfs-next fixed the warnings, I still get OOMs while executing the defrag script which bring down the whole machine. If you whish, I could send you a private copy of the partition where this happens.
Actually a couple of snapshots of cat /proc/slabinfo during the the problem would be the most helpful.
Created attachment 107000 [details] content of /proc/slabinfo sampled periodically
were the slabinfos helpful?
So it looks like you may be leaking btrfs_extent_state's, which isn't super helpful since we use that for locking and such. Is there a way you could unmount and rmmod the btrfs module after you hit these issues and see if slab complains about leaked entries? In the meantime I'll dig around and see if I can spot anything obvious.
Also do you have any files set with NODATACOW?
The unounting and rmmod will take some time as I am on vacation with very poor internet connection. As far as I know I don't have any files with nodatacow...
Ok so I *think* we are leaking EXTENT_DEFRAG states, so I posted a couple of patches to fix this and pushed them to btrfs-next, could you please give them a whirl when you get back from vacation :).
Created attachment 107091 [details] BUG_ONs output with btrfs-next (2013-08-03)
With btrfs-next defragging works far better, however I still get the BUG_ONs listed in attachment 107091 [details]
Created attachment 107098 [details] debug patch Ok I think we're just making too large of an allocation when we don't need to, can you give this patch a whirl and let me know how it goes? If it still blows up then there will be some debug info printed out so make sure to capture it. Thanks.
Created attachment 107099 [details] BUG ONs with btrfs-next + debug-patch
Created attachment 107138 [details] debug patch Ugh sorry I'm such a bozo I attached the wrong damned debug patch, can you remove the previous patch and apply this one instead and reproduce? Sorry about that.
Created attachment 107139 [details] syslog output with latest patch unfortunately, still no luck with the new patch
Actually it did spit what I needed out so that was perfect, it does seem like we are just wanting to make that large of an allocation, so I need to check and see if we really should be keeping track of this many things or if we need to make our allocations differently. Can you run btrfsck on this fs (no options) just to verify theres not some sort of weird extent tree corruption that's causing the backref walking stuff to blow up? In the meantime I will be looking at how to reduce the allocations and hopefully have a patch for you in a day or so.
Created attachment 107158 [details] btrfsck log seems there is some corruption of the checksums
Created attachment 107169 [details] possible fix Ok this changes how we allocate nodes for the ulist and hopefully will fix your problem. Now I've tested this as best I can, but you are already an extreme case so this may panic, and if it does just capture the output so I can fix my patch.
Hold on, zach has found a fatal flaw in this plan, I need to fix it up a bit.
Created attachment 107408 [details] possible fix Sorry it took me so long to get back to this, please give this a shot and let me know if you still have issues.
On a quite full fs I got the following stack trace. I'll trying to increase free space and report back... Sep 6 18:57:41 localhost kernel: [ 2453.818749] btrfs-endio-wri: page allocation failure: order:4, mode:0x104050 Sep 6 18:57:41 localhost kernel: [ 2453.818753] CPU: 0 PID: 5149 Comm: btrfs-endio-wri Tainted: G W 3.10.0+ #1 Sep 6 18:57:41 localhost kernel: [ 2453.818754] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Sep 6 18:57:41 localhost kernel: [ 2453.818755] 0000000000000000 ffff880100e937a0 ffffffff83746003 ffff880100e93828 Sep 6 18:57:41 localhost kernel: [ 2453.818757] ffffffff811cff80 ffff88011ffe9b00 0000000000000004 0000000000104050 Sep 6 18:57:41 localhost kernel: [ 2453.818759] ffff880100e93828 ffffffff8373e68b 0010405000000010 ffff88011ffe8000 Sep 6 18:57:41 localhost kernel: [ 2453.818761] Call Trace: Sep 6 18:57:41 localhost kernel: [ 2453.818777] [<ffffffff83746003>] dump_stack+0x19/0x1b Sep 6 18:57:41 localhost kernel: [ 2453.818781] [<ffffffff811cff80>] warn_alloc_failed+0xf0/0x160 Sep 6 18:57:41 localhost kernel: [ 2453.818783] [<ffffffff8373e68b>] ? __alloc_pages_direct_compact+0xac/0x196 Sep 6 18:57:41 localhost kernel: [ 2453.818785] [<ffffffff811d3fc1>] __alloc_pages_nodemask+0x811/0xa10 Sep 6 18:57:41 localhost kernel: [ 2453.818788] [<ffffffff81210859>] alloc_pages_current+0xa9/0x170 Sep 6 18:57:41 localhost kernel: [ 2453.818790] [<ffffffff811d00f4>] __get_free_pages+0x14/0x50 Sep 6 18:57:41 localhost kernel: [ 2453.818792] [<ffffffff81218f6e>] kmalloc_order_trace+0x2e/0xa0 Sep 6 18:57:41 localhost kernel: [ 2453.818795] [<ffffffff816d61a1>] ? ulist_free+0x31/0x50 Sep 6 18:57:41 localhost kernel: [ 2453.818797] [<ffffffff8121bc61>] __kmalloc_track_caller+0x221/0x240 Sep 6 18:57:41 localhost kernel: [ 2453.818799] [<ffffffff816d638c>] ? ulist_add_merge+0x1cc/0x290 Sep 6 18:57:41 localhost kernel: [ 2453.818801] [<ffffffff811e52df>] krealloc+0x4f/0xa0 Sep 6 18:57:41 localhost kernel: [ 2453.818803] [<ffffffff816d638c>] ulist_add_merge+0x1cc/0x290 Sep 6 18:57:41 localhost kernel: [ 2453.818805] [<ffffffff816d4a14>] find_parent_nodes+0x8d4/0xd50 Sep 6 18:57:41 localhost kernel: [ 2453.818807] [<ffffffff816d4f49>] btrfs_find_all_roots+0x99/0x100 Sep 6 18:57:41 localhost kernel: [ 2453.818810] [<ffffffff816800a0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0 Sep 6 18:57:41 localhost kernel: [ 2453.818812] [<ffffffff816d58cf>] iterate_extent_inodes+0x14f/0x320 Sep 6 18:57:41 localhost kernel: [ 2453.818814] [<ffffffff816800a0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0 Sep 6 18:57:41 localhost kernel: [ 2453.818816] [<ffffffff816d5b27>] iterate_inodes_from_logical+0x87/0xa0 Sep 6 18:57:41 localhost kernel: [ 2453.818818] [<ffffffff8167e5ef>] record_extent_backrefs+0x7f/0xe0 Sep 6 18:57:41 localhost kernel: [ 2453.818820] [<ffffffff816889ff>] btrfs_finish_ordered_io+0x1df/0xab0 Sep 6 18:57:41 localhost kernel: [ 2453.818823] [<ffffffff8376b500>] ? schedule_timeout+0xf0/0x2c0 Sep 6 18:57:41 localhost kernel: [ 2453.818825] [<ffffffff816894e5>] finish_ordered_fn+0x15/0x20 Sep 6 18:57:41 localhost kernel: [ 2453.818828] [<ffffffff816a9dbb>] worker_loop+0x12b/0x500 Sep 6 18:57:41 localhost kernel: [ 2453.818836] [<ffffffff816a9c90>] ? btrfs_queue_worker+0x300/0x300 Sep 6 18:57:41 localhost kernel: [ 2453.818840] [<ffffffff81126110>] kthread+0xc0/0xd0 Sep 6 18:57:41 localhost kernel: [ 2453.818842] [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40 Sep 6 18:57:41 localhost kernel: [ 2453.818844] [<ffffffff8376f6ac>] ret_from_fork+0x7c/0xb0 Sep 6 18:57:41 localhost kernel: [ 2453.818846] [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40 Sep 6 18:57:41 localhost kernel: [ 2453.818847] Mem-Info: Sep 6 18:57:41 localhost kernel: [ 2453.818848] Node 0 DMA per-cpu: Sep 6 18:57:41 localhost kernel: [ 2453.818850] CPU 0: hi: 0, btch: 1 usd: 0 Sep 6 18:57:41 localhost kernel: [ 2453.818850] Node 0 DMA32 per-cpu: Sep 6 18:57:41 localhost kernel: [ 2453.818852] CPU 0: hi: 186, btch: 31 usd: 0 Sep 6 18:57:41 localhost kernel: [ 2453.818852] Node 0 Normal per-cpu: Sep 6 18:57:41 localhost kernel: [ 2453.818854] CPU 0: hi: 186, btch: 31 usd: 0 Sep 6 18:57:41 localhost kernel: [ 2453.818856] active_anon:60702 inactive_anon:1654 isolated_anon:0 Sep 6 18:57:41 localhost kernel: [ 2453.818856] active_file:80430 inactive_file:103409 isolated_file:0 Sep 6 18:57:41 localhost kernel: [ 2453.818856] unevictable:8 dirty:2248 writeback:0 unstable:0 Sep 6 18:57:41 localhost kernel: [ 2453.818856] free:41518 slab_reclaimable:22460 slab_unreclaimable:12928 Sep 6 18:57:41 localhost kernel: [ 2453.818856] mapped:4617 shmem:2294 pagetables:4299 bounce:0 Sep 6 18:57:41 localhost kernel: [ 2453.818856] free_cma:0 Sep 6 18:57:41 localhost kernel: [ 2453.818859] Node 0 DMA free:15808kB min:268kB low:332kB high:400kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes Sep 6 18:57:41 localhost kernel: [ 2453.818862] lowmem_reserve[]: 0 3430 3877 3877 Sep 6 18:57:41 localhost kernel: [ 2453.818864] Node 0 DMA32 free:138196kB min:59548kB low:74432kB high:89320kB active_anon:128596kB inactive_anon:5912kB active_file:316372kB inactive_file:408068kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3653568kB managed:3512740kB mlocked:0kB dirty:892kB writeback:0kB mapped:15672kB shmem:5944kB slab_reclaimable:66028kB slab_unreclaimable:7912kB kernel_stack:656kB pagetables:6756kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Sep 6 18:57:41 localhost kernel: [ 2453.818867] lowmem_reserve[]: 0 0 447 447 Sep 6 18:57:41 localhost kernel: [ 2453.818869] Node 0 Normal free:12068kB min:7764kB low:9704kB high:11644kB active_anon:114212kB inactive_anon:704kB active_file:5348kB inactive_file:5568kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:458000kB mlocked:32kB dirty:8100kB writeback:0kB mapped:2796kB shmem:3232kB slab_reclaimable:23812kB slab_unreclaimable:43796kB kernel_stack:2000kB pagetables:10440kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Sep 6 18:57:41 localhost kernel: [ 2453.818872] lowmem_reserve[]: 0 0 0 0 Sep 6 18:57:41 localhost kernel: [ 2453.818874] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15808kB Sep 6 18:57:41 localhost kernel: [ 2453.818880] Node 0 DMA32: 16537*4kB (UEM) 6001*8kB (UEM) 1216*16kB (UEM) 16*32kB (UEM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 138220kB Sep 6 18:57:41 localhost kernel: [ 2453.818885] Node 0 Normal: 2327*4kB (UEM) 87*8kB (M) 2*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB = 12084kB Sep 6 18:57:41 localhost kernel: [ 2453.818891] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB Sep 6 18:57:41 localhost kernel: [ 2453.818892] 186155 total pagecache pages Sep 6 18:57:41 localhost kernel: [ 2453.818893] 0 pages in swap cache Sep 6 18:57:41 localhost kernel: [ 2453.818894] Swap cache stats: add 0, delete 0, find 0/0 Sep 6 18:57:41 localhost kernel: [ 2453.818895] Free swap = 0kB Sep 6 18:57:41 localhost kernel: [ 2453.818895] Total swap = 0kB Sep 6 18:57:41 localhost kernel: [ 2453.830851] 1048575 pages RAM Sep 6 18:57:41 localhost kernel: [ 2453.830853] 48865 pages reserved Sep 6 18:57:41 localhost kernel: [ 2453.830854] 483457 pages shared Sep 6 18:57:41 localhost kernel: [ 2453.830855] 808991 pages non-shared Sep 6 18:57:41 localhost kernel: [ 2453.832908] ------------[ cut here ]------------ Sep 6 18:57:41 localhost kernel: [ 2453.832911] kernel BUG at fs/btrfs/inode.c:2136! Sep 6 18:57:41 localhost kernel: [ 2453.832913] invalid opcode: 0000 [#1] SMP Sep 6 18:57:41 localhost kernel: [ 2453.832916] CPU: 0 PID: 5149 Comm: btrfs-endio-wri Tainted: G W 3.10.0+ #1 Sep 6 18:57:41 localhost kernel: [ 2453.832917] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 Sep 6 18:57:41 localhost kernel: [ 2453.832918] task: ffff8800d61c4f40 ti: ffff880100e92000 task.ti: ffff880100e92000 Sep 6 18:57:41 localhost kernel: [ 2453.832919] RIP: 0010:[<ffffffff8167e640>] [<ffffffff8167e640>] record_extent_backrefs+0xd0/0xe0 Sep 6 18:57:41 localhost kernel: [ 2453.832925] RSP: 0018:ffff880100e93d00 EFLAGS: 00010202 Sep 6 18:57:41 localhost kernel: [ 2453.832926] RAX: 0000000000000001 RBX: ffff88011744c000 RCX: 00000000003566c8 Sep 6 18:57:41 localhost kernel: [ 2453.832927] RDX: 00000000003566c7 RSI: 000000000001afb0 RDI: ffff880116c33f00 Sep 6 18:57:41 localhost kernel: [ 2453.832928] RBP: ffff880100e93d38 R08: 000000000001afb0 R09: ffff88011fc1afb0 Sep 6 18:57:41 localhost kernel: [ 2453.832929] R10: ffffffff8167da4c R11: ffffea00040b5180 R12: ffff8800c8011128 Sep 6 18:57:41 localhost kernel: [ 2453.832930] R13: ffff880102ce5240 R14: ffff8800c8011128 R15: ffff88011744c000 Sep 6 18:57:41 localhost kernel: [ 2453.832931] FS: 0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000 Sep 6 18:57:41 localhost kernel: [ 2453.832932] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b Sep 6 18:57:41 localhost kernel: [ 2453.832933] CR2: 00007fc1682ea3a0 CR3: 0000000117451000 CR4: 00000000000006f0 Sep 6 18:57:41 localhost kernel: [ 2453.832939] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 Sep 6 18:57:41 localhost kernel: [ 2453.832940] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 Sep 6 18:57:41 localhost kernel: [ 2453.832941] Stack: Sep 6 18:57:41 localhost kernel: [ 2453.832941] ffff8800c8011120 ffff88010f7a4000 ffff8800aea85400 ffff880102d46140 Sep 6 18:57:41 localhost kernel: [ 2453.832943] ffff880102ce5240 ffff8800c8011120 ffff8800aa53bd60 ffff880100e93e18 Sep 6 18:57:41 localhost kernel: [ 2453.832945] ffffffff816889ff 0000000000008000 0000000000000002 ffff880100000001 Sep 6 18:57:41 localhost kernel: [ 2453.832946] Call Trace: Sep 6 18:57:41 localhost kernel: [ 2453.832950] [<ffffffff816889ff>] btrfs_finish_ordered_io+0x1df/0xab0 Sep 6 18:57:41 localhost kernel: [ 2453.832955] [<ffffffff8376b500>] ? schedule_timeout+0xf0/0x2c0 Sep 6 18:57:41 localhost kernel: [ 2453.832957] [<ffffffff816894e5>] finish_ordered_fn+0x15/0x20 Sep 6 18:57:41 localhost kernel: [ 2453.832960] [<ffffffff816a9dbb>] worker_loop+0x12b/0x500 Sep 6 18:57:41 localhost kernel: [ 2453.832961] [<ffffffff816a9c90>] ? btrfs_queue_worker+0x300/0x300 Sep 6 18:57:41 localhost kernel: [ 2453.832974] [<ffffffff81126110>] kthread+0xc0/0xd0 Sep 6 18:57:41 localhost kernel: [ 2453.833225] [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40 Sep 6 18:57:41 localhost kernel: [ 2453.833228] [<ffffffff8376f6ac>] ret_from_fork+0x7c/0xb0 Sep 6 18:57:41 localhost kernel: [ 2453.833230] [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40 Sep 6 18:57:41 localhost kernel: [ 2453.833231] Code: 75 b4 48 8b 45 c8 48 8b 40 08 49 39 c4 0f 95 c0 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 4c 89 e0 eb e6 c1 e8 1f 84 c0 74 b4 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 Sep 6 18:57:41 localhost kernel: [ 2453.833246] RIP [<ffffffff8167e640>] record_extent_backrefs+0xd0/0xe0 Sep 6 18:57:41 localhost kernel: [ 2453.833248] RSP <ffff880100e93d00> Sep 6 18:57:41 localhost kernel: [ 2453.833250] ---[ end trace ef2a8b050fb82f98 ]---
Those warnings are coming from krealloc() which I killed with my most recent patch. Please make sure you have applied my patch and rebuilt and installed the new module and re-test, it looks like you are still using the old one.
Created attachment 110071 [details] Syslog Errors with autodefrag I think I'm also hitting this issue on a 3.10.12 kernel that has the current btrfs patches back-ported through the latest for-linus branch (for the 3.12_rc kernel). I've applied the patch in attachment 107408 [details], and this has cleared the errors and warnings, but I am still getting Oom errors. I was able to get one series of Alt-SysReq-W before the Oom-killer began. I've attached the output of the Alt-SysReq-W as well as the top few Oom messages. I am using autodefrag, and compress-force=lzo in conjunction with a snapshot test.
This is a semi-automated bugzilla cleanup, report is against an old kernel version. If the problem still happens, please open a new bug. Thanks.