Bug 60579 - Warning and OOMs in btrfs_iget when defragmenting files
Summary: Warning and OOMs in btrfs_iget when defragmenting files
Status: RESOLVED OBSOLETE
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-07-18 17:36 UTC by Clemens Eisserer
Modified: 2022-09-30 15:01 UTC (History)
2 users (show)

See Also:
Kernel Version: 3.11rc1
Subsystem:
Regression: No
Bisected commit-id:


Attachments
syslog of OOMs (417.04 KB, text/plain)
2013-07-23 13:08 UTC, Clemens Eisserer
Details
content of /proc/slabinfo sampled periodically (28.67 KB, application/octet-stream)
2013-07-23 19:48 UTC, Clemens Eisserer
Details
BUG_ONs output with btrfs-next (2013-08-03) (21.84 KB, text/plain)
2013-08-04 16:37 UTC, Clemens Eisserer
Details
debug patch (1.17 KB, patch)
2013-08-05 15:00 UTC, Josef Bacik
Details | Diff
BUG ONs with btrfs-next + debug-patch (186.38 KB, text/plain)
2013-08-05 15:43 UTC, Clemens Eisserer
Details
debug patch (1009 bytes, patch)
2013-08-07 15:18 UTC, Josef Bacik
Details | Diff
syslog output with latest patch (184.91 KB, text/plain)
2013-08-07 15:53 UTC, Clemens Eisserer
Details
btrfsck log (55.37 KB, text/plain)
2013-08-08 18:00 UTC, Clemens Eisserer
Details
possible fix (6.53 KB, patch)
2013-08-09 21:33 UTC, Josef Bacik
Details | Diff
possible fix (6.97 KB, patch)
2013-09-03 20:24 UTC, Josef Bacik
Details | Diff
Syslog Errors with autodefrag (68.16 KB, text/plain)
2013-09-30 17:30 UTC, Mitch Harder
Details

Description Clemens Eisserer 2013-07-18 17:36:37 UTC
When I defragment my system using the command below, the system starts to react slow and I get warnings in syslog. After some time the OOM-Killer kicks in and kills unrelated processes.

find /usr/ -type d -atime -2 -o -type f | while read file; do /usr/sbin/btrfs filesystem  defrag -f -v "${file}"; done

[   90.357859] ------------[ cut here ]------------
[   90.357891] WARNING: CPU: 3 PID: 673 at fs/btrfs/inode.c:4741 inode_tree_add+0x154/0x160 [btrfs]()
[   90.357893] Modules linked in: vfat fat rfcomm nf_conntrack_netbios_ns nf_conntrack_broadcast ipt_MASQUERADE ip6table_nat nf_nat_ipv6 ip6table_mangle ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 iptable_nat nf_nat_ipv4 nf_nat iptable_mangle bnep bluetooth nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ebtable_filter ebtables ip6table_filter ip6_tables mmc_block snd_hda_codec_hdmi snd_hda_codec_idt cdc_mbim dell_wmi iTCO_wdt sparse_keymap uvcvideo iTCO_vendor_support videobuf2_vmalloc videobuf2_memops videobuf2_core ppdev videodev cdc_ncm usbnet media mii cdc_acm cdc_wdm snd_hda_intel snd_hda_codec snd_hwdep dell_laptop snd_seq dcdbas snd_seq_device snd_pcm i2c_i801 sdhci_pci sdhci iwlwifi e1000e mmc_core snd_page_alloc lpc_ich snd_timer cfg80211 mfd_core snd soundcore ptp rfkill pps_core
[   90.357926]  wmi parport_pc parport mperf uinput binfmt_misc btrfs libcrc32c xor zlib_deflate raid6_pq dm_crypt i915 i2c_algo_bit drm_kms_helper drm i2c_core video
[   90.357939] CPU: 3 PID: 673 Comm: btrfs-endio-wri Not tainted 3.11.0-0.rc1.git1.2.fc20.x86_64 #1
[   90.357941] Hardware name: Dell Inc. Latitude E6320/050F4T, BIOS A08 10/18/2011
[   90.357943]  0000000000000009 ffff880215aa1b70 ffffffff8164db74 0000000000000000
[   90.357946]  ffff880215aa1ba8 ffffffff810670bd ffff8800699c9560 ffff88020b33a800
[   90.357948]  ffff88020b33aca8 ffff8800699c15f8 00000000003ac817 ffff880215aa1bb8
[   90.357951] Call Trace:
[   90.357958]  [<ffffffff8164db74>] dump_stack+0x45/0x56
[   90.357962]  [<ffffffff810670bd>] warn_slowpath_common+0x7d/0xa0
[   90.357965]  [<ffffffff8106719a>] warn_slowpath_null+0x1a/0x20
[   90.357976]  [<ffffffffa0173a84>] inode_tree_add+0x154/0x160 [btrfs]
[   90.357989]  [<ffffffffa017cb7f>] btrfs_iget+0x11f/0x540 [btrfs]
[   90.358000]  [<ffffffffa017d50c>] relink_extent_backref+0xcc/0x790 [btrfs]
[   90.358004]  [<ffffffff8118d45e>] ? kfree+0x13e/0x180
[   90.358014]  [<ffffffffa017ddc9>] btrfs_finish_ordered_io+0x1f9/0x960 [btrfs]
[   90.358025]  [<ffffffffa017e715>] finish_ordered_fn+0x15/0x20 [btrfs]
[   90.358037]  [<ffffffffa019e4eb>] worker_loop+0x12b/0x500 [btrfs]
[   90.358049]  [<ffffffffa019e3c0>] ? btrfs_queue_worker+0x300/0x300 [btrfs]
[   90.358052]  [<ffffffff8108bd70>] kthread+0xc0/0xd0
[   90.358056]  [<ffffffff8108bcb0>] ? insert_kthread_work+0x40/0x40
[   90.358060]  [<ffffffff8165cdac>] ret_from_fork+0x7c/0xb0
[   90.358062]  [<ffffffff8108bcb0>] ? insert_kthread_work+0x40/0x40
[   90.358064] ---[ end trace 491ee6fab6a924f1 ]---
Comment 1 Josef Bacik 2013-07-19 20:17:45 UTC
What mount options, how much memory do you have in the system, and do you have snapshots on this fs?  I'll try and reproduce locally.
Comment 2 Clemens Eisserer 2013-07-19 20:22:15 UTC
mount options are: rw,noatime,compress-force=lzo,ssd,discard,space_cache

I am running a script which snapshots every 60min, so there there are quite a few.

I could try to prepare an filesystem-image with all personal data deleted if that would help?
Comment 3 Josef Bacik 2013-07-22 17:09:49 UTC
Can you try btrfs-next, I have some backref walking patches that will likely fix this.
Comment 4 Clemens Eisserer 2013-07-23 13:08:14 UTC
Created attachment 106997 [details]
syslog of OOMs
Comment 5 Clemens Eisserer 2013-07-23 13:08:24 UTC
while btrfs-next fixed the warnings, I still get OOMs while executing the defrag script which bring down the whole machine.

If you whish, I could send you a private copy of the partition where this happens.
Comment 6 Josef Bacik 2013-07-23 16:51:14 UTC
Actually a couple of snapshots of cat /proc/slabinfo during the the problem would be the most helpful.
Comment 7 Clemens Eisserer 2013-07-23 19:48:39 UTC
Created attachment 107000 [details]
content of /proc/slabinfo sampled periodically
Comment 8 Clemens Eisserer 2013-07-26 18:27:09 UTC
were the slabinfos helpful?
Comment 9 Josef Bacik 2013-07-29 13:32:30 UTC
So it looks like you may be leaking btrfs_extent_state's, which isn't super helpful since we use that for locking and such.  Is there a way you could unmount and rmmod the btrfs module after you hit these issues and see if slab complains about leaked entries?  In the meantime I'll dig around and see if I can spot anything obvious.
Comment 10 Josef Bacik 2013-07-29 13:35:18 UTC
Also do you have any files set with NODATACOW?
Comment 11 Clemens Eisserer 2013-07-29 16:17:06 UTC
The unounting and rmmod will take some time as I am on vacation with very poor internet connection.

As far as I know I don't have any files with nodatacow...
Comment 12 Josef Bacik 2013-07-29 17:25:27 UTC
Ok so I *think* we are leaking EXTENT_DEFRAG states, so I posted a couple of patches to fix this and pushed them to btrfs-next, could you please give them a whirl when you get back from vacation :).
Comment 13 Clemens Eisserer 2013-08-04 16:37:52 UTC
Created attachment 107091 [details]
BUG_ONs output with btrfs-next (2013-08-03)
Comment 14 Clemens Eisserer 2013-08-04 16:38:58 UTC
With btrfs-next defragging works far better, however I still get the BUG_ONs listed in attachment 107091 [details]
Comment 15 Josef Bacik 2013-08-05 15:00:18 UTC
Created attachment 107098 [details]
debug patch

Ok I think we're just making too large of an allocation when we don't need to, can you give this patch a whirl and let me know how it goes?  If it still blows up then there will be some debug info printed out so make sure to capture it.  Thanks.
Comment 16 Clemens Eisserer 2013-08-05 15:43:23 UTC
Created attachment 107099 [details]
BUG ONs with btrfs-next + debug-patch
Comment 17 Josef Bacik 2013-08-07 15:18:44 UTC
Created attachment 107138 [details]
debug patch

Ugh sorry I'm such a bozo I attached the wrong damned debug patch, can you remove the previous patch and apply this one instead and reproduce?  Sorry about that.
Comment 18 Clemens Eisserer 2013-08-07 15:53:24 UTC
Created attachment 107139 [details]
syslog output with latest patch

unfortunately, still no luck with the new patch
Comment 19 Josef Bacik 2013-08-08 17:27:09 UTC
Actually it did spit what I needed out so that was perfect, it does seem like we are just wanting to make that large of an allocation, so I need to check and see if we really should be keeping track of this many things or if we need to make our allocations differently.  Can you run btrfsck on this fs (no options) just to verify theres not some sort of weird extent tree corruption that's causing the backref walking stuff to blow up?  In the meantime I will be looking at how to reduce the allocations and hopefully have a patch for you in a day or so.
Comment 20 Clemens Eisserer 2013-08-08 18:00:57 UTC
Created attachment 107158 [details]
btrfsck log

seems there is some corruption of the checksums
Comment 21 Josef Bacik 2013-08-09 21:33:40 UTC
Created attachment 107169 [details]
possible fix

Ok this changes how we allocate nodes for the ulist and hopefully will fix your problem.  Now I've tested this as best I can, but you are already an extreme case so this may panic, and if it does just capture the output so I can fix my patch.
Comment 22 Josef Bacik 2013-08-12 19:49:58 UTC
Hold on, zach has found a fatal flaw in this plan, I need to fix it up a bit.
Comment 23 Josef Bacik 2013-09-03 20:24:58 UTC
Created attachment 107408 [details]
possible fix

Sorry it took me so long to get back to this, please give this a shot and let me know if you still have issues.
Comment 24 Clemens Eisserer 2013-09-06 17:12:54 UTC
On a quite full fs I got the following stack trace.
I'll trying to increase free space and report back...


Sep  6 18:57:41 localhost kernel: [ 2453.818749] btrfs-endio-wri: page allocation failure: order:4, mode:0x104050
Sep  6 18:57:41 localhost kernel: [ 2453.818753] CPU: 0 PID: 5149 Comm: btrfs-endio-wri Tainted: G        W    3.10.0+ #1
Sep  6 18:57:41 localhost kernel: [ 2453.818754] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Sep  6 18:57:41 localhost kernel: [ 2453.818755]  0000000000000000 ffff880100e937a0 ffffffff83746003 ffff880100e93828
Sep  6 18:57:41 localhost kernel: [ 2453.818757]  ffffffff811cff80 ffff88011ffe9b00 0000000000000004 0000000000104050
Sep  6 18:57:41 localhost kernel: [ 2453.818759]  ffff880100e93828 ffffffff8373e68b 0010405000000010 ffff88011ffe8000
Sep  6 18:57:41 localhost kernel: [ 2453.818761] Call Trace:
Sep  6 18:57:41 localhost kernel: [ 2453.818777]  [<ffffffff83746003>] dump_stack+0x19/0x1b
Sep  6 18:57:41 localhost kernel: [ 2453.818781]  [<ffffffff811cff80>] warn_alloc_failed+0xf0/0x160
Sep  6 18:57:41 localhost kernel: [ 2453.818783]  [<ffffffff8373e68b>] ? __alloc_pages_direct_compact+0xac/0x196
Sep  6 18:57:41 localhost kernel: [ 2453.818785]  [<ffffffff811d3fc1>] __alloc_pages_nodemask+0x811/0xa10
Sep  6 18:57:41 localhost kernel: [ 2453.818788]  [<ffffffff81210859>] alloc_pages_current+0xa9/0x170
Sep  6 18:57:41 localhost kernel: [ 2453.818790]  [<ffffffff811d00f4>] __get_free_pages+0x14/0x50
Sep  6 18:57:41 localhost kernel: [ 2453.818792]  [<ffffffff81218f6e>] kmalloc_order_trace+0x2e/0xa0
Sep  6 18:57:41 localhost kernel: [ 2453.818795]  [<ffffffff816d61a1>] ? ulist_free+0x31/0x50
Sep  6 18:57:41 localhost kernel: [ 2453.818797]  [<ffffffff8121bc61>] __kmalloc_track_caller+0x221/0x240
Sep  6 18:57:41 localhost kernel: [ 2453.818799]  [<ffffffff816d638c>] ? ulist_add_merge+0x1cc/0x290
Sep  6 18:57:41 localhost kernel: [ 2453.818801]  [<ffffffff811e52df>] krealloc+0x4f/0xa0
Sep  6 18:57:41 localhost kernel: [ 2453.818803]  [<ffffffff816d638c>] ulist_add_merge+0x1cc/0x290
Sep  6 18:57:41 localhost kernel: [ 2453.818805]  [<ffffffff816d4a14>] find_parent_nodes+0x8d4/0xd50
Sep  6 18:57:41 localhost kernel: [ 2453.818807]  [<ffffffff816d4f49>] btrfs_find_all_roots+0x99/0x100
Sep  6 18:57:41 localhost kernel: [ 2453.818810]  [<ffffffff816800a0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0
Sep  6 18:57:41 localhost kernel: [ 2453.818812]  [<ffffffff816d58cf>] iterate_extent_inodes+0x14f/0x320
Sep  6 18:57:41 localhost kernel: [ 2453.818814]  [<ffffffff816800a0>] ? btrfs_submit_bio_hook+0x1e0/0x1e0
Sep  6 18:57:41 localhost kernel: [ 2453.818816]  [<ffffffff816d5b27>] iterate_inodes_from_logical+0x87/0xa0
Sep  6 18:57:41 localhost kernel: [ 2453.818818]  [<ffffffff8167e5ef>] record_extent_backrefs+0x7f/0xe0
Sep  6 18:57:41 localhost kernel: [ 2453.818820]  [<ffffffff816889ff>] btrfs_finish_ordered_io+0x1df/0xab0
Sep  6 18:57:41 localhost kernel: [ 2453.818823]  [<ffffffff8376b500>] ? schedule_timeout+0xf0/0x2c0
Sep  6 18:57:41 localhost kernel: [ 2453.818825]  [<ffffffff816894e5>] finish_ordered_fn+0x15/0x20
Sep  6 18:57:41 localhost kernel: [ 2453.818828]  [<ffffffff816a9dbb>] worker_loop+0x12b/0x500
Sep  6 18:57:41 localhost kernel: [ 2453.818836]  [<ffffffff816a9c90>] ? btrfs_queue_worker+0x300/0x300
Sep  6 18:57:41 localhost kernel: [ 2453.818840]  [<ffffffff81126110>] kthread+0xc0/0xd0
Sep  6 18:57:41 localhost kernel: [ 2453.818842]  [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40
Sep  6 18:57:41 localhost kernel: [ 2453.818844]  [<ffffffff8376f6ac>] ret_from_fork+0x7c/0xb0
Sep  6 18:57:41 localhost kernel: [ 2453.818846]  [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40
Sep  6 18:57:41 localhost kernel: [ 2453.818847] Mem-Info:
Sep  6 18:57:41 localhost kernel: [ 2453.818848] Node 0 DMA per-cpu:
Sep  6 18:57:41 localhost kernel: [ 2453.818850] CPU    0: hi:    0, btch:   1 usd:   0
Sep  6 18:57:41 localhost kernel: [ 2453.818850] Node 0 DMA32 per-cpu:
Sep  6 18:57:41 localhost kernel: [ 2453.818852] CPU    0: hi:  186, btch:  31 usd:   0
Sep  6 18:57:41 localhost kernel: [ 2453.818852] Node 0 Normal per-cpu:
Sep  6 18:57:41 localhost kernel: [ 2453.818854] CPU    0: hi:  186, btch:  31 usd:   0
Sep  6 18:57:41 localhost kernel: [ 2453.818856] active_anon:60702 inactive_anon:1654 isolated_anon:0
Sep  6 18:57:41 localhost kernel: [ 2453.818856]  active_file:80430 inactive_file:103409 isolated_file:0
Sep  6 18:57:41 localhost kernel: [ 2453.818856]  unevictable:8 dirty:2248 writeback:0 unstable:0
Sep  6 18:57:41 localhost kernel: [ 2453.818856]  free:41518 slab_reclaimable:22460 slab_unreclaimable:12928
Sep  6 18:57:41 localhost kernel: [ 2453.818856]  mapped:4617 shmem:2294 pagetables:4299 bounce:0
Sep  6 18:57:41 localhost kernel: [ 2453.818856]  free_cma:0
Sep  6 18:57:41 localhost kernel: [ 2453.818859] Node 0 DMA free:15808kB min:268kB low:332kB high:400kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:0kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15992kB managed:15908kB mlocked:0kB dirty:0kB writeback:0kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:4kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? yes
Sep  6 18:57:41 localhost kernel: [ 2453.818862] lowmem_reserve[]: 0 3430 3877 3877
Sep  6 18:57:41 localhost kernel: [ 2453.818864] Node 0 DMA32 free:138196kB min:59548kB low:74432kB high:89320kB active_anon:128596kB inactive_anon:5912kB active_file:316372kB inactive_file:408068kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3653568kB managed:3512740kB mlocked:0kB dirty:892kB writeback:0kB mapped:15672kB shmem:5944kB slab_reclaimable:66028kB slab_unreclaimable:7912kB kernel_stack:656kB pagetables:6756kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep  6 18:57:41 localhost kernel: [ 2453.818867] lowmem_reserve[]: 0 0 447 447
Sep  6 18:57:41 localhost kernel: [ 2453.818869] Node 0 Normal free:12068kB min:7764kB low:9704kB high:11644kB active_anon:114212kB inactive_anon:704kB active_file:5348kB inactive_file:5568kB unevictable:32kB isolated(anon):0kB isolated(file):0kB present:524288kB managed:458000kB mlocked:32kB dirty:8100kB writeback:0kB mapped:2796kB shmem:3232kB slab_reclaimable:23812kB slab_unreclaimable:43796kB kernel_stack:2000kB pagetables:10440kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
Sep  6 18:57:41 localhost kernel: [ 2453.818872] lowmem_reserve[]: 0 0 0 0
Sep  6 18:57:41 localhost kernel: [ 2453.818874] Node 0 DMA: 0*4kB 0*8kB 0*16kB 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB (U) 1*2048kB (R) 3*4096kB (M) = 15808kB
Sep  6 18:57:41 localhost kernel: [ 2453.818880] Node 0 DMA32: 16537*4kB (UEM) 6001*8kB (UEM) 1216*16kB (UEM) 16*32kB (UEM) 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 1*4096kB (R) = 138220kB
Sep  6 18:57:41 localhost kernel: [ 2453.818885] Node 0 Normal: 2327*4kB (UEM) 87*8kB (M) 2*16kB (M) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB (R) 0*4096kB = 12084kB
Sep  6 18:57:41 localhost kernel: [ 2453.818891] Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Sep  6 18:57:41 localhost kernel: [ 2453.818892] 186155 total pagecache pages
Sep  6 18:57:41 localhost kernel: [ 2453.818893] 0 pages in swap cache
Sep  6 18:57:41 localhost kernel: [ 2453.818894] Swap cache stats: add 0, delete 0, find 0/0
Sep  6 18:57:41 localhost kernel: [ 2453.818895] Free swap  = 0kB
Sep  6 18:57:41 localhost kernel: [ 2453.818895] Total swap = 0kB
Sep  6 18:57:41 localhost kernel: [ 2453.830851] 1048575 pages RAM
Sep  6 18:57:41 localhost kernel: [ 2453.830853] 48865 pages reserved
Sep  6 18:57:41 localhost kernel: [ 2453.830854] 483457 pages shared
Sep  6 18:57:41 localhost kernel: [ 2453.830855] 808991 pages non-shared
Sep  6 18:57:41 localhost kernel: [ 2453.832908] ------------[ cut here ]------------
Sep  6 18:57:41 localhost kernel: [ 2453.832911] kernel BUG at fs/btrfs/inode.c:2136!
Sep  6 18:57:41 localhost kernel: [ 2453.832913] invalid opcode: 0000 [#1] SMP 
Sep  6 18:57:41 localhost kernel: [ 2453.832916] CPU: 0 PID: 5149 Comm: btrfs-endio-wri Tainted: G        W    3.10.0+ #1
Sep  6 18:57:41 localhost kernel: [ 2453.832917] Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
Sep  6 18:57:41 localhost kernel: [ 2453.832918] task: ffff8800d61c4f40 ti: ffff880100e92000 task.ti: ffff880100e92000
Sep  6 18:57:41 localhost kernel: [ 2453.832919] RIP: 0010:[<ffffffff8167e640>]  [<ffffffff8167e640>] record_extent_backrefs+0xd0/0xe0
Sep  6 18:57:41 localhost kernel: [ 2453.832925] RSP: 0018:ffff880100e93d00  EFLAGS: 00010202
Sep  6 18:57:41 localhost kernel: [ 2453.832926] RAX: 0000000000000001 RBX: ffff88011744c000 RCX: 00000000003566c8
Sep  6 18:57:41 localhost kernel: [ 2453.832927] RDX: 00000000003566c7 RSI: 000000000001afb0 RDI: ffff880116c33f00
Sep  6 18:57:41 localhost kernel: [ 2453.832928] RBP: ffff880100e93d38 R08: 000000000001afb0 R09: ffff88011fc1afb0
Sep  6 18:57:41 localhost kernel: [ 2453.832929] R10: ffffffff8167da4c R11: ffffea00040b5180 R12: ffff8800c8011128
Sep  6 18:57:41 localhost kernel: [ 2453.832930] R13: ffff880102ce5240 R14: ffff8800c8011128 R15: ffff88011744c000
Sep  6 18:57:41 localhost kernel: [ 2453.832931] FS:  0000000000000000(0000) GS:ffff88011fc00000(0000) knlGS:0000000000000000
Sep  6 18:57:41 localhost kernel: [ 2453.832932] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
Sep  6 18:57:41 localhost kernel: [ 2453.832933] CR2: 00007fc1682ea3a0 CR3: 0000000117451000 CR4: 00000000000006f0
Sep  6 18:57:41 localhost kernel: [ 2453.832939] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Sep  6 18:57:41 localhost kernel: [ 2453.832940] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Sep  6 18:57:41 localhost kernel: [ 2453.832941] Stack:
Sep  6 18:57:41 localhost kernel: [ 2453.832941]  ffff8800c8011120 ffff88010f7a4000 ffff8800aea85400 ffff880102d46140
Sep  6 18:57:41 localhost kernel: [ 2453.832943]  ffff880102ce5240 ffff8800c8011120 ffff8800aa53bd60 ffff880100e93e18
Sep  6 18:57:41 localhost kernel: [ 2453.832945]  ffffffff816889ff 0000000000008000 0000000000000002 ffff880100000001
Sep  6 18:57:41 localhost kernel: [ 2453.832946] Call Trace:
Sep  6 18:57:41 localhost kernel: [ 2453.832950]  [<ffffffff816889ff>] btrfs_finish_ordered_io+0x1df/0xab0
Sep  6 18:57:41 localhost kernel: [ 2453.832955]  [<ffffffff8376b500>] ? schedule_timeout+0xf0/0x2c0
Sep  6 18:57:41 localhost kernel: [ 2453.832957]  [<ffffffff816894e5>] finish_ordered_fn+0x15/0x20
Sep  6 18:57:41 localhost kernel: [ 2453.832960]  [<ffffffff816a9dbb>] worker_loop+0x12b/0x500
Sep  6 18:57:41 localhost kernel: [ 2453.832961]  [<ffffffff816a9c90>] ? btrfs_queue_worker+0x300/0x300
Sep  6 18:57:41 localhost kernel: [ 2453.832974]  [<ffffffff81126110>] kthread+0xc0/0xd0
Sep  6 18:57:41 localhost kernel: [ 2453.833225]  [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40
Sep  6 18:57:41 localhost kernel: [ 2453.833228]  [<ffffffff8376f6ac>] ret_from_fork+0x7c/0xb0
Sep  6 18:57:41 localhost kernel: [ 2453.833230]  [<ffffffff81126050>] ? insert_kthread_work+0x40/0x40
Sep  6 18:57:41 localhost kernel: [ 2453.833231] Code: 75 b4 48 8b 45 c8 48 8b 40 08 49 39 c4 0f 95 c0 48 83 c4 10 5b 41 5c 41 5d 41 5e 41 5f 5d c3 4c 89 e0 eb e6 c1 e8 1f 84 c0 74 b4 <0f> 0b 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 
Sep  6 18:57:41 localhost kernel: [ 2453.833246] RIP  [<ffffffff8167e640>] record_extent_backrefs+0xd0/0xe0
Sep  6 18:57:41 localhost kernel: [ 2453.833248]  RSP <ffff880100e93d00>
Sep  6 18:57:41 localhost kernel: [ 2453.833250] ---[ end trace ef2a8b050fb82f98 ]---
Comment 25 Josef Bacik 2013-09-09 18:16:09 UTC
Those warnings are coming from krealloc() which I killed with my most recent patch.  Please make sure you have applied my patch and rebuilt and installed the new module and re-test, it looks like you are still using the old one.
Comment 26 Mitch Harder 2013-09-30 17:30:22 UTC
Created attachment 110071 [details]
Syslog Errors with autodefrag

I think I'm also hitting this issue on a 3.10.12 kernel that has the current btrfs patches back-ported through the latest for-linus branch (for the 3.12_rc kernel).

I've applied the patch in attachment 107408 [details], and this has cleared the errors and warnings, but I am still getting Oom errors.

I was able to get one series of Alt-SysReq-W before the Oom-killer began.

I've attached the output of the Alt-SysReq-W as well as the top few Oom messages.

I am using autodefrag, and compress-force=lzo in conjunction with a snapshot test.
Comment 27 David Sterba 2022-09-30 15:01:29 UTC
This is a semi-automated bugzilla cleanup, report is against an old kernel version. If the problem still happens, please open a new bug. Thanks.

Note You need to log in before you can comment on or make changes to this bug.