Bug 155791

Summary: progs 4.7.x, numerous incorrect backrefs are not fixed with --repair
Product: File System Reporter: Chris Murphy (bugzilla)
Component: btrfsAssignee: Josef Bacik (josef)
Status: RESOLVED OBSOLETE    
Severity: normal CC: andreas.reis, calestyo, dsterba, mark, ojab, wqu
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.7.2-200.fc24.x86_64 Subsystem:
Regression: No Bisected commit-id:
Attachments: btrfs check, before repair
btrfs check --repair
btrfs check, after repair
btrfs show super, before repair
"Incorrect local backref" with 4.7.2 check --repair
btrfs check repair 4.7.2

Description Chris Murphy 2016-09-01 23:22:11 UTC
btrfs-progs-4.7.1-1.fc26.x86_64

Two Btrfs volumes have no problems mounting or scrubbing or with btrfs check with btrfs-progs 4.6.1. However both btrfs-progs 4.7 and 4.7.1 report numerous, various, incorrect backref problems, and --repair claims to fix them, but a subsequent check shows most are not fixed.
Comment 1 Chris Murphy 2016-09-01 23:26:11 UTC
Created attachment 231741 [details]
btrfs check, before repair

--repair not run on this volume
Comment 2 Chris Murphy 2016-09-01 23:27:00 UTC
Created attachment 231751 [details]
btrfs check --repair

This is the entire output from --repair.
Comment 3 Chris Murphy 2016-09-01 23:28:52 UTC
Created attachment 231761 [details]
btrfs check, after repair

This is actually after --repair, followed by a mount (which results in no kernel messages at mount time but the file system has not really been exercised since repair).
Comment 4 Chris Murphy 2016-09-01 23:29:37 UTC
Created attachment 231771 [details]
btrfs show super, before repair

btrfs-show-super -fa, before doing --repair
Comment 5 Chris Murphy 2016-09-01 23:30:44 UTC
btrfs-debug-tree output to a file then gzipped, is 157MiB.
https://drive.google.com/open?id=0B_2Asp8DGjJ9ekJDeGxoTE5sbFk
Comment 6 Chris Murphy 2016-09-01 23:35:25 UTC
OK disregard comment 5! Sorry, that's a totally different debug tree.

btrfs-debug tree output to a file, gzipped, is 157MiB
https://drive.google.com/open?id=0B_2Asp8DGjJ9S2JIQzB6c1d3R3c

btrfs-image -c9 -t4 -s, is 445MiB
https://drive.google.com/open?id=0B_2Asp8DGjJ9WFJFVzh3Nkx1c3M

Note that both of these files are taken before --repair was run.
Comment 7 Chris Murphy 2016-09-01 23:50:51 UTC
linux-btrfs@ thread
http://www.spinics.net/lists/linux-btrfs/msg58142.html

One user has reported after --repair trying to fix this problem, subsequent check crashes with assertion. Uncertain if the file system is broken.
Comment 8 Mark Fasheh 2016-09-02 15:57:09 UTC
Hi, this looks like a regression in btrfsck. There's a similar issue filed in the duperemove tracker:

https://github.com/markfasheh/duperemove/issues/141
Comment 9 David Sterba 2016-09-03 19:09:24 UTC
Mark, https://github.com/markfasheh/btrfs-progs-patches/commit/0a9093c17e26430d4ba1da0d131fa23d95d81f1f looks good, the old code did not compare found_ref. Can you please submit the patch if it's the final version? I'm going to release 4.7.2 afterwards.
Comment 10 Mark Fasheh 2016-09-04 03:17:20 UTC
(In reply to David Sterba from comment #9)
> Mark,
> https://github.com/markfasheh/btrfs-progs-patches/commit/
> 0a9093c17e26430d4ba1da0d131fa23d95d81f1f looks good, the old code did not
> compare found_ref. Can you please submit the patch if it's the final
> version? I'm going to release 4.7.2 afterwards.

This is a more complete fixup of that portion of the code:

https://github.com/markfasheh/btrfs-progs-patches/commit/6df7f4e005c984f06a0f44d4407cbab5d40999cb

That's not a complete fix though we can build upon that patch and I strongly feel that some of the changes are necessary - we shouldn't be ignoring errors from rb_insert() for example. So if you like it, I'll gladly submit but I wouldn't base 4.7.2 on only that change.


Most of my testing by the way is to just run the duperemove smoke test:

https://github.com/markfasheh/duperemove-tests/blob/master/tests/smoke_test.sh

That'll reproduce it every time (even with my fix, though with fewer errors)
Comment 11 Mark Fasheh 2016-09-04 03:22:31 UTC
(In reply to Mark Fasheh from comment #10)
> Most of my testing by the way is to just run the duperemove smoke test:
> 
> https://github.com/markfasheh/duperemove-tests/blob/master/tests/smoke_test.
> sh
> 
> That'll reproduce it every time (even with my fix, though with fewer errors)

To be clear, I mean you reproduce by running the test then unmounting the filesystem and btrfscking it. This is not an actual disk corruption however that test seems to get our extents nicely set up to make btrfsck unhappy.
Comment 12 Qu Wenruo 2016-09-05 07:53:50 UTC
(In reply to Mark Fasheh from comment #10)
> (In reply to David Sterba from comment #9)
> > Mark,
> > https://github.com/markfasheh/btrfs-progs-patches/commit/
> > 0a9093c17e26430d4ba1da0d131fa23d95d81f1f looks good, the old code did not
> > compare found_ref. Can you please submit the patch if it's the final
> > version? I'm going to release 4.7.2 afterwards.
> 
> This is a more complete fixup of that portion of the code:
> 
> https://github.com/markfasheh/btrfs-progs-patches/commit/
> 6df7f4e005c984f06a0f44d4407cbab5d40999cb

The BUG_ON seems quite dangerous for me.
Maybe some fuzzed images can trigger the BUG_ON about disk_bytenr comparison.

IMHO We should not introduce new BUG_ON, as fuzz test has already beaten us quite hard now.
Comment 13 David Sterba 2016-09-05 10:03:12 UTC
I can reproduce the false warnings on one of my testing images (6df7f4e005c984f06a0f44d4407cbab5d40999cb). And I do not like to see newly added BUG_ON. So, I'm going to revert the offending patches and release 4.7.2.
Comment 14 Andreas Reis 2016-09-06 10:34:37 UTC
Created attachment 232251 [details]
"Incorrect local backref" with 4.7.2 check --repair

Dunno if it's the same bug since my log's a bit different, but I also keep getting a myriad of "Incorrect local backref" no matter how often I run

btrfs check --repair

The attached log is from 4.7.2, after the same command was already run a few times.

It's a root fs, and both kernel 4.7.2 and 4.8-rc5 have over the past days started to throw an error after a while which turns the fs to read-only.
Comment 15 Andreas Reis 2016-09-06 11:03:36 UTC
Here's the dmesg part from 4.8-rc5 kernel raising the error:

[ 2088.230765] BTRFS error (device dm-0): trying to drop 3 refs but we only have 2 for bytenr 12865773568
[ 2088.230772] ------------[ cut here ]------------
[ 2088.230780] WARNING: CPU: 1 PID: 3258 at fs/btrfs/extent-tree.c:7012 __btrfs_free_extent.isra.33+0xb4d/0xca0
[ 2088.230781] BTRFS: Transaction aborted (error -22)
[ 2088.230782] Modules linked in:
[ 2088.230805] CPU: 1 PID: 3258 Comm: kworker/u8:9 Tainted: G     U    IO    4.8.0-1-ck #1
[ 2088.230807] Hardware name:
[ 2088.230813] Workqueue: btrfs-extent-refs btrfs_extent_refs_helper
[ 2088.230816]  0000000000000286 000000003f90215d ffff880173b4fa88 ffffffff8144c970
[ 2088.230821]  ffff880173b4fad8 0000000000000000 ffff880173b4fac8 ffffffff810fa6fb
[ 2088.230825]  00001b6473b4faf0 00000002fedc2000 0000000000000000 ffff880210950d20
[ 2088.230830] Call Trace:
[ 2088.230837]  [<ffffffff8144c970>] dump_stack+0x63/0x83
[ 2088.230842]  [<ffffffff810fa6fb>] __warn+0xcb/0xf0
[ 2088.230846]  [<ffffffff810fa77f>] warn_slowpath_fmt+0x5f/0x80
[ 2088.230850]  [<ffffffff8133856d>] __btrfs_free_extent.isra.33+0xb4d/0xca0
[ 2088.230856]  [<ffffffff8133c620>] __btrfs_run_delayed_refs+0x5b0/0x12e0
[ 2088.230861]  [<ffffffff813403d1>] btrfs_run_delayed_refs+0xa1/0x2a0
[ 2088.230865]  [<ffffffff81340664>] delayed_ref_async_start+0x94/0xb0
[ 2088.230868]  [<ffffffff8138b8fd>] btrfs_scrubparity_helper+0x7d/0x340
[ 2088.230872]  [<ffffffff8138bbfe>] btrfs_extent_refs_helper+0xe/0x10
[ 2088.230877]  [<ffffffff81113d58>] process_one_work+0x1e8/0x480
[ 2088.230880]  [<ffffffff81114038>] worker_thread+0x48/0x4e0
[ 2088.230883]  [<ffffffff81113ff0>] ? process_one_work+0x480/0x480
[ 2088.230886]  [<ffffffff81113ff0>] ? process_one_work+0x480/0x480
[ 2088.230888]  [<ffffffff811199b8>] kthread+0xd8/0xf0
[ 2088.230892]  [<ffffffff81087772>] ? __switch_to+0x2d2/0x630
[ 2088.230896]  [<ffffffff81a9fc7f>] ret_from_fork+0x1f/0x40
[ 2088.230899]  [<ffffffff811198e0>] ? kthread_worker_fn+0x170/0x170
[ 2088.230900] ---[ end trace dbcac85c9567c22f ]---
[ 2088.230903] BTRFS: error (device dm-0) in __btrfs_free_extent:7012: errno=-22 unknown
[ 2088.230908] BTRFS info (device dm-0): forced readonly
[ 2088.230910] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2960: errno=-22 unknown
[ 2088.232663] pending csums is 17444864
Comment 16 Andreas Reis 2016-09-13 15:44:35 UTC
So am I just lucky enough to have a irreparably broken partition or …?
Comment 17 Chris Murphy 2016-09-13 21:02:11 UTC
Still a problem with btrfs-progs-4.7.2-1.fc26.x86_64 although this may be an artifact of the (bad?) repair from 4.7.1?

[chris@f24s ~]$ sudo btrfs check /dev/mapper/brick1 
Checking filesystem on /dev/mapper/brick1
UUID: 30f4724a-9a9f-4a5d-a37d-3e53876bf2e1
checking extents
ref mismatch on [284858793984 134217728] extent item 8, found 9
Incorrect local backref count on 284858793984 root 659 owner 188824 offset 422068224 found 3 wanted 2 back 0x55927e869750
backpointer mismatch on [284858793984 134217728]
ref mismatch on [285446881280 134217728] extent item 8, found 9
Incorrect local backref count on 285446881280 root 659 owner 188824 offset 1010155520 found 3 wanted 2 back 0x55927e86a710
backpointer mismatch on [285446881280 134217728]
ref mismatch on [285913120768 134217728] extent item 9, found 11
Incorrect local backref count on 285913120768 root 659 owner 188824 offset 1462910976 found 4 wanted 2 back 0x55927e86c1c0
backpointer mismatch on [285913120768 134217728]
ref mismatch on [286852644864 134217728] extent item 8, found 9
Incorrect local backref count on 286852644864 root 659 owner 188824 offset 2357133312 found 3 wanted 2 back 0x55927f2263a0
backpointer mismatch on [286852644864 134217728]
ref mismatch on [287165235200 134217728] extent item 14, found 15
Incorrect local backref count on 287165235200 root 659 owner 188824 offset 2669723648 found 3 wanted 2 back 0x55927f226ff0
backpointer mismatch on [287165235200 134217728]
ref mismatch on [287433670656 134217728] extent item 9, found 11
Incorrect local backref count on 287433670656 root 659 owner 188824 offset 2938159104 found 4 wanted 2 back 0x55927f2275b0
backpointer mismatch on [287433670656 134217728]
ref mismatch on [287792168960 134217728] extent item 10, found 11
Incorrect local backref count on 287792168960 root 659 owner 188824 offset 3212296192 found 3 wanted 2 back 0x55927f2287f0
backpointer mismatch on [287792168960 134217728]
ref mismatch on [287926386688 134217728] extent item 7, found 8
Incorrect local backref count on 287926386688 root 659 owner 188824 offset 3346513920 found 3 wanted 2 back 0x55927f228bd0
backpointer mismatch on [287926386688 134217728]
ref mismatch on [454773628928 8388608] extent item 6, found 8
Incorrect local backref count on 454773628928 root 659 owner 204210 offset 0 found 4 wanted 2 back 0x559280de9ac0
backpointer mismatch on [454773628928 8388608]
ref mismatch on [455505809408 8388608] extent item 79, found 165
Incorrect local backref count on 455505809408 root 659 owner 204212 offset 0 found 88 wanted 2 back 0x559280e1a190
backpointer mismatch on [455505809408 8388608]
ref mismatch on [455564529664 8388608] extent item 6, found 8
Incorrect local backref count on 455564529664 root 659 owner 204222 offset 0 found 4 wanted 2 back 0x559280e1ace0
backpointer mismatch on [455564529664 8388608]
checking free space cache
checking fs roots
checking csums
checking root refs
found 445291180032 bytes used err is 0
total csum bytes: 434113644
total tree bytes: 649756672
total fs tree bytes: 161316864
total extent tree bytes: 18366464
btree space waste bytes: 69497799
file data blocks allocated: 1054615560192
 referenced 1029668896768


I haven't tried --repair.
Comment 18 Chris Murphy 2016-09-13 21:29:11 UTC
Created attachment 233341 [details]
btrfs check repair 4.7.2

I have a btrfs-image of the unmodified file system in comment 17, let me know if this is desired for the image stash. I did a repair, followed up with check which comes up clean (both attached), and a subsequent mount comes up with no errors.
Comment 19 Mark Fasheh 2016-09-13 23:20:40 UTC
This hardly matters at this point, but that bytenr BUG_ON is exactly the type of hard logic check that btrfsck is missing.

Without fixing the underlying cause, which is poor error handling, we are inviting situations where an otherwise well meaning patch might accidentally cause corruption of users file systems when it does things that 'shouldn't happen' but we never check for, like inserting identical items into a tree.


In particular, the bytenr check is in the rbtree compare function which doesn't even get the opportunity to report errors. So at that point we're pretty well boned if we just silently continue. I would venture to guess that most users would prefer btrfsck to crash than silently corrupt if it encounters a situation such as I have described.
Comment 20 Mark Fasheh 2016-09-14 00:07:30 UTC
Btw, I should add that I understated the capability of btrfsck's error handling, it's generally pretty good though the patch (and area of code) in question is lacking in that area. That said, my point still stands - we can't let the dogma of 'No BUG_ONs' get in the way of common-sense logic checks to avoid situations that would be very bad for the user.
Comment 21 Christoph Anton Mitterer 2016-09-14 00:11:15 UTC
Confirming, that the newly released version no longer triggers the false positives on my fs.
thx! :)
Comment 22 Andreas Reis 2016-09-15 10:00:09 UTC
Sorry to be that guy *again*, but… do I understand this right, my btrfs FS is trashed and I need to reinstall, rather than wait further for a fix to btrfsck?
Comment 23 Qu Wenruo 2016-09-16 00:05:51 UTC
(In reply to Andreas Reis from comment #22)
> Sorry to be that guy *again*, but… do I understand this right, my btrfs FS
> is trashed and I need to reinstall, rather than wait further for a fix to
> btrfsck?

Not yet.

Use '--init-extent-tree' option to execute btrfsck (of course with version 4.7.2) should fix it.
As only extent tree is transhed.

Thanks,
Qu
Comment 24 Andreas Reis 2016-09-16 16:59:56 UTC
Thanks, it finally stopped (and I just booted from it). After 10+ hours for a 128GB SSD (840 Evo) partition, with no indication that it would actually finish.

At the end I did figure that the read-only check log which I posted above – and which completed within a minute – contained a max 'Backref 143118163968'. The repair run's actual numbers (exponentially more so in total) flew by way too fast, but they did give some hint – without knowing if it was a genuine one – to the ETA (actually it was around 149.5b).

Still, given that the FS was still readable: Next time I'll just backup and reformat. Vastly more economical in every possible aspect.

Two more notes:

1. If the printed messages from btrfs check were aligned to some fixed line number count, one could at least read them.
2. I seem to get "cache and super generation don't match" no matter what I do, but never mind.
Comment 25 David Sterba 2016-09-27 12:14:42 UTC
(In reply to Mark Fasheh from comment #20)
> Btw, I should add that I understated the capability of btrfsck's error
> handling, it's generally pretty good though the patch (and area of code) in
> question is lacking in that area. That said, my point still stands - we
> can't let the dogma of 'No BUG_ONs' get in the way of common-sense logic
> checks to avoid situations that would be very bad for the user.

If you want to call it dogma, whatever. Throwing BUG_ONs everywhere is not the best way to handle errors, IMO. If the condition depends on external input, it must be handled in a better way, eg. print a message describing the condition, cleanup/rollback changes and then return with an error code that gets propagated upwards.