Bug 116331 - kernel BUG at fs/btrfs/inode.c:1828! RIP: btrfs_merge_bio_hook+0x8b/0xa0 [btrfs]
Summary: kernel BUG at fs/btrfs/inode.c:1828! RIP: btrfs_merge_bio_hook+0x8b/0xa0 [btrfs]
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2016-04-15 04:56 UTC by Chris Murphy
Modified: 2016-04-17 20:50 UTC (History)
0 users

See Also:
Kernel Version: 4.5.0-302.fc24.x86_64
Tree: Fedora
Regression: No


Attachments
dmesg1.txt (44.72 KB, text/plain)
2016-04-15 04:56 UTC, Chris Murphy
Details
btrfs-show-super -fa (6.18 KB, text/plain)
2016-04-15 05:08 UTC, Chris Murphy
Details
guestOS_journal_filteredbtrfs.txt (8.69 KB, text/plain)
2016-04-15 06:19 UTC, Chris Murphy
Details

Description Chris Murphy 2016-04-15 04:56:43 UTC
Created attachment 212771 [details]
dmesg1.txt

Summary:

File system created in a libvirt VM with btrfs-progs 4.4.1 and written with kernel 4.4.3. During YaST deletion of some unneeded packages there was a total system hang. Forced quit the VM and reboot, and I get a crash with a bunch of btrfs messages.

Tried to mount with a Fedora 24 live in order to use kernel 4.5.0; that's what the call track attached is based on, that's dmesg1.txt.

# btrfs check /dev/vda2
Couldn't open file system

# btrfs --version
btrfs-progs v4.4.1
Comment 1 Chris Murphy 2016-04-15 04:59:29 UTC
# btrfs-image -c 9 -t4 /dev/vda2 116331.btrfs.image
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
Error going to next leaf -5
create failed (Success)

No file is created. -w has no effect.
Comment 2 Chris Murphy 2016-04-15 05:08:35 UTC
Created attachment 212781 [details]
btrfs-show-super -fa
Comment 3 Chris Murphy 2016-04-15 05:18:58 UTC
# btrfs-debug-tree /dev/vda2 > 116331.btrfsdebugtree.txt
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
failed to read 7778729984 in tree 2
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
Csum didn't match
failed to read 7769063424 in tree 349
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000


This produces a 622MB file, 52MB when xz compressed.
https://drive.google.com/open?id=0B_2Asp8DGjJ9N1hVbjJNZUNYcGc
Comment 4 Chris Murphy 2016-04-15 05:20:53 UTC
After rebooting.

# btrfs check /dev/vda2
Checking filesystem on /dev/vda2
UUID: aebc2805-f109-4c59-8de9-9008b3f6abc4
checking extents
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
Csum didn't match
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Errors found in extent allocation tree or chunk allocation
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0

# btrfs check --repair /dev/vda2
enabling repair mode
Checking filesystem on /dev/vda2
UUID: aebc2805-f109-4c59-8de9-9008b3f6abc4
checking extents
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
Csum didn't match
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Errors found in extent allocation tree or chunk allocation
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0


# btrfs check --repair --init-extent-tree /dev/vda2
enabling repair mode
Checking filesystem on /dev/vda2
UUID: aebc2805-f109-4c59-8de9-9008b3f6abc4
Creating a new extent tree
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Error reading root block
error pinning down used bytes
transaction.h:42: btrfs_start_transaction: Assertion `fs_info->running_transaction` failed.
btrfs(+0x432ca)[0x562bd46bf2ca]
btrfs(+0x13a4c)[0x562bd468fa4c]
btrfs(close_ctree_fs_info+0x1f7)[0x562bd46c1557]
btrfs(cmd_check+0x3e0)[0x562bd46aade0]
btrfs(main+0x7d)[0x562bd468fc2d]
/lib64/libc.so.6(__libc_start_main+0xf1)[0x7fb8e06bd721]
btrfs(_start+0x29)[0x562bd468fd39]
Comment 5 Chris Murphy 2016-04-15 05:25:20 UTC
Upgraded:
  btrfs-progs.x86_64 4.5.1-1.fc25                                                                    
Complete!

[root@localhost ~]# btrfs check /dev/vda2
Checking filesystem on /dev/vda2
UUID: aebc2805-f109-4c59-8de9-9008b3f6abc4
checking extents
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
Csum didn't match
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Errors found in extent allocation tree or chunk allocation
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
[root@localhost ~]# btrfs check --repair /dev/vda2
enabling repair mode
Checking filesystem on /dev/vda2
UUID: aebc2805-f109-4c59-8de9-9008b3f6abc4
checking extents
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
checksum verify failed on 7769063424 found F44FC547 wanted D967BF27
Csum didn't match
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Errors found in extent allocation tree or chunk allocation
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
[root@localhost ~]# btrfs check --repair --init-extent-tree /dev/vda2
enabling repair mode
Checking filesystem on /dev/vda2
UUID: aebc2805-f109-4c59-8de9-9008b3f6abc4
Creating a new extent tree
Couldn't map the block 0
Invalid mapping for 0-16384, got 12582912-20971520
Couldn't map the block 0
fsid mismatch, want=aebc2805-f109-4c59-8de9-9008b3f6abc4, have=00000000-0000-0000-0000-000000000000
Error reading root block
error pinning down used bytes
transaction.h:42: btrfs_start_transaction: Assertion `fs_info->running_transaction` failed.
btrfs(+0x48ada)[0x56212f8edada]
btrfs(+0xfa75)[0x56212f8b4a75]
btrfs(close_ctree_fs_info+0x1f7)[0x56212f8efc67]
btrfs(cmd_check+0x3e5)[0x56212f8d65f5]
btrfs(main+0x7d)[0x56212f8b4c4d]
/lib64/libc.so.6(__libc_start_main+0xf1)[0x7ff1c4c97721]
btrfs(_start+0x29)[0x56212f8b4d59]


# btrfs-image -c9 -t4 /dev/vda2 116331.btrfs.image
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
checksum verify failed on 7778729984 found E4E3BDB6 wanted 00000000
bytenr mismatch, want=7778729984, have=0
Error going to next leaf -5
create failed (Success)

No file created.
Comment 6 Chris Murphy 2016-04-15 06:19:08 UTC
Created attachment 212811 [details]
guestOS_journal_filteredbtrfs.txt

The filesystem can be mounted with -o ro,recovery, interestingly enough. So I extracted the systemd journal. This attachment filters for btrfs, and is only for April. I can't exactly match up the csum errors with the YaST2 hang while deleting packages. I don't know if Btrfs problems cause the YaST2 hang, or if there's a bug in this pre-release Tumbleweed that caused Btrfs problems. *shrug*
Comment 7 Chris Murphy 2016-04-15 06:24:10 UTC
A read-only scrub reveals this. It aborts at 1.5GiB out of 4.25GiB.

[  158.606616] BTRFS info (device vda2): enabling auto recovery
[  158.606624] BTRFS info (device vda2): disk space caching is enabled
[  158.606627] BTRFS: has skinny extents
[ 2226.555331] BTRFS error (device vda2): bad tree block start 0 7778729984
[ 2226.633472] BTRFS error (device vda2): bad tree block start 0 7778729984
[ 2226.633725] BTRFS error (device vda2): bad tree block start 0 7778729984


# btrfs fi us /mnt
Overall:
    Device size:		  29.88GiB
    Device allocated:		   5.88GiB
    Device unallocated:		  24.00GiB
    Device missing:		     0.00B
    Used:			   4.25GiB
    Free (estimated):		  25.08GiB	(min: 13.08GiB)
    Data ratio:			      1.00
    Metadata ratio:		      2.00
    Global reserve:		  64.00MiB	(used: 0.00B)

Data,single: Size:5.01GiB, Used:3.92GiB
   /dev/vda2	   5.01GiB

Metadata,DUP: Size:384.00MiB, Used:164.77MiB
   /dev/vda2	 768.00MiB

System,DUP: Size:64.00MiB, Used:16.00KiB
   /dev/vda2	 128.00MiB

Unallocated:
   /dev/vda2	  24.00GiB


So both copies of metadata are bad? Weird.
Comment 8 Chris Murphy 2016-04-17 20:50:59 UTC
QEMU disk cache set to unsafe might be part of the reason for the failure.

    <emulator>/usr/bin/qemu-kvm</emulator>
    <disk type='file' device='disk'>
      <driver name='qemu' type='qcow2' cache='unsafe'/>

In this case losing some data, even up to the last minute, is expected. But shouldn't the filesystem be able to roll itself back to a known good state?

Note You need to log in before you can comment on or make changes to this bug.