Bug 14589
Summary: | btrfs-related crash dump | ||
---|---|---|---|
Product: | File System | Reporter: | Peter Teoh (htmldeveloper) |
Component: | btrfs | Assignee: | fs_btrfs (fs_btrfs) |
Status: | RESOLVED OBSOLETE | ||
Severity: | blocking | CC: | chris.mason, josef, kairo |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32-rc5 / 2.6.32-rc6 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
.config used for compilation
gzip image of partial output of dmesg |
Description
Peter Teoh
2009-11-12 03:27:07 UTC
Created attachment 23754 [details]
.config used for compilation
Created attachment 23755 [details]
gzip image of partial output of dmesg
this is taken partially from /var/log/messages (which is much larger than this), as dmesg has already overwritten itself, and is only a sublist of the following attachment.
Thanks for this bug report. Could you please provide the steps to setup your config in script form? I'd like to make sure I understand exactly what commands you ran. Extracting from my bash history, this is the approximate list of command issued: dd if=/dev/zero of=btrfs_disk.dat bs=1024 count=1024000 mkfs.btrfs btrfs_disk.dat file btrfs_disk.dat blkid btrfs_disk.dat df mkdir /mnt/btrfs mount btrfs_disk.dat /mnt/btrfs/ mount -t btrfs btrfs_disk.dat /mnt/btrfs/ mount -t btrfs btrfs_disk.dat /mnt/btrfs/ -o loop losetup /dev/loop0 btrfs_disk.dat mount /dev/loop0 /mnt/btrfs/ mount -t btrfs /dev/loop0 /mnt/btrfs (the error could arise from the 2nd last line - mounting an already mounted partition? crash console messages seemed to be generated before the last line, although it does execute, but after executing the entire system hanged - this is specific for v2.6.32-rc6-346-gaa021ba (linus-git-tree): git describe v2.6.32-rc6-346-gaa021ba CONFIG_BTRFS_FS=y CONFIG_BTRFS_FS_POSIX_ACL=y But now, after rebooting without btrfs partition mounted, I immediately do a mount and got the following errors at the mount command (mount -t btrfs /dev/sda7 /sda7, and only one single command was issued): kernel BUG at fs/btrfs/extent-tree.c:3541! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/0000:04:0b.0/local_cpus CPU 0 Modules linked in: scsi_wait_scan Pid: 8282, comm: mount Not tainted 2.6.32-rc6-00346-gaa021ba #4 System Product Name RIP: 0010:[<ffffffff81349f3a>] [<ffffffff81349f3a>] btrfs_pin_extent+0x37/0xce RSP: 0018:ffff8800949579c8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800b9a7aa78 RDX: 0000000000840083 RSI: 0000000000000507 RDI: ffff8800847540a8 RBP: ffff8800949579f8 R08: ffff8800847540a8 R09: 0000000000000000 R10: ffff880094957918 R11: 0000000000000001 R12: 0000000000001000 R13: 01ffffffffffffff R14: fa00000000000000 R15: 0000000000000000 FS: 00007f44aa1a87d0(0000) GS:ffff880008c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00000000017db208 CR3: 00000000b840c000 CR4: 00000000000006b0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 Process mount (pid: 8282, threadinfo ffff880094956000, task ffff8800b9a7aa40) Stack: ffff880084754000 ffff880080449290 ffff880094957b08 01ffffffffffffff <0> ffff8800893b80d8 ffff88008a3961b0 ffff880094957a28 ffffffff81380c2e <0> ffff8800893b80d8 ffff880080449290 ffff880094957acc 01ffffffffffffff Call Trace: [<ffffffff81380c2e>] process_one_buffer+0x3a/0x73 [<ffffffff8137e34d>] walk_down_log_tree+0x14c/0x3c3 [<ffffffff8137e646>] walk_log_tree+0x82/0x18c [<ffffffff8137f35a>] btrfs_recover_log_trees+0x178/0x27d [<ffffffff81380bf4>] ? process_one_buffer+0x0/0x73 [<ffffffff81354428>] ? btree_read_extent_buffer_pages+0x6f/0xac [<ffffffff81357ce3>] open_ctree+0x104f/0x12b2 [<ffffffff8140deb5>] ? vsnprintf+0x3fc/0x43b [<ffffffff81131cb6>] ? set_anon_super+0x0/0xec [<ffffffff81340460>] btrfs_get_sb+0x298/0x489 [<ffffffff81132bdd>] vfs_kern_mount+0xa2/0x15d [<ffffffff81132cff>] do_kern_mount+0x4c/0xec [<ffffffff8114bdfa>] do_mount+0x704/0x76a [<ffffffff81108b72>] ? strndup_user+0x62/0x8a [<ffffffff8114bee4>] sys_mount+0x84/0xc0 [<ffffffff810c9636>] ? audit_syscall_entry+0x119/0x145 [<ffffffff8103612b>] system_call_fastpath+0x16/0x1b Code: 53 48 83 ec 08 e8 b7 bf ce ff 48 8b bf 60 01 00 00 49 89 f6 49 89 d4 41 89 cf 48 89 7d d0 e8 83 dd ff ff 48 85 c0 48 89 c3 75 04 <0f> 0b eb fe 48 8b b8 c8 00 00 00 4c 8d 6b 38 48 81 c7 28 01 00 RIP [<ffffffff81349f3a>] btrfs_pin_extent+0x37/0xce RSP <ffff8800949579c8> ---[ end trace e7f075c327663f12 ]--- Yes, I think the partition was corrupted from the previous testing, which was able to mount successfully though. So 2nd reboot gave the same error: btrfs: sda7 checksum verify failed on 56213504 wanted 4E62BB6A found 3B389F1F level 1 btrfs: sda7 checksum verify failed on 56213504 wanted 4E62BB6A found 3B389F1F level 1 btrfs: sda7 checksum verify failed on 56213504 wanted 4E62BB6A found 3B389F1F level 1 parent transid verify failed on 56217600 wanted 72057594037927937 found 45 ------------[ cut here ]------------ kernel BUG at fs/btrfs/extent-tree.c:3541! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC last sysfs file: /sys/devices/pci0000:00/0000:00:12.0/0000:04:0b.0/local_cpus CPU 1 Modules linked in: scsi_wait_scan Pid: 8557, comm: mount Not tainted 2.6.32-rc6-00346-gaa021ba #4 System Product Name RIP: 0010:[<ffffffff81349f3a>] [<ffffffff81349f3a>] btrfs_pin_extent+0x37/0xce RSP: 0018:ffff880095db39c8 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffff8800b9be1558 RDX: 0000000000840083 RSI: 000000000000050c RDI: ffff880095f080a8 RBP: ffff880095db39f8 R08: ffff880095f080a8 R09: 0000000000000000 R10: ffff880095db3918 R11: 0000000000000001 R12: 0000000000001000 R13: 01ffffffffffffff R14: fa00000000000000 R15: 0000000000000000 FS: 00007fdce4c047d0(0000) GS:ffff880008e00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b CR2: 00007fdcdef74170 CR3: 0000000095e4d000 CR4: 00000000000006a0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000ffff4ff0 DR7: 0000000000000400 Process mount (pid: 8557, threadinfo ffff880095db2000, task ffff8800b9be1520) Stack: ffff880095f08000 ffff88008598b290 ffff880095db3b08 01ffffffffffffff <0> ffff8800858c00d8 ffff8800b35ab0d8 ffff880095db3a28 ffffffff81380c2e <0> ffff8800858c00d8 ffff88008598b290 ffff880095db3acc 01ffffffffffffff Call Trace: [<ffffffff81380c2e>] process_one_buffer+0x3a/0x73 [<ffffffff8137e34d>] walk_down_log_tree+0x14c/0x3c3 [<ffffffff8137e646>] walk_log_tree+0x82/0x18c [<ffffffff8137f35a>] btrfs_recover_log_trees+0x178/0x27d [<ffffffff81380bf4>] ? process_one_buffer+0x0/0x73 [<ffffffff81354428>] ? btree_read_extent_buffer_pages+0x6f/0xac [<ffffffff81357ce3>] open_ctree+0x104f/0x12b2 [<ffffffff8140deb5>] ? vsnprintf+0x3fc/0x43b [<ffffffff81131cb6>] ? set_anon_super+0x0/0xec [<ffffffff81340460>] btrfs_get_sb+0x298/0x489 [<ffffffff81132bdd>] vfs_kern_mount+0xa2/0x15d [<ffffffff81132cff>] do_kern_mount+0x4c/0xec [<ffffffff8114bdfa>] do_mount+0x704/0x76a [<ffffffff81108b72>] ? strndup_user+0x62/0x8a [<ffffffff8114bee4>] sys_mount+0x84/0xc0 [<ffffffff810c9636>] ? audit_syscall_entry+0x119/0x145 [<ffffffff8103612b>] system_call_fastpath+0x16/0x1b Code: 53 48 83 ec 08 0f 1f 44 00 00 48 8b bf 60 01 00 00 49 89 f6 49 89 d4 41 89 cf 48 89 7d d0 e8 83 dd ff ff 48 85 c0 48 89 c3 75 04 <0f> 0b eb fe 48 8b b8 c8 00 00 00 4c 8d 6b 38 48 81 c7 28 01 00 RIP [<ffffffff81349f3a>] btrfs_pin_extent+0x37/0xce RSP <ffff880095db39c8> ---[ end trace 1dfe64a8dc2d8116 ]--- Doing a blkid: /dev/sda7: UUID="136c469e-8931-4b20-8c43-0222eecae9d9" TYPE="btrfs" Do u need me to dd the /dev/sda7 for your post-mortem analysis? And how many bytes is good enough? Disk /dev/sda: 300.0 GB, 300069052416 bytes 255 heads, 63 sectors/track, 36481 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Disk identifier: 0xe271e271 Device Boot Start End Blocks Id System /dev/sda7 34443 36481 16378236 83 Linux So in summary, I think we are looking at a few bugs here, one arising from remount a loopback-mounted entry, and another from a single "mount" command - perhaps due to insufficient error checking on corrupted blocks. THanks. Yes and yes. We do try and carry on in the face of metadata errors but we don't always survive. I'd say you've hit two known problems (double mounting and error recovery). Thanks for tracking the oops down to a specific setup. We don't need to save the corrupted disk. It looks to me like I've hit the same thing in the openSUSE Factory 2.6.34 kernel: http://pastebin.mozilla.org/740969 - I've been running btrfs for quite some time now on a partition I use for daily compiling multiple Mozilla builds, and this happened during such a process, as "Progress gmake" is noted in that /var/log/messages excerpt. Should I paste this inline here (pastebin only keeps it for a month), is it redundant to the ones above, or is that a new bug to be reported? Closing, if this is still affecting you on a newer kernel please reopen. |