Bug 100911
Summary: | NULL pointer dereference during btrfs snapshot removal | ||
---|---|---|---|
Product: | File System | Reporter: | Christoph Biedl (bugzilla.kernel.bpeb) |
Component: | btrfs | Assignee: | Josef Bacik (josef) |
Status: | RESOLVED PATCH_ALREADY_AVAILABLE | ||
Severity: | high | CC: | dsterba, hch, jeffm |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: | btrfs: skip waiting on ordered range for special files |
Description
Christoph Biedl
2015-07-04 11:17:27 UTC
After some investigation I managed to create a reproducer that still isn't minimal but at least worth sharing. Short version: schroot --chroot jessie-amd64 -- apt-get install lvm2 Longer version: - Debian jessie userland required, or anything else that runs schroot - Create a type=btrfs-snapshot schroot named jessie-amd64, containing a minimal jessie amd64 userland - Start a non-persistent session and install the lvm2 package, then end the session - BUG is triggered when schroot cleans up the chroot, i.e. deletes the snapshot Presumably the lvm2 postinst does things that later cause the problem. Just installing lvm2's dependencies works as expected. Neither sleep nor sync before the snapshot deletion are a workaround. An strace of that installation has some 180 Mbyte size, still searching the for the root cause that triggers the bug. After that it should be possible to have a minimal reproducer of "btrfs subvolume snapshot", (some command), "btrfs subvolume delete". The culprit is vgcfgbackup called from lvm2 postinst. So the following reproducer should do the trick: * Have a btrfs at /mnt/schroot/ * Create a subvolume /mnt/schroot/jessie-amd64, and populate it with a userland, including the lvm2 package. * Create a snapshot: btrfs subvolume snapshot /mnt/schroot/jessie-amd64/ /mnt/schroot/snap * Bind-mount /proc mount --bind /proc /mnt/schroot/snap/proc * Exec vgcfgbackup in the snapshot chroot /mnt/schroot/snap /sbin/vgcfgbackup * Umount umount /mnt/schroot/snap/proc * Delete snapshot btrfs subvolume delete /mnt/schroot/snap Final step was to reduce vgcfgbackup to the actual ioctl (wild guessing) that causes the trouble. The strace output looks harmless. Created attachment 187441 [details]
btrfs: skip waiting on ordered range for special files
In btrfs_evict_inode, we properly truncate the page cache for evicted
inodes but then we call btrfs_wait_ordered_range for every inode as well.
It's the right thing to do for regular files but results in incorrect
behavior for device inodes for block devices.
filemap_fdatawrite_range gets called with inode->i_mapping which gets
resolved to the block device inode before getting passed to
wbc_attach_fdatawrite_inode and ultimately to inode_to_bdi. What happens
next depends on whether there's an open file handle associated with the
inode. If there is, we write to the block device, which is unexpected
behavior. If there isn't, we through normally and inode->i_data is used.
We can also end up racing against open/close which can result in crashes
when i_mapping points to a block device inode that has been closed.
Since there can't be any page cache associated with special file inodes,
it's safe to skip the btrfs_wait_ordered_range call entirely and avoid
the problem.
[Still undergoing xfstests run but should be ok.]
Thanks, looks very good. Not a single crash in all scenarios where I've encountered them every time. Consider this a Tested-by: Thanks, closing. |