Bug 202695 - Inconsistency in fsync'ed file size after crash
Summary: Inconsistency in fsync'ed file size after crash
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: BTRFS virtual assignee
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-02-27 23:41 UTC by Seulbae Kim
Modified: 2019-03-07 14:05 UTC (History)
3 users (show)

See Also:
Kernel Version: v5.0.0-rc8
Tree: Mainline
Regression: No


Attachments
Proof of Concept (913 bytes, text/x-csrc)
2019-02-27 23:41 UTC, Seulbae Kim
Details

Description Seulbae Kim 2019-02-27 23:41:06 UTC
Created attachment 281395 [details]
Proof of Concept

[Kernel version]
This bug can be reproduced on kernel 5.0.0-rc8.


[Reproduce]
* Use a VM, since our PoC simulates a crash by triggering a SysRq!
1. Download a base image (128 MB)
$ wget https://gts3.org/~seulbae/fsimg/btrfs-10.image

2. Mount the image
$ mkdir /tmp/btrfs
$ sudo mount -o loop btrfs-10.image /tmp/btrfs

3. Compile and run PoC
$ gcc poc.c -o poc
$ sudo ./poc /tmp/btrfs
(System reboots)


[Check]
1. Re-mount the crashed image
$ mkdir /tmp/btrfs
$ sudo mount -o loop btrfs-10.image /tmp/btrfs

2. Check inconsistency
$ stat /tmp/btrfs/yyy
-> Size: 8000


[Description]
In the base image, 2 directories and 7 files exist.

0: 0755 (mount_point)
+--257: 0755 foo
   +--258: 0755 bar
      +--259: 0644 baz (12 bytes, offset: {})
      +--259: 0644 hln (12 bytes, offset: {})
      +--260: 0644 xattr (0 bytes, offset: {})
      +--261: 0644 acl (0 bytes, offset: {})
      +--262: 0644 æøå (4 bytes, offset: {})
      +--263: 0644 fifo
      +--264: 0777 sln -> mnt/foo/bar/baz

Below is the breakdown of the PoC:
1. Create a file "foo/bar/xxx”,
(line 28) fd = syscall(SYS_open, "foo/bar/xxx”, O_CREAT | O_RDWR, 0666);

2. increase the size of “foo/bar/xxx” to 8000 through pwrite64,
(Line 29) syscall(SYS_pwrite64, (long)fd, (long)buf, 4000, 4000)

3. flush the data,
(line 30) syscall(SYS_fdatasync, fd);

4. truncate the file to 3000 bytes,
(line 31) syscall(SYS_ftruncate, fd, 3000);

5. rename “foo/bar/xxx” to “yyy”,
(line 32) syscall(SYS_rename, “foo/bar/xxx”, “yyy”);

6. flush the metadata of “yyy”, and
(line 33) syscall(SYS_fsync, fd);

7. simulate a crash by rebooting right away without un-mounting.
(line 35) system("echo b > /proc/sysrq-trigger");

As we run fsync on file “yyy”’s file descriptor after its size is
truncated to 3000 bytes, we expect that the size attribute is
successfully flushed to the disk, and when we re-mount the crashed image,
we will see that “yyy”’s size is 3000.
However, “yyy” still has its old size, 8000.


[Further Analysis]
I also tested several variations of the aforementioned test case
to find the potential root cause.
With any of the minor tweaks below, this bug does not happen.
In other words, file “yyy” recovers to size 3000, as expected.
1) Creating file “xxx” not under “foo/bar/“, but directly under the mount point (“./xxx”). (line 28)
2) Removing line 30 (fdatasync).
3) Swapping line 32 (rename) with line 33 (fsync).
4) Removing line 32 (rename).


Reported by Seulbae Kim (seulbae@gatech.edu) from SSLab, Gatech.
Comment 1 David Sterba 2019-03-07 14:05:06 UTC
Fixed by https://patchwork.kernel.org/patch/10837829/ .

Note You need to log in before you can comment on or make changes to this bug.