Bug 59051 - Reproduceable file system corruption (fs tree ... refs ... not found)
Summary: Reproduceable file system corruption (fs tree ... refs ... not found)
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-30 22:26 UTC by Clemens Eisserer
Modified: 2013-06-05 08:27 UTC (History)
0 users

See Also:
Kernel Version: 3.10.0-0.rc2, 3.9.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Clemens Eisserer 2013-05-30 22:26:33 UTC
Recently btrfsck reported fs-corruption in snapshots on a production system (running linux-3.9.1 to 3.9.4) where I used a self-written continous snapshot utility: http://comments.gmane.org/gmane.comp.file-systems.btrfs/26030
This utility creates a new snapshot every 10min and deletes the oldest when free space drops below XY%.
Using a NFS-replay tool written by a colleague of mine (replays real FS load recorded by NFS servers) I was able to reproduce the corruptions reliably on a fresh btrfs filesystem with both linux-3.9.4 as well as 3.10.0.rc2, when snapshotting every 20s.
I also verified the issue is not caused by an old btrfsck.

As the process writes about ~160GB and syncs/seeks frequently, I really recommend burning an SSD instead of waiting ages for an HDD to finish.

The following steps are required to reproduce the issue:
1. Download the NFS traces from: https://mega.co.nz/#!o8dziKzQ!OjocHW3CMz5VmVvr2Z6mln66aH0fC27DBZeaZcmTByA
2. Download the NFS replay utility from: https://mega.co.nz/#!Zs0FXILL!V5jJqhNwtT1mDuipSW4cST4zYW8IJTj2c5IH6W4KFAA
3. Download the continous snapshot scripts from: https://mega.co.nz/#!tlFRzJLY!IXRTZkAvl9urw7lePRzY-wO2kY8HmcJZ0jWt7DGWr8o
4. Unpack the archives (good idea to do that in sub-folders)

5. Create a 12GB btrfs somewhere, use inline-metadata to reduce chance of ENOSPC due to data/metadata inbalance (production system had separated sections, so doesn't make a difference):
   mkfs.btrfs -f -M -b 12884901888 /dev/???  

5. Compile the NFS replay utility (cnfsparse): 
    make
    
6. Compile the snapshot-clean daemon: 
    gcc -O2 -lm snapshot_cleaner_replay.c -o snappy_replay

The daemon was configured to clean up on my SSD-based system fast enough, but offers knobs to adjust clean-up.
Please adjust those in case you run out of space during replaying the NFS logs.

   #define MIN_FREE_PERCENT 20  //Start cleaning when < XY% free
   #define BATCH_CLEAN_SIZE 5  
   #define CLEAN_SLEEP_SECS 5  //Wait N seconds after cleaning BATCH_CLEAN_SIZE before checking free space again


7. mount the FS at /replay (the scripts have that hardcoded to expect it that way, sorry):
   mkdir /replay
   mount -o noatime /dev/sda1 /replay/ (on my system ssd-mode and space_cache are enabled automatically)
   mkdir /replay/snapshots
   mkdir /replay/traces

8. Start continous snapshot creator script as well as cleanup-"daemon" and leave both running:
   sh create_snapshot.sh 
   ./snapshot_cleaner_replay

9. Start the replay tool (cnfsparse) to replay the nfs traces (lair62_small.txt.xz) in the traces-dir:
   cd /replay/traces
   xz -d -c lair62_small.txt.xz  | cnfsparse -s 1

10. Wait until it finishes with "unexpected end of input", stop all the scripts/daemons, unmount and perform a btrfsck.
On my system this resulted 3 out of 3 times in errors like: 
   checking root refs
   fs tree 565 refs 125 not found
	   unresolved ref root 807 dir 813347 index 277 namelen 39 name snapshot_1368273601_2013-05-11_14:00:01 error 600
	   unresolved ref root 808 dir 813347 index 277 namelen 39 name snapshot_1368273601_2013-05-11_14:00:01 error 600
       .....
       
A corrupted FS is available for reference at: https://mega.co.nz/#!Zs0FXILL!V5jJqhNwtT1mDuipSW4cST4zYW8IJTj2c5IH6W4KFAA
Comment 1 Clemens Eisserer 2013-06-03 18:02:03 UTC
would a preconfigured VM help to make reproducing this issue easier?
Comment 2 Josef Bacik 2013-06-03 20:40:22 UTC
No this is perfect I just need to get to trying to reproduce.  I will try first thing in the morning.
Comment 3 Josef Bacik 2013-06-04 21:00:54 UTC
Ok it wasn't really corruption, fsck is just wrong.  I've posted a patch to fix this problem

[PATCH] Btrfs-progs: fix incorrect root backref errors in fsck

I've tested this patch against my smaller reproducer and a file system that gave me the same errors that was created with your reproducer.  Please re-open this if it still gives you a problem.
Comment 4 Clemens Eisserer 2013-06-05 08:27:51 UTC
Thanks a lot for taking a look and fixing this issue.

Note You need to log in before you can comment on or make changes to this bug.