Bug 102901 - "btrfs device delete missing" doesn't delete all missing, doesn't fully balance
Summary: "btrfs device delete missing" doesn't delete all missing, doesn't fully balance
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Josef Bacik
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2015-08-15 03:27 UTC by Timothy Miller
Modified: 2016-03-20 10:02 UTC (History)
1 user (show)

See Also:
Kernel Version: 4.1.4
Tree: Mainline
Regression: No


Attachments

Description Timothy Miller 2015-08-15 03:27:05 UTC
I have a RAID1 across four drives.  I wound up in this funny situation where a replacement drive failed during a replacement operation, so I ended up with TWO missing drives and had to apply a patch to be able to mount the filesystem in degraded mode and add a new drive.  

Normally, when you delete a missing device, btrfs rebalances the data from the existing drives to the new one.  However, since there were two missing drives, the "delete missing operation" only partially rebalanced.  

Before the first "delete missing," I saw this:

compute0 ~ # btrfs fi show
Label: none  uuid: 49ac9ad2-b529-4e6e-aef9-1c5b9e8a72f8
        Total devices 1 FS bytes used 28.81GiB
        devid    1 size 79.69GiB used 41.03GiB path /dev/sda3

Label: none  uuid: ecdff84d-b4a2-4286-a1c1-cd7e5396901c
        Total devices 5 FS bytes used 1.46TiB
        devid    2 size 931.51GiB used 767.00GiB path /dev/sdd
        devid    3 size 931.51GiB used 745.03GiB path /dev/sdc
        devid    4 size 931.51GiB used 767.00GiB path /dev/sdb
        *** Some devices missing

btrfs-progs v4.1.2
compute0 ~ # btrfs device add /dev/sde /mnt/btrfs
compute0 ~ # btrfs device delete missing /mnt/btrfs

After finishing the first "delete missing", I saw this:

compute0 ~ # btrfs fi show
Label: none  uuid: 49ac9ad2-b529-4e6e-aef9-1c5b9e8a72f8
        Total devices 1 FS bytes used 28.81GiB
        devid    1 size 79.69GiB used 41.03GiB path /dev/sda3

Label: none  uuid: ecdff84d-b4a2-4286-a1c1-cd7e5396901c
        Total devices 5 FS bytes used 1.46TiB
        devid    2 size 931.51GiB used 767.00GiB path /dev/sdd
        devid    3 size 931.51GiB used 744.03GiB path /dev/sdc
        devid    4 size 931.51GiB used 767.00GiB path /dev/sdb
        devid    6 size 931.51GiB used 98.00GiB path /dev/sde
        *** Some devices missing

btrfs-progs v4.1.2


Notice that the new drive is only partially populated.  How in world does btrfs know what NOT to copy to the new drive?  It should simply be aware that there aren't two copies of every block, right?  So for every block that doesn't have a duplicate, it should copy that to the new drive.  Why it stopped part way through is baffling.  

Currently, I'm running a second "delete missing," and it's still going.  But this is just weird.  I'm nervous that something is going to go wrong here.  I'll try a scrub and running a balance again, just to be sure, once this is all over with, but this is just weird.

Note You need to log in before you can comment on or make changes to this bug.