Bug 72811 - If a raid5 array gets into degraded mode, gets modified, and the missing drive re-added, the filesystem loses state
Summary: If a raid5 array gets into degraded mode, gets modified, and the missing driv...
Status: NEW
Alias: None
Product: File System
Classification: Unclassified
Component: btrfs (show other bugs)
Hardware: All Linux
: P1 blocking
Assignee: Josef Bacik
Depends on:
Reported: 2014-03-23 19:17 UTC by Marc MERLIN
Modified: 2022-10-03 14:50 UTC (History)
13 users (show)

See Also:
Kernel Version: 3.14, 5.10
Regression: No
Bisected commit-id:


Description Marc MERLIN 2014-03-23 19:17:15 UTC
As per my own experimentation, and Tobias Holst as reported here:

If you re-add a drive removed from a raid5 array, likely after it's been rebalanced (buy maybe just modifying it without rebalance is enough), two bad things happened:

1) As soon as my drive re-appeared on the bus, btrfs noticed it and automatically re-added it. This is likely unwelcome and dangerous since the drive is out of sync with the rest of the array

2) In my experience, the filesystem regressed to an earlier state, and a directory I had added while the drive was gone, disappeared after I re-added the drive.

That said, fixing #1 should be enough: once a drive was kicked out of an array and the array has anything written on it, it should have a newer generation number and the drive with the older generation number should not be accepted back, unless it's initialized, or you use some special recovery option to rebuild a raid with too many missing drives and re-add one that is older than wanted, but better than nothing at all.
Comment 1 Mathieu Jobin 2015-11-16 12:14:00 UTC
any update on this bug ?
still current as of Linux 4.2 ?
Comment 2 Marc MERLIN 2015-11-16 15:25:43 UTC
On Mon, Nov 16, 2015 at 12:14:00PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> any update on this bug ?
> still current as of Linux 4.2 ?

I have no idea, I haven't used swraid5 on btrfs in a long time.
You can try it out and report back.

Comment 3 Don 2016-08-09 08:30:49 UTC
I tested this with RAID6 and can confirm that this bug is still present in kernel 4.4 (Ubuntu 16.04 LTS).

After a disk became unavailable, I took it out and reinserted it into the slot. BTRFS then automatically started using the disk again as if nothing had happened. After running a scrub the entire filesystem became corrupted beyond repair.

On IRC it became clear that quite a few people have experienced this. Afterwards the wiki was updated to reflect the current state of RAID5/6 and the following text was added:
"The parity RAID code has multiple serious data-loss bugs in it. It should not be used for anything other than testing purposes."
Comment 4 liubo 2017-03-25 04:18:10 UTC

If it is scrub that screws up your btrfs, I think this patch[1] would help a bit.

Assuming we're using the default copy-on-write mode, unless the failed disk pretends to get every write down to it (i.e. it didn't report an error to the upper layer), the data would not be found since an error showed up during writing back data to disk so that the related metadata would not be updated.  On the other hand, if filesystem's metadata got errors when being written to disk, the filesystem would(or should) flip to readonly.

[1]: https://patchwork.kernel.org/patch/9642241/
Comment 5 Chris Tam 2022-01-12 11:22:37 UTC

I can confirm this bug is still present in kernel version 5.10.84. In this version, btrfs will occasionally refuse to mount if it detects a generation version that's older, but there doesn't seem to be a way to add the disk back to the array without wiping the disk and using `btrfs device replace`.

Note You need to log in before you can comment on or make changes to this bug.