Bug 72811
Summary: | If a raid5 array gets into degraded mode, gets modified, and the missing drive re-added, the filesystem loses state | ||
---|---|---|---|
Product: | File System | Reporter: | Marc MERLIN (marc) |
Component: | btrfs | Assignee: | Josef Bacik (josef) |
Status: | NEW --- | ||
Severity: | blocking | CC: | anthony, bo.li.liu, chupaka, don, dsterba, jannik.winkel, kb9vqf, kernelbugzilla, marc, mspeder, somekool, szg00000, troy |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.14, 5.10 | Subsystem: | |
Regression: | No | Bisected commit-id: |
Description
Marc MERLIN
2014-03-23 19:17:15 UTC
any update on this bug ? still current as of Linux 4.2 ? On Mon, Nov 16, 2015 at 12:14:00PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > any update on this bug ? > still current as of Linux 4.2 ? I have no idea, I haven't used swraid5 on btrfs in a long time. You can try it out and report back. Marc I tested this with RAID6 and can confirm that this bug is still present in kernel 4.4 (Ubuntu 16.04 LTS). After a disk became unavailable, I took it out and reinserted it into the slot. BTRFS then automatically started using the disk again as if nothing had happened. After running a scrub the entire filesystem became corrupted beyond repair. On IRC it became clear that quite a few people have experienced this. Afterwards the wiki was updated to reflect the current state of RAID5/6 and the following text was added: "The parity RAID code has multiple serious data-loss bugs in it. It should not be used for anything other than testing purposes." https://btrfs.wiki.kernel.org/index.php/RAID56 Hi, If it is scrub that screws up your btrfs, I think this patch[1] would help a bit. Assuming we're using the default copy-on-write mode, unless the failed disk pretends to get every write down to it (i.e. it didn't report an error to the upper layer), the data would not be found since an error showed up during writing back data to disk so that the related metadata would not be updated. On the other hand, if filesystem's metadata got errors when being written to disk, the filesystem would(or should) flip to readonly. [1]: https://patchwork.kernel.org/patch/9642241/ Hi, I can confirm this bug is still present in kernel version 5.10.84. In this version, btrfs will occasionally refuse to mount if it detects a generation version that's older, but there doesn't seem to be a way to add the disk back to the array without wiping the disk and using `btrfs device replace`. |