Bug 98501
Summary: | md raid0 w/ fstrim causing data loss | ||
---|---|---|---|
Product: | IO/Storage | Reporter: | Eric Work (work.eric) |
Component: | MD | Assignee: | io_md |
Status: | RESOLVED CODE_FIX | ||
Severity: | blocking | CC: | evangelos, josh, jskier, mat.jonczyk, mike, neilb, odi, renatomefidf |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 3.19.7 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Move reassignment of "sector" in raid0_make_request
md/raid0: fix restore to sector variable in raid0_make_request |
Description
Eric Work
2015-05-17 19:41:19 UTC
Created attachment 177181 [details]
Move reassignment of "sector" in raid0_make_request
I reverted commit 7595f5425cad83e037639e228ee24d5052510139 and the problem went away. This commit created a problem by reassigning "sector" at the wrong location. At that point where this commit was doing the reassignment "bio" had already advanced. I have no idea what this code is really doing and what these variables contain, but I did a bit of research with cscope and found that "bio_split" calls some *_iter_advance function which is before the reassignment. I'll need to do some more testing tomorrow to see if this patch fixes the problem while maintaining the goal of the original fix. This bug is still present in linux-stable.git and linus' tree.
Created attachment 177291 [details]
md/raid0: fix restore to sector variable in raid0_make_request
I can confirm with reasonably strong confidence that the attached patch fixes the mentioned regression. After 3 rounds of the above procedure I see no differences. Going back to the unpatched kernel again I see a difference after the first round.
The updated patch is now in "git am" format.
Thanks a lot, and sorry for letting that bug in. Patch will go to Linus shortly. It was a bit of a pain to recover from, but it was an interesting challenge to find and test the fix. I got really lucky that I had been keeping my Fedora kernel up to date so the versions to check were just a few :-) I just lost an entire partition for this bug. I have the fstrim.timer running and unfortunelly it runned right before the kernel 3.19.7-200 was updated at my fedora 21 After running e2fsck there are a lot of inodes corruption messages. Any help on how can I rebuild this partition? Should I replace my journal of something like that? Well, the first indication for those looking for this error, is a huge amount number of errors in ldconfig, complaining about the libs headers. p.s.: Yes, I should be sleeping now, but I`m looking for a solution (I didn`t lost my home folder, but everything else is lost) Thank you Does turning off "discard" and not running fstrim entirely avoid this bug, as a workaround until stable kernels get this fix? Probably yes josh, keep away from all fstrim operations on raid0 A 4K (or larger) IO request that is not 4K-aligned on the array can still be handled wrongly. Most filesystems do all their IO 4k aligned so problems are unlikely. However if you partition an md/raid0 with a non-4K alignment, you could hit problems fairly easily. Neil, Do you have any suggestions to recover the partition? Thank you! I wish I did. "DISCARD" was told to discard something that shouldn't have been discarded. Unless there is someway to revert all recent DISCARDs, which I very much doubt, there is no way to get that discarded data back. Sorry :-( Ok Neil, I'm starting over! :) How do I follow this bug until it's released? The fix for this bug has been merged into Linus' tree as commit a81157768a00e8cf8a7b43b5ea5cac931262374f Can you please clarify whether this issue is specific to ext4 file systems (as reported by some news sites) or affects any file system with discard support? (The latter seems more likely since the bug was in the md/raid0 layer.) Bug is not specific to ext4. Your analysis is correct. From what I see, this bug is limited to RAID0 only. Is RAID1 safe even on affected kernels? Was there any separate issue concerning RAID1 (or any other RAID levels)? This bug affects systems with kernel 3.19.7+ or 4.0.2+ running any filesystem on top of MD RAID 0 that supports and enables TRIM. No other RAID levels are affected. I believe Intel fakeraid is also affected. If you don't use fstrim or have the 'discard' option enabled in fstab then you wouldn't be affected. Removing these TRIM options is also the workaround. Fedora has already included the fix in their next kernel update for F21 and F22. Last check Arch Linux has not included the fix. Arch applied the fix yesterday: https://www.archlinux.org/news/data-corruption-on-software-raid-0-when-discard-is-used/ This bug now also affects "stable" 3.18.14 because the buggy commit went in as: d2c861b700b0af90da2d60b1b256173628fa6785 md/raid0: fix bug with chunksize not a power of 2. But this fix did not. Waving good bye to my raid-0 for the 2nd time. |