Bug 8144
Summary: | raid5 disk failure followed by xfs filesystem corruption | ||
---|---|---|---|
Product: | File System | Reporter: | lazx888 |
Component: | XFS | Assignee: | XFS Guru (xfs-masters) |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | high | CC: | neilb, protasnb, sandeen-xfs |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.18 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: | Error messages from /var/log/messages |
Description
lazx888
2007-03-07 17:48:13 UTC
Created attachment 10648 [details]
Error messages from /var/log/messages
There was a bug in raid5 in 2.6.19 and earlier where by error-returns weren't properly recognised by the filesystem (depending on the filesystem) (We cleared the UPTODAT bit but passed a '0' error code). In this case it was probably a read-ahead request failed due to lack of resources, as much of the stripe cache was tided up with retries on the failed drive. I don't know if this analysis meshes with the reality of how XFS works, the code is a bit to complex for me to follow easily. I think this bug should possibly be assigned to someone with XFS knowledge to comment if that is a possible explanation.... I wonder how I do that... Maybe I do it like that.... accept the bug first, then reassign... Neil: I still have the machine off and in a "broken" state. I am planning on redoing the array soon, was wondering if you need any other info before I do this. Thanks Have you been able to bring up your RAID and maybe do more testing with newer kernels? Is the problem still there? Thanks. |