Bug 6757
Summary: | repeated slight XFS corruption | ||
---|---|---|---|
Product: | File System | Reporter: | Martin Steigerwald (Martin) |
Component: | XFS | Assignee: | XFS Guru (xfs-masters) |
Status: | CLOSED CODE_FIX | ||
Severity: | high | ||
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.17.1 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
the corruption errors that XFS wrote to syslog
xfs_check and xfs_repair output output of lspci and lspci -vvn configuration of the kernel I used (2.6.17.1 with sws2-2.2.6) patch that might fix the issue |
Description
Martin Steigerwald
2006-06-27 15:12:48 UTC
Created attachment 8427 [details]
the corruption errors that XFS wrote to syslog
Created attachment 8428 [details]
xfs_check and xfs_repair output
Created attachment 8429 [details]
output of lspci and lspci -vvn
Created attachment 8430 [details]
configuration of the kernel I used (2.6.17.1 with sws2-2.2.6)
I will now reboot into a 2.6.17.1 without software suspend 2. I will not use software suspend 2 nor the new userspace software suspend for a while to exclude that its suspend related. Actually I highly doubt thats its suspend related, but it still makes sense to test it. Make sure you're using Mandys patch that I sent to the list earlier too... I'll send that into the -stable folks today. cheers. Created attachment 8452 [details] patch that might fix the issue I am currently testing a patch that Nathan Scott sent to the stable kernel team and apparently CCd to me. From what I understand this patch may fix the issue I am seeing. For now all seems fine, but its too early to say anything definite. Here is that patch including description of what it does: Fix nused counter. It's currently getting set to -1 rather than getting decremented by 1. Since nused never reaches 0, the "if (!free->hdr.nused)" check in xfs_dir2_leafn_remove() fails every time and xfs_dir2_shrink_inode() doesn't get called when it should. This causes extra blocks to be left on an empty directory and the directory in unable to be converted back to inline extent mode. Signed-off-by: Mandy Kirkconnell <alkirkco@sgi.com> Signed-off-by: Nathan Scott <nathans@sgi.com> --- a/fs/xfs/xfs_dir2_node.c 2006-06-28 08:20:56.000000000 +1000 +++ b/fs/xfs/xfs_dir2_node.c 2006-06-28 08:20:56.000000000 +1000 @@ -972,7 +972,7 @@ xfs_dir2_leafn_remove( /* * One less used entry in the free table. */ - free->hdr.nused = cpu_to_be32(-1); + be32_add(&free->hdr.nused, -1); xfs_dir2_free_log_header(tp, fbp); /* * If this was the last entry in the table, we can Mandys patch seems to fix this issue. I had three days production use without any corruption and also a rsync backup with lots of file and directory deletion (SUSE 10.0 -> 10.1 update on one partition) worked well. Thanks Mandy and Nathan! This should really go into next stable kernel patch. Another 24 days without corruption. This patch really seems to be fine! 2.6.17.7 contains the patch. So I am closing this. Kudos to the stable kernel team for finally including it! Just some additinal information for those that where hit by this bug: Mandy Kirkconnel, XFS: corruption fix: http://marc.theaimsgroup.com/?t=115315520200004&r=1&w=2 XFS FAQ, What is the issue with directory corruption in Linux 2.6.17?: http://oss.sgi.com/projects/xfs/faq.html#dir2 Barry Naujock, Review: xfs_repair fixes for dir2 corruption, 28. Juli 2006: http://oss.sgi.com/archives/xfs/2006-07/msg00374.html |