Bug 12469
Summary: | XFS : Corruption of in-memory data | ||
---|---|---|---|
Product: | File System | Reporter: | Haphaeu (phreytaz) |
Component: | XFS | Assignee: | Dave Chinner (david) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | rjw |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.29-rc1 (20090116 last tested) | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Bug Depends on: | |||
Bug Blocks: | 12398 |
Description
Haphaeu
2009-01-17 10:57:58 UTC
So your reproducer steps hit it every time? That'd be awesome. I'll go test it (though I only have x86_64 to do so...) Could you attach the file /opt/firefox/profile/urlclassifier3.sqlite to the bug please? Oh, of course we don't need that file itself :) But can you please provide info on exactly how large it is (ls -l) One other thing, can you confirm that the xfs_info geometry shown is that which is failing? I'm a little confused because xfsprogs 2.10.1 with "mkfs.xfs -b size=1024" should not have those defaults (attr=0, logver=0, agcount=16) and still one more thing :) If the source sqlite file is on xfs, can you paste the results of xfs_bmap -v for it? And also xfs_bmap -v of the file after you copy it to your test fs and sync it? Thanks, -Eric (In reply to comment #1) > So your reproducer steps hit it every time? That'd be awesome. > > I'll go test it (though I only have x86_64 to do so...) > > Could you attach the file /opt/firefox/profile/urlclassifier3.sqlite to the > bug > please? > hi Eric on very first time, i was dumped the audio from a .mkv file after boot with the new kernel (heavily r/w) occurs randomly, when i copy/move a "big" file (~100mb) to a XFS fs... but, if i "zeroes" a file (1mb bigger) the XFS screw up more frequently... (9/10 times) # > bzImage, crash (my 1.9mb kernel img) # dd if=/dev/zero of=file bs=10M count=1 ; sync ; > file , crash # > /etc/fstab , does not crash (~500 bytes) # dd if=/dev/zero of=file bs=64K count=1 ; sync ; > file , does not crash (In reply to comment #2) > Oh, of course we don't need that file itself :) But can you please provide > info on exactly how large it is (ls -l) > -rw-r--r-- 1 root root 15765504 2009-01-17 20:17 /opt/firefox/profile/urlclassifier3.sqlite (In reply to comment #3) > One other thing, can you confirm that the xfs_info geometry shown is that > which > is failing? I'm a little confused because xfsprogs 2.10.1 with "mkfs.xfs -b > size=1024" should not have those defaults (attr=0, logver=0, agcount=16) > [root@0x29A]# xfs_info /dev/sdb1 meta-data=/dev/root isize=256 agcount=16, agsize=2509162 blks = sectsz=512 attr=0 data = bsize=1024 blocks=40146592, imaxpct=25 = sunit=0 swidth=0 blks naming =version 2 bsize=4096 ascii-ci=0 log =internal bsize=1024 blocks=19602, version=1 = sectsz=512 sunit=0 blks, lazy-count=0 realtime =none extsz=4096 blocks=0, rtextents=0 i use only -b size=1024, no other arg (In reply to comment #4) > and still one more thing :) > > If the source sqlite file is on xfs, can you paste the results of xfs_bmap -v > for it? > > And also xfs_bmap -v of the file after you copy it to your test fs and sync > it? > > Thanks, > -Eric > [root@0x29A]# xfs_bmap -v urlclassifier3.sqlite urlclassifier3.sqlite: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..31039]: 24796346..24827385 4 (4723050..4754089) 31040 (In reply to comment #8) > (In reply to comment #4) > > and still one more thing :) > > > > If the source sqlite file is on xfs, can you paste the results of xfs_bmap > -v > > for it? > > > > And also xfs_bmap -v of the file after you copy it to your test fs and sync > it? > > > > Thanks, > > -Eric > > > before: [root@0x29A]# xfs_bmap -v urlclassifier3.sqlite urlclassifier3.sqlite: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..1727]: 2835266..2836993 0 (2835266..2836993) 1728 1: [1728..4999]: 2597070..2600341 0 (2597070..2600341) 3272 2: [5000..6455]: 2595614..2597069 0 (2595614..2597069) 1456 3: [6456..8791]: 2593278..2595613 0 (2593278..2595613) 2336 4: [8792..11343]: 2590726..2593277 0 (2590726..2593277) 2552 5: [11344..12151]: 3459672..3460479 0 (3459672..3460479) 808 6: [12152..14047]: 3461146..3463041 0 (3461146..3463041) 1896 7: [14048..15167]: 3467366..3468485 0 (3467366..3468485) 1120 8: [15168..16927]: 3469670..3471429 0 (3469670..3471429) 1760 9: [16928..17839]: 3474238..3475149 0 (3474238..3475149) 912 10: [17840..19231]: 3477534..3478925 0 (3477534..3478925) 1392 11: [19232..19679]: 3480350..3480797 0 (3480350..3480797) 448 12: [19680..20607]: 3486738..3487665 0 (3486738..3487665) 928 13: [20608..21615]: 3708944..3709951 0 (3708944..3709951) 1008 14: [21616..24367]: 4253798..4256549 0 (4253798..4256549) 2752 15: [24368..27215]: 4256972..4259819 0 (4256972..4259819) 2848 16: [27216..29535]: 4262584..4264903 0 (4262584..4264903) 2320 17: [29536..29687]: 4266048..4266199 0 (4266048..4266199) 152 18: [29688..30319]: 4272432..4273063 0 (4272432..4273063) 632 19: [30320..30791]: 4273532..4274003 0 (4273532..4274003) 472 20: [30792..31039]: 5717636..5717883 1 (699312..699559) 248 after: [root@0x29A]# xfs_bmap -v urlclassifier3.sqlite urlclassifier3.sqlite: EXT: FILE-OFFSET BLOCK-RANGE AG AG-OFFSET TOTAL 0: [0..31039]: 24796346..24827385 4 (4723050..4754089) 31040 Does this reproduce it for you, then? doesn't for me, unfortunately: mkdir mnt mkfs.xfs -b size=1024 -d file,name=fsfile,size=40146592b,agcount=16 -i attr=0 -l version=1 dd if=/dev/zero bs=1 count=1 seek=15765503 of=bigfile mount -o loop,rw,noatime,nodiratime fsfile mnt/ cp --sparse=never bigfile mnt/ sync cd mnt sync > bigfile cd .. dmesg | grep CORRUPTED One more thing to try for a reproducer if the above doesn't work. mkfs, populate, sync as shown in the original comment. unmount make an xfs_metadump image of the device xfs_mdrestore back to the device mount the device with your mount options do the "> testfile" test does that reproduce it? if so, please provide the xfs_metadump image. if you need a hand with these steps, please let me know. another thing could you attach your mkfs binary? It may help us reproduce, since you seem to have something ... unique. Eric, your test won't reproduce because it doesn't fragment sufficiently. can I suggest using xfs_io to write each extent in reverse order (i.e. write the file backwards) synchronously using direct IO to get the file written out into 20 extents before nulling it.... Dave, according to comment #9, the file on the filesystem in question has a single extent (although the source was fragmented). It doesn't seem that multiextent file is the only condition for the bug. I created 100M file with 72K+ extents and no corruption. cxfsopus16 6# l /xfstest/file1 -rw-r--r-- 1 root root 104857600 2009-01-17 16:56 /xfstest/file1 cxfsopus16 7# xfs_bmap -v /xfstest/file1 | wc 72802 436807 5086085 cxfsopus16 8# Out of curiousity, so we can see what btree operations are actually happening on your machine, can you dump /proc/fs/xfs/stats and post it? Eric, i can't reproduce it with the xfs_io trick - I've produced an identical block map and it doesn't corrupt when overwriting it. Also, "> to_a_file" leaves me with a zero length file with new extents, not file full of zeroed blocks.... Bug has been found and fixed. See here: http://oss.sgi.com/archives/xfs/2009-01/msg00645.html Should be on it's way up to the main tree now. |