Bug 12469 - XFS : Corruption of in-memory data
XFS : Corruption of in-memory data
Status: CLOSED CODE_FIX
Product: File System
Classification: Unclassified
Component: XFS
All Linux
: P1 normal
Assigned To: Dave Chinner
:
Depends on:
Blocks: 12398
  Show dependency treegraph
 
Reported: 2009-01-17 10:57 UTC by Haphaeu
Modified: 2009-01-21 19:08 UTC (History)
1 user (show)

See Also:
Kernel Version: 2.6.29-rc1 (20090116 last tested)
Tree: Mainline
Regression: Yes


Attachments

Description Haphaeu 2009-01-17 10:57:58 UTC
Latest working kernel version: 2.6.28-rc8
Distribution: slackware current (20090116)
Hardware Environment: Intel(R) Pentium(R) 4 CPU 2.66GHz on a ECS p4m800pro-m 2gb
Software Envitonment: 
Gnu C                  4.2.4
Gnu make               3.81
binutils               2.18.50.0.9.20080822
util-linux             2.12r
mount                  2.12r
module-init-tools      3.5
e2fsprogs              1.41.3
jfsutils               1.1.12
reiserfsprogs          3.6.20
xfsprogs               2.10.1
Linux C Library        2.7
Dynamic linker (ldd)   2.7
Linux C++ Library      6.0.9
Procps                 3.2.7
Net-tools              1.60
Kbd                    1.12
Sh-utils               6.12


XFS memory corruption

The problem occurs in kernels >= 2.6.28-rc8

last git kernel tested: 2.6.29-rc1 (20090116)

simple and clean test:

mkfs.xfs -b size=1024 /dev/sdb1

mount /dev/sdb1 /mnt/sdb1 -o rw,noatime,nodiratime

create/copy a file and try "clear" the data

[root@0x29A]# cp /opt/firefox/profile/urlclassifier3.sqlite /mnt/sdb1 ; sync ; cd /mnt/sdb1 ; sync

[root@0x29A]# > urlclassifier3.sqlite

XFS internal error XFS_WANT_CORRUPTED_GOTO at line 3327 of file fs/xfs/xfs_btree.c. Caller 0xc023884c

Filesystem "sdb1": XFS internal error XFS_TRANS_CALCEL at line 1164 of file fs/xfs/xfs_trnas.c. Caller 0xc02637c9

Filesystem "sdb1": Corruption of in-memory data detected. Shutting down filesystem: sdb1

Please umount the filesystem ans rectify the problem(s)

bash: urlclassifier3.sqlite : structure needs cleaning
----------------------------------------------------------

tested in more 2 hdds and in another computer (my old pMMX 233mhz/64mb, just for hardware sanity test =)

on my lilo:
append = "printk.time=1 noisapnp acpi=off irqpool"

but with out this append the problem still occurs

[root@0x29A]# xfs_info /dev/sdb1
meta-data=/dev/root              isize=256    agcount=16, agsize=2509162 blks
         =                       sectsz=512   attr=0
data     =                       bsize=1024   blocks=40146592, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=1024   blocks=19602, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

[root@0x29A]# cat /proc/cpuinfo    
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 15
model           : 4
model name      : Intel(R) Pentium(R) 4 CPU 2.66GHz
stepping        : 9
cpu MHz         : 2666.440
cache size      : 1024 KB
fdiv_bug        : no
hlt_bug         : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 5
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe lm constant_tsc pebs bts pni dtes64 monitor ds_cpl tm2 cid cx16 xtpr
bogomips        : 5332.95
clflush size    : 64
power management:


[root@0x29A]# cat /proc/modules
snd_pcm_oss 33056 1 - Live 0xf93d7000
snd_mixer_oss 12864 1 snd_pcm_oss, Live 0xf93c5000
ext2 44708 1 - Live 0xfa7a2000
apm 15092 1 - Live 0xfa766000
via 37728 2 - Live 0xfa743000
drm 74592 3 via, Live 0xfa701000
brd 5196 1 - Live 0xfa6ac000
snd_via82xx 21016 1 - Live 0xf864c000
gameport 9064 1 snd_via82xx, Live 0xf8637000
snd_ac97_codec 95908 1 snd_via82xx, Live 0xf8610000
ac97_bus 1312 1 snd_ac97_codec, Live 0xf85e5000
snd_pcm 54632 3 snd_pcm_oss,snd_via82xx,snd_ac97_codec, Live 0xf85cf000
snd_timer 17252 1 snd_pcm, Live 0xf85b3000
snd_page_alloc 7624 2 snd_via82xx,snd_pcm, Live 0xf85a4000
snd_mpu401_uart 5696 1 snd_via82xx, Live 0xf859b000
snd_rawmidi 17248 1 snd_mpu401_uart, Live 0xf8580000
snd_seq_device 5836 1 snd_rawmidi, Live 0xf8541000
via_agp 7744 1 - Live 0xf8594000
uhci_hcd 20364 0 - Live 0xf8569000
snd 43140 9 snd_pcm_oss,snd_mixer_oss,snd_via82xx,snd_ac97_codec,snd_pcm,snd_timer,snd_mpu401_uart,snd_rawmidi,snd_seq_device, Live 0xf854c000
agpgart 27504 2 drm,via_agp, Live 0xf8524000
ehci_hcd 31628 0 - Live 0xf8508000
via_rhine 19880 0 - Live 0xf8acd000
mii 4256 1 via_rhine, Live 0xf8ab8000
shpchp 29364 0 - Live 0xf8aa7000
i2c_viapro 7476 0 - Live 0xf8a94000
psmouse 39408 0 - Live 0xf8a58000
soundcore 5120 2 snd, Live 0xf8a43000
rtc_cmos 8460 0 - Live 0xf8a27000
pci_hotplug 11432 1 shpchp, Live 0xf8a1a000
usbcore 118128 3 uhci_hcd,ehci_hcd, Live 0xf895a000
rtc_core 13496 1 rtc_cmos, Live 0xf8920000
i2c_core 19408 1 i2c_viapro, Live 0xf890b000
rtc_lib 2240 1 rtc_core, Live 0xf88f4000
serio_raw 4612 0 - Live 0xf88ea000
evdev 8352 0 - Live 0xf88ce000
sg 22512 0 - Live 0xf88a8000
Comment 1 Eric Sandeen 2009-01-17 12:59:44 UTC
So your reproducer steps hit it every time?  That'd be awesome.

I'll go test it (though I only have x86_64 to do so...)

Could you attach the file /opt/firefox/profile/urlclassifier3.sqlite to the bug please?
Comment 2 Eric Sandeen 2009-01-17 13:02:13 UTC
Oh, of course we don't need that file itself :)  But can you please provide info on exactly how large it is (ls -l)
Comment 3 Eric Sandeen 2009-01-17 13:50:08 UTC
One other thing, can you confirm that the xfs_info geometry shown is that which is failing?  I'm a little confused because xfsprogs 2.10.1 with "mkfs.xfs -b size=1024" should not have those defaults (attr=0, logver=0, agcount=16)
Comment 4 Eric Sandeen 2009-01-17 14:01:55 UTC
and still one more thing :)

If the source sqlite file is on xfs, can you paste the results of xfs_bmap -v for it?

And also xfs_bmap -v of the file after you copy it to your test fs and sync it?

Thanks,
-Eric
Comment 5 Haphaeu 2009-01-17 14:13:49 UTC
(In reply to comment #1)
> So your reproducer steps hit it every time?  That'd be awesome.
> 
> I'll go test it (though I only have x86_64 to do so...)
> 
> Could you attach the file /opt/firefox/profile/urlclassifier3.sqlite to the bug
> please?
> 
hi Eric

on very first time, i was dumped the audio from a .mkv file after boot with the
new kernel (heavily r/w)

occurs randomly, when i copy/move a "big" file (~100mb) to a XFS fs...
but, if i "zeroes" a file (1mb bigger) the XFS screw up more frequently...
(9/10 times)

# > bzImage, crash (my 1.9mb kernel img)
# dd if=/dev/zero of=file bs=10M count=1 ; sync ; > file , crash
# > /etc/fstab , does not crash (~500 bytes)
# dd if=/dev/zero of=file bs=64K count=1 ; sync ; > file , does not crash

Comment 6 Haphaeu 2009-01-17 14:16:09 UTC
(In reply to comment #2)
> Oh, of course we don't need that file itself :)  But can you please provide
> info on exactly how large it is (ls -l)
> 

-rw-r--r-- 1 root root 15765504 2009-01-17 20:17 /opt/firefox/profile/urlclassifier3.sqlite

Comment 7 Haphaeu 2009-01-17 14:18:55 UTC
(In reply to comment #3)
> One other thing, can you confirm that the xfs_info geometry shown is that which
> is failing?  I'm a little confused because xfsprogs 2.10.1 with "mkfs.xfs -b
> size=1024" should not have those defaults (attr=0, logver=0, agcount=16)
> 

[root@0x29A]# xfs_info /dev/sdb1
meta-data=/dev/root              isize=256    agcount=16, agsize=2509162 blks
         =                       sectsz=512   attr=0
data     =                       bsize=1024   blocks=40146592, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=1024   blocks=19602, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0


i use only -b size=1024, no other arg
Comment 8 Haphaeu 2009-01-17 14:21:41 UTC
(In reply to comment #4)
> and still one more thing :)
> 
> If the source sqlite file is on xfs, can you paste the results of xfs_bmap -v
> for it?
> 
> And also xfs_bmap -v of the file after you copy it to your test fs and sync it?
> 
> Thanks,
> -Eric
> 

[root@0x29A]# xfs_bmap -v urlclassifier3.sqlite     
urlclassifier3.sqlite:
 EXT: FILE-OFFSET      BLOCK-RANGE        AG AG-OFFSET          TOTAL
   0: [0..31039]:      24796346..24827385  4 (4723050..4754089) 31040
Comment 9 Haphaeu 2009-01-17 14:29:57 UTC
(In reply to comment #8)
> (In reply to comment #4)
> > and still one more thing :)
> > 
> > If the source sqlite file is on xfs, can you paste the results of xfs_bmap -v
> > for it?
> > 
> > And also xfs_bmap -v of the file after you copy it to your test fs and sync it?
> > 
> > Thanks,
> > -Eric
> > 
> 

before:

[root@0x29A]# xfs_bmap -v urlclassifier3.sqlite                       
urlclassifier3.sqlite:
 EXT: FILE-OFFSET      BLOCK-RANGE      AG AG-OFFSET          TOTAL
   0: [0..1727]:       2835266..2836993  0 (2835266..2836993)  1728
   1: [1728..4999]:    2597070..2600341  0 (2597070..2600341)  3272
   2: [5000..6455]:    2595614..2597069  0 (2595614..2597069)  1456
   3: [6456..8791]:    2593278..2595613  0 (2593278..2595613)  2336
   4: [8792..11343]:   2590726..2593277  0 (2590726..2593277)  2552
   5: [11344..12151]:  3459672..3460479  0 (3459672..3460479)   808
   6: [12152..14047]:  3461146..3463041  0 (3461146..3463041)  1896
   7: [14048..15167]:  3467366..3468485  0 (3467366..3468485)  1120
   8: [15168..16927]:  3469670..3471429  0 (3469670..3471429)  1760
   9: [16928..17839]:  3474238..3475149  0 (3474238..3475149)   912
  10: [17840..19231]:  3477534..3478925  0 (3477534..3478925)  1392
  11: [19232..19679]:  3480350..3480797  0 (3480350..3480797)   448
  12: [19680..20607]:  3486738..3487665  0 (3486738..3487665)   928
  13: [20608..21615]:  3708944..3709951  0 (3708944..3709951)  1008
  14: [21616..24367]:  4253798..4256549  0 (4253798..4256549)  2752
  15: [24368..27215]:  4256972..4259819  0 (4256972..4259819)  2848
  16: [27216..29535]:  4262584..4264903  0 (4262584..4264903)  2320
  17: [29536..29687]:  4266048..4266199  0 (4266048..4266199)   152
  18: [29688..30319]:  4272432..4273063  0 (4272432..4273063)   632
  19: [30320..30791]:  4273532..4274003  0 (4273532..4274003)   472
  20: [30792..31039]:  5717636..5717883  1 (699312..699559)     248

after:

[root@0x29A]# xfs_bmap -v urlclassifier3.sqlite     
 urlclassifier3.sqlite:
  EXT: FILE-OFFSET      BLOCK-RANGE        AG AG-OFFSET          TOTAL
    0: [0..31039]:      24796346..24827385  4 (4723050..4754089) 31040
 
Comment 10 Eric Sandeen 2009-01-17 14:31:17 UTC
Does this reproduce it for you, then?  doesn't for me, unfortunately:

	mkdir mnt
	mkfs.xfs -b size=1024 -d file,name=fsfile,size=40146592b,agcount=16 -i attr=0 -l version=1
	dd if=/dev/zero bs=1 count=1 seek=15765503 of=bigfile
	mount -o loop,rw,noatime,nodiratime fsfile mnt/
	cp --sparse=never bigfile mnt/
	sync
	cd mnt
	sync
	> bigfile
	cd ..
	dmesg | grep CORRUPTED

Comment 11 Eric Sandeen 2009-01-17 14:51:01 UTC
One more thing to try for a reproducer if the above doesn't work.

mkfs, populate, sync as shown in the original comment.
unmount
make an xfs_metadump image of the device
xfs_mdrestore back to the device
mount the device with your mount options
do the "> testfile" test
does that reproduce it?  if so, please provide the xfs_metadump image.

if you need a hand with these steps, please let me know.
Comment 12 Eric Sandeen 2009-01-18 13:32:32 UTC
another thing could you attach your mkfs binary?  It may help us reproduce, since you seem to have something ... unique.
Comment 13 Dave Chinner 2009-01-18 19:35:23 UTC
Eric, your test won't reproduce because it doesn't fragment sufficiently.
can I suggest using xfs_io to write each extent in reverse order (i.e.
write the file backwards) synchronously using direct IO to get the file
written out into 20 extents before nulling it....
Comment 14 Eric Sandeen 2009-01-18 19:37:12 UTC
Dave, according to comment #9, the file on the filesystem in question has a single extent (although the source was fragmented).
Comment 15 Felix Blyakher 2009-01-18 19:52:42 UTC
It doesn't seem that multiextent file is the only condition for the
bug. I created 100M file with 72K+ extents and no corruption.

cxfsopus16 6# l /xfstest/file1
-rw-r--r-- 1 root root 104857600 2009-01-17 16:56 /xfstest/file1
cxfsopus16 7# xfs_bmap -v /xfstest/file1 | wc
  72802  436807 5086085
cxfsopus16 8# 
Comment 16 Dave Chinner 2009-01-18 20:11:27 UTC
Out of curiousity, so we can see what btree operations are actually
happening on your machine, can you dump /proc/fs/xfs/stats and post it?

Eric, i can't reproduce it with the xfs_io trick - I've produced
an identical block map and it doesn't corrupt when overwriting it.
Also, "> to_a_file" leaves me with a zero length file with new
extents, not file full of zeroed blocks....
Comment 17 Dave Chinner 2009-01-21 19:06:39 UTC
Bug has been found and fixed. See here:

http://oss.sgi.com/archives/xfs/2009-01/msg00645.html

Should be on it's way up to the main tree now.

Note You need to log in before you can comment on or make changes to this bug.