Bug 8414
Summary: | soft lockup and filesystem corruption on XFS write (by nfsd) | ||
---|---|---|---|
Product: | File System | Reporter: | Pallai Roland (dap) |
Component: | XFS | Assignee: | XFS Guru (xfs-masters) |
Status: | REJECTED INSUFFICIENT_DATA | ||
Severity: | high | CC: | bunk, protasnb, zhseal0 |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.21.1,2.6.20.3 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
my .config
superblock superblock by xfs_db |
Description
Pallai Roland
2007-05-01 04:21:48 UTC
Created attachment 11361 [details]
my .config
I must correct myself, and I have some more info on this issue. First, I did not upgrade from 2.6.20.3, I upgraded from 2.6.15.7. I've 3 NFS servers, 2 of them are running 2.6.20.3 for a long time without problems. Yesterday, I downgraded the third box (called Sulaco) to 2.6.20.3 from 2.6.21.1 and I believed the reported problem has gone, but I was wrong. The same error occurred with 2.6.20.3 on Sulaco after 1 day! I've got a kernel message now that cares about filesystem corruption (I think) just before the lockup. May 1 23:10:45 Sulaco kernel: Filesystem "dm-6": Access to block zero in inode 2428580890 star t_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: f0 May 1 23:10:45 Sulaco kernel: Filesystem "dm-6": Access to block zero in inode 2428580890 star t_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: fd May 1 23:10:45 Sulaco kernel: Filesystem "dm-6": Access to block zero in inode 2428580890 star t_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: fe May 1 23:10:45 Sulaco kernel: Filesystem "dm-6": Access to block zero in inode 2428580890 star t_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: fe May 1 23:10:45 Sulaco kernel: Filesystem "dm-6": Access to block zero in inode 2428580890 star t_block: 0 start_off: 0 blkcnt: 0 extent-state: 0 lastx: fe and then: May 1 23:10:55 Sulaco kernel: Call Trace: May 1 23:10:55 Sulaco kernel: <IRQ> [softlockup_tick+233/272] softlockup_tick+0xe9/0x110 May 1 23:10:55 Sulaco kernel: [update_process_times+80/144] update_process_times+0x50/0x90 May 1 23:10:55 Sulaco kernel: [smp_local_timer_interrupt+52/96] smp_local_timer_interrupt+0x3 4/0x60 May 1 23:10:55 Sulaco kernel: [smp_apic_timer_interrupt+89/128] smp_apic_timer_interrupt+0x59 /0x80 May 1 23:10:55 Sulaco kernel: [apic_timer_interrupt+102/112] apic_timer_interrupt+0x66/0x70 May 1 23:10:55 Sulaco kernel: <EOI> [_end+127398072/2129470264] :xfs:xfs_iext_get_ext+0x0/0x 70 May 1 23:10:55 Sulaco kernel: [_end+127274467/2129470264] :xfs:xfs_bmapi+0xdeb/0x1030 May 1 23:10:55 Sulaco kernel: [_end+127551052/2129470264] :xfs:xfs_zero_eof+0xe4/0x190 May 1 23:10:55 Sulaco kernel: [_end+127552484/2129470264] :xfs:xfs_write+0x4ec/0x9e0 May 1 23:10:55 Sulaco kernel: [wake_idle+26/176] wake_idle+0x1a/0xb0 May 1 23:10:55 Sulaco kernel: [activate_task+145/240] activate_task+0x91/0xf0 May 1 23:10:55 Sulaco kernel: [__up_read+117/176] __up_read+0x75/0xb0 May 1 23:10:55 Sulaco kernel: [_end+127471524/2129470264] :xfs:xfs_trans_unlocked_item+0x2c/0 x60 May 1 23:10:55 Sulaco kernel: [_end+127534520/2129470264] :xfs:xfs_file_aio_write+0x0/0x70 May 1 23:10:55 Sulaco kernel: [_end+127534617/2129470264] :xfs:xfs_file_aio_write+0x61/0x70 May 1 23:10:55 Sulaco kernel: [do_sync_readv_writev+195/272] do_sync_readv_writev+0xc3/0x110 May 1 23:10:55 Sulaco kernel: [_end+128189601/2129470264] :exportfs:find_exported_dentry+0x89 /0x500 May 1 23:10:55 Sulaco kernel: [dev_queue_xmit+585/608] dev_queue_xmit+0x249/0x260 May 1 23:10:55 Sulaco kernel: [autoremove_wake_function+0/48] autoremove_wake_function+0x0/0x 30 May 1 23:10:55 Sulaco kernel: [sunrpc_cache_lookup+126/336] sunrpc_cache_lookup+0x7e/0x150 May 1 23:10:55 Sulaco kernel: [rw_copy_check_uvector+112/256] rw_copy_check_uvector+0x70/0x10 0 May 1 23:10:55 Sulaco kernel: [do_readv_writev+229/464] do_readv_writev+0xe5/0x1d0 May 1 23:10:55 Sulaco kernel: [_end+127533722/2129470264] :xfs:xfs_fs_decode_fh+0xe2/0xf0 May 1 23:10:55 Sulaco kernel: [_end+128239116/2129470264] :nfsd:nfsd_setuser+0x154/0x1c0 May 1 23:10:55 Sulaco kernel: [_end+128213299/2129470264] :nfsd:fh_verify+0x50b/0x530 May 1 23:10:55 Sulaco kernel: [_end+128220538/2129470264] :nfsd:nfsd_vfs_write+0xe2/0x2d0 When I rebooted the machine, I've got a corrupted filesystem! :( attempt to access beyond end of device dm-6: rw=1, want=0, limit=2097152000 Buffer I/O error on device dm-6, logical block 18446744073709551615 lost page write due to I/O error on dm-6 XFS internal error XFS_WANT_CORRUPTED_GOTO at line 1590 of file fs/xfs/xfs_alloc.c. Caller 0xffffffff8810b347 Call Trace: [<ffffffff8810a457>] :xfs:xfs_free_ag_extent+0x4b7/0x690 [<ffffffff8810b347>] :xfs:xfs_free_extent+0xa7/0xd0 [<ffffffff8811a4bb>] :xfs:xfs_bmap_finish+0xeb/0x170 [<ffffffff88137e8d>] :xfs:xfs_itruncate_finish+0x1ad/0x2d0 [<ffffffff8815245e>] :xfs:xfs_inactive+0x24e/0x480 [<ffffffff88160695>] :xfs:xfs_fs_clear_inode+0xa5/0xf0 [<ffffffff80291f9a>] clear_inode+0xca/0x130 [<ffffffff80292f7f>] generic_delete_inode+0xdf/0x140 [<ffffffff802931cd>] iput+0x6d/0x70 [<ffffffff8028fc4f>] prune_one_dentry+0x6f/0xb0 [<ffffffff8028fdd4>] prune_dcache+0x144/0x1b0 [<ffffffff802903e9>] shrink_dcache_memory+0x19/0x50 [<ffffffff8025f6a7>] shrink_slab+0x107/0x180 [<ffffffff80260c72>] balance_pgdat+0x242/0x380 [<ffffffff80243680>] keventd_create_kthread+0x0/0x90 [<ffffffff80243680>] keventd_create_kthread+0x0/0x90 [<ffffffff80260ecd>] kswapd+0x11d/0x120 [<ffffffff80243c10>] autoremove_wake_function+0x0/0x30 [<ffffffff80243c10>] autoremove_wake_function+0x0/0x30 [<ffffffff80260db0>] kswapd+0x0/0x120 [<ffffffff80260db0>] kswapd+0x0/0x120 [<ffffffff80243642>] kthread+0x92/0xd0 [<ffffffff8020a5b8>] child_rip+0xa/0x12 [<ffffffff80243680>] keventd_create_kthread+0x0/0x90 [<ffffffff802435b0>] kthread+0x0/0xd0 [<ffffffff8020a5ae>] child_rip+0x0/0x12 xfs_force_shutdown(dm-6,0x8) called from line 4272 of file fs/xfs/xfs_bmap.c. Return address = 0xffffffff88161368 Filesystem "dm-6": Corruption of in-memory data detected. Shutting down filesystem: dm-6 Please umount the filesystem, and rectify the problem(s) What going on here? I consider 2.6.20.3 stable for this job as I use it on the another 2 NFS server box for a long time and the setup of this boxes are very similar. Maybe hardware problem? Memory failure?? But why the call trace is so similar? It's not a lucky day for me, I could not mount the corrupted fs: Starting XFS recovery on filesystem: dm-6 (logdev: internal) XFS: failed to read root inode I ran xfs_repair and, in practice, I lost the partition with 1Tb of data. Bleh. Now I changed every partition to read-only on this box but I really want to know what is going here. TIA. The original problem - the soft lockup - is probably a result
of a massively fragmented file as we are searching the extent
list when the soft-lockup detector fired. The soft lockup
detector is not indicative of an actual bug being present, though.
The second problem you report (access to block zero) indicates
something did go wrong and there is on-disk corruption of an
extent tree - there are zeros instead of real block numbers.
ou need to run repair to fix this.
> attempt to access beyond end of device
> dm-6: rw=1, want=0, limit=2097152000
> Buffer I/O error on device dm-6, logical block 18446744073709551615
> lost page write due to I/O error on dm-6
Shows an attempt to write to a block marked as either a hole or delayed
allocate. You're not having memory errors are you?
Now, the mount error:
982 error = xfs_iget(mp, NULL, sbp->sb_rootino, 0, XFS_ILOCK_EXCL,
&rip, 0);
983 if (error) {
984 cmn_err(CE_WARN, "XFS: failed to read root inode");
985 goto error3;
986 }
Means teh root inode in the superblock is wrong. Something went really
wrong here. Can you run:
# xfs_db -r -c "sb 0" -c p <dev>
and
# dd if=<dev> bs=512 count=1 | od -Ax
And attach the output so we can see how badly corrupted the superblock
is?
And finally, repair. What problem did you encounter? Did you just end
up with everything in lost+found?
Hmmm - this reminds me of problems seen when the filesystem wraps the
device at 2TB. How big is the LVM volume this partition is on and where is
it placed? Along the same trainof thought, I'm wondering
if the device wasn't reconstructed correctly with the new kernel and that
led to the problem. Were the any other error messages or warnings in the
syslog from the boot when the problem first happened?
Thank you for reply! > The original problem - the soft lockup - is probably a result > of a massively fragmented file as we are searching the extent > list when the soft-lockup detector fired. The soft lockup > detector is not indicative of an actual bug being present, though. I was waiting about 10 minutes after the soft lockup, but one of the nfsd processes consumed 99% cpu for the whole time till I rebooted into 2.6.20.3. The nfsd did not reply to any request after the soft lockup. I think it was a real infinite loop/deadlock/Idunno. >> attempt to access beyond end of device >> dm-6: rw=1, want=0, limit=2097152000 >> Buffer I/O error on device dm-6, logical block 18446744073709551615 >> lost page write due to I/O error on dm-6 > Shows an attempt to write to a block marked as either a hole or delayed > allocate. You're not having memory errors are you? I ran IO stress test scripts today, but no sign of memory/fs corruption or instability. I would like to run memtest86+, but as I said it's a productive server and there's still 4Tb data on the remained partitions that's served as read-only, I cannot shut down this box for memtest for a long time, easily. Anyway, I've no reason to suspect bad memory (except that I loosed 1Tb data just now :) > Can you run: > # xfs_db -r -c "sb 0" -c p <dev> > and > # dd if=<dev> bs=512 count=1 | od -Ax > And attach the output so we can see how badly corrupted the superblock > is? I know, I did a mistake when I ran the 'xfs_repair -L' without thinking. Lot of valuable debug info has been lost. :( The superblock dump is attached, but it's the corrected one, not the corrupted. > And finally, repair. What problem did you encounter? Did you just end > up with everything in lost+found? Yes, and I've also got empty directories, that wasn't on the original partition, I'm sure. But 'du -hs lost+found' said everything is there. > Hmmm - this reminds me of problems seen when the filesystem wraps the > device at 2TB. How big is the LVM volume this partition is on and where is > it placed? Sulaco:~# vgdisplay -v vg1 [...] Metadata Areas 1 Metadata Sequence No 5 [...] Total PE 89426 Alloc PE / Size 89426 / 5.46 TB Free PE / Size 0 / 0 Sulaco:~# lvdisplay -m /dev/vg1/wlv [...] LV Size 1000.00 GB Current LE 16000 Segments 1 [...] --- Segments --- Logical extent 0 to 15999: Type linear Physical volume /dev/md0 Physical extents 32000 to 47999 Sulaco:~# cat /proc/mdstat Personalities : [linear] [raid0] [raid1] [raid6] [raid5] [raid4] md0 : active raid5 sda1[0] sdg1[15] sdh1[14] sdf1[13] sde1[12] sdp1[11] sdo1 [10] sdn1[9] sdm1[8] sdl1[7] sdk1[6] sdj1[5] sdi1[4] sdd1[3] sdc1[2] sdb1[1] 5860631040 blocks level 5, 128k chunk, algorithm 2 [16/16] [UUUUUUUUUUUUUUUU] [=============>.......] resync = 66.2% (258980992/390708736) finish=1451.1min speed=1511K/sec It's placed middle of the VG and only 1Tb. There's a 2.5Tb partition on this box too and there's no problem, I checked all volumes by xfs_check today. > Along the same trainof thought, I'm wondering if the device wasn't > reconstructed correctly with the new kernel and that led to the problem. > Were the any other error messages or warnings in the syslog from the > boot when the problem first happened? No, nothing. The MD was complete and worked for 1 day in the first case, and after the reboot there wasn't rebuild. I'm doing a stress test like this over NFS from morning: while :; do for i in `seq 1 100`; do rm $i dd if=/dev/zero of=$i bs=32k count=800 & done for i in `seq 1 100`; do if ! md5sum $i | grep -q <md5>; then # does not happened, yet echo FATAL; exit 1 fi done done No errors, yet. It's not "that" partition, but has same size and file count. After I mined lost+found for valuable data, I'll start this test on that partition too. I don't know what's the problem, but the similar call traces on soft lockup are very suspicious for me. I never seen such message on my servers before, and generally there's no mass fragmentation (checked by xfs_bmap) and no too high load. Created attachment 11374 [details]
superblock
Created attachment 11375 [details]
superblock by xfs_db
Any update on the problem? Thanks. I'm using 2.6.20 from this issue, I won't use .21 on this box again, I'll jump to .22 on this week. Pallai, did you have chance to try latest kernel? Since no update on this problem for a while now, I'm closing it. Please reopen if you get back to it. |