Distribution: SLES 9 SP1 ------------------------ Hardware Environment: SMP, 2 way Xeon TM 2.8GHz, 1.5G RAM Network Interface (Tigon3) Disk I/O SCSI storage controller: IBM server RAID ------------------------ Software Environment: 1.kernel linux-2.6.13-rc3-mm2 ------------------------- Problem Description: While Running File system Race test on file systems ext2,ext3,reiserfs,js and xfs encountered this oops message PREEMPT SMP DEBUG_PAGEALLOC last sysfs file: Modules linked in: CPU: 1 EIP: 0060:[<c024f6f5>] Not tainted VLI EFLAGS: 00010296 (2.6.13-rc3-mm2-I) EIP is at txLock+0x3b5/0x530 eax: d244d088 ebx: 00000000 ecx: d244d02c edx: d483742c esi: 000002d8 edi: f88ed600 ebp: ca40da20 esp: ca40d9b4 ds: 007b es: 007b ss: 0068 Process dir_create.sh (pid: 26440, threadinfo=ca40c000 task=c6926530) Stack: 00000286 000256b4 000256b8 ca40d983 c049fe41 ca40d9e0 00000000 d244d02c 00000001 00000000 00000000 ca40da00 c02416d8 00000000 ca40da20 c018c16d d244d258 d244d074 00000000 00000000 00000001 d244d02c 00000000 f657a800 Call Trace: [<c0103fbb>] show_stack+0xab/0xc0 [<c0104154>] show_registers+0x164/0x1d0 [<c0104392>] die+0x122/0x1c0 [<c04a089c>] do_page_fault+0x3dc/0x61d [<c0103c4b>] error_code+0x4f/0x54 [<c02358b5>] xtInsert+0x365/0x590 [<c024a27a>] extAlloc+0x26a/0x3f0 [<c0230b8a>] jfs_get_blocks+0x21a/0x2c0 [<c0230c66>] jfs_get_block+0x36/0x40 [<c016a658>] nobh_prepare_write+0xe8/0x420 [<c0230d48>] jfs_prepare_write+0x28/0x30 [<c0147119>] generic_file_buffered_write+0x1a9/0x680 [<c014790a>] __generic_file_aio_write_nolock+0x31a/0x580 [<c0147cb0>] __generic_file_write_nolock+0x90/0xb0 [<c0147ec2>] generic_file_write+0x52/0xd0 [<c0166168>] vfs_write+0xb8/0x190 [<c016630b>] sys_write+0x4b/0x80 [<c01030ff>] sysenter_past_esp+0x54/0x75<2>EXT2-fs error (device dm-0): ext2_readdir: bad page in #1014 Code: 89 72 50 b8 78 eb 51 <2>EXT2-fs error (device dm-0): ext2_readdir: bad page in #1015844 c0 e8 99 01 25 00 8b 45 b0 8b 15 70 0a 5f c0 83 c0 5c a3 70 0a 5f c0 8b 4d b0 89 50 04 c7 41 5c 6c 0a 54 05 25 00 e9 5a ff ff ff 8b 4d 10 66
Would it be possible to determine where in txLock the trap occurred? At the least, could you "make fs/jfs/jfs_txnmgr.s" so that I can compare the failing code to the assembler listing? I don't have a xeon system to compile it on myself.
OK, well there's one known bug in the BIO layer which could cause the dm layer to return bad data. That might explain the ext2 warning, although JFS shouldn't be oopsing. I'd suggest that you add ftp://ftp.kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.13-rc3/2.6.13-rc3-mm3/broken-out/bio_clone-fix.patch. That's already in 2.6.13-rc4.
I had captured the dump and here is the preliminary analysis over the dump. Let me know If you need more information Meanwhile I will test on the rc4-mm1 and let you know, if this bug is fixed ----------------------------------- GNU gdb Red Hat Linux (6.3.0.0-0.31rh) Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-redhat-linux-gnu"...Using host libthread_db library "/lib/tls/libthread_db.so.1". ----------------------------- #0 0xc01e3655 in journal_unmap_buffer (journal=0xc5f92200, bh=0xdfd95c9c) at include/asm/bitops.h:143 143 __asm__ __volatile__( LOCK_PREFIX ----------------------------- (gdb) info thread 4 process 1122 submit_bh (rw=1, bh=0xe3bc8754) at fs/buffer.c:2752 3 process 0 0xc01e3655 in journal_unmap_buffer (journal=0xc5f92200, bh=0xdfd95c9c) at include/asm/bitops.h:143 2 process 0 0xc01e3655 in journal_unmap_buffer (journal=0xc5f92200, bh=0xdfd95c9c) at include/asm/bitops.h:143 * 1 process 2284 0xc01e3655 in journal_unmap_buffer (journal=0xc5f92200, bh=0xdfd95c9c) at include/asm/bitops.h:143 ----------------------------- (gdb) bt #0 0xc01e3655 in journal_unmap_buffer (journal=0xc5f92200, bh=0xdfd95c9c) at include/asm/bitops.h:143 #1 0xc01e3909 in journal_invalidatepage (journal=0xc5f92200, page=0xc52385a0, offset=0) at fs/jbd/transaction.c:1906 #2 0xc0155b16 in do_invalidatepage (page=0xc5e98030, offset=3755564188) at mm/truncate.c:25 #3 0xc0155b60 in truncate_complete_page (mapping=0xc5e98030, page=0xc52385a0) at mm/truncate.c:52 #4 0xc0155d25 in truncate_inode_pages_range (mapping=0xf5ed7850, lstart=warning: Unhandled dwarf expresion opcode DW_OP_piece 3755564188, lend=warning: Unhandled dwarf expresion opcode DW_OP_piece -987527776) at mm/truncate.c:162 #5 0xc0155ee7 in truncate_inode_pages (mapping=0xc5e98030, lstart=warning: Unhandled dwarf expresion opcode DW_OP_piece 3755564188) at mm/truncate.c:219 #6 0xc015b405 in vmtruncate (inode=0xf5ed7780, offset=warning: Unhandled dwarf expresion opcode DW_OP_piece 444096111) at mm/memory.c:1587 #7 0xc0186da8 in inode_setattr (inode=0xf5ed7780, attr=0xee70cf4c) at fs/attr.c:74 #8 0xc01d5cf9 in ext3_setattr (dentry=0xc5e98030, attr=0xee70cf4c) at fs/ext3/inode.c:2807 #9 0xc0187038 in notify_change (dentry=0xf6e88414, attr=0xee70cf4c) at fs/attr.c:160 #10 0xc0168bd3 in do_truncate (dentry=0xf6e88414, length=warning: Unhandled dwarf expresion opcode DW_OP_piece 3755564188) at fs/open.c:210 #11 0xc016920b in sys_ftruncate64 (fd=3320414256, length=warning: Unhandled dwarf expresion opcode DW_OP_piece 444096111) at fs/open.c:317 #12 0xc01040bf in sysenter_past_esp () at include/asm/current.h:9 #13 0x00000005 in ?? () #14 0x1a785e6f in ?? () #15 0x00000000 in ?? () ----------------------------- (gdb) list journal_unmap_buffer 1734 * 1735 * We're outside-transaction here. Either or both of j_running_transaction 1736 * and j_committing_transaction may be NULL. 1737 */ 1738 static int journal_unmap_buffer(journal_t *journal, struct buffer_head *bh) 1739 { 1740 transaction_t *transaction; 1741 struct journal_head *jh; 1742 int may_free = 1; 1743 int ret; -----------------------------
The first oops reported looks a lot like bug #4987. Both appear to be trapping on a store of a 16-bit value. Both show the current instruction without the 66 prefix, making it look like a 32-bit store. (Does the processor increment the IP past the prefix?) I don't why we're seeing an oops. In both cases the register being dereferenced has already been dereferenced. In neither case should the structure be freed by another thread. Both bug reports are missing the beginning part of the oops text. The second oops is in ext3, and I haven't looked too closely at it.
I am still not able to replicate the bug in 2.6.13-rc4 kernel, will update you on encountering the problem
Sooo... where do we stand on this bug now?
This is still under investigation. The problem doesn't appear in 2.6.13-rc4, but a similar oops happens in 2.6.13-rc4-mm1 (bug #4987).
So, I took a quick look at this, compiled 2.6.13-rc3-mm2 on a P4 Xeon box -- though I don't know exactly what options were in the config so my version could be slightly different from the version Sharyathi compiled. I haven't reproduced the problem myself so hopefully this isn't all totally bogus analysis ;-) It looks like we could be oopsing in this piece of code: /* initialize type dependent area for linelock */ linelock = (struct linelock *) & tlck->lock; linelock->next = 0; linelock->flag = tlckLINELOCK; linelock->maxcnt = TLOCKSHORT; linelock->index = 0; from jfs_txnmgr.c: lines 765-770 Here's output of objdump -ld on jfs_txnmgr.o looking at roughly the area where the oops occurs: fs/jfs/jfs_txnmgr.c:747 1180: 8b 5c 24 10 mov 0x10(%esp),%ebx 1184: 66 89 73 1a mov %si,0x1a(%ebx) fs/jfs/jfs_txnmgr.c:766 1188: 8d 55 10 lea 0x10(%ebp),%edx fs/jfs/jfs_txnmgr.c:767 118b: 66 c7 45 10 00 00 movw $0x0,0x10(%ebp) fs/jfs/jfs_txnmgr.c:768 1191: 66 c7 42 04 00 20 movw $0x2000,0x4(%edx) fs/jfs/jfs_txnmgr.c:769 1197: c6 42 02 14 movb $0x14,0x2(%edx) fs/jfs/jfs_txnmgr.c:770 119b: c6 42 03 00 movb $0x0,0x3(%edx) To me it looked like the oopsing instruction was at 0x1191 where we first dereference %edx. I'm confused why the instruction before that dereferences %ebp (instead of %edx) for line 767. Basically, this would mean that the linelock pointer is either somehow invalid, which seems to be unlikely, or possibly incorrectly aligned? You're playing some interesting games with the whole overlay thing, maybe a union would be more appropriate there? I'll try and reproduce on a my xeon machine...
I think the code looks closer to lines 757-761: 757 jfs_ip->atltail = lid; 758 TXN_LOCK(); 759 list_add_tail(&jfs_ip->anon_inode_list, 760 &TxAnchor.anon_list); 761 TXN_UNLOCK(); fs/jfs/jfs_txnmgr.c:757 12eb: 8b 45 a0 mov 0xffffffa0(%ebp),%eax 12ee: 66 89 43 50 mov %ax,0x50(%ebx) fs/jfs/jfs_txnmgr.c:758 12f2: b8 1c 00 00 00 mov $0x1c,%eax 12f7: e8 fc ff ff ff call 12f8 <txLock+0x5f8> include/linux/list.h:54 12fc: c7 43 5c 6c 00 00 00 movl $0x6c,0x5c(%ebx) include/linux/list.h:52 1303: 8b 15 70 00 00 00 mov 0x70,%edx include/linux/list.h:81 1309: 89 d8 mov %ebx,%eax 130b: 83 c0 5c add $0x5c,%eax include/linux/list.h:53 130e: a3 70 00 00 00 mov %eax,0x70 include/linux/list.h:56 1313: 89 02 mov %eax,(%edx) include/linux/list.h:55 1315: 89 50 04 mov %edx,0x4(%eax) fs/jfs/jfs_txnmgr.c:761 1318: b8 1c 00 00 00 mov $0x1c,%eax 131d: e8 fc ff ff ff call 131e <txLock+0x61e> 1322: e9 d2 fe ff ff jmp 11f9 <txLock+0x4f9> It's hard to tell for sure, since it compiled differently for each of us. The actual instruction it trapped on is: 89 72 50 mov %esi,0x50(%edx) but I believe the real instruction is 66 89 72 50 mov %si,0x50(%edx) This would be consistent with what I observed in bug #4987, although I have no idea how we would have jumped into the middle of an instruction.
Okay, I completely misunderstood the end of the oops. I thought the Code dump started at EIP, but it starts 43 bytes before EIP. ksymoops figures it out, but since the oops output was mangled, I didn't catch it. I finally figured out what was happening and this is the same as bug #4987. A patch can be found there. *** This bug has been marked as a duplicate of 4987 ***