Bug 14156 - Running fsstress "fsx" on ext4 filesystem gives dirty metadata buffer warning message
Summary: Running fsstress "fsx" on ext4 filesystem gives dirty metadata buffer warning...
Status: RESOLVED CODE_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: ext4 (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Jan Kara
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-11 04:59 UTC by muni
Modified: 2010-09-15 13:22 UTC (History)
6 users (show)

See Also:
Kernel Version: 2.6.31-rc8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
fsx tarball (100.10 KB, application/x-bzip)
2009-09-16 12:06 UTC, muni
Details
Fix buffer dirty warning messages in data=journal mode (2.77 KB, patch)
2010-06-17 10:12 UTC, Jan Kara
Details | Diff

Description muni 2009-09-11 04:59:42 UTC
Here is the warning message from 'dmesg':

Sep 7 12:49:57 mx3755a kernel: JBD: Spotted dirty metadata buffer (dev = sdb5,
blocknr = 41444). There's a risk of filesystem corruption in case of system
crash.
Sep  7 12:49:57 mx3755a kernel: JBD: Spotted dirty metadata buffer (dev = sdb5,
blocknr = 41445). There's a risk of filesystem corruption in case of system
crash.


More information can be found in IBM bugzilla# 55868
Comment 1 muni 2009-09-16 12:03:48 UTC
While running the fsx tests on 2.6.31-rc8 kernel, the above mentioned occurred.

Steps for running fsx test:
1. Create an ext4 partition with blocksize 1024 and mount it with 
"-o errors=panic,data=journal" options
2. Install the autotest framework
3. Run fsx test attached to this bug
4. Check the dmesg.

You can see several messages like the one below:

kernel: JBD: Spotted dirty metadata buffer (dev = sdb5,
blocknr = 41444). There's a risk of filesystem corruption in case of system
crash.

Thanks
Muni
Comment 2 muni 2009-09-16 12:06:00 UTC
Created attachment 23101 [details]
fsx tarball
Comment 3 Jan Kara 2009-12-23 16:45:10 UTC
Guys, are you still able to reproduce the problem with 2.6.32 kernel?
Comment 4 Jan Kara 2010-02-23 13:17:53 UTC
I will close this one as NORESPONSE. Please reopen in case you are still able to reproduce...
Comment 5 Eric Sandeen 2010-02-23 14:59:09 UTC
Jan, for what its worth, running xfstests 076 faithfully reproduces this warning for me (hm, with quotas turned on, I'd need to double check if it does so without).

test 076 runs fsstress while doing concurrent reads directly from the block device.

-Eric
Comment 6 Jan Kara 2010-02-24 11:34:49 UTC
OK, reopening. I've missed the data=journal part of the report originally. Now when I tried running fsx-linux test on my machine, it just hung so obviously there really are some problems. I'll have a look into that.
Comment 7 Kamalesh Babulal 2010-06-09 09:55:51 UTC
With 2.6.34, after a couple of JBD warnings, kernel panic's


kernel: EXT4-fs (sdd7): mounted filesystem with journalled data mode
kernel: JBD: Spotted dirty metadata buffer (dev = sdd7, blocknr = 24982). There's a risk of filesystem corruption in case of system crash.
kernel: JBD: Spotted dirty metadata buffer (dev = sdd7, blocknr = 49153). There's a risk of filesystem corruption in case of system crash.
kernel: JBD: Spotted dirty metadata buffer (dev = sdd7, blocknr = 24894). There's a risk of filesystem corruption in case of system crash.
kernel: JBD: Spotted dirty metadata buffer (dev = sdd7, blocknr = 24851). There's a risk of filesystem corruption in case of system crash.
kernel: JBD: Spotted dirty metadata buffer (dev = sdd7, blocknr = 24892). There's a risk of filesystem corruption in case of system crash.
------------[ cut here ]------------
kernel BUG at fs/jbd2/commit.c:951!
invalid opcode: 0000 [#1] SMP 
last sysfs file: /sys/devices/pci0000:00/0000:00:06.0/0000:03:04.0/host3/port-3:1/end_device-3:1/target3:0:1/3:0:1:0/block/sdd/sdd7/alignment
Modules linked in: ipt_MASQUERADE iptable_nat nf_nat bridge stp llc autofs4 sunrpc ip6t_REJECT nf_conntrack_ipv6 ip6table_filter ip6_tables i

Pid: 2907, comm: jbd2/sdd7-8 Not tainted 2.6.34 #1 Server Blade/BladeCenter LS41 -[797252A]-
EIP: 0060:[<fa3d1817>] EFLAGS: 00010202 CPU: 0 
EIP is at jbd2_journal_commit_transaction+0x10c7/0x16c0 [jbd2]
EAX: 00510023 EBX: df42dc1c ECX: 00000000 EDX: df425820
ESI: e881e470 EDI: ec7ed3c0 EBP: ec09a1fc ESP: e35a9ec4
 DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
Process jbd2/sdd7-8 (pid: 2907, ti=e35a8000 task=ec330030 task.ti=e35a8000)
Stack:
 ffffffff 00000000 0b36a000 00000000 00000001 00000000 e35a9ee4 ec7ed3c0
<0> ec09a000 ec7ed3f4 00000000 ec09a014 00000000 ec7ed3f4 ffffffff 0000028c
<0> 00000000 00000008 c16f3a00 ec09a214 e9d73d6c 83a204b2 00000360 00000018
Call Trace:
 [<c04597b7>] ? lock_timer_base+0x27/0x50
 [<fa3d6ab3>] ? kjournald2+0xa3/0x360 [jbd2]
 [<c0466fc0>] ? autoremove_wake_function+0x0/0x40
 [<fa3d6a10>] ? kjournald2+0x0/0x360 [jbd2]
 [<c0466c14>] ? kthread+0x74/0x80
 [<c0466ba0>] ? kthread+0x0/0x80
 [<c040997e>] ? kernel_thread_helper+0x6/0x10
Code: f6 06 20 0f 84 c5 f4 ff ff 8b 86 ac 01 00 00 31 d2 e8 de 26 1d c6 e9 b3 f4 ff ff 8b 44 24 20 89 fa e8 0e 4c 00 00 e9 75 f8 ff ff <0f> 0
EIP: [<fa3d1817>] jbd2_journal_commit_transaction+0x10c7/0x16c0 [jbd2] SS:ESP 0068:e35a9ec4
---[ end trace 3af338f3dcb2d193 ]---
Kernel panic - not syncing: Fatal exception
Pid: 2907, comm: jbd2/sdd7-8 Tainted: G      D    2.6.34 #1
Call Trace:
 [<c07f16b4>] ? panic+0x41/0xb9
 [<c07f51fc>] ? oops_end+0xbc/0xd0
 [<c040a150>] ? do_invalid_op+0x0/0x90
 [<c040a1cf>] ? do_invalid_op+0x7f/0x90
 [<fa3d1817>] ? jbd2_journal_commit_transaction+0x10c7/0x16c0 [jbd2]
 [<c04a4c71>] ? delayacct_end+0xa1/0xc0
 [<c04a4cea>] ? __delayacct_blkio_end+0x2a/0x50
 [<c07f2819>] ? __wait_on_bit+0x59/0x70
 [<c0527000>] ? sync_buffer+0x0/0x40
 [<c07f4617>] ? error_code+0x73/0x78
 [<fa3d1817>] ? jbd2_journal_commit_transaction+0x10c7/0x16c0 [jbd2]
 [<c04597b7>] ? lock_timer_base+0x27/0x50
 [<fa3d6ab3>] ? kjournald2+0xa3/0x360 [jbd2]
 [<c0466fc0>] ? autoremove_wake_function+0x0/0x40
 [<fa3d6a10>] ? kjournald2+0x0/0x360 [jbd2]
 [<c0466c14>] ? kthread+0x74/0x80
 [<c0466ba0>] ? kthread+0x0/0x80
 [<c040997e>] ? kernel_thread_helper+0x6/0x10
Comment 8 Jan Kara 2010-06-16 10:44:00 UTC
This is probably the same problem as it seems the BUG comes from:
J_ASSERT_BH(bh, !buffer_dirty(bh));

I have some time now so I'm looking at why does this happen...
Comment 9 Jan Kara 2010-06-16 18:48:11 UTC
Hmm, I think I have found the problem. It's in both ext3 & ext4 - block_prepare_write dirties the buffer. Currently, I have written an ext3 fix, tomorrow I'll port it to ext4 and attach here for testing.
Comment 10 Jan Kara 2010-06-17 10:12:57 UTC
Created attachment 26823 [details]
Fix buffer dirty warning messages in data=journal mode

Kamalesh, this patch fixes the issue for me. Could you please test it? Thanks.
Comment 11 Kamalesh Babulal 2010-06-18 05:50:53 UTC
(In reply to comment #10)
> Created an attachment (id=26823) [details]
> Fix buffer dirty warning messages in data=journal mode
> 
> Kamalesh, this patch fixes the issue for me. Could you please test it?
> Thanks.

Thanks Jan, the patch fixes the issue for me.
Comment 12 Jan Kara 2010-09-15 13:22:57 UTC
Patches were merged upstream. Closing the bug.

Note You need to log in before you can comment on or make changes to this bug.