Distribution: Debian sid Hardware Environment: qemu x86 Software Environment: Minimal Debian sid (unstable) Problem Description: On mounting the attached (intentionally corrupted) filesystem, I get the following BUG: ---------- XFS mounting filesystem hdb Starting XFS recovery on filesystem: hdb (logdev: internal) BUG: unable to handle kernel NULL pointer dereference at 00000000 IP: [<c04109d1>] xlog_recover_reorder_trans+0x18/0x7d *pde = 00000000 Oops: 0000 [#1] Pid: 642, comm: mount Not tainted (2.6.25.4 #3) EIP: 0060:[<c04109d1>] EFLAGS: 00000282 CPU: 0 EIP is at xlog_recover_reorder_trans+0x18/0x7d EAX: c4c7a240 EBX: 00000000 ECX: 00000001 EDX: 00000000 ESI: 00000000 EDI: c4c7a264 EBP: c4cc1b78 ESP: c4cc1b64 DS: 007b ES: 007b FS: 0000 GS: 0033 SS: 0068 Process mount (pid: 642, ti=c4cc0000 task=c78d8ea0 task.ti=c4cc0000) Stack: 00000000 000002d0 00000000 c4c7a240 00000001 c4cc1ba4 c0412c05 c0424c59 00000000 01410f56 c4c7a240 c7aad600 c4cc1ca0 00000000 c4c7a240 c7aad600 c4cc1bb8 c0412d44 00000000 c4c7a240 c4cc1ca0 c4cc1bf0 c0412ea3 00000001 Call Trace: [<c0412c05>] ? xlog_recover_do_trans+0x18/0x12b [<c0424c59>] ? kmem_free+0x37/0x47 [<c0412d44>] ? xlog_recover_commit_trans+0x2c/0x40 [<c0412ea3>] ? xlog_recover_process_data+0x14b/0x21d [<c0414052>] ? xlog_do_recovery_pass+0x955/0x9f8 [<c0228baa>] ? autoremove_wake_function+0x17/0x3a [<c021186f>] ? __wake_up+0x3a/0x42 [<c041413c>] ? xlog_do_log_recovery+0x47/0x94 [<c04141a6>] ? xlog_do_recover+0x1d/0x109 [<c04156a5>] ? xlog_recover+0x8c/0xa7 [<c040f83b>] ? xfs_log_mount+0x9a/0x14b [<c0417726>] ? xfs_mountfs+0x2b4/0x68d [<c044f34c>] ? _atomic_dec_and_lock+0x10/0x34 [<c045895d>] ? __spin_lock_init+0x2c/0x4f [<c041e83f>] ? xfs_mount+0x2f9/0x33a [<c02805dc>] ? set_blocksize+0x62/0xb2 [<c042e0ed>] ? xfs_fs_fill_super+0xb3/0x1f2 [<c025fa7f>] ? get_sb_bdev+0x108/0x139 [<c02727dd>] ? mntput_no_expire+0x16/0x67 [<c053d089>] ? _spin_lock+0x32/0x38 [<c042cf82>] ? xfs_fs_get_sb+0x21/0x27 [<c042e03a>] ? xfs_fs_fill_super+0x0/0x1f2 [<c025eab7>] ? vfs_kern_mount+0x3a/0x8b [<c025eb52>] ? do_kern_mount+0x33/0xbd [<c027296e>] ? do_new_mount+0x59/0x77 [<c02737c7>] ? do_mount+0x185/0x1b0 [<c0244727>] ? __get_free_pages+0x29/0x62 [<c0271f95>] ? copy_mount_options+0x2e/0x11e [<c027386d>] ? sys_mount+0x7b/0xae [<c0202cd2>] ? syscall_call+0x7/0xb ======================= Code: 00 83 c4 08 5b 5d c3 8b 01 89 03 31 c0 83 c4 08 5b 5d c3 55 89 e5 57 56 53 83 ec 08 8b 70 24 c7 40 24 00 00 00 00 8d 78 24 89 f2 <8b> 1a 8b 42 14 8b 08 0f b7 01 66 2d 36 12 66 83 f8 08 76 21 c7 EIP: [<c04109d1>] xlog_recover_reorder_trans+0x18/0x7d SS:ESP 0068:c4cc1b64 ---[ end trace 48466195ed42b2a7 ]--- ---------- Steps to reproduce: 1. gunzip the attached filesystem image 2. mount it as XFS
Created attachment 16416 [details] Test case, corrupted XFS filesystem, gzip compressed
Verified
Created attachment 17981 [details] Validate the transaction header in new items Can you check that this patch fixes the reported problem? It catches the corrupt transaction at the point that it is initially decoded and returns a error at that point. The filesystem will refuse to mount with an EIO error and log a warning explaining the reason.
I have the same problem. Dave, your patch does not fixes the problem.
I've verified with that Dave's check does indeed catch the problem in the attached image. Vadim, if you have a problem that is not fixed by that check it's probably similar but not the same, please open a different bug report for it.
Christoph, it is almost one year left since I've got the issue. I've stopped to use XFS and moved to another one FS (and hardware as well). There is no way to reproduce the problem and create a bug now, sorry.