Bug 20162 - [LogFS][2.6.36.rc7+] Kernel BUG at readwrite.c:1193
[LogFS][2.6.36.rc7+] Kernel BUG at readwrite.c:1193
Status: CLOSED CODE_FIX
Product: File System
Classification: Unclassified
Component: Other
All Linux
: P1 normal
Assigned To: fs_other
:
Depends on:
Blocks: 16444
  Show dependency treegraph
 
Reported: 2010-10-12 17:00 UTC by Maciej Rutecki
Modified: 2010-12-24 13:43 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.36-rc7
Tree: Mainline
Regression: Yes


Attachments

Description Maciej Rutecki 2010-10-12 17:00:45 UTC
Subject    : [LogFS][2.6.36.rc7+] Kernel BUG at readwrite.c:1193
Submitter  : Prasad Joshi <prasadjoshi124@gmail.com>
Date       : 2010-10-10 17:44
Message-ID : AANLkTi=JkcuWBPo+X-i+9o-BJFVqjea1J3e=Mr=HvAWF@mail.gmail.com
References : http://marc.info/?l=linux-kernel&m=128673196203340&w=2

This entry is being used for tracking a regression from 2.6.35.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Prasad Gajanan Joshi. 2010-10-12 17:54:46 UTC
Pasting the stack dump as it was not present in the Bug Report

<4>[ 4452.145753]  [<e099e5b1>] ? logfs_write_i0+0x91/0x170 [logfs]
<4>[ 4452.145753]  [<e099fb66>] ? __logfs_write_rec+0x176/0x2b0 [logfs]
<4>[ 4452.145753]  [<e099fad8>] ? __logfs_write_rec+0xe8/0x2b0 [logfs]
<4>[ 4452.145753]  [<e099fd0b>] ? logfs_write_rec+0x6b/0xd0 [logfs]
<4>[ 4452.145753]  [<e09a0811>] ? __logfs_delete+0x41/0x70 [logfs]
<4>[ 4452.145753]  [<e09a0920>] ? logfs_delete_inode+0xe0/0x140 [logfs]
<4>[ 4452.145753]  [<c0203a0f>] ? generic_delete_inode+0x6f/0x100
<4>[ 4452.145753]  [<c020427d>] ? generic_drop_inode+0x3d/0x60
<4>[ 4452.145753]  [<e099afab>] ? logfs_drop_inode+0x6b/0x80 [logfs]
<4>[ 4452.145753]  [<c0203091>] ? iput+0x41/0x60
<4>[ 4452.145753]  [<e0999685>] ? __logfs_create+0xd5/0x1b0 [logfs]
<4>[ 4452.145753]  [<c0203b11>] ? __insert_inode_hash+0x31/0x60
<4>[ 4452.145753]  [<e0999885>] ? logfs_create+0x55/0x60 [logfs]
<4>[ 4452.145753]  [<c01f9bfa>] ? vfs_create+0x9a/0xc0
<4>[ 4452.145753]  [<c01fac07>] ? do_last+0x557/0x5d0
<4>[ 4452.145753]  [<c01fc742>] ? do_filp_open+0x182/0x4d0
<4>[ 4452.145753]  [<c01f044c>] ? do_sync_write+0x9c/0xd0
<4>[ 4452.145753]  [<c01f9335>] ? getname+0x25/0xf0
<4>[ 4452.145753]  [<c01ee9af>] ? do_sys_open+0x4f/0x110
<4>[ 4452.145753]  [<c01f2005>] ? fput+0x15/0x20
<4>[ 4452.145753]  [<c01eead9>] ? sys_open+0x29/0x40
<4>[ 4452.145753]  [<c0102fd7>] ? sysenter_do_call+0x12/0x22

Summary
=======
This is happening when __logfs_create() tries to write a new inode to disk when the disk is full and hence write_inode() returns error. On error __logfs_create() aborts and frees the transaction pointer and calls iput() on the allocated inode. It results in call to delete_inode(). But the transaction pointer used during delete_inode() is invalid.

Details
=======
__logfs_create() associates the transaction pointer with inode. During the logfs_write_inode() function call chain this transaction pointer is moved from inode to page->private using function move_inode_to_page (do_write_inode() -> inode_to_page() -> move_inode_to_page)

As I said when write_inode() returns error the transaction is aborted and transaction pointer is freed. But what about page->private? it results in incorrect transaction pointer.

During delete_inode the same transaction pointer associated with the page is getting used.

I modified the code in do_write_inode, which checks for error. If disk write returns an error restore the page->private pointer to logfs_inode. This way page->private wld not point to dangling pointer.

[prasad@localhost logfs]$ git diff *
diff --git a/fs/logfs/readwrite.c b/fs/logfs/readwrite.c
index 6127baf..ee99a9f 100644
--- a/fs/logfs/readwrite.c
+++ b/fs/logfs/readwrite.c
@@ -1994,6 +1994,9 @@ static int do_write_inode(struct inode *inode)
 
        /* FIXME: transaction is part of logfs_block now.  Is that enough? */
        err = logfs_write_buf(master_inode, page, 0);
+       if (err)
+               move_page_to_inode(inode, page);
+
        logfs_put_write_page(page);
        return err;
 }

And it worked for me, I no longer see the same bug again. 

This Bug is reproducible in 2.6.35 as well as latest kernel, above changes were done in the latest kernel.
Comment 2 Florian Mickler 2010-11-16 15:20:46 UTC
Can you submit that patch to lkml and cc the logfs maintainer and the logfs list?
(Joern Engel <joern@logfs.org>, logfs@logfs.org, linux-kernel@vger.kernel.org)

See Documentation/SubmittingPatches

Patch: https://bugzilla.kernel.org/show_bug.cgi?id=20162#c1
Comment 3 Rafael J. Wysocki 2010-11-18 23:42:30 UTC
Handled-By :  Prasad Gajanan Joshi <prasadjoshi124@gmail.com>

Note You need to log in before you can comment on or make changes to this bug.