Bug 9738

Summary: regression: 100% io-wait with 2.6.24-rcX
Product: File System Reporter: Rafael J. Wysocki (rjw)
Component: OtherAssignee: fs_other
Status: CLOSED CODE_FIX    
Severity: normal CC: bunk, wfg
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.24-rc Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9243    

Description Rafael J. Wysocki 2008-01-12 12:19:17 UTC
Subject         : regression: 100% io-wait with 2.6.24-rcX
Submitter       : Joerg Platte <lists@naasa.net>
Date            : 2008-01-07 11:51
References      : http://lkml.org/lkml/2008/1/7/70
Handled-By      : Fengguang Wu <wfg@mail.ustc.edu.cn>
Comment 1 Anonymous Emailer 2008-01-14 16:56:17 UTC
Reply-To: wfg@mail.ustc.edu.cn

On Sat, Jan 12, 2008 at 12:20:20PM -0800, bugme-daemon@bugzilla.kernel.org wrote:
> http://bugzilla.kernel.org/show_bug.cgi?id=9738

Joerg and me has comfirmed that this patch can fix the bug:


clear PAGECACHE_TAG_DIRTY for truncated page in block_write_full_page()

The `truncated' page in block_write_full_page() may stick for a long time.
E.g. ext2_rmdir() will set i_size to 0, and then the dir inode may hang around
because of being referenced by someone.

So clear PAGECACHE_TAG_DIRTY to prevent pdflush from retrying and iowaiting on
it.

Tested-by: Joerg Platte <jplatte@naasa.net>
Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
---
fs/buffer.c |    2 ++
 1 files changed, 2 insertions(+)

Index: linux/fs/buffer.c
===================================================================
--- linux.orig/fs/buffer.c
+++ linux/fs/buffer.c
@@ -2820,7 +2820,9 @@ int block_write_full_page(struct page *p
 		 * freeable here, so the page does not leak.
 		 */
 		do_invalidatepage(page, 0);
+		set_page_writeback(page);
 		unlock_page(page);
+		end_page_writeback(page);
 		return 0; /* don't care */
 	}
 
Comment 2 Anonymous Emailer 2008-01-14 17:06:51 UTC
Reply-To: akpm@linux-foundation.org

On Mon, 14 Jan 2008 21:00:06 +0800
Fengguang Wu <wfg@mail.ustc.edu.cn> wrote:

> On Sat, Jan 12, 2008 at 12:20:20PM -0800, bugme-daemon@bugzilla.kernel.org
> wrote:
> > http://bugzilla.kernel.org/show_bug.cgi?id=9738
> 
> Joerg and me has comfirmed that this patch can fix the bug:
> 
> 
> clear PAGECACHE_TAG_DIRTY for truncated page in block_write_full_page()
> 
> The `truncated' page in block_write_full_page() may stick for a long time.
> E.g. ext2_rmdir() will set i_size to 0, and then the dir inode may hang
> around
> because of being referenced by someone.
> 
> So clear PAGECACHE_TAG_DIRTY to prevent pdflush from retrying and iowaiting
> on
> it.
> 
> Tested-by: Joerg Platte <jplatte@naasa.net>
> Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
> ---
> fs/buffer.c |    2 ++
>  1 files changed, 2 insertions(+)
> 
> Index: linux/fs/buffer.c
> ===================================================================
> --- linux.orig/fs/buffer.c
> +++ linux/fs/buffer.c
> @@ -2820,7 +2820,9 @@ int block_write_full_page(struct page *p
>                * freeable here, so the page does not leak.
>                */
>               do_invalidatepage(page, 0);
> +             set_page_writeback(page);
>               unlock_page(page);
> +             end_page_writeback(page);
>               return 0; /* don't care */
>       }
>  

But seven minutes earlier you recommended that we revert 
2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b.  Please clarify?
Comment 3 Anonymous Emailer 2008-01-14 17:20:24 UTC
Reply-To: wfg@mail.ustc.edu.cn

On Mon, Jan 14, 2008 at 05:06:47PM -0800, Andrew Morton wrote:
> On Mon, 14 Jan 2008 21:00:06 +0800
> Fengguang Wu <wfg@mail.ustc.edu.cn> wrote:
> 
> > On Sat, Jan 12, 2008 at 12:20:20PM -0800, bugme-daemon@bugzilla.kernel.org
> wrote:
> > > http://bugzilla.kernel.org/show_bug.cgi?id=9738
> > 
> > Joerg and me has comfirmed that this patch can fix the bug:
> > 
> > 
> > clear PAGECACHE_TAG_DIRTY for truncated page in block_write_full_page()
> > 
> > The `truncated' page in block_write_full_page() may stick for a long time.
> > E.g. ext2_rmdir() will set i_size to 0, and then the dir inode may hang
> around
> > because of being referenced by someone.
> > 
> > So clear PAGECACHE_TAG_DIRTY to prevent pdflush from retrying and iowaiting
> on
> > it.
> > 
> > Tested-by: Joerg Platte <jplatte@naasa.net>
> > Signed-off-by: Fengguang Wu <wfg@mail.ustc.edu.cn>
> > ---
> > fs/buffer.c |    2 ++
> >  1 files changed, 2 insertions(+)
> > 
> > Index: linux/fs/buffer.c
> > ===================================================================
> > --- linux.orig/fs/buffer.c
> > +++ linux/fs/buffer.c
> > @@ -2820,7 +2820,9 @@ int block_write_full_page(struct page *p
> >              * freeable here, so the page does not leak.
> >              */
> >             do_invalidatepage(page, 0);
> > +           set_page_writeback(page);
> >             unlock_page(page);
> > +           end_page_writeback(page);
> >             return 0; /* don't care */
> >     }
> >  
> 
> But seven minutes earlier you recommended that we revert 
> 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b.  Please clarify?

OK. Please revert 2e6883bdf49abd0e7f0d9b6297fc3be7ebb2250b and ignore
this patch. I'll resend them both for -mm later on, in a more complete
patchset.

Fengguang
Comment 4 Adrian Bunk 2008-01-15 12:35:03 UTC
Problem-causing commit was reverted in commit c23f72cae9523d29ff94eec8f30ccbdaf234b20e.