Bug 13726

Summary: fio sync read 4k block size 35% regression
Product: IO/Storage Reporter: Rafael J. Wysocki (rjw)
Component: OtherAssignee: io_other
Status: CLOSED CODE_FIX    
Severity: normal CC: yanmin_zhang
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.31-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 13615    

Description Rafael J. Wysocki 2009-07-06 21:59:09 UTC
Subject    : fio sync read 4k block size 35% regression
Submitter  : "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Date       : 2009-07-01 11:25
References : http://lkml.org/lkml/2009/6/30/679
Handled-By : Wu Fengguang <fengguang.wu@intel.com>

This entry is being used for tracking a regression from 2.6.30.  Please don't
close it until the problem is fixed in the mainline.

Caused by:

commit 51daa88ebd8e0d437289f589af29d4b39379ea76
Author: Wu Fengguang <fengguang.wu@intel.com>
Date:   Tue Jun 16 15:31:24 2009 -0700

    readahead: remove sync/async readahead call dependency

First-Bad-Commit : 51daa88ebd8e0d437289f589af29d4b39379ea76
Comment 1 Rafael J. Wysocki 2009-07-07 11:15:17 UTC
On Tuesday 07 July 2009, Zhang, Yanmin wrote:
> On Tue, 2009-07-07 at 02:01 +0200, Rafael J. Wysocki wrote:
> > This message has been generated automatically as a part of a report
> > of recent regressions.
> > 
> > The following bug entry is on the current list of known regressions
> > from 2.6.30.  Please verify if it still should be listed and let me know
> > (either way).
> > 
> > 
> > Bug-Entry   : http://bugzilla.kernel.org/show_bug.cgi?id=13726
> > Subject             : fio sync read 4k block size 35% regression
> > Submitter   : Zhang, Yanmin <yanmin_zhang@linux.intel.com>
> > Date                : 2009-07-01 11:25 (6 days old)
> 
> I'm still working on it now. The new testing against 2.6.31-rc2 is ongoing.
> fio sync/mmap read has new behavior. I did collect some data. But suddenly
> with new created data, the fio_sync_read_4k regression disappeared, while
> fio_mmap_read is still there. Originally, the testing and bisect were stable.
> Let me check what happens firstly.
Comment 2 Yanmin Zhang 2009-07-27 01:23:17 UTC
I found something new with the workload.
1) The parameters of FIO I used cause fio to read files round-robin and every I/O only consists of a block; original, I imagine it read a complete file, then switch to another file;
2) fio is a typical disk I/O testing benchmark, so cpu utilization is very little. I found such cpu C states have big impact on such workload results, especially on Nehalem machines. On my Nehalem machine, cpu enters C2/C3 frequently which causes a long latency to restore to C0 state. It hurts fio result on Nehalem machine. I add idle=poll to kernel boot command line now.
3) FengGuang thinks round-robin is a good test scenario, so I add new sub-cases, including read sequential and read round-robin.

I would like to close the report now. If finding new issues with new revised test cases, I will open new reports.

Thanks,
Yanmin