Bug 9978

Summary: 2.6.25-rc1: volanoMark regression
Product: Process Management Reporter: Rafael J. Wysocki (rjw)
Component: SchedulerAssignee: Ingo Molnar (mingo)
Status: CLOSED CODE_FIX    
Severity: normal CC: bsingharora, bunk, dhaval, mingo
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.25-rc1 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 9832    

Description Rafael J. Wysocki 2008-02-13 16:08:36 UTC
Subject         : 2.6.25-rc1: volanoMark 45% regression
Submitter       : "Zhang, Yanmin" <yanmin_zhang@linux.intel.com>
Date            : 2008-02-13 10:30
References      : http://lkml.org/lkml/2008/2/13/128
Handled-By      : Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>

This entry is being used for tracking a regression from 2.6.24.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Rafael J. Wysocki 2008-02-13 16:09:02 UTC
Caused by:

commit 58e2d4ca581167c2a079f4ee02be2f0bc52e8729
Author: Srivatsa Vaddagiri <vatsa@linux.vnet.ibm.com>
Date:   Fri Jan 25 21:08:00 2008 +0100

    sched: group scheduling, change how cpu load is calculated
Comment 2 Rafael J. Wysocki 2008-02-24 16:05:02 UTC
Handled-By : Balbir Singh <balbir@linux.vnet.ibm.com>
Comment 3 Rafael J. Wysocki 2008-03-12 15:43:13 UTC
References : http://lkml.org/lkml/2008/3/12/52

Yanmin said:

Peter reverted the load balance patch and 2.6.25-rc4 accepted the reverting patch.

With kernel 2.6.25-rc5, volanoMark has about 6% regression on my 16-core tigerton. If I apply patch http://lkml.org/lkml/2008/2/20/83 which fixes the tbench regression issue, volanoMark regression becomes about 4%.

I tried to bisect down which patch caused the last 4%, but found it's very hard. One thing is many patches depend on the reverted patches. The other thing is I find the testing result isn't stable since 2.6.25-rc1. The result variation might be more than 15% sometimes. I ran the testing against the same kernel for many times to get the best result.

I also tried to tune some sched_XXX parameters under /proc/sys/kernel, but didn't get better result than the default configuration.

Above regression exists on the 2.93GHz 16-core tigerton. With the less powerful 2.40GHz 16-core tigerton, the regression is less than 1%, but result is not stable and results of many runs might have about 15% variation.

On 8-core stoakley, the regression is about 1%.
Comment 4 Rafael J. Wysocki 2008-03-18 17:40:21 UTC
References : http://lkml.org/lkml/2008/3/18/81

Caused by:

commit e22ecef1d2658ba54ed7d3fdb5d60829fb434c23
Author: Ingo Molnar <mingo@elte.hu>
Date:   Fri Mar 14 22:16:08 2008 +0100

    sched: fix fair sleepers
Comment 6 Adrian Bunk 2008-04-06 13:53:55 UTC
According to comment #3 this commit was not enough for completely fixing the regression.