Bug 11836

Summary: Scheduler on C2D CPU and latest 2.6.27 kernel
Product: Process Management Reporter: Rafael J. Wysocki (rjw)
Component: SchedulerAssignee: Ingo Molnar (mingo)
Status: CLOSED INSUFFICIENT_DATA    
Severity: normal CC: csnook, zdenek.kabelac
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.27 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 11167    

Description Rafael J. Wysocki 2008-10-25 06:45:57 UTC
Subject    : Scheduler on C2D CPU and latest 2.6.27 kernel
Submitter  : "Zdenek Kabelac" <zdenek.kabelac@gmail.com>
Date       : 2008-10-21 9:59
References : http://marc.info/?l=linux-kernel&m=122458320502371&w=4
Handled-By : Chris Snook <csnook@redhat.com>

This entry is being used for tracking a regression from 2.6.26.  Please don't
close it until the problem is fixed in the mainline.
Comment 1 Chris Snook 2008-10-27 06:50:13 UTC
As I understand it, this regression has only been seen in 2.6.27-git kernels, not the stable 2.6.27 series.  Please test on a stable 2.6.27.y kernel so we can be sure we're looking in the right place for the regression.
Comment 2 Zdenek Kabelac 2008-10-27 07:24:57 UTC
Yes - plain 2.6.27 is fine and I do not see this regression.

If none has any idea where this problem could be hidden - maybe I should do bisect?
Comment 3 Zdenek Kabelac 2008-10-27 07:25:02 UTC
Yes - plain 2.6.27 is fine and I do not see this regression.

If none has any idea where this problem could be hidden - maybe I should do bisect?
Comment 4 Chris Snook 2008-10-27 07:36:48 UTC
Please do bisect.  You can probably find it quickly by testing scheduler merges.
Comment 5 Zdenek Kabelac 2008-11-05 05:02:32 UTC
Ok - this problem is still present with 2.6.28-rc3 and it becomes even more weird - I assume that patches which are now flowing into 2.6.28 tree are really not correct. Now the CPU scales it's clock even with the performance governor (which might be another different bug/regression) and CPU seems to be lowering these CPU bound tasks like there would be just 1 CPU and scheduler tries to fit all jobs on just 1 CPU and tries to free the second CPU -i.e.  when second busy task is run in parallel with gears - gears are slowly rendering less and less FPS.
Comment 6 Peter Zijlstra 2008-11-06 00:11:48 UTC
Did you do that bisect you proposed?

Can you post your full dmesg and .config somewhere?

I've never seen anything like this...
Comment 7 Zdenek Kabelac 2008-11-10 04:06:29 UTC
So I've spend quite a few hours on this - and it's getting even more weird. One part of the problem is that this is happening on Fedora Rawhide Xserver - I've not found a kernel which would be not showing strange things. So I'll try to more closely describe the problem:

There are some ongoing changes in the Xorg - some of them are getting back the speed of glxgears which was possible to achive with older 1.4.2 Xserver.

Using x86_64 2.6.28-rc3 & rawhide Xorg intel with GEM gives nice 860FPS
(My older debian with 1.4.2 & i386 kernel scores 890FPS)

Now when I run this:

LIBGL_ALWAYS_SOFTWARE=1 glxgears
I get something like 250FPS with currently compiled kernel.
When I run some CPU busy loop the FPS starts to lower down - but not in one step but gradually - and there is no other CPU significant task running.

1252 frames in 5.0 seconds = 250.225 FPS
1246 frames in 5.0 seconds = 249.087 FPS
1138 frames in 5.0 seconds = 227.510 FPS
1159 frames in 5.0 seconds = 231.704 FPS
1160 frames in 5.0 seconds = 231.827 FPS
1079 frames in 5.0 seconds = 215.736 FPS
1121 frames in 5.0 seconds = 224.167 FPS
989 frames in 5.0 seconds = 197.375 FPS
993 frames in 5.0 seconds = 198.469 FPS
976 frames in 5.0 seconds = 195.141 FPS
969 frames in 5.0 seconds = 193.783 FPS
957 frames in 5.0 seconds = 191.379 FPS
946 frames in 5.0 seconds = 189.060 FPS
949 frames in 5.0 seconds = 189.549 FPS

so it looks like this could be some 'timing' bug in Xorg server instead of kernel. I've not been able to see the similar thing on the Debian's older xserver.

I'll do some others tests - if I would know which....
But currently bisects doesn't help as I've not found good kernel.

Also any idea why the performance governor is changing CPU frequency ?