Most recent kernel where this bug did not occur: ubuntu kernel 2.6.10-5-686 Distribution: ubuntu Hardware Environment: CPU Intel P4 1.7Ghz - 1.256 Gb RAM - Asus P4B Mb Software Environment: Problem Description: I encounter serious performance problems with all kernels older than ubuntu kernel 2.6.10-5-686. Even the latest kernel.org 2.6.13 have this problem. The first symptom that I noticed was that KDM login screen took more than 30sec to appear when it took an average 3sec before. Then all applications were absolutly slower than on 2.6.10. I did a little bash shell loop program (testit attached in zzxxx files) to compare performances on the same basis. This same bash shell setting 100000 times a variable takes 8sec on 2.6.10 kernel, 28sec on 2.6.12 or 2.6.13 ! So I think there is a serious problem on this kernel. Then I compiled my own 2.6.12 kernel with very reduced options adapted to my particular system. I'll join the .config for reference. But even with this kernel the performance problem is EXACTLY THE SAME ! time ./testit still shows 28sec realtime on K2.6.12 instead of 8sec on K2.6.10. Booting this kernel with acpi=off noacpi noapic nolapic does not solve this problem either. Then I felt relieved when I see that : $ time ./testit real 0m6.474s user 0m6.211s sys 0m0.126s $ uname -a Linux cha92-7-82-230-174-61 2.6.12-5-686 #1 Thu Jul 28 09:25:12 UTC 2005 i686 GNU/Linux Well problem seems to be solved by latest kernel 2.6.12-5-686. But it was not. Now the problem does not occur immediately after boot. Performance degrade quickly only if some processes take the CPU to high load, like kmail + spamd on lots of incomming messages, or compiling kde etc... After such CPU loads occur, any process tend to take 5 to 10% CPU ! So U understanf easilly that the system becomes quickly so slow that it becomes absolutly unusable. breezy kernel Linux version 2.6.12-7-686 (buildd@terranova) (gcc version 3.4.5 20050809 (prerelease) (Debian 3.4.4-6ubuntu4)) #1 Fri Aug 19 13:08:28 UTC 2005 After several weeks of running the different new kernels I still have this problem. All I can say about it is the following: - after boot everything is fine now (before, kdm was horribly slow to start, now it starts in 4 sec instead of 2 minutes) - residual CPU when not doing anything is around 1 to 2 % - after some time of use OR if I use 100% CPU during a couple minutes, residual CPU reaches near 30% all the time. in fact, top shows each running process taking around 10% CPU. After that the machine begins to be unusable of course. - a reboot fix the problem I thought of a cpu temperature problem (cpu is between 56-61
Created attachment 5889 [details] 2.6.10 data some data whith this kernel 2.6.10 that works well xorg data are there because I thought it was xorg problem at first.
Created attachment 5890 [details] 2.6.12 data perf problems when using 2.6.12
First of all, distro kernels are not going to be too helpful to reference. Does kernel.org 2.6.10 display the same problem? What was the last *kernel.org* kernel which did not display the performance issues? Does anything show up in the logs? What's your .config? Any binary drivers? Thanks, Nish
Created attachment 5891 [details] 2.6.13 data perf problem with vanilla 2.6.13 more data to come
Odd. Initial blame always falls to mtrr settings. Can you please generate copies of /proc/mtrr for both good and bad kernels? I wonder if this could be caused by cpufreq going bananas?
Created attachment 5922 [details] some infos incliding mtrr for ubuntu kernel 2610-5 which works well I am currently compiling and testing kernel.org kernels 2.6.10 2.6.11 ... to find where the problem begins. please be patient
Created attachment 5923 [details] kernel.org 2.6.11.1 just after boot 2.6.11.1 just after boot when everything idles (X11 is started and kdm login screen too) perf is correct now I start KDE ...
Created attachment 5924 [details] kernel.org 2.6.11.1 after trying to work on kde now kde tries to startttt to staaaaarrt to staaaaaaaarrrrrtttttttt after 10 minutes it has started but is entirely unusable you can see the windows refresh draw... I log out kde, wait for everything to be idle like in the preceeding comment (kdm login screen) and take a report you can see that per has dropped near 500% nothing runs and testit takes 28sec instead of 6sec after boot. so for the moment something weird happens between kernel.org 2.6.10 and 2.6.11-1.
Thanks for narrowing it down! I presume that 2.6.11 is also buggy?
Created attachment 5925 [details] kernel.org 2.6.11.1 just after boot well this is worse with 2.6.11 as the perf problems appear immediately after boot
Created attachment 5926 [details] 2.6.11 .config
Created attachment 5927 [details] 2.6.10 .config
Created attachment 5928 [details] kernel log of different boots
Summury as of today : k2.6.10 perf stable k2.6.11 bad perf start at boot k2.6.11-1 - 2.6.13 perf ok at boot but degrades after solliciting high CPU - no binary driver - no significant msg in logs afaik - noacpi acpi=off noapic nolapic boot option does not solve the problem - how can cpufreq affect me since I have no process setting it (packages have been removed from my system since 2.6.11, that was the first thing I thought of) - am I the only one to see that ! it cannot go unnoticed ! is it MB arch dependent ? ** I will try 2.6.11rc1 please suggest me what to do to narrow search down more. thanks.
kernel.org 2.6.11rc1 works will try rc3
rc3 works rc5 have perf problem trying rc4...
OK the culprit is hidding since 2.6.11-rc4 ! I will join my 2 reports for final compare : zz2.6.11-rc3 zz2.6.11-rc4 another example : in rc3 top takes 1% cpu in rc4 top takes 5% cpu !
Created attachment 5945 [details] 2.6.11-rc3 good perf
Created attachment 5946 [details] 2.6.11-rc4 BAD PERFs
Created attachment 6039 [details] 2.6.13.1 bad perf
Created attachment 6063 [details] 2.6.13.2 GOOD PERF at boot - BAD after running kde Don't know what was wrong but this kernel is perfect, same good perfs for me as 2.6.10
no no no forget my joy on 2.6.13.2 - after using kde on it performance is badly impacted also even after stopping kde and xorg So i'm still stuck with 2.6.11-rc3 !
Could you please try disabling CONFIG_DRM_RADEON?
I tried this suggestion with kernel 2.6.13.1 and 2.6.13.2. No drm module is compiled and no drm module is loaded. The result is unchanged, very bad perf, going from tesit taking 6 sec immediatly after boot to 13mn20 (yes 13 minutes !) under kde loaded. stopping kde and xorg takes this test down to 30secs. I can have 20secs if I go to single user mode with minimal processes. but I cannot return back to the excellent 6secs perf. with 2.6.11-rc3 perf is always aroung 9secs regardless of the system load. very weird...
I meant no radeon drm module... This is the timing under kde, notice the great difference between user and real time: $ time ./testit real 13m20.337s user 0m54.559s sys 0m1.684s
$ time ./testit real 1m41.005s user 0m44.523s sys 0m1.192s under kde but with no load from other kde applications.
Created attachment 6262 [details] diffs between 2.6.11 rc3 and rc4 This is the result of the following shell showing diff -u between .c and .h in 2.6.11-rc3 and 2.6.11-rc4 it might suggest ideas... #!/bin/bash find linux-2.6.11-rc3 -name '*.[ch]' | while read a do f=$(echo $a | cut -d'/' -f2-) diff -u $a linux-2.6.11-rc4/$f if [ $? -eq 0 ] then continue fi echo '_______________________________________________________________________________' echo done
Could you please try turning off CONFIG_FB as well as Radeon DRM.
Created attachment 6309 [details] 2.6.11-rc3 to rc4 diffstat
Created attachment 6314 [details] 2.6.11-rc4 without FB nor RADEON DRM No more luck with this - still 30 sec immediately after boot. the slowness was visible during boot where boot steps are really slower than with rc3. I dont feel this has to do with graphics or video...
Created attachment 6322 [details] Zwane's 2.6 megaconfig
The diffstat isn't exactly revealing and DRM/FB were two suspects, that's why i singled them out. Please test the config called "Zwane's 2.6 megaconfig".
Created attachment 6349 [details] 2.6.11-rc4 megaconfig + corrections make xconfig corrected some options in your megaconfig I added bttv and usbmouse too, after verifying the initial corrected config was working right. Result is perfect.
Created attachment 6350 [details] results with corrected megaconfig after boot perf is OK 6 seconds for my testit program usage under KDE is perfect.
Created attachment 6351 [details] diffs between 2.6.11 rc3 and rc4 configs now where is the culprit ? pentiumII instead of pentiumIV ?
Hmm, can you try turn off CONFIG_AUDIT?
Created attachment 6382 [details] tests in single user mode I tried removing the following options in order, one by one with no success beginning from the rc3 config - pentiumII - no drm - no fb - no audit - no up_apic at this point I only see an improvement in perf after boot as is demonstrated in the attached file after boot in single user mode to avoid graphics and kde environnement my test loop takes 4 secs just compiling a kernel for 2 minutes makes this same loop take FIVE times more ie 20 secs ! Maybe we can take your config as a starting point and u suggest me to add options one by one to see at what point it breaks perf ?
Can you try without CONFIG_HPET_TIMER?
not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet timer
Without CONFIG_X86_PM_TIMER?
not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet_timer - no hpet
not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet_timer - no hpet - no x86_pm_timer
How about disabling CONFIG_HIGHMEM
not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet_timer - no hpet - no x86_pm_timer - no highmem I also gave a try to 2.6.14 with my initial config : perf is bad after boot (29 sec).
Created attachment 6424 [details] 2.6.14 config
Created attachment 6425 [details] data and bad perf with 2.6.14
Could you take my mega config, make your necessary changes to boot/use your system (only select the options you really require here please) and then diff the changes between my mega config and your new config.
Created attachment 6429 [details] current diffs between megaconfig and mine you will notice that I also tried with SMP + SMT and MAXCPU=2 as in your megaconfig, with no improvement. anyway this is the current diff between our 2 configs.
Created attachment 6473 [details] small diff causing perf loss this is my latest smallest diff of 2.6.11-rc4 configs that is causing perf loss on my machine. So it seems related to either I2C or SENSORS modules. I have configured the same on kernel 2.6.14 and I can obtain a viable performant kernel when removing all i2c and sensors modules.
Created attachment 6474 [details] bad perf with this config on 2.6.11-rc4
Created attachment 6475 [details] good perf on 2.6.11-rc4 with this config
Created attachment 6476 [details] good perf on 2.6.14 with this config - all other options are re enabled (drm framebuffer p4 etc)
can my problem be related to this ? http://www2.lm-sensors.nu/~lm78/cvs/lm_sensors2/prog/hotplug/README.p4b
Or is this patch harmfull to me ? (I use this chip as a sensor on my MB) --- linux-2.6.11-rc3/drivers/i2c/chips/w83781d.c 2005-02-03 02:54:37.000000000 +0100 +++ linux-2.6.11-rc4/drivers/i2c/chips/w83781d.c 2005-02-13 04:04:47.000000000 +0100
Does backing out that patch make a difference?
latest tests I have done shows that kernel slowness on my motherboard is triggered ONLY if w83781d is activated DURING BOOT phase (from /etc/modules). Modprobing it after manually does not produce the bad effects. This is very weird. Concerning the patch, I just looked at my diff listing from comment #27 for w83781d and found it there, but I don't know where the corresponding patch is and how I can disable it from compiling. It is out of my skills ...
Jean Delvare thinks this may be duplicate of a bug (http://bugzilla.kernel.org/show_bug.cgi?id=4332) reported in the 2.6.11 cycle, with similar h/w and symptoms. I'm going to go ahead and make him the owner, so he can narrow down if it's an i2c issue. Thanks, Nish
*** This bug has been marked as a duplicate of 4332 ***
Pascal: The performance drop is most likely caused by CPU throttling, itself due to your hardware monitoring chip (Asus AS99127F) to erroneously asserting a CPU overheating alert condition. This condition is itself triggered by "sensors -s" which must be run by one of your initialization scripts right after loading the w83781d driver. "sensors -s" programs the temperature limits according to data it finds in /etc/sensors.conf. This file requires an update due to a fix which was done to the w83781d driver in 2.6.11-rc4. Pick a fresh copy from that file from the lm_sensors project and your performance problem should belong to the past. As a quick test, you can simply move /etc/sensors.conf away and reboot your system with any kernel. If my guess is correct, your system should run just fine.
hummm damn it, 6 months searching for problem already in the database :( yes it's is true, now I remember looking in sensors.conf because my cpu temp was bad and choosing this line to correct it under as99127f-* : compute temp2 (@*30/43)+25, (@-25)*43/30 then this line stayed asis during ubuntu lm-sensors upgrades... commenting this line fixed the problem :) thanks all for your time. I updated the pending bug I opened on ubuntu https://bugzilla.ubuntu.com/show_bug.cgi?id=12641
Rather than commenting out the line, you can use this one instead for 2.6.11+ kernels: compute temp2 (@*15/43)+25, (@-25)*43/15 Basically the same, with "15" instead of "30". This should give you correct temperature readings again.