Bug 5186
Description
Pascal Cavy
2005-09-04 16:57:39 UTC
Created attachment 5889 [details]
2.6.10 data
some data whith this kernel 2.6.10 that works well
xorg data are there because I thought it was xorg problem at first.
Created attachment 5890 [details]
2.6.12 data
perf problems when using 2.6.12
First of all, distro kernels are not going to be too helpful to reference. Does kernel.org 2.6.10 display the same problem? What was the last *kernel.org* kernel which did not display the performance issues? Does anything show up in the logs? What's your .config? Any binary drivers? Thanks, Nish Created attachment 5891 [details]
2.6.13 data
perf problem with vanilla 2.6.13
more data to come
Odd. Initial blame always falls to mtrr settings. Can you please generate copies of /proc/mtrr for both good and bad kernels? I wonder if this could be caused by cpufreq going bananas? Created attachment 5922 [details]
some infos incliding mtrr for ubuntu kernel 2610-5 which works well
I am currently compiling and testing kernel.org kernels 2.6.10 2.6.11 ... to
find where the problem begins. please be patient
Created attachment 5923 [details]
kernel.org 2.6.11.1 just after boot
2.6.11.1 just after boot when everything idles (X11 is started and kdm login
screen too)
perf is correct
now I start KDE ...
Created attachment 5924 [details]
kernel.org 2.6.11.1 after trying to work on kde
now kde tries to startttt to staaaaarrt to staaaaaaaarrrrrtttttttt
after 10 minutes it has started but is entirely unusable
you can see the windows refresh draw...
I log out kde, wait for everything to be idle like in the preceeding comment
(kdm login screen) and take a report
you can see that per has dropped near 500%
nothing runs and testit takes 28sec instead of 6sec after boot.
so for the moment something weird happens between kernel.org 2.6.10 and
2.6.11-1.
Thanks for narrowing it down! I presume that 2.6.11 is also buggy? Created attachment 5925 [details]
kernel.org 2.6.11.1 just after boot
well this is worse with 2.6.11 as the perf problems appear immediately after
boot
Created attachment 5926 [details]
2.6.11 .config
Created attachment 5927 [details]
2.6.10 .config
Created attachment 5928 [details]
kernel log of different boots
Summury as of today : k2.6.10 perf stable k2.6.11 bad perf start at boot k2.6.11-1 - 2.6.13 perf ok at boot but degrades after solliciting high CPU - no binary driver - no significant msg in logs afaik - noacpi acpi=off noapic nolapic boot option does not solve the problem - how can cpufreq affect me since I have no process setting it (packages have been removed from my system since 2.6.11, that was the first thing I thought of) - am I the only one to see that ! it cannot go unnoticed ! is it MB arch dependent ? ** I will try 2.6.11rc1 please suggest me what to do to narrow search down more. thanks. kernel.org 2.6.11rc1 works will try rc3 rc3 works rc5 have perf problem trying rc4... OK the culprit is hidding since 2.6.11-rc4 ! I will join my 2 reports for final compare : zz2.6.11-rc3 zz2.6.11-rc4 another example : in rc3 top takes 1% cpu in rc4 top takes 5% cpu ! Created attachment 5945 [details]
2.6.11-rc3 good perf
Created attachment 5946 [details]
2.6.11-rc4 BAD PERFs
Created attachment 6039 [details]
2.6.13.1 bad perf
Created attachment 6063 [details]
2.6.13.2 GOOD PERF at boot - BAD after running kde
Don't know what was wrong but this kernel is perfect, same good perfs for me
as 2.6.10
no no no forget my joy on 2.6.13.2 - after using kde on it performance is badly impacted also even after stopping kde and xorg So i'm still stuck with 2.6.11-rc3 ! Could you please try disabling CONFIG_DRM_RADEON? I tried this suggestion with kernel 2.6.13.1 and 2.6.13.2. No drm module is compiled and no drm module is loaded. The result is unchanged, very bad perf, going from tesit taking 6 sec immediatly after boot to 13mn20 (yes 13 minutes !) under kde loaded. stopping kde and xorg takes this test down to 30secs. I can have 20secs if I go to single user mode with minimal processes. but I cannot return back to the excellent 6secs perf. with 2.6.11-rc3 perf is always aroung 9secs regardless of the system load. very weird... I meant no radeon drm module... This is the timing under kde, notice the great difference between user and real time: $ time ./testit real 13m20.337s user 0m54.559s sys 0m1.684s $ time ./testit real 1m41.005s user 0m44.523s sys 0m1.192s under kde but with no load from other kde applications. Created attachment 6262 [details]
diffs between 2.6.11 rc3 and rc4
This is the result of the following shell showing diff -u between .c and .h in
2.6.11-rc3 and 2.6.11-rc4
it might suggest ideas...
#!/bin/bash
find linux-2.6.11-rc3 -name '*.[ch]' | while read a
do
f=$(echo $a | cut -d'/' -f2-)
diff -u $a linux-2.6.11-rc4/$f
if [ $? -eq 0 ]
then
continue
fi
echo
'_______________________________________________________________________________'
echo
done
Could you please try turning off CONFIG_FB as well as Radeon DRM. Created attachment 6309 [details]
2.6.11-rc3 to rc4 diffstat
Created attachment 6314 [details]
2.6.11-rc4 without FB nor RADEON DRM
No more luck with this - still 30 sec immediately after boot. the slowness was
visible during boot where boot steps are really slower than with rc3. I dont
feel this has to do with graphics or video...
Created attachment 6322 [details]
Zwane's 2.6 megaconfig
The diffstat isn't exactly revealing and DRM/FB were two suspects, that's why i singled them out. Please test the config called "Zwane's 2.6 megaconfig". Created attachment 6349 [details]
2.6.11-rc4 megaconfig + corrections
make xconfig corrected some options in your megaconfig
I added bttv and usbmouse too, after verifying the initial corrected config was
working right.
Result is perfect.
Created attachment 6350 [details]
results with corrected megaconfig
after boot perf is OK 6 seconds for my testit program
usage under KDE is perfect.
Created attachment 6351 [details]
diffs between 2.6.11 rc3 and rc4 configs
now where is the culprit ? pentiumII instead of pentiumIV ?
Hmm, can you try turn off CONFIG_AUDIT? Created attachment 6382 [details]
tests in single user mode
I tried removing the following options in order, one by one with no success
beginning from the rc3 config
- pentiumII
- no drm
- no fb
- no audit
- no up_apic
at this point I only see an improvement in perf after boot
as is demonstrated in the attached file
after boot in single user mode to avoid graphics and kde environnement my test
loop takes 4 secs
just compiling a kernel for 2 minutes makes this same loop take FIVE times more
ie 20 secs !
Maybe we can take your config as a starting point and u suggest me to add
options one by one to see at what point it breaks perf ?
Can you try without CONFIG_HPET_TIMER? not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet timer Without CONFIG_X86_PM_TIMER? not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet_timer - no hpet not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet_timer - no hpet - no x86_pm_timer How about disabling CONFIG_HIGHMEM not better with : - pentiumII - no drm - no fb - no audit - no up_apic - no hpet_timer - no hpet - no x86_pm_timer - no highmem I also gave a try to 2.6.14 with my initial config : perf is bad after boot (29 sec). Created attachment 6424 [details]
2.6.14 config
Created attachment 6425 [details]
data and bad perf with 2.6.14
Could you take my mega config, make your necessary changes to boot/use your system (only select the options you really require here please) and then diff the changes between my mega config and your new config. Created attachment 6429 [details]
current diffs between megaconfig and mine
you will notice that I also tried with SMP + SMT and MAXCPU=2 as in your
megaconfig, with no improvement.
anyway this is the current diff between our 2 configs.
Created attachment 6473 [details]
small diff causing perf loss
this is my latest smallest diff of 2.6.11-rc4 configs that is causing perf loss
on my machine.
So it seems related to either I2C or SENSORS modules.
I have configured the same on kernel 2.6.14 and I can obtain a viable
performant kernel when removing all i2c and sensors modules.
Created attachment 6474 [details]
bad perf with this config on 2.6.11-rc4
Created attachment 6475 [details]
good perf on 2.6.11-rc4 with this config
Created attachment 6476 [details]
good perf on 2.6.14 with this config - all other options are re enabled (drm framebuffer p4 etc)
can my problem be related to this ? http://www2.lm-sensors.nu/~lm78/cvs/lm_sensors2/prog/hotplug/README.p4b Or is this patch harmfull to me ? (I use this chip as a sensor on my MB) --- linux-2.6.11-rc3/drivers/i2c/chips/w83781d.c 2005-02-03 02:54:37.000000000 +0100 +++ linux-2.6.11-rc4/drivers/i2c/chips/w83781d.c 2005-02-13 04:04:47.000000000 +0100 Does backing out that patch make a difference? latest tests I have done shows that kernel slowness on my motherboard is triggered ONLY if w83781d is activated DURING BOOT phase (from /etc/modules). Modprobing it after manually does not produce the bad effects. This is very weird. Concerning the patch, I just looked at my diff listing from comment #27 for w83781d and found it there, but I don't know where the corresponding patch is and how I can disable it from compiling. It is out of my skills ... Jean Delvare thinks this may be duplicate of a bug (http://bugzilla.kernel.org/show_bug.cgi?id=4332) reported in the 2.6.11 cycle, with similar h/w and symptoms. I'm going to go ahead and make him the owner, so he can narrow down if it's an i2c issue. Thanks, Nish *** This bug has been marked as a duplicate of 4332 *** Pascal: The performance drop is most likely caused by CPU throttling, itself due to your hardware monitoring chip (Asus AS99127F) to erroneously asserting a CPU overheating alert condition. This condition is itself triggered by "sensors -s" which must be run by one of your initialization scripts right after loading the w83781d driver. "sensors -s" programs the temperature limits according to data it finds in /etc/sensors.conf. This file requires an update due to a fix which was done to the w83781d driver in 2.6.11-rc4. Pick a fresh copy from that file from the lm_sensors project and your performance problem should belong to the past. As a quick test, you can simply move /etc/sensors.conf away and reboot your system with any kernel. If my guess is correct, your system should run just fine. hummm damn it, 6 months searching for problem already in the database :( yes it's is true, now I remember looking in sensors.conf because my cpu temp was bad and choosing this line to correct it under as99127f-* : compute temp2 (@*30/43)+25, (@-25)*43/30 then this line stayed asis during ubuntu lm-sensors upgrades... commenting this line fixed the problem :) thanks all for your time. I updated the pending bug I opened on ubuntu https://bugzilla.ubuntu.com/show_bug.cgi?id=12641 Rather than commenting out the line, you can use this one instead for 2.6.11+ kernels: compute temp2 (@*15/43)+25, (@-25)*43/15 Basically the same, with "15" instead of "30". This should give you correct temperature readings again. |