Most recent kernel where this bug did *NOT* occur: unknown Distribution: openSuSE10.2 (probably irrelevant) A previous descussion of this problem is here: https://bugzilla.novell.com/show_bug.cgi?id=246525 Hardware Environment: hp compaq nx6325 (AMD Turion64 X2 Software Environment: Problem Description: After taking CPU1 offline and online again, the /sys/devices/system/cpu/cpu1/cpufreq link is missing. This also happens after suspend2disk and resume. While the CPU will work normally, kpowersave will report it as disabled. Steps to reproduce: as root user: # echo 0 >/sys/devices/system/cpu/cpu1/online # echo 1 >/sys/devices/system/cpu/cpu1/online # l /sys/devices/system/cpu/* -rw-r--r-- 1 root root 4096 2007-02-16 23:12 /sys/devices/system/cpu/sched_mc_power_savings /sys/devices/system/cpu/cpu0: total 0 drwxr-xr-x 5 root root 0 2007-02-16 23:12 ./ drwxr-xr-x 4 root root 0 2007-02-16 23:12 ../ drwxr-xr-x 5 root root 0 2007-02-17 00:08 cache/ drwxr-xr-x 3 root root 0 2007-02-16 23:12 cpufreq/ -r-------- 1 root root 4096 2007-02-16 23:12 crash_notes drwxr-xr-x 2 root root 0 2007-02-17 00:08 topology/ /sys/devices/system/cpu/cpu1: total 0 drwxr-xr-x 4 root root 0 2007-02-16 23:13 ./ drwxr-xr-x 4 root root 0 2007-02-16 23:12 ../ drwxr-xr-x 5 root root 0 2007-02-16 23:13 cache/ -r-------- 1 root root 4096 2007-02-16 23:12 crash_notes -rw------- 1 root root 0 2007-02-16 23:13 online drwxr-xr-x 2 root root 0 2007-02-16 23:13 topology/ #
AFAICT, this only happens if the kernel is 32-bit (i386). At least I'm unable to reproduce the problem on nx6325 with x86_64 kernels (2.6.20 or later).
That has been indeed bugging me for a long time (Athlon 64 X2 3800+, Asus A8N-E, 32 bit distro).
The problem persists with kernel version 2.6.20.4
This does not look like a cpufreq problem... for now I found out that the vendor string gets overridden when the CPU gets offlined: Apr 26 19:10:47 adalid kernel: Checking CPU 0 Apr 26 19:10:47 adalid kernel: k Apr 26 19:10:47 adalid kernel: u Apr 26 19:10:47 adalid kernel: vendor: 2 - We have an AMD CPU ... Apr 26 19:10:47 adalid kernel: a Apr 26 19:10:47 adalid kernel: b Apr 26 19:10:47 adalid kernel: c Apr 26 19:10:47 adalid kernel: d Apr 26 19:10:47 adalid kernel: e Apr 26 19:10:47 adalid kernel: CPU 0 supported Apr 26 19:10:47 adalid kernel: Checking CPU 1 Apr 26 19:10:47 adalid kernel: k Apr 26 19:10:47 adalid kernel: u Apr 26 19:10:47 adalid kernel: vendor: 255 - NOT AN AMD CPU ?!? Apr 26 19:10:47 adalid kernel: Checking CPU 2 Apr 26 19:10:47 adalid kernel: k Apr 26 19:10:47 adalid kernel: u Apr 26 19:10:47 adalid kernel: vendor: 255 - NOT AN AMD CPU ?!? Apr 26 19:10:47 adalid kernel: Checking CPU 3 Apr 26 19:10:47 adalid kernel: k Apr 26 19:10:47 adalid kernel: u Apr 26 19:10:47 adalid kernel: vendor: 255 - NOT AN AMD CPU ?!? Apr 26 19:10:47 adalid kernel: Num online: 4 This is in powernow-k8.c:check_supported_cpu(unsigned int cpu) here: if (current_cpu_data.x86_vendor != X86_VENDOR_AMD){ printk ("vendor: %d - NOT AN AMD CPU ?!?\n", current_cpu_data.x86_vendor); goto out; } printk ("vendor: %d - We have an AMD CPU ...\n", current_cpu_data.x86_vendor); Remember: This is a 32 bit kernel problem! I will try some more tomorrow...
Created attachment 11290 [details] Don't delete cpu_devs data to identify different x86 types in late_initcall This one should fix it. Dave, Andi will someone of you pick this one up or do I have to explicitly post it to lkml or somewhere? I did test this with i386 -> works and test compiled with and without CONFIG_HOTPLUG_CPU --------------------------------- *Unrelated*: Some change to older or x86_64 kernel I realized: when doing: l -d cpu*/cpufreq drwxr-xr-x cpu0/cpufreq/ lrwxrwxrwx cpu1/cpufreq -> ../../../../devices/system/cpu/cpu0/cpufreq/ drwxr-xr-x cpu2/cpufreq/ lrwxrwxrwx cpu3/cpufreq -> ../../../../devices/system/cpu/cpu2/cpufreq/ then doing: adalid:/sys/devices/system/cpu # echo 0 >cpu2/online adalid:/sys/devices/system/cpu # l -d cpu*/cpufreq drwxr-xr-x cpu0/cpufreq/ lrwxrwxrwx cpu1/cpufreq -> ../../../../devices/system/cpu/cpu0/cpufreq/ The cpufreq dir in both cores, cpu2 and cpu3 vanishes. This was on a recent i386 kernel. I am pretty sure I saw the cpu3/cpufreq getting a real directory when offlining cpu2 and when adding cpu2 again, cpu2/cpufreq gets linked to cpu3/cpufreq. This should be the right behaviour, so the kernel is broken here, right? -> You might want to reply to the list, this one is forwarded to, should be easiest, don't want to write a separate mail now...
Alas, this patch doesn't work for me - with it applied modprobe powernow-k8 applies spits out "cannot load module: no such device".
Actually, it works - at first tried on a kernel with some other patches, must've been their fault. Mea culpa. Thanks a lot for this - now I can use suspend and not worry about processes on cpu1 not triggering throttling.
Thanks for verification. Hopefully Andi or Dave can just pick the patch up. If there is no msg the next days I'll send a little pointer to lkml for review and ask for integration...
Arg, I shouldn't have assigned this one to me... Set it to i386 component, can someone please review and pick up the patch from comment #5.
Yes the patch looks correct. I just merged it. Owner should close the bug now
Accept bug for closing.