Bug 73781

Summary: acpi-cpufreq cannot be loaded.
Product: ACPI Reporter: KATO Hiroshi (katoh)
Component: Config-ProcessorsAssignee: Lan Tianyu (tianyu.lan)
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: celeonar, l.majewski, smf-linux, tianyu.lan, viresh.kumar
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.14 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: inserted modules (3.13.8)
inserted modules (3.14)
dmesg trace from last git bisect
dmesg_3.13.8_and_3.14.1.tar.gz
debug.patch

Description KATO Hiroshi 2014-04-10 14:23:47 UTC
After I upgraded kernel to 3.14, acpi-cpufreq module cannot be loaded.
Accordingly my laptop's cpufreq doesn't run anymore.
Until kernel 3.13.8, acpi-cpufreq has been loaded automatically as expected.

If I try to load acpi-cpufreq manually on 3.14, 

# modprobe acpi-cpufreq
modprobe: ERROR: could not insert 'acpi_cpufreq': No such device

Strangely if p4_clockmod was loaded initially, acpi-cpufreq can be
loaded sucessfully too. My laptop's cpu is PentiumM though.

# modprobe p4-clockmod
# modprobe acpi-cpufreq

In this case, available frequencies are messed up.

# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
212500 425000 637500 850000 1062500 1275000 1487500 1700000

On 3.13.8 (This is correct.)
# cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
1700000 1400000 1200000 1000000 800000 600000 


/proc/cpuinfo on Linux version 3.13.8
processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 9
model name      : Intel(R) Pentium(R) M processor 1700MHz
stepping        : 5
microcode       : 0x7
cpu MHz         : 600.000
cache size      : 1024 KB
physical id     : 0
siblings        : 1
core id         : 0
cpu cores       : 1
apicid          : 0
initial apicid  : 0
fdiv_bug        : no
f00f_bug        : no
coma_bug        : no
fpu             : yes
fpu_exception   : yes
cpuid level     : 2
wp              : yes
flags           : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 tm pbe bts est tm2
bogomips        : 1199.92
clflush size    : 64
cache_alignment : 64
address sizes   : 32 bits physical, 32 bits virtual
power management:
Comment 1 Viresh Kumar 2014-04-10 14:52:53 UTC
Try reverting these patches on top of 3.14 in the order mentioned:

 commit eb8c68ef558e6cba241e7ada54f6b3427cb2bf68
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date:   Mon Jan 27 22:50:35 2014 -0500

    acpi-cpufreq: De-register CPU notifier and free struct msr on error.

commit cfc9c8ed03e4d908f2388af8815f44c87b503aaf
Author: Lukasz Majewski <l.majewski@samsung.com>
Date:   Fri Dec 20 15:24:50 2013 +0100

    acpi-cpufreq: Adjust the code to use the common boost attribute


commit 6f19efc0a1ca08bc61841b971d8b85ab505d95c8
Author: Lukasz Majewski <l.majewski@samsung.com>
Date:   Fri Dec 20 15:24:49 2013 +0100

    cpufreq: Add boost frequency support in core
    


And let me know if it fixes things for you.
Comment 2 KATO Hiroshi 2014-04-10 15:44:01 UTC
commit eb8c68ef558e6cba241e7ada54f6b3427cb2bf68
Author: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Date:   Mon Jan 27 22:50:35 2014 -0500

I reverted this patch and built module. And it doesn't seem to make any difference.


commit cfc9c8ed03e4d908f2388af8815f44c87b503aaf
Author: Lukasz Majewski <l.majewski@samsung.com>
Date:   Fri Dec 20 15:24:50 2013 +0100

I can't revert this patch, because some rejects happened.


commit 6f19efc0a1ca08bc61841b971d8b85ab505d95c8
Author: Lukasz Majewski <l.majewski@samsung.com>
Date:   Fri Dec 20 15:24:49 2013 +0100

While building modules, many errors occured. So I didn't test this. 


I apologize for bothering you.
Comment 3 Viresh Kumar 2014-04-11 03:40:27 UTC
I wanted you to revert all three and then test it :)

Okay, I have reverted them for you now, let me know if it works.
Otherwise you have to do a git bisect, we aren't left with much options :)

git://git.linaro.org/people/viresh.kumar/linux.git for-kato

--
viresh
Comment 4 Lukasz Majewski 2014-04-11 08:05:08 UTC
(In reply to KATO Hiroshi from comment #0)
> After I upgraded kernel to 3.14, acpi-cpufreq module cannot be loaded.
> Accordingly my laptop's cpufreq doesn't run anymore.
> Until kernel 3.13.8, acpi-cpufreq has been loaded automatically as expected.
> 
> If I try to load acpi-cpufreq manually on 3.14, 
> 
> # modprobe acpi-cpufreq
> modprobe: ERROR: could not insert 'acpi_cpufreq': No such device

Why the "No such device" error show up? This is strange.

I suppose, that the acpi_cpufreq.ko module was built correctly.

Could you share the list of modules to be installed/inserted?

Also, I've noticed, that the p4-clockmod is also inserted. Could you
compile it in and left the acpi-cpufreq to be inserted as a standalone module?

Thanks in advance.


> 
> Strangely if p4_clockmod was loaded initially, acpi-cpufreq can be
> loaded sucessfully too. My laptop's cpu is PentiumM though.
> 
> # modprobe p4-clockmod
> # modprobe acpi-cpufreq
> 
> In this case, available frequencies are messed up.
> 
> # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
> 212500 425000 637500 850000 1062500 1275000 1487500 1700000
> 
> On 3.13.8 (This is correct.)
> # cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_frequencies
> 1700000 1400000 1200000 1000000 800000 600000 
> 
> 
> /proc/cpuinfo on Linux version 3.13.8
> processor       : 0
> vendor_id       : GenuineIntel
> cpu family      : 6
> model           : 9
> model name      : Intel(R) Pentium(R) M processor 1700MHz
> stepping        : 5
> microcode       : 0x7
> cpu MHz         : 600.000
> cache size      : 1024 KB
> physical id     : 0
> siblings        : 1
> core id         : 0
> cpu cores       : 1
> apicid          : 0
> initial apicid  : 0
> fdiv_bug        : no
> f00f_bug        : no
> coma_bug        : no
> fpu             : yes
> fpu_exception   : yes
> cpuid level     : 2
> wp              : yes
> flags           : fpu vme de pse tsc msr mce cx8 sep mtrr pge mca cmov
> clflush dts acpi mmx fxsr sse sse2 tm pbe bts est tm2
> bogomips        : 1199.92
> clflush size    : 64
> cache_alignment : 64
> address sizes   : 32 bits physical, 32 bits virtual
> power management:
Comment 5 KATO Hiroshi 2014-04-11 12:44:23 UTC
Created attachment 131921 [details]
inserted modules (3.13.8)
Comment 6 KATO Hiroshi 2014-04-11 12:44:50 UTC
Created attachment 131931 [details]
inserted modules (3.14)
Comment 7 KATO Hiroshi 2014-04-11 12:52:20 UTC
(In reply to Viresh Kumar from comment #3)

Thank you for your kindness. I built revert patched drivers/cpufreq, 
and unfortunately the problem has not been solved yet.

I also overwrite drivers/cpufreq with 3.13.8's one, the same problem happens.
So maybe cpufreq does not relate to this.


(In reply to Lukasz Majewski from comment #4)

> Why the "No such device" error show up? This is strange.
>I suppose, that the acpi_cpufreq.ko module was built correctly.

I use an ArchLinux package kernel. And never heard a similar problem
on the bts of archlinux. So I think acpi_cpufreq.ko module was built correctly.

> Also, I've noticed, that the p4-clockmod is also inserted. Could you
> compile it in and left the acpi-cpufreq to be inserted as a standalone
> module?

When I was searching the solution, I found a tweet of somebody who may be worrying the same problem.
Though unlike me he uses c2d, he wrote that only p4-clockmod could be used. So I gave it a try. 

inserted modules 3.13.8
https://bugzilla.kernel.org/attachment.cgi?id=131921
inserted modules 3.14
https://bugzilla.kernel.org/attachment.cgi?id=131931
Comment 8 Viresh Kumar 2014-04-11 14:07:32 UTC
Do git bisect please and let us know the offending commit.
Comment 9 KATO Hiroshi 2014-04-11 15:17:04 UTC
(In reply to Viresh Kumar from comment #8)
> Do git bisect please and let us know the offending commit.

I'll do as possible. 


By the way while glancing through dmesg, one important difference found.

3.13.8

[    0.087346] acpiphp: Slot [1-1] registered
[    0.090033] Found 1 acpi root devices
[    0.140788] Switched to clocksource acpi_pm
[    1.856854] ACPI: acpi_idle registered with cpuidle
[    2.101747] thinkpad_acpi: ThinkPad ACPI Extras v0.25

3.14
[    0.083948] acpiphp: Slot [1-1] registered
[    0.139115] Switched to clocksource acpi_pm
[    2.266132] thinkpad_acpi: ThinkPad ACPI Extras v0.25

3.14 kernel does not find an acpi root device.
I think this can be the point of this problem intead of cpufreq.
Comment 10 Stuart Foster 2014-04-16 08:57:28 UTC
Just found the same problem with a Thinkpad R51 (3.14.1 Kernel). From my investigations the problem was introduced prior 3.14.0-rc1. How is the bisect going ?
Comment 11 KATO Hiroshi 2014-04-16 12:39:30 UTC
It is making very slow progress. I haven't known how hard bitsect is.
Comment 12 Stuart Foster 2014-04-17 08:13:41 UTC
My bisect attempt reports this as the source of the issue:

root@Mars:/work/linux-git# git bisect bad
b981513f806d26b2cc971eb65ced14bede67558b is the first bad commit
commit b981513f806d26b2cc971eb65ced14bede67558b
Author: Jiang Liu <jiang.liu@linux.intel.com>
Date:   Thu Jan 9 16:15:19 2014 +0800

    ACPI / scan: bail out early if failed to parse APIC ID for CPU
    
    Enhance ACPI CPU hotplug driver to print clear error message and
    bail out early if BIOS returns wrong value in ACPI MADT table or
    _MAT method. Otherwise it will add the CPU device even if failed
    to get APIC ID and fails any operations against sysfs interface:
    /sys/devices/system/cpu/cpux/online
    
    Signed-off-by: Jiang Liu <jiang.liu@linux.intel.com>
    Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

:040000 040000 beed2874da58eff342807a9bed397a9b3f23accb 07f57bb8eb502216ae49cce21293a191b594095d M	drivers
Comment 13 KATO Hiroshi 2014-04-17 15:07:30 UTC
Great job!

Now should I change product category to ACPI or report the cause of the regression to the author of this patch? As this is the first time of my submitting a bug report, I don't have any idea what to do next.
Comment 14 Stuart Foster 2014-04-17 15:24:21 UTC
I would let Viresh make that call as he asked for the bisect.
Comment 15 Lan Tianyu 2014-04-18 02:03:15 UTC
Could you provide the output of dmesg?
Comment 16 Stuart Foster 2014-04-18 08:14:46 UTC
Created attachment 132911 [details]
dmesg trace from last git bisect
Comment 17 KATO Hiroshi 2014-04-18 11:44:37 UTC
Created attachment 132931 [details]
dmesg_3.13.8_and_3.14.1.tar.gz
Comment 18 Lan Tianyu 2014-04-19 13:18:49 UTC
Created attachment 133031 [details]
debug.patch

Please try this patch.
Comment 19 Stuart Foster 2014-04-19 16:46:28 UTC
(In reply to Lan Tianyu from comment #18)
> Created attachment 133031 [details]
> debug.patch
> 
> Please try this patch.

On my Thinkpad R51 the patch fixes the problem (tested against 3.14.1).
Comment 20 KATO Hiroshi 2014-04-19 21:09:53 UTC
(In reply to Lan Tianyu from comment #18)
> Created attachment 133031 [details]
> debug.patch
> 
> Please try this patch.

Works fine now. Thank you for fixing the bug.
Comment 21 Cesare Leonardi 2014-04-30 06:30:13 UTC
What are the conclusions of this bug? Looks like the patch is not applied in any released kernel, isn't it?
In the latest Debian kernel, based on 3.14.2, the bug is still there.
I'm affected and my system is, like the bug reporter, an old Pentium M: when i use this kernel my cpu fan is always on.

cat /proc/cpuinfo
processor	: 0
vendor_id	: GenuineIntel
cpu family	: 6
model		: 13
model name	: Intel(R) Pentium(R) M processor 1.60GHz
stepping	: 6
microcode	: 0x18
cpu MHz		: 600.000
cache size	: 2048 KB
fdiv_bug	: no
f00f_bug	: no
coma_bug	: no
fpu		: yes
fpu_exception	: yes
cpuid level	: 2
wp		: yes
flags		: fpu vme de pse tsc msr mce cx8 apic sep mtrr pge mca cmov clflush dts acpi mmx fxsr sse sse2 ss tm pbe bts est tm2
bogomips	: 1190.90
clflush size	: 64
cache_alignment	: 64
address sizes	: 32 bits physical, 32 bits virtual
power management:
Comment 22 Lan Tianyu 2014-04-30 06:33:44 UTC
Hi, just send the fix patch to upstream.
https://patchwork.kernel.org/patch/4090951/