Bug 10952
Summary: | "scaling_max_freq" constantly shifting between minimum and maximum frequency | ||
---|---|---|---|
Product: | ACPI | Reporter: | Viktor Kojouharov (vkojouharov) |
Component: | Power-Processor | Assignee: | ykzhao (yakui.zhao) |
Status: | REJECTED DOCUMENTED | ||
Severity: | normal | CC: | acpi-bugzilla |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.25.7 | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
The ver_linux script output. I've removed the nvidia module before reproducing, just in case
cpufreq.debug=7 and CONFIG_CPU_FREQ_DEBUG are set. governors are not set yet more complete dump of the syslog. showing complete cycle from 2201MHz to 800 and back. Still no change from the default ondemand governor cpu[01][ic]st files dmesg output, after switching from ondemand to performance dumps of thermal_zone during various system states |
Description
Viktor Kojouharov
2008-06-22 10:31:11 UTC
Created attachment 16578 [details]
The ver_linux script output. I've removed the nvidia module before reproducing, just in case
hmm, I think you can add a dump_stack in cpu_update_policy to see who is keeping on changing the scaling_max_freq? something like: --- drivers/cpufreq/cpufreq.c | 1 + 1 file changed, 1 insertion(+) Index: linux-2.6/drivers/cpufreq/cpufreq.c =================================================================== --- linux-2.6.orig/drivers/cpufreq/cpufreq.c 2008-06-17 06:35:42.000000000 +0800 +++ linux-2.6/drivers/cpufreq/cpufreq.c 2008-06-23 10:25:13.000000000 +0800 @@ -1722,6 +1722,7 @@ if (unlikely(lock_policy_rwsem_write(cpu))) return -EINVAL; + dump_stack(); dprintk("updating policy for CPU %u\n", cpu); memcpy(&policy, data, sizeof(struct cpufreq_policy)); policy.min = data->user_policy.min; Will you please attach the output of acpidump? Will you please enable the CONFIG_CPU_FREQ_DEBUG in kernel configuration and boot the system with the option of "cpufreq.debug=7"? After the system is booted, please change the cpufreq governor several times and attach the output of dmesg. Thanks. Does this make the changes to scaling_max_freq stop: # echo 1 > /sys/module/processor/parameters/ignore_ppc (or boot with processor.ignore_ppc=1) In addition to the acpidump requested above, please build with CONFIG_ACPI_DEBUG=y and attach the complete output from dmesg -s64000 Created attachment 16584 [details]
cpufreq.debug=7 and CONFIG_CPU_FREQ_DEBUG are set. governors are not set yet
Created attachment 16585 [details]
more complete dump of the syslog. showing complete cycle from 2201MHz to 800 and back. Still no change from the default ondemand governor
echo 1 > /sys/module/processor/parameters/ignore_ppc stops the scaling_max_freq jumping, though the system tends to totally freeze sometimes now More info: setting ignore_ppc to 1 does not stop the problem from occurring. Under heavy load, the system decided to bring the cpu cores down from 2200 to 1200. I don't know whether the actual scaling_max_freq was changed, or whether just the scaling frequency was changed by the ondemand governor. Before I could see whether the scaling_max_freq was changed, the system froze :\ The core temp for both was around 60-65C, which is pretty normal for these cores in this laptop. I don't think it froze because of overheating. Also, according to the string in 'processor_preflib.c', this option is used if the BIOS is the culprit. However, the cores were scaled correctly with 2.6.22, and now bios updates have been installed. Will you please attach the following outputs? acpidump --addr 0x7fe6e4f2 --length 0x286 -o cpu0ist acpidump --addr 0x7fe6de88 --length 0x5e5 -o cpu0cst acpidump --addr 0x7fe6e778 --length 0xc4 -o cpu1ist acpidump --addr 0x7fe6e46d --length 0x85 -o cpu1cst Will you please change the cpufreq governor from ondemand to performance and see whether the problem still exists? Had better boot the system with the option of "cpufreq.debug=7" and attach the output of dmesg. Thanks. Created attachment 16677 [details]
cpu[01][ic]st files
Created attachment 16678 [details]
dmesg output, after switching from ondemand to performance
Changing the governor to performance doesn't help. 'scaling_max_freq' is still being changed, and performance just uses that value (from what I gather).
The dmesg output includes changing the governor to performance.
Hi, Viktor Thanks for the info. It seems that this issue is related with BIOS. When the system is running, BIOS often sends the notification event(0x80) , which causes that OS will evaluate the _PPC object and get the new performance limit. Then OS will use the new limit to update the cpufreq policy. Of course the scaling_max_freq will be changed as the change of performance limit. If the boot option of "processor.ignore_ppc" is added, OS won't update the cpufreq policy according to the change of _PPC object and the scaling_max_freq can be normal. From the acpidump it seems that there is no cooling device when the temperate reaches some conditions. There only exists the following objects under the scope of thermalzone : _CRT, _TMP. If the temperature returned by _TMP object is greater than the _CRT, the system will be shutdown.Maybe this is related with that the system freezes. Anyway, will you please attach the following output? cat /proc/acpi/thermal_zone/THM/* Thanks. Created attachment 16722 [details]
dumps of thermal_zone during various system states
These are the output of the files in the thermal_zone/THM/, without processor.ignore_ppc. Please let me know if I need to make another batch with that option turned on.
the starting_thermal file contains the state when the system first booted up. starting_minor_load_2200 is state with little cpu load with the max freq set to 2200. The other two files are with max cpu load and with scaling_max_freq set to the given frequency
1. the "scaling_max_freq" change is caused by BIOS/Hardware. IMO, it's not a Linux kernel bug, thus I'm afraid we can not change this behaviour. 2. no_ppc freezes system. This is weird. would you please try to boot with "idle=poll" and then "echo 1 > /sys/module/processor/parameters/ignore_ppc", and see if the system still freezes? Turns out the unstable system was due to faulty hardware. So I will keep using ignore_ppc in the future. I guess this bug can be closed now. Are there any side effects to ignore_ppc? Hi, Viktor Thanks for the quick response. To use the "ignore_ppc" is harmless in theory. But you had better confirm whether the temperature is below the critical threshold under heavy load when the system is in max cpufreq(2200). If the temperature is still below the critical threshold, it is harmless. As the issue is related with the BIOS and can be suppressed by the ignore_ppc, the bug will be rejected and marked as "Documented". Thanks. |