Bug 8255
Summary: | Frequency Scaling not working properly using powernow-k7 | ||
---|---|---|---|
Product: | Power Management | Reporter: | Dustin Surawicz (bugzilla.20.dsurawicz) |
Component: | cpufreq | Assignee: | cpufreq (cpufreq) |
Status: | CLOSED CODE_FIX | ||
Severity: | normal | CC: | cimmo, kernel |
Priority: | P2 | ||
Hardware: | i386 | ||
OS: | Linux | ||
Kernel Version: | 2.6.19 and newer (up to 2.6.21-rc) | Subsystem: | |
Regression: | --- | Bisected commit-id: | |
Attachments: |
dmesg.log with debug output
kernel configuration I used for 2.26.21-rc3 to reproduce the problem diagnostic patch dmesg output after applying Daniel's patch Output of acpidump dmesg output after applying Daniel's patch new patch dmesg output after applying 2nd patch modified DSDT DSDT ASL diff dmesg output after using fixed DSDT new patch hopefully final patch dmesg output dmesg from ubuntu gutsy daily 9 june 2007 |
Description
Dustin Surawicz
2007-03-24 11:11:58 UTC
Created attachment 10932 [details]
dmesg.log with debug output
Created attachment 10933 [details]
kernel configuration I used for 2.26.21-rc3 to reproduce the problem
This is a 2.6.19 regression (2.6.18 was OK) which has been reproduced as of 2.6.21-rc3 Okay. I tracked down the problem with git. Here is what is the submission that causes the problem: solaris linux-git # git bisect good 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e is first bad commit commit 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e Author: Dave Jones <davej@redhat.com> Date: Wed Nov 22 20:42:01 2006 -0500 [PATCH] Correct bound checking from the value returned from _PPC method. processor_perflib.c::acpi_processor_ppc_notifier() check if the value returned by the processor's _PPC method is 0 and return failed if so. This is wrong since 0 indicate that the bios think the processor can go to the highest frequency. This patch for example fix the HP NX 6125 to allow its highest frequency to be available. Signed-off-by: Bruno Ducrot <ducrot@poupinou.org> Cc: "Pallipadi, Venkatesh" <venkatesh.pallipadi@intel.com> Signed-off-by: Dave Jones <davej@redhat.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org> :040000 040000 d6696bf57e1a08b39e051bcf22c838723ccdf0bf a67b7741e5163c598cb3663c7997dc663516695e M drivers Maybe someone can look into it. I google a bit and found that this regression is already known and a patch has been proposed: http://www.mail-archive.com/linux-acpi@vger.kernel.org/msg04484.html Don't know whether this is just a workaround or a fix. I do not know anything about kernel hacking... I already suspected that exact issue which is why I asked you to test gentoo-sources-2.6.19-r6 on the downstream bug, as that patch was merged into 2.6.19.3. So although it appears to be the same commit at fault, it must be a different issue. Dave, bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8255 > Okay. I tracked down the problem with git. Here is what is the submission that > causes the problem: > > solaris linux-git # git bisect good > 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e is first bad commit > commit 0916bd3ebb7cefdd0f432e8491abe24f4b5a101e > Author: Dave Jones <davej@redhat.com> > Date: Wed Nov 22 20:42:01 2006 -0500 > > [PATCH] Correct bound checking from the value returned from _PPC method. The above commit seems to be the culprit behind this 2.6.19 regression, which is still present as of 2.6.21-rc3 (so it's not the earlier issue found by Ingo). Any ideas? Thanks, Daniel After analyzing the log, my guess is that there is a bug in some acpi piece of source of powernow-k7. Here is what I found out: from dmesg.log: -------------snip-------------- powernow-k7: acpi: P0: 950 MHz 24000 mW 125 uS control 009c418d SGTC 10000 powernow-k7: FID: 0xd (9.5x [1266MHz]) VID: 0xc (1.400V) powernow-k7: acpi: P1: 750 MHz 16337 mW 125 uS control 009c41c9 SGTC 10000 powernow-k7: FID: 0x9 (7.5x [1000MHz]) VID: 0xe (1.300V) powernow-k7: acpi: P2: 700 MHz 15248 mW 125 uS control 009c41c8 SGTC 10000 powernow-k7: FID: 0x8 (7.0x [933MHz]) VID: 0xe (1.300V) powernow-k7: acpi: P3: 600 MHz 12084 mW 125 uS control 009c4226 SGTC 10000 powernow-k7: FID: 0x6 (6.0x [800MHz]) VID: 0x11 (1.250V) powernow-k7: acpi: P4: 500 MHz 9280 mW 125 uS control 009c4264 SGTC 10000 powernow-k7: FID: 0x4 (5.0x [666MHz]) VID: 0x13 (1.200V) -------------snap-------------- As you can see, acpi detects the states available with my cpu are, 500,600,700,750, and 950MHz. But the correct values are 666,800,933,1000, and 1266MHz, as shown in the lines stating values FID and VID. If one devides the corresponding values of FID by the ones detected by acpi, e.g 666/500, this always yields 1.33. Could this be somehow related to the FSB frequency? powernow-k7: FSB: 133MHz Furthermore, ' # cat /proc/acpi/processor/CPU0/performance' give the following output, regardless whether I use a 2.6.18 (before applying the above mentioned patch) or later: state count: 5 active state: P0 states: *P0: 950 MHz, 24000 mW, 125 uS P1: 750 MHz, 16337 mW, 125 uS P2: 700 MHz, 15248 mW, 125 uS P3: 600 MHz, 12084 mW, 125 uS P4: 500 MHz, 9280 mW, 125 uS If I change the active state to P4 and back to P0, the cpu never speeds up to 1267MHz any more but stays at a around 1000MHz. Exact speed depends on the kernel version: 2.6.18: 1000MHz 2.6.21: 950MHz 1000MHz is a valid speed according to FID table, which is set because it is close to the invalid 950MHz. After applying the patch, the cpufreq_verify_within_limits call in processor_perflib.c is executed and, thus, I suspect that the frequency is compared to the table obtained by acpi, where this is a valid frequency. Any comments? This bug seems not to be of much interest... I would love to look myself but I am really a noob. Could at least someone give me a hint, which piece of code is responsible to set these: state count: 5 active state: P0 states: *P0: 950 MHz, 24000 mW, 125 uS P1: 750 MHz, 16337 mW, 125 uS P2: 700 MHz, 15248 mW, 125 uS P3: 600 MHz, 12084 mW, 125 uS P4: 500 MHz, 9280 mW, 125 uS Thanks. Created attachment 11279 [details]
diagnostic patch
I'm not knowledgeable about any of this stuff, but here's an attempt to further
clarify what's going on: Please apply this patch and attach new dmesg output.
I think one important part of the issue is that powernow-k7 can't look up the
frequency tables on it's own (possibly broken BIOS), so it falls back on ACPI.
Also, sometimes the ACPI developers find acpidump output useful. You can emerge
pmtools to get this utility - I suggest you attach the output here.
Okay, here comes the new dmesg output after applying the patch to the 2.6.21-gentoo kernel source. I enabled acpi debugging and added cpufreq.debug=7 to my kernel parameters. Output of acpidump to be attached as well. Created attachment 11284 [details]
dmesg output after applying Daniel's patch
Created attachment 11285 [details]
Output of acpidump
Comment on attachment 11284 [details]
dmesg output after applying Daniel's patch
truncated output
Created attachment 11286 [details]
dmesg output after applying Daniel's patch
Created attachment 11287 [details]
new patch
It does look like your BIOS is broken. Before we go further, please check if
updates are available.
If not, try applying this patch. It may get things back on their feet, by
ignoring the broken field in the BIOS.
I checked the availability of a bios update. Unfortunately, I am running the newest revision and since my laptop is not brand new any more, there won't be any. Bios is from Jan 2003... So I will try with your patch. Should I apply this one additionally to the previous one or instead of it? Thanks for your efforts so far. Hope that this is fixed soon. apply it instead of the first one Created attachment 11357 [details]
dmesg output after applying 2nd patch
The patch did not solve the problem, it even got worse. CPU just running with
800MHz now. Available frequencies in the log are looking quite weird. There is
the slowest frequency 633MHz and then other entries state 800MHz.
OK, your BIOS is broken so we shouldn't go further down that route. I'll hopefully find some time to dig into why acpi_perflib isn't obeying the 133mhz FSB value. I suspect this may be another BIOS bug on your system... Created attachment 11364 [details]
modified DSDT
Created attachment 11365 [details]
DSDT ASL diff
for reference
(the above diff is accidentally reversed) Dustin, First reverse the above patches so that you're working from clean kernel sources. I have generated a custom ACPI BIOS for you. The one stored in the system has those incorrect frequencies. Download the modified DSDT and compile it with: # iasl -tc DSDT.dsl You'll then get a "DSDT.hex" file in the same directory. Now in the kernel config, enable CONFIG_ACPI_CUSTOM_DSDT and provide the path to the .hex file Then recompile the kernel, boot into the new one, and see if things have improved. Managed to compile the DSDT.dsl file. But I can't find any CONFIG_ACPI_CUSTOM_DSDT in menuconfig (2.6.21-gentoo). grepping for it in the .config gives no match either... Can I just add it to the .config file and f yes where? Never mind. Found how to get the option :) Created attachment 11366 [details]
dmesg output after using fixed DSDT
Bingo! This seems to have fixed it. I attach dmesg output in case there is
still something suspicious.
Thank you so much, Daniel, for your effort!!!
Created attachment 11367 [details]
new patch
Please now apply this patch to a clean kernel (no custom DSDT, no previous
patches applied) and see what happens...
Patch works. CPU runs at highest frequency now. Thanks. Can I see dmesg output from using it? The modified DSDT I gave you didn't quite work correctly. If you look in the logs: freq-table: table entry 0: 1266768 kHz, 3085 index [...] cpufreq-core: setting new policy for CPU 0: 666720 - 1266768 kHz freq-table: request for verification of policy (666720 - 1266768 kHz) for cpu 0 freq-table: verification lead to (666720 - 1266768 kHz) for cpu 0 freq-table: request for verification of policy (666720 - 1266000 kHz) for cpu 0 freq-table: verification lead to (666720 - 1266000 kHz) for cpu 0 cpufreq-core: new min and max freqs are 666720 - 1266000 kHz [...] performance: setting to 1266000 kHz because of event 1 cpufreq-core: target for CPU 0: 1266000 kHz, relation 1 freq-table: request for target 1266000 kHz (relation: 1) for cpu 0 freq-table: target is 1 (1000080 kHz, 3593) In other words, it's not actually reaching the maximum frequency due to rounding between mhz/khz values. I'm pretty sure the last patch I posted has the same bug. If so, please apply the updated version I'll post in a few mins. Also, there's something else going on here. From the logs: Detected 1333.446 MHz processor. and: userspace: managing cpu 0 started (666720 - 1266000 kHz, currently 1333440 kHz) And I suspect /proc/cpuinfo will reflect that if you don't build cpufreq support. That aside, I don't think that's a new issue (I think 2.6.18 also behaved the same way for you), so if we can get it going at 1266mhz again I think this bug is fixed (and you're welcome to file a new bug for the 1333mhz issue). Created attachment 11373 [details]
hopefully final patch
This should fix the rounding problem (but not the 1333mhz one). It replaces all
previous patches/DSDTs. Please post dmesg output from using this patch.
Created attachment 11376 [details]
dmesg output
You are correct, Daniel. I applied now your latest patch and the frequency
table should now be correct.
Concerning the discrepancy between 'detected 1333.446MHz processor' and max
frequency of 1266768MHz: Could this be related with my broken BIOS and do I
then need to correct the DSDT or is this a more general prob? In the latter
case, I would file another bug.
BR,
Dustin
I recompiled the patched kernel without frequency scaling and, of course, the CPU is then running at 1333MHz. Should I file a seperate bug? Here's how it works for you at the moment: powernow-k7 can't find any of the powernow performances tables that match your CPU. Instead, it falls back on the performance tables in the ACPI BIOS (DSDT) to figure out the available frequencies. It turns out that those tables aren't much good either. Taking the first entry: Package (0x06) { 0x03B6, 0x5DC0, 0x7D, 0x7D, 0x009C418D, 0x018D }, The first value (0x03B6) is 950 in decimal, and this represents the CPU frequency for this performance state. The FID and VID of the performance state are encoded in the 5th value (0x009C418D). The FID can be looked up in a table in the powernow-k7 driver to deduce a multiplier, 9.5 in this case. Multiply 9.5 by the FSB and you get the frequency. So, it looks like your ACPI tables were written for a system with a 100mhz FSB (9.5 * 100 = 950, which is consistent with the first value). However, your FSB is 133mhz. We have been working on the earlier assumptions that the multiplier encoded by the FID code is correct, whereas the frequency (in the first field) is incorrect. e.g. we decided that "9.5 * 133 = 1266mhz" is the right calculation. This isn't an unfair assumption to make, and this is how the powernow driver works anyway (uses ACPI-supplied-multiplier * fsb, not ACPI-supplied-frequency) but the reality is that it looks like both the FID(multiplier) *and* frequency values in the ACPI performance state tables are wrong. To reach maximum frequency there would have to be a performance state with a FID that encodes a multiplier of 10.0 (10 * 133 = 1333mhz). At this point we're beyond the level where this could be fixed in the kernel (since all of the data sources are wrong!). If you know what the allowed frequencies were *supposed* to be, I could help provide a new DSDT. Or, we could even just add another performance state with a 10.0 multiplier. However I'm not sure how safe it is to mess with stuff like this... I understood what you mean. I am using an AMD1500+ processor, not sure whether it is a mobile one. Is it possible to get the needed information for it so that I can use a corrected DSDT locally? I could open my laptop to see the exact label of it if this helps. One comment: If write a value smaller than f_max to /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq and want to revert afterwards to f_max, the CPU freq will not be set to the correct value due to kHz/MHz rounding issue. I'm not sure what you mean. Can you expand on that, perhaps with annotated debug logs? I am quite busy right now and on vacation the next three weeks. After I am back, I will give you the desired information. Hi, I have the same problem here with an asus L3D notebook that previously has an Athlon XP-M 2000+ that was 1600 Mhz and was recognized perfectly. Now there is an Athlon XP-M 2400+ that has 1800 Mhz and when kernel is started is recognized well after powernow recognized it as min and max = 1473 Mhz. I've tried latest ubuntu gutsy snapshot that has kernel 2.6.22-6 that in theory has 2.6.22rc2 but I'm not quite sure, so I don't know if it has this patch included or not. Can be my problem related? I attach dmesg. Thanx Created attachment 11719 [details]
dmesg from ubuntu gutsy daily 9 june 2007
Please reproduce this on an unpatched 2.6.22-rc4 kernel. Compile it with cpufreq debugging support enabled and boot with cpufreq.debug=3 then attach new logs if the bug still exists. marking fixed as this patch is upstream. If there are still issues, please open a new bug with the requested info and put me on CC. |