Bug 14771
Summary: | "ondemand" never raises frequency if smaller power supply is used (60w vs. 90w) -- Dell E6500 | ||
---|---|---|---|
Product: | Power Management | Reporter: | Mike Frysinger (vapier) |
Component: | cpufreq | Assignee: | cpufreq |
Status: | CLOSED INVALID | ||
Severity: | normal | CC: | lenb, michael, pasky, pasky, robert.bradbury, rui.zhang, trenn, udknight, venki, vyncere |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 2.6.32 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
annotated dmesg output with cpufreq debugging enabled
linux 2.6.32 config output of `acpidump` tar of each SSDT region |
Description
Mike Frysinger
2009-12-09 01:21:24 UTC
Created attachment 24104 [details]
annotated dmesg output with cpufreq debugging enabled
Created attachment 24105 [details]
linux 2.6.32 config
these are the settings i use by default
Is the default up_threshold 95? What does top show you for CPU utilization when you run your workload? yes, the default is the default (95) my summary mentioned the top output: (and top does show 4 burnP6's with each using 50% of the system) Can you get output of # cat /proc/timer_list | grep sleeptime; sleep 10; cat /proc/timer_list | grep sleeptime while you are running the workload. This is the idle info that ondemand uses. This should give an idea whether the problem is with stats or ondemand itself. i ran the burnP6's and then did: # while sleep 1; do grep sleeptime /proc/timer_list ; done the output never changed: .idle_sleeptime : 4247762995088 nsecs .idle_sleeptime : 4251748680094 nsecs Maybe you're facing the same issue as I did about two years ago? Look here: http://bugzilla.kernel.org/show_bug.cgi?id=9488 The first thing you can try is the "ignore_PPC" module parameter. In my case it was a buggy BIOS (some wrong ACPI tables), and the final solution for me was to add the following boot option: acpi_osi="!Windows 2006" Hope that helps, Michael not sure how i missed that in my original bug search for the issue, but i'll give it a try. the bug was closed indicating that there should have been a workaround in the kernel now ? at any rate, i'll go through it and see if the things in there make a difference; thanks. when i said "A18" bios in the summary, i meant it to mean ive tried the latest bios that Dell currently offers; it was needed to fix other throttling problems that existed in earlier versions of the ACPI tables -- see "throttlegate" as mentioned on slashdot for lots-o-details. i'll attach my acpi dumps since i've never screwed around with these things before, so i dont really know how to read and/or fix them myself Created attachment 24115 [details]
output of `acpidump`
Created attachment 24116 [details]
tar of each SSDT region
seems that `acpidump` skips duplicated named regions ? my dmesg lists these SSDT regions:
ACPI: SSDT 00000000df4502eb 0066C (v01 PmRef CpuPm 00003000 INTL 20050624)
ACPI: SSDT 00000000df450957 002C3 (v01 PmRef BspIst 00003000 INTL 20050624)
ACPI: SSDT 00000000df450df1 005C6 (v01 PmRef BspCst 00003001 INTL 20050624)
ACPI: SSDT 00000000df450c1a 001D7 (v01 PmRef ApIst 00003000 INTL 20050624)
ACPI: SSDT 00000000df4513b7 0008D (v01 PmRef ApCst 00003000 INTL 20050624)
but `acpidump` only included the first one. so here's a tar of manual runs of `acpidump` of each SSDT region.
going by that report and my dmesg, it seems i have "weird" P states too: acpi-cpufreq: *P0: 3068 MHz, 35000 mW, 10 uS acpi-cpufreq: P1: 3067 MHz, 35000 mW, 10 uS acpi-cpufreq: P2: 2134 MHz, 16314 mW, 10 uS acpi-cpufreq: P3: 1600 MHz, 15000 mW, 10 uS acpi-cpufreq: P4: 800 MHz, 12000 mW, 10 uS unfortunately, rebooting with "processor.ignore_ppc=1" added to my cmdline didnt make a difference. after switching to the ondemand governor, the freq did not go above 800 MHz (save behavior is observed as documented in the summary). # cat /sys/module/processor/parameters/ignore_ppc 1 # cat /proc/cmdline root=/dev/sda3 panic=3 quiet processor.ignore_ppc=1 seems to have gotten worse between 2.6.32.2 and 2.6.32.7. with my system, i have it switch to "powersave" when the power is unplugged (switching to battery) and then back to ondemand or performance when plugged back in. but now even when switched back to "performance", the cpu never goes above 800 MHz. i have to reboot in order to get 3.07GHz back. I have the same problem with i7 920; currently using 2.6.32.8, but 2.6.27 is already having trouble; all the symptoms from bug description are the same. For me, even on 2.6.32.8, switching cpufreq governor from ondemand to performance seems to fix the issue fine, but I haven't had to do it many times yet. Also, I don't have this problem on Core2 Duo Q9300 machines. HyperThreading turned on/off does not make a difference for the i7 machine. I have now also found one i7 machine (supposedly identical) where I seem *NOT* to have this problem. diff -u broken_dmesg working_dmesg shows: @@ -117,6 +71,14 @@ PM: Adding info for acpi:LNXCPU:05 PM: Adding info for acpi:LNXCPU:06 PM: Adding info for acpi:LNXCPU:07 +PM: Adding info for acpi:LNXCPU:08 +PM: Adding info for acpi:LNXCPU:09 +PM: Adding info for acpi:LNXCPU:0a +PM: Adding info for acpi:LNXCPU:0b +PM: Adding info for acpi:LNXCPU:0c +PM: Adding info for acpi:LNXCPU:0d +PM: Adding info for acpi:LNXCPU:0e +PM: Adding info for acpi:LNXCPU:0f PM: Adding info for acpi:LNXSYBUS:00 PM: Adding info for acpi:PNP0C0E:00 PM: Adding info for acpi:PNP0A08:00 @@ -177,6 +139,7 @@ PM: Adding info for acpi:device:25 PM: Adding info for acpi:device:26 PM: Adding info for acpi:device:27 +PM: Adding info for acpi:PNP0103:00 PM: Adding info for acpi:pnp0c14:00 PM: Adding info for acpi:LNXTHERM:00 PM: Adding info for acpi:LNXPWRBN:00 @@ -216,7 +179,6 @@ pci 0000:00:1d.7: reg 10 32bit mmio: [0xd0321000-0xd03213ff] pci 0000:00:1d.7: PME# supported from D0 D3hot D3cold pci 0000:00:1d.7: PME# disabled -pci 0000:00:1f.0: Force enabled HPET at 0xfed00000 pci 0000:00:1f.0: quirk: region 0400-047f claimed by ICH6 ACPI/GPIO/TCO pci 0000:00:1f.0: quirk: region 0500-053f claimed by ICH6 GPIO pci 0000:00:1f.0: ICH7 LPC Generic IO decode 1 PIO at 0680 (mask 007f) @@ -320,8 +282,6 @@ libata version 3.00 loaded. PCI: Using ACPI for IRQ routing PM: Adding info for No Bus:lo -hpet clockevent registered -HPET: 4 timers in total, 0 timers will be used for per-cpu timer hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0 hpet0: 4 comparators, 64-bit 14.318180 MHz counter Switching to clocksource tsc Finally, some hwinfo excerpts: broken machine #1 BIOS Info: #5 Vendor: "Intel Corp." Version: "SOX5810J.86A.2127.2008.0914.1638" Date: "09/14/2008" Board Info: #7 Manufacturer: "Intel Corporation" Product: "DX58SO" Version: "AAE29331-501" broken machine #2 BIOS Info: #5 Vendor: "Intel Corp." Version: "SOX5810J.86A.3435.2009.0210.2311" Date: "02/10/2009" Board Info: #7 Manufacturer: "Intel Corporation" Product: "DX58SO" Version: "AAE29331-503" working machine #1 BIOS Info: #5 Vendor: "Intel Corp." Version: "SOX5810J.86A.4405.2009.1020.1419" Date: "10/20/2009" Board Info: #7 Manufacturer: "Intel Corporation" Product: "DX58SO" Version: "AAE29331-701" I will try to rebuild with cpufreq debugging enabled now, and try to upgrade BIOS. my last comment may have been inaccurate. it's hard to correlate what changes the behavior such that the scaling gov goes low and stays there. sometimes it recovers during runtime (battery charges or something?), sometimes it doesnt and i get impatient with a dual core 800MHz POS that i reboot to force the issue. i did notice that the policy seems to lower itself and never increase: # grep . cpu?/cpufreq/scaling_max_freq cpu0/cpufreq/scaling_max_freq:800000 cpu1/cpufreq/scaling_max_freq:800000 writing to this manually or using `cpufreq --max` doesnt make a difference # grep . cpu?/cpufreq/scaling_available_frequencies cpu0/cpufreq/scaling_available_frequencies:3068000 3067000 2134000 1600000 800000 cpu1/cpufreq/scaling_available_frequencies:3068000 3067000 2134000 1600000 800000 # echo 3068000 > cpu0/cpufreq/scaling_max_freq # echo 3068000 > cpu1/cpufreq/scaling_max_freq # grep . cpu?/cpufreq/scaling_max_freq cpu0/cpufreq/scaling_max_freq:800000 cpu1/cpufreq/scaling_max_freq:800000 # cpufreq-info | grep policy current policy: frequency should be within 800 MHz and 800 MHz. current policy: frequency should be within 800 MHz and 800 MHz. i noticed that it seems to happen a lot at my office and not so much at home. then i realized i have a larger power supply at home (~90W) than at work (~60W). so i brought them both in and when i plug in my smaller one, cpufreq stays stuck at 800MHz max. as soon as i plug in the other one, it goes back to the normal 3.07GHz. swapping them shows consistent behavior. Without re-reading the whole thread: There are generally two possibilities how BIOS can tell OS to limit freq: ... Ok, I've better written it down for further referencing and placed it here, eventually it's going to help you (oh dear, I have to wrap up the lines...): ftp://ftp.suse.com/pub/people/trenn/ACPI_BIOS_on_Linux_guide/Frequency_is_limited_how_to.txt Reading your power supply experiences, I could imagine you see two phenomenons: - BIOS tells the OS to limit frequency because power supply requirements are not met -> max_frequency gets limited - If processor.ignore_ppc=1 is passed OS frequency switching might get ignored by HW and HW still limits frequency behind OS back -> max_frequency should still be high then, but you might see "out of sync" messages with cpufreq_debug and cpuinfo_cur_freq won't raise. Anyway, this looks like a HW and not a kernel issue and should get closed, right? Hm, I cannot modify the bugs status... i did try processor.ignore_ppc=1 before and it didnt work. just tried again with latest version (2.6.33.x) and it seems to work now. i can switch governors and the max version is no longer forced to 800mhz. whether the issue is in my ACPI tables, i did post full dumps of them, but i dont know enough about ACPI to say that's the problem absolutely. if Petr finds the command line option works for him too, i'll close the issue ... Mike, I looked at the output of your acpidump. It looks as if there is no "_PCT" section in your ACPI Bios which means that the acpi-cpufeq module may not be able to function to scale CPU speeds. In that case one needs to use and control speeds using the ondemand scheduler. The problem is that the "ondemand" scheduler has effectively been broken been since circa Linux 2.6.30. See Gentoo Bug #287463 esp. comments #20 & #21. Either your core 2 CPU doesn't support "enhanced" Intel SpeedStep or Dell stuck you with an older ACPI BIOS which was not designed to work with "enhanced" CPUs. This can be dealt with by "reverting" p4-clockmod.c to the pre-2.6.30 state (p4-clockmod.changes attachment to Bug #287463) or by upgrading your ACPI BIOS DSDT table to a more modern variant which includes the _PCT section compatible with your CPU. Linux allows one to configure a user defined DSDT file but I don't have any examples of what one should look like for "Enhanced" SpeedStep CPU's to see if it would work with my old Pentium IV Prescott CPU. If you are going to attempt to use acpi-cpufreq then you really want to enable ACPI debugging to see in greater detail what is going on with the ACPI BIOS. > Mike, I looked at the output of your acpidump. It looks as if there is no
> "_PCT" section in your ACPI Bios
Robert: I expect you are on the wrong track.
Mike's acpidump does not include any cpufreq ACPI info because this the SSDT tables are most likely loaded at runtime which acpidump cannot collect (e.g. you do not see a _PSS func as well, but this info probably got retrieved (compare with output of description), also the acpi-cpufreq would not load at all in this case).
You can retrieve the missing tables by:
acpidump --addr 0xDF450957 --len 0x2C3 >IST0.dat
acpidump --addr 0xDF450C1A --len 0x1D7 >IST1.dat
acpidump --addr 0xDF450DF1 --len 0x5C6 >CST0.dat
acpidump --addr 0xDF4513B7 --len 0x8D >CST1.dat
iasl -d [IC]ST[01].dat
will then disassemble them.
You find that by:
grep -i load *
SSDT.dsl: Load (IST0, HI0)
SSDT.dsl: Load (CST0, HC0)
SSDT.dsl: Load (CST1, HC1)
SSDT.dsl: Load (IST1, HI1)
and looking up the address/length in the SSDT object in the SSDT, above load params are pointing to:
Name (SSDT, Package (0x0C)
{
"BspIst ",
0xDF450957,
0x000002C3,
"ApIst ",
0xDF450C1A,
0x000001D7,
"BspCst ",
0xDF450DF1,
0x000005C6,
"ApCst ",
0xDF4513B7,
0x0000008D
})
Eventually checking for shared_type reveals something. Intel processors could show quite some nondeterministic behavior if it's HW_ALL (Jean Delvare showed em some quite weird behavior where frequency followed cpufreq settings in general (not like here), but individual cores where randomly set up or down).
A quick check would be whether, these differ:
/sys/devices/system/cpu/cpu0/cpufreq/{affected_cpus,related_cpus}
A more intrusive one is to check in the _PSD.
IMO it would be best to export the shared type readable to userspace, because it's hard to retrieve and in HW_ALL case, current CPU frequency of cores could be rather confusing and I expect it will cause some further grieve and bug reports in the future. Eventually this could get exported by creating /sys/../cpu0/cpufreq/shared_type which would only get filled by acpi-cpufreq.
You could also get msr-tools package/tool and try to set frequencies yourself (best without acpi-cpufreq loaded): rdmsr 0x198 shows the current freq. rdmsr 0x199 shows the current ctl register. Be careful, only change the lower 0-15 bits and write your wanted frequency in there. There is one bit (32) that modifies Turbo Mode behavior. This should do it (untested): wrmsr 0x199 $((`rdmsr 0x199` + $DESIRED_FREQ)) DESIRED_FREQ can be a value from scaling_available_frequencies. i did manually dump those regions. that is the tarball in my 4th attachment. note that when i'm plugged in to my 90W power supply, the ondemand gov works fine. it also seems to work fine when running on battery. Thomas, I would agree that I could very easily be on the wrong track as Mike is working with a Core 2 processor/BIOS (which I strongly suspect should support "Enhanced" SpeedStep and the associated ACPI BIOS presuming Dell provides that). That machine is probably of the order of ~2 years old. I on the other hand am working with a 4-5 year old Pentium IV Prescott which does not support "Enhanced" SpeedStep in a HP Pavilion with an ACPI BIOS which does not support processor scaling (_PCT) (presumably because the Desktop processor itself does not allow it). This is evidenced by the fact that when the proper Linux CONFIG & diagnostic flags are set, I do get: "ACPI-based processor performance control unavailable" and the associated ENODEV. But the *default* Linux messages (i.e. those available without jumping through hoops) do not make it clear when (a) ones processor WILL or WILL NOT support Enhanced SpeedStep (which could in theory be controlled by the ACPI BIOS); and (b) ones ACPI BIOS does or does not have the capability of controlling the processor in that way (i.e. one can effectively use acpi-cpufreq vs. p4-clockmod). These are two different entities. Just because one has the processor doesn't mean you have the BIOS and one could envision situations where one has the BIOS features without a processor that can use them. All of my messages to various forums (and the lack of response) seem to suggest that one cannot get effective "awareness" of ones clock speed (and power consumption) when using acpi-cpufreq unless one has an ACPI 4.0 BIOS and there doesn't appear to be any user level facility (like the Gnome CPU Frequency Scaling Monitor) which could monitor these statistics and report them to the user. The key point is whether or not the user knows when and why his computer may be running amok (from a program/CPU use perspective). Sure I can run "top" but that doesn't give me the focus that the CFSM provides. A 4 panel-window System Monitor (CPU+MEM+Network+Disk) and the CFSM provides a very good real time monitor/diagnostic for what ones system is doing and the source of any problems. > note that when i'm plugged in to my 90W power supply, the ondemand gov works > fine. Ah yes..., this really isn't a kernel bug. Can someone with enough privileges (the reporter?), close it. You should also modify the title, so that others get a quick idea that this is about power supply and whether their problem is related... > I on the other hand am working with a 4-5 year old Pentium IV Prescott I expect it doesn't support SpeedStep at all (without Enhanced you only got 2 steps), but I am not entirely sure. Theoretically you could read up the Intel docs whether it's supported at all. If, you also could try some rudimentary things in userspace, in this case it's IO and not MSR driven. But instead of wasting the time to finally find out that Pentium IV wasted power like hell and there is not much you can do about that, you better invest some bugs into a new cheap board and a new processor... Hi all, My personal experience may help too. I had the same problem with my Thinkpad T410 (Core i5 520 M). With kernel 2.6.36, I had to set the "Performance" Profile in my BIOS, for AC/DC power and Battery mode, instead of "Power-saving / Balanced / etc.". After that, the bios_limit managed to reach the max value, 2.40 GHz instead of 1.20GHz, and cpufreq managed to do its job. But it's not the only thing. This observation was made with my battery plugged on my laptop, with AC/DC power. Another day, without battery, I was very surprised when I saw that the bios_limit was still stuck to 1.20GHz (!). I think it's a BIOS related problem, very similar to the previous power supply case. But fortunately, "processor.ignore_ppc=1" boot parameter does perfectly the job. (Phew !!!) BTW, for me, the problems with ondemand governor boiled down to software issues (modulo kernel upgrades throughout 2.6.32.x, this seemed to resolve some most basic issues). First, a lot of my load was niced and I was not aware of /sys/devices/system/cpu/cpu$i/cpufreq/ondemand/ignore_nice_load. Setting this resolved my problems in part. This setting is being reset time by time anyway, but that is likely to be some evil desktop thing trying to meddle with stuff. Bug closed according to comment #23. |