Most recent kernel where this bug did not occur: 2.6.12 Distribution: vanilla (Gentoo) Hardware Environment: Dell Inspiron 9100, Intel Pentium 4 3.2GHz with w/HT stepping 9 Software Environment: gcc version 3.3.5-20050130 (Gentoo 3.3.5.20050130-r1, ssp-3.3.5.20050130-1, pie-8.7.7.1) Problem Description: Processes take in the order of 10 times longer to run when both cpus are online and max_cstate is greater than 1. Steps to reproduce: Boot with hyperthreading enabled (and see how long it takes to run init scripts). Loading processor module with max_cstate=1 or taking one cpu off line results in the processes running as expected. (Unloading processor module and reloading the module with a different max_cstate resulted in dswload-0304: *** Error looking up _CST in namespace: AE_ALREADY_EXISTS and a system hang such that MagicSysRq was ineffective but echo 1 >| /sys/module/processor/parameters/max_cstate works fine.) When max_cstate >= 2, processes still report 100% cpu. % cat /proc/acpi/processor/CPU0/power /proc/acpi/processor/CPU1/power active state: C1 max_cstate: C8 bus master activity: ffffffe9 states: *C1: type[C1] promotion[C2] demotion[--] latency[001] usage[00245832] C2: type[C2] promotion[C3] demotion[C1] latency[050] usage[00181279] C3: type[C3] promotion[--] demotion[C2] latency[050] usage[00000000] active state: C1 max_cstate: C8 bus master activity: 0200801d states: *C1: type[C1] promotion[C2] demotion[--] latency[001] usage[00115864] C2: type[C2] promotion[C3] demotion[C1] latency[050] usage[00469656] C3: type[C3] promotion[--] demotion[C2] latency[050] usage[00000000] Using PREEMPT_VOLUNTARY or PREEMPT_NONE instead of PREEMPT makes the behaviour more eratic - sometimes jobs seem to run as expected, but sometimes they take twice as long, so the average time is similar. Using HZ=250 instead of 1000 seems to make processes take even longer. There is a thin continuous sound of about 2kHz (I guess) during inactivity with HZ=1000. With HZ=250 the sound is either gone or blends in with the fans. This sound is the same that occurs with maxcpus=1 and hyperthreading enabled since at least Linux 2.6.11. (With hyperthreading disabled in the bios or max_cstate=1 the sound is not present.) With maxcpus=1 and HT enabled: % cat /proc/acpi/processor/CPU0/power active state: C2 max_cstate: C8 bus master activity: ffdffffd states: C1: type[C1] promotion[C2] demotion[--] latency[001] usage[00055200] *C2: type[C2] promotion[C3] demotion[C1] latency[050] usage[00338121] C3: type[C3] promotion[--] demotion[C2] latency[050] usage[00000000] With HT disabled: active state: C2 max_cstate: C8 bus master activity: 00000000 states: C1: type[C1] promotion[C2] demotion[--] latency[001] usage[00000010] *C2: type[C2] promotion[--] demotion[C1] latency[001] usage[00229907] The other notable difference the logs is the new message that occurs several times: acpi_bus-0212 [-12] acpi_bus_set_power : Device is not power manageable
I don't the C-state latencies advertised by this BIOS. Please verify that you're running the latest advertised BIOS revision and then attach the output from acpidump, available in pmtools here: http://ftp.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/
Created attachment 5838 [details] output from acpidump The BIOS revision I have is A06 (03/08/2005), which is the latest available from Dell. (http://support.ap.dell.com/apjsite/Downloads/format.aspx?releaseid=R95766)
This particular issue (Unloading processor module and reloading the module with a different max_cstate resulted in dswload-0304: *** Error looking up _CST in namespace: AE_ALREADY_EXISTS Is a BIOS issue where it tries to load a same module twice. Should be fixed in recent acpi patchset with duplicate SSDT load fix. But, the main issue with C2, C3 state is still unclear to me and I will need some more info. Can you please provide output of #acpidump --addr 0x3ffefe54 --length 0x1db and #acpidump --addr 0x3ffefdf8 --length 0x5c
Created attachment 5929 [details] acpidump --addr 0x3ffefe54 --length 0x1db
Created attachment 5930 [details] acpidump --addr 0x3ffefdf8 --length 0x5c Thanks for the info on dswload-0304 and for looking into the C-states. Hope this is helpful.
There may be still more than one bug here. Looking at the ACPI disassembly, BIOS is doing some tricky things with CST here. 1) CST for CPU 0 and CPU 1 are different.CPU 1 always has only C1. 2) CPU 0 has either only C1, or C1, C2 or C1, C2, C3 or C1, C2, C3, C4 depending on some configuration (Things like whether HT is on or off). The disassembly explains somethings: 1) Why you are seeing different number of states whe HT is enabled/disabled. But, there are lot of unexplained things here 1) Why 2 CPUs are having 3 C-states when BIOS has only one entry in CPU 1 CST. This seems like a OS bug. 2) Why there is a C2 latency of 1 when HT is disabled. Probably a BIOS bug. Bad CST. 3) I am still not able to map latency 50 to any entry in BIOSes CST. Probably another bug somewhere. So, when you boot with HT enabled-maxcpus=1 do you still see the slowdown? When you boot with HT disabled in BIOS, do you still see the slowdown?
Created attachment 5936 [details] CPU 0 CST disassembly Attaching the CST disassembly for reference.
Created attachment 5937 [details] CPU 1 CST disassembly
I am not seeing a slowdown either with HT enabled and maxcpus=1 (when the sound is present) or with HT disabled in the BIOS (when there is no sound). In both cases max_cstate was not specified, and so was 8 according to /proc/acpi/processor/CPU0/power.
Created attachment 5956 [details] _CST debug patch OK. That means the slowness is mostly coming from the second CPU trying to go to C2 state, while BIOS says it can only go to C1. When you have only one CPU enabled, both with HT disabeld in BIOS or maxcpus=1, everything seems OK (though the latencies advertised by the BIOS is a suspect there). Can you please try the patch on 2.6.13 kernel, and boot with both the CPUs. That should print out a lot of messages in dmesg (probably you may need to increase the dmesg log_buf size in order to capture the whole message). Then send in that complete dmesg output. Thanks.
Created attachment 5963 [details] _CST scan debug output (bzip2 compressed) The whole dmesg output is included for completeness but processor was loaded last so the output from this is at the end. The kernel was vanilla 2.6.13 + acpi-20050815-2.6.13.diff + _CST debug patch. HT was enabled and the only kernel parameter was log_buf_len, so both cpus and all c-states were available. (With acpi-20050815-2.6.13.diff the message, swload-0304: *** Error looking up _CST in namespace: AE_ALREADY_EXISTS is no longer present, but the kernel still hangs after unloading and reloading processor, although MagicSysRq enabled a reboot.)
Created attachment 5964 [details] kernel .config Just in case this is useful.
I don't know how c-states work, but I wonder whether it is even possible to change the c-state on only one of a pair of sibling logical cpus. If there is a process running on CPU 1 (which seems to only have one c-state), wouldn't the process be affected by sending CPU 0 to C2?
I think I have root-caused this one. This is what is happening: - We try to get supported C-states from _CST. - We only find 1 C-state in there. - We fall back to P_LVL/P_BLK way of determining number of C-states. - That says C2 and C3 are supported - And we go ahead and use this C2 and C3 on both the processors. The only problem with all the above is a single bit P_LVL2_UP in Fixed Feature Flags of FADT. That bit says whether C2 is only supported on UP system. And on this system that bit is set and Linux kernel today is happily ignoring that bit. As a result, when one processor requests a C2 or C3, looks like both of them are going to idle and hence affecting the performance. This problem got unmasked only in 2.6.13 as before we used to disabled C2, C3 on all SMP systems. Now we are enabling it as some SMP system do support C2, C3. I will provide a patch for this one soon (tomorrow). Once the patch is there can you please test it on your system and make sure that it works correctly. Thanks.
After talking with Len, we have identified 2 bugs here: 1) Linux should not use C-states based on P_LVL, when _CST is present. 2) Linux should look at P_LVL2_UP flag in fadt, when using P_LVL based C- states on an SMP system. Below are the patches (1 for each bug above). Please apply both of them, rebuild the kernel and verify that the problem is solved here. The expected result is: When HT is enabled and 2 CPUs are running, only C1 should be used. Thanks.
Created attachment 6034 [details] Don't use P_LVL when there is a valid _CST
Created attachment 6035 [details] Watchout for P_LVL2_UP flag in fadt, before using C2 and beyond on SMP systems
The "Don't use P_LVL when there is a valid _CST" patch seems to work as intended. With HT (and both cpus) enabled only one c-state is available (with 0 latency?), and there was no slow down. active state: C1 max_cstate: C8 bus master activity: 00000000 states: *C1: type[C1] promotion[--] demotion[--] latency[000] usage[00134855] With HT disabled 2 c-states were available as before.
To test the "Watchout for P_LVL2_UP..." patch, I used the nocst=1 processor module parameter. The behaviour reverted to the same as without the patches: The same 3 c-states were available (with HT) with the same latencies and similar usages, and the slow down was back. The patches (both of them) were applied to Linux 2.6.13.1. (acpi-20050815-2.6.13.diff was not used as some hunks were rejected on application.) CONFIG_HOTPLUG_CPU was enabled. With kernel parameters acpi_dbg_level=0x1f acpi_dbg_layer=0x01000002, the output from processor was: acpi_processor-0476 [06] acpi_processor_get_inf: Bus mastering arbitration control present acpi_processor-0527 [06] acpi_processor_get_inf: Processor [0:0] acpi_processor-0192 [07] acpi_processor_get_thr: pblk_address[0x000010e0] duty_offset[1] duty_width[3] acpi_processor-0241 [07] acpi_processor_get_thr: Found 8 throttling states acpi_processor-0098 [08] acpi_processor_get_thr: Throttling state is T0 (0% throttling applied) acpi_processor-0560 [08] acpi_processor_get_pow: lvl2[0x000010e4] lvl3 [0x000010e5] ACPI: CPU0 (power states: C1[C1] C2[C2] C3[C3]) ACPI: Processor [CPU0] (supports 8 throttling states) acpi_processor-0476 [06] acpi_processor_get_inf: Bus mastering arbitration control present acpi_processor-0527 [06] acpi_processor_get_inf: Processor [1:1] acpi_processor-0192 [07] acpi_processor_get_thr: pblk_address[0x000010e0] duty_offset[1] duty_width[3] acpi_processor-0241 [07] acpi_processor_get_thr: Found 8 throttling states acpi_processor-0098 [08] acpi_processor_get_thr: Throttling state is T0 (0% throttling applied) acpi_processor-0560 [12] acpi_processor_get_pow: lvl2[0x000010e4] lvl3 [0x000010e5] ACPI: CPU1 (power states: C1[C1] C2[C2] C3[C3]) ACPI: Processor [CPU1] (supports 8 throttling states) I don't know what the BIOS's P_LVL2_UP value is. Is it contained in the acpidump output here or are there other arguments to extract it? Despite the _CST reporting that there is only one C-state, it seems that C2 does have an effect but would need to be entered only when both processors were idle. Whether or not the power savings are worth the effort I don't know.
Looks liek I read the FADT wrongly before. BIOS LVL2_UP value is '0' here. This is what I see in FADT/FACP: Flags: 0x32203235 LVL2_UP is bit 3. Anyways, both these patches are required for Linux. And patch 1 solves the problem here. So, I will send both these patches towards Len.
applied patches in comment #16 and comment #17 to acpi test tree
The patches in comments 16 & 17 fix the "slow-down" on my Dell Inspiron 9100 laptop (P4 3 GHz HT). However, it seems to screw up suspend and resume. I'm talking about the normal kernel suspend / resume, no extra patches or utilities. Should I open a new bug or add more info here? PS. I suspend, as root, by doing: # echo shutdown > /sys/power/disk; echo disk > /sys/power/state
suspend2 (2.2-rc7) is working well for my I9100. I'm using the patch in Comment #16 which is the one that is required. I can't remember but I'm probably not using the patch in Comment #17 as it didn't do anything for the I9100. I think there may have been an issue when suspending with processor module loaded and trying to resume from a kernel without the module. So build in processor and thermal because processor at least can't be unloaded successfully anyway. You'll probably want hibernate-script to unload some other problem modules (ndiswrapper, button, battery) and don't include any cpufreq stuff in the kernel config (even as modules). http://www.suspend2.net/downloads/
*** Bug 5432 has been marked as a duplicate of this bug. ***
For the record, I just tried (vanilla) 2.6.14_rc4, and I still get this "slowdown". > I'm using the patch in Comment #16 which is the one that is required. > I can't remember but I'm probably not using the patch in Comment #17 as it > didn't do anything for the I9100. I tried gentoo-2.6.13-gentoo-r5 with only comment #16 patch, and I got kernel oopses. > I think there may have been an issue when suspending with processor module > loaded and trying to resume from a kernel without the module. So build in > processor and thermal because processor at least can't be unloaded > successfully anyway. I have processor and thermal built-in. I used almost exactly the same options as with my working suspend on 2.6.12.x, so I would assume its 2.6.13 (or the patch) thats breaking the suspend... for me! > You'll probably want hibernate-script to unload some other problem modules > (ndiswrapper, button, battery) and don't include any cpufreq stuff in the > kernel config (even as modules). I made my own script to remove some modules - all I had to do was unload ndiswrapper, and b44 (wired network) and 2.6.12 suspended ok, but not 2.6.13... So, what now?
Downstream bug report: http://bugs.gentoo.org/110661
Slowness issue and suspend-resume issue seems to be unrelated at this point. I don't see how the changes for the slowness issue affect suspend-resume in any way. Patches for slowness issue is in Len's acpi test tree now. So, it should get into mm and the bse soon. For suspend-resume issue, it will be great if you can open another bug. Was it working fine on say 2.6.12? Was it working fine before this patch (even though the whole system was slow)? How exactly does it fail? Fails during suspend? or during resume? Does it hang or oops or panic?
Just installed vanilla-sources 2.6.15_rc1 (on gentoo) and the "slowness" is still there. The good news is the two patches still fix the issue. (Also, I can now cleanly compile and load ndiswrapper, which I haven't been able to do for all of 2.6.14 :) Now I just have to wait for suspend2 patches for 2.6.15 and I'll be a happy chappy again ;)
Created attachment 6707 [details] Watchout for P_LVL2_UP flag in fadt, before using C2 and beyond on SMP systems Venki's patch with some typos fixed. Venki, please look at it.
Created attachment 6742 [details] p_LVL2_UP flag: increament against 2.6.15-rc3-mm1
Created attachment 6774 [details] incremental patch (#3) from David Shaohua Li vs 2.6.15-rc5 patches in comment #16 and comment #17 shipped in linux-2.6.15-rc5 the refreshed 3rd patch attached from Shaohua & Venki applied to acpi-test tree
Hi, I didn't understood, is it fixed in 2.6.15-rc5? Because I have exactly the same problem (P4 HT), but upgrading to 2.6.15-rc5 didn't solve it. I didn't apply any patches, should I ? Thanks
ok, it works if I apply patch from comment #31. I've should have tried first and asked after. Sorry! Marco
Shipped in 2.6.16-rc1-git6 -- closing.