Bug 20722
Summary: | intel_idle boot hang if C-states disabled in BIOS, NO_HZ=n CONFIG_HIGH_RES_TIMERS=n - Xeon X5550 - IBM System x3650 M2 | ||
---|---|---|---|
Product: | Power Management | Reporter: | Marc Aurele La France (tsi) |
Component: | intel_idle | Assignee: | power-management_intel_idle (power-management_intel_idle) |
Status: | CLOSED PATCH_ALREADY_AVAILABLE | ||
Severity: | normal | CC: | lenb, maciej.rutecki, rui.zhang, tsi, yakui.zhao |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 2.6.35 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
acpidump output (identical in all cases)
/proc/cpuinfo (identical in all cases modulo speed & bogomips) lspci output (identical in all cases) grep output (2.6.36-rc8 INTEL_IDLE=n) grep output (2.6.36-rc8 INTEL_IDLE=y max_cstate=1) dmesg (2.6.36-rc8 INTEL_IDLE=y max_cstat=1) dmesg (2.6.36-rc8 INTEL_IDLE=y max_cstate=1) .config for 2.6.36-rc8 INTEL_IDLE=n requested dmesg (processor.max_cstate=7) .config for comment #15 C-state disabled MSRs C-state enabled MSRs 3.6.37 .config diff |
Description
Marc Aurele La France
2010-10-18 20:07:27 UTC
On Sat, 16 Oct 2010, Len Brown wrote: > > > On the Xeon's, 2.6.35 hangs early on, upon the first test of trace events > > > (in kernel/trace/trace_events.c:event_trace_self_tests()). When > > > disabling all tracing, debugging, etc., it still hangs but slightly > > > later. The megaraid_sas module is loaded, detects the adapter, but never > > > gets around to registering it with the SCSI layer. > > This is due to "CONFIG_INTEL_IDLE=y". > Please file a bug report at bugzilla.kernel.org and assign it to me. > Please reproduce using an upstream 2.6.36-rc8 kernel. > Boot a CONFIG_INTEL_IDLE=n kernel and to the bug report... > attach the output from acpidump > 'cat /proc/cpuinfo' > 'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*' > 'lspci' > Then boot a CONFIG_INTEL_IDLE=y kernel and see what is the highest N that > boots when you boot with "intel_idle.max_cstate=N" (0 will disable > the driver completely) and if any of them boot, for the highest N, > attach to the bug report the complete dmesg and the output from > 'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*' Created attachment 33992 [details]
acpidump output (identical in all cases)
Created attachment 34002 [details]
/proc/cpuinfo (identical in all cases modulo speed & bogomips)
Created attachment 34012 [details]
lspci output (identical in all cases)
Created attachment 34022 [details]
grep output (2.6.36-rc8 INTEL_IDLE=n)
Created attachment 34032 [details]
grep output (2.6.36-rc8 INTEL_IDLE=y max_cstate=1)
Created attachment 34042 [details]
dmesg (2.6.36-rc8 INTEL_IDLE=y max_cstat=1)
2.6.36-rc8 INTEL_IDLE=y max_cstate=2 hangs as described above Created attachment 34052 [details]
dmesg (2.6.36-rc8 INTEL_IDLE=y max_cstate=1)
*** Bug 20002 has been marked as a duplicate of this bug. *** re: comment #5 the ACPI baseline case with CONFIG_INTEL_IDLE=n ... > grep: /sys/devices/system/cpu/cpu*/cpuidle/*/*: No such file or directory please grep CONFIG_ACPI_PROCESSOR .config if it is =m, try 'modprobe processor' or try =y. what do you see with: cat /proc/acpi/processor/*/power Re: comment #9 -- dmesg > cpuidle: using governor ladder Hmmm, haven't used that since we went tickless a few years ago. please attach the .config in particular, what is CONFIG_NO_HZ? If it is =n, please try =y and be sure that CPU_IDLE_GOV_MENU is enabled Created attachment 34172 [details]
.config for 2.6.36-rc8 INTEL_IDLE=n
This config originally hails from Red Hat's 2.6.9 modified kernel. It has been incrementally `make oldconfig`'ed over the years. So I'm not surprised it isn't tickless.
Anyway, a 2.6.36-rc8 kernel with INTEL_IDLE, ACPI_PROCESSOR, NO_HZ and IDLE_GOV_MENU all set to "y" runs fine regardless of max_cstates [0-7]. So, it seems INTEL_IDLE simply needs another Kconfig dependency.
As for /proc/acpi/processor/*/power, a commit you signed off on removes them.
intel_idle does not depend on ACPI_PROCESSOR, NO_HZ, or IDLE_GOV_MENU. When I delete them from my working system, it still boots. Unfortunately, I've not been able to reproduce your boot hang using your .config, as I've failed to convince it to mount root on my Fedora 13 test box. Please try ACPI mode, like so: CONFIG_ACPI_PROCESSOR=y CONFIG_NO_HZ=n CONFIG_INTEL_IDLE=n Attach the full dmesg and output from 'grep . /sys/devices/system/cpu/cpu*/cpuidle/*/*' If acpi_idle fails to boot, try "processor.max_cstate=1" and increase until it fails. Created attachment 34222 [details] requested dmesg (processor.max_cstate=7) This config (re: comment #14) runs fine regardless of processor.max_cstate [0-7]. There are no /sys/devices/system/cpu/cpu*/cpuidle/*/*. please attach the .config run in comment #15 and show the output from 'turbostat -v sleep 20' turbostat is available here: http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/pmtools-latest/turbostat/turbostat.c Created attachment 34282 [details] .config for comment #15 After `modprobe msr`, turbostat gives... CPUID GenuineIntel 11 levels family:model:stepping 0x6:1a:5 (6:26:5) 12 * 133 = 1600 MHz max efficiency 20 * 133 = 2667 MHz TSC frequency 22 * 133 = 2933 MHz max turbo 4 active cores 22 * 133 = 2933 MHz max turbo 3 active cores 23 * 133 = 3067 MHz max turbo 2 active cores 23 * 133 = 3067 MHz max turbo 1 active cores pkg core CPU %c0 GHz TSC %c1 %c3 %c6 %pc3 %pc6 0.23 2.05 2.67 99.77 0.00 0.00 0.00 0.00 0 0 0 0.37 2.25 2.67 99.63 0.00 0.00 0.00 0.00 0 1 1 0.21 2.00 2.67 99.79 0.00 0.00 0.00 0.00 0 2 2 0.13 1.63 2.67 99.87 0.00 0.00 0.00 0.00 0 3 3 0.41 2.20 2.67 99.59 0.00 0.00 0.00 0.00 1 0 4 0.27 2.19 2.67 99.73 0.00 0.00 0.00 0.00 1 1 5 0.21 1.94 2.67 99.79 0.00 0.00 0.00 0.00 1 2 6 0.13 1.69 2.67 99.87 0.00 0.00 0.00 0.00 1 3 7 0.10 1.62 2.67 99.90 0.00 0.00 0.00 0.00 20.010713 sec Hmmm, the 2.6.36 kernel has CONFIG_ACPI_PROCESSOR=y, yet turbostat shows that it fails to enter any C-states deeper than C1. How about if you boot it with "processor.nocst=1"? please confirm that acpi_idle probed by showing 'grep . /sys/devices/system/cpu/cpuidle/*' Assuming it did, then for some reason it didn't actually register any C-states with cpuidle, which would explain the missing states in /sys/devices/system/cpu/cpu*/cpuidle/*/* What if you boot a 2.6.35 CONFIG_ACPI_PROCESSOR=y kernel -- what do you see in /proc/acpi/processor/*/power? (if processor.nocst=1 worked above, try it with and without here also) Please show the turbostat output for the INTEL_IDLE=y, NO_HZ=y, IDLE_GOV_MENU=y kernel to see if it is getting into deep C-states. That kernel will fail to boot if you boot with "nohz=off", right? Does it boot when "nolapic_timer" is added to the cmdline? Finally, please identify the motherboard. Are the BIOS SETUP options at their defaults WRT power management options? (In reply to comment #18) > Hmmm, the 2.6.36 kernel has CONFIG_ACPI_PROCESSOR=y, > yet turbostat shows that it fails to enter any C-states deeper than C1. > How about if you boot it with "processor.nocst=1"? Just about the same as before CPUID GenuineIntel 11 levels family:model:stepping 0x6:1a:5 (6:26:5) 12 * 133 = 1600 MHz max efficiency 20 * 133 = 2667 MHz TSC frequency 22 * 133 = 2933 MHz max turbo 4 active cores 22 * 133 = 2933 MHz max turbo 3 active cores 23 * 133 = 3067 MHz max turbo 2 active cores 23 * 133 = 3067 MHz max turbo 1 active cores pkg core CPU %c0 GHz TSC %c1 %c3 %c6 %pc3 %pc6 0.25 2.07 2.67 99.75 0.00 0.00 0.00 0.00 0 0 0 0.23 1.94 2.67 99.77 0.00 0.00 0.00 0.00 0 1 1 0.19 1.91 2.67 99.81 0.00 0.00 0.00 0.00 0 2 2 0.13 1.69 2.67 99.87 0.00 0.00 0.00 0.00 0 3 3 0.11 1.62 2.67 99.89 0.00 0.00 0.00 0.00 1 0 4 0.28 2.19 2.67 99.72 0.00 0.00 0.00 0.00 1 1 5 0.20 2.00 2.67 99.80 0.00 0.00 0.00 0.00 1 2 6 0.13 1.69 2.67 99.87 0.00 0.00 0.00 0.00 1 3 7 0.69 2.35 2.67 99.31 0.00 0.00 0.00 0.00 20.007365 sec > please confirm that acpi_idle probed by showing > 'grep . /sys/devices/system/cpu/cpuidle/*' grep -r . `find /sys/devices/system/cpu -name cpuidle` shows /sys/devices/system/cpu/cpuidle/current_driver:acpi_idle /sys/devices/system/cpu/cpuidle/current_governor_ro:ladder > Assuming it did, then for some reason it didn't actually > register any C-states with cpuidle, which would explain > the missing states in /sys/devices/system/cpu/cpu*/cpuidle/*/* > What if you boot a 2.6.35 CONFIG_ACPI_PROCESSOR=y kernel -- > what do you see in /proc/acpi/processor/*/power? > (if processor.nocst=1 worked above, try it with and without > here also) This hangs, regardless of nocst. > Please show the turbostat output for the > INTEL_IDLE=y, NO_HZ=y, IDLE_GOV_MENU=y kernel > to see if it is getting into deep C-states. It does ... CPUID GenuineIntel 11 levels family:model:stepping 0x6:1a:5 (6:26:5) 12 * 133 = 1600 MHz max efficiency 20 * 133 = 2667 MHz TSC frequency 22 * 133 = 2933 MHz max turbo 4 active cores 22 * 133 = 2933 MHz max turbo 3 active cores 23 * 133 = 3067 MHz max turbo 2 active cores 23 * 133 = 3067 MHz max turbo 1 active cores pkg core CPU %c0 GHz TSC %c1 %c3 %c6 %pc3 %pc6 0.59 1.66 2.67 0.60 0.14 98.67 0.76 79.46 0 0 0 0.82 1.90 2.67 0.59 0.10 98.50 0.76 79.45 0 1 1 0.57 1.60 2.67 0.60 0.00 98.83 0.76 79.45 0 2 2 0.56 1.61 2.67 0.60 0.52 98.31 0.76 79.45 0 3 3 0.61 1.60 2.67 0.62 0.18 98.59 0.76 79.45 1 0 4 0.55 1.62 2.67 0.59 0.03 98.83 0.76 79.47 1 1 5 0.53 1.61 2.67 0.60 0.06 98.81 0.76 79.47 1 2 6 0.54 1.61 2.67 0.61 0.15 98.70 0.76 79.47 1 3 7 0.52 1.60 2.67 0.61 0.11 98.76 0.76 79.47 20.005420 sec > That kernel will fail to boot if you boot with "nohz=off", right? > Does it boot when "nolapic_timer" is added to the cmdline? No. It boots fine with all four combinations of these options. > Finally, please identify the motherboard. The system is an IBM System x3650 M2. God knows what IBM calls its board. I can privately send you the docs, if you want. Not all that detailed, really. > Are the BIOS SETUP options at their defaults WRT > power management options? No, because the defaults are not optimal for my purposes. Relevent settings appear to be: 1) Package ACPI C-State Limit: Set to ACPI C2; help says ... Package ACPI C-State Limit selects the processor's lowest idle power state. Choosing a higher C-State allows lower processor idle power. ACPI C2 equals Intel C3. ACPI C3 equals Intel C6. 2) CPU C-States: Set to Disabled; help says ... Enable/Disable ACPI Processor Power states C2 & C3. > CPU C-States: Set to Disabled Yay, this explains the unsolved mystery of the 1st 19 comments in this bug report. Linux acpi_idle sees no C-states besides C1 because they have been manually disabled. BTW, this is a very unusual way to run the machine. Note that if C-state latency avoidance is your goal, you can do that at run-time via linux pm_qos, or at boot time via the acpi_idle and intel_idle max_cstate boot parameters. Or if even C1 latency is too high, you can disable all C-states by booting with "idle=poll"... Note also that disabling deep C-states impacts the ability of the machine to reach high-frequency turbo modes. You can observe this in the GHz column in turbostat when the system is under load. Please try the BIOS SETUP options at the defaults to enable all the ACPI C-states and run acpi_idle, does turbostat show that you are then getting into the deep C-states? intel_idle ignores the ACPI BIOS settings, of course, so what we are trying to do here is get an apples/apples comparison of acpi_idle and intel_idle using the same states. In this case, it would be interesting if acpi_idle is able to boot and access the deep C-states when intel_idle can not. >> What if you boot a 2.6.35 CONFIG_ACPI_PROCESSOR=y kernel -- >> what do you see in /proc/acpi/processor/*/power? >> (if processor.nocst=1 worked above, try it with and without >> here also) > >This hangs, regardless of nocst. The idea there was to run in ACPI mode, rather than in intel_idle mode. So you would have to build with CONFIG_INTEL_IDLE=n or disable it at boot time with intel_idle.max_cstate=0. But as you've identified that the ACPI C-states are disabled in the BIOS, that test is no longer necessary to explain why ACPI saw no C-states. >> Please show the turbostat output for the >> INTEL_IDLE=y, NO_HZ=y, IDLE_GOV_MENU=y kernel >> to see if it is getting into deep C-states. > > It does ... >> That kernel will fail to boot if you boot with "nohz=off", right? >> Does it boot when "nolapic_timer" is added to the cmdline? > No. It boots fine with all four combinations of these options. Hmm, this kernel works with "nohz=off", yet if you change it to CONFIG_NO_HZ=n it fails? What governor was it using? (grep . /sys/devices/system/cpu/cpuidle/*" (In reply to comment #20) > > CPU C-States: Set to Disabled > Yay, this explains the unsolved mystery of the 1st 19 comments in this bug > report. Linux acpi_idle sees no C-states besides C1 because they have been > manually disabled. > BTW, this is a very unusual way to run the machine. Note that if C-state > latency avoidance is your goal, you can do that at run-time via linux pm_qos, > or at boot time via the acpi_idle and intel_idle max_cstate boot parameters. > Or if even C1 latency is too high, you can disable all C-states by booting > with "idle=poll"... ... or, to be more in line with the KISS principle, not configure anything at all that depends on CPU_IDLE. That's probably what I'll end up doing shortly, when I move the entire cluster to 2.6.36-release. > Note also that disabling deep C-states impacts the ability of the machine to > reach high-frequency turbo modes. You can observe this in the GHz column in > turbostat when the system is under load. This makes no sense. You get more performance if you allow the CPU(s) to use less power when idle? Performance-per-watt maybe, but I don't care to much about that. > Please try the BIOS SETUP options at the defaults to enable all the ACPI > C-states and run acpi_idle, does turbostat show that you are then getting > into the deep C-states? It turns out the settings I mention in comment #19 were the only ones I needed to change. Anyway, turbostat says ... CPUID GenuineIntel 11 levels family:model:stepping 0x6:1a:5 (6:26:5) 12 * 133 = 1600 MHz max efficiency 20 * 133 = 2667 MHz TSC frequency 22 * 133 = 2933 MHz max turbo 4 active cores 22 * 133 = 2933 MHz max turbo 3 active cores 23 * 133 = 3067 MHz max turbo 2 active cores 23 * 133 = 3067 MHz max turbo 1 active cores pkg core CPU %c0 GHz TSC %c1 %c3 %c6 %pc3 %pc6 2.25 1.61 2.67 2.19 0.64 94.92 1.45 76.55 0 0 0 2.36 1.60 2.67 2.03 0.27 95.34 1.45 76.50 0 1 1 2.25 1.60 2.67 2.16 0.31 95.29 1.45 76.50 0 2 2 2.20 1.60 2.67 2.21 0.31 95.28 1.45 76.50 0 3 3 2.22 1.60 2.67 2.25 0.25 95.29 1.45 76.50 1 0 4 2.35 1.63 2.67 2.13 1.08 94.43 1.45 76.60 1 1 5 2.24 1.61 2.67 2.21 1.06 94.50 1.45 76.60 1 2 6 2.23 1.60 2.67 2.29 0.92 94.57 1.45 76.60 1 3 7 2.20 1.61 2.67 2.26 0.90 94.64 1.45 76.60 20.006242 sec > intel_idle ignores the ACPI BIOS settings, of course, so what > we are trying to do here is get an apples/apples comparison > of acpi_idle and intel_idle using the same states. In this > case, it would be interesting if acpi_idle is able to boot > and access the deep C-states when intel_idle can not. It would also be good if it didn't hang. > >> What if you boot a 2.6.35 CONFIG_ACPI_PROCESSOR=y kernel -- > >> what do you see in /proc/acpi/processor/*/power? > >> (if processor.nocst=1 worked above, try it with and without > >> here also) > >This hangs, regardless of nocst. > The idea there was to run in ACPI mode, rather than in intel_idle mode. So > you would have to build with CONFIG_INTEL_IDLE=n or disable it at boot time > with intel_idle.max_cstate=0. > But as you've identified that the ACPI C-states are disabled in the BIOS, > that test is no longer necessary to explain why ACPI saw no C-states. I did so anyway. grep . /proc/acpi/processor/*/power gives /proc/acpi/processor/CPU0/power:active state: C0 /proc/acpi/processor/CPU0/power:max_cstate: C8 /proc/acpi/processor/CPU0/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU0/power:states: /proc/acpi/processor/CPU1/power:active state: C0 /proc/acpi/processor/CPU1/power:max_cstate: C8 /proc/acpi/processor/CPU1/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU1/power:states: /proc/acpi/processor/CPU2/power:active state: C0 /proc/acpi/processor/CPU2/power:max_cstate: C8 /proc/acpi/processor/CPU2/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU2/power:states: /proc/acpi/processor/CPU3/power:active state: C0 /proc/acpi/processor/CPU3/power:max_cstate: C8 /proc/acpi/processor/CPU3/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU3/power:states: /proc/acpi/processor/CPU4/power:active state: C0 /proc/acpi/processor/CPU4/power:max_cstate: C8 /proc/acpi/processor/CPU4/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU4/power:states: /proc/acpi/processor/CPU5/power:active state: C0 /proc/acpi/processor/CPU5/power:max_cstate: C8 /proc/acpi/processor/CPU5/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU5/power:states: /proc/acpi/processor/CPU6/power:active state: C0 /proc/acpi/processor/CPU6/power:max_cstate: C8 /proc/acpi/processor/CPU6/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU6/power:states: /proc/acpi/processor/CPU7/power:active state: C0 /proc/acpi/processor/CPU7/power:max_cstate: C8 /proc/acpi/processor/CPU7/power:maximum allowed latency: 2000000000 usec /proc/acpi/processor/CPU7/power:states: > >> Please show the turbostat output for the INTEL_IDLE=y, NO_HZ=y, > >> IDLE_GOV_MENU=y kernel to see if it is getting into deep C-states. > >> That kernel will fail to boot if you boot with "nohz=off", right? > >> Does it boot when "nolapic_timer" is added to the cmdline? > > No. It boots fine with all four combinations of these options. > Hmm, this kernel works with "nohz=off", yet if you change it to > CONFIG_NO_HZ=n it fails? That is so, yes. > What governor was it using? > (grep . /sys/devices/system/cpu/cpuidle/*" /sys/devices/system/cpu/cpuidle/current_driver:intel_idle /sys/devices/system/cpu/cpuidle/current_governor_ro:menu (In reply to comment #21) > (In reply to comment #20) > > >> Please show the turbostat output for the INTEL_IDLE=y, NO_HZ=y, > > >> IDLE_GOV_MENU=y kernel to see if it is getting into deep C-states. > > >> That kernel will fail to boot if you boot with "nohz=off", right? > > >> Does it boot when "nolapic_timer" is added to the cmdline? > > > No. It boots fine with all four combinations of these options. > > Hmm, this kernel works with "nohz=off", yet if you change it to > > CONFIG_NO_HZ=n it fails? > That is so, yes. > > What governor was it using? > > (grep . /sys/devices/system/cpu/cpuidle/*" > /sys/devices/system/cpu/cpuidle/current_driver:intel_idle > /sys/devices/system/cpu/cpuidle/current_governor_ro:menu Is there anything more you want me to try on this? Thanks. >> Note also that disabling deep C-states impacts the ability of the machine to >> reach high-frequency turbo modes. You can observe this in the GHz column in >> turbostat when the system is under load. > This makes no sense. You get more performance if you allow the CPU(s) to use > less power when idle? Performance-per-watt maybe, but I don't care to much > about that. Each processor package has a fixed power and thermal budget. When some cores are idle, the busy cores have available power and thermal budget that they can use for "opportunistic frequency upside", AKA "turbo mode". Turbostat spells this out: 12 * 133 = 1600 MHz max efficiency 20 * 133 = 2667 MHz TSC frequency 22 * 133 = 2933 MHz max turbo 4 active cores 22 * 133 = 2933 MHz max turbo 3 active cores 23 * 133 = 3067 MHz max turbo 2 active cores 23 * 133 = 3067 MHz max turbo 1 active cores So under nominal electrical and cooling conditions, this part will run all cores continuously at up to 2667. Turbo allows all 4 cores to run at up to 2933 until it the part gets hot. If 2 or 3 cores are idle, then the maximum frequency is 3067, and that will be sustained as long as the part stays within its thermal limits. So the answer is yes, the part can deliver more performance when the cores are permitted to use less power when idle. With C-states renabled in BIOS SETUP, booting 2.6.36 intel_idle.max_cstate=0 to enable acpi_idle, you were able to get into deep C-states, as shown by the turbostat output in comment #21. For that scenario, please show the output from grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* The same kernel booted with: intel_idle.max_cstate=1 works fine. But intel_idle.max_cstate=2 hangs someplace during boot. You are using a CONFIG_NO_HZ=n kernel. If you change that to CONFIG_NO_HZ=y then everything works fine. Unexpectedly, a CONFIG_NO_HZ=y booted with "nohz=off" also works fine. Is that summary accurate? (In reply to comment #23) >>> Note also that disabling deep C-states impacts the ability of the machine >>> to >>> reach high-frequency turbo modes. You can observe this in the GHz column >>> in >>> turbostat when the system is under load. >> This makes no sense. You get more performance if you allow the CPU(s) to >> use >> less power when idle? Performance-per-watt maybe, but I don't care to much >> about that. > Each processor package has a fixed power and thermal budget. > When some cores are idle, the busy cores have available power > and thermal budget that they can use for "opportunistic > frequency upside", AKA "turbo mode". > Turbostat spells this out: > 12 * 133 = 1600 MHz max efficiency > 20 * 133 = 2667 MHz TSC frequency > 22 * 133 = 2933 MHz max turbo 4 active cores > 22 * 133 = 2933 MHz max turbo 3 active cores > 23 * 133 = 3067 MHz max turbo 2 active cores > 23 * 133 = 3067 MHz max turbo 1 active cores > So under nominal electrical and cooling conditions, this part > will run all cores continuously at up to 2667. Turbo allows > all 4 cores to run at up to 2933 until it the part gets hot. > If 2 or 3 cores are idle, then the maximum frequency is 3067, > and that will be sustained as long as the part stays within > its thermal limits. > So the answer is yes, the part can deliver more performance > when the cores are permitted to use less power when idle. OK. You've got me convinced of the error in my ways. Silly question to ask of you perhaps, but I'm wondering if the AMDs have a similar scheme. Probably not. Thanks for the correction. (In reply to comment #24) > With C-states renabled in BIOS SETUP, > booting 2.6.36 intel_idle.max_cstate=0 to enable acpi_idle, > you were able to get into deep C-states, as shown by > the turbostat output in comment #21. Yes. > For that scenario, please show the output from > grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* /sys/devices/system/cpu/cpu0/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu0/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu0/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu0/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu0/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu0/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu0/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu0/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu0/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu0/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu0/cpuidle/state1/usage:2979 /sys/devices/system/cpu/cpu0/cpuidle/state1/time:3466983 /sys/devices/system/cpu/cpu0/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu0/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu0/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu0/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu0/cpuidle/state2/usage:6104 /sys/devices/system/cpu/cpu0/cpuidle/state2/time:8843703 /sys/devices/system/cpu/cpu0/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu0/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu0/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu0/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu0/cpuidle/state3/usage:56814 /sys/devices/system/cpu/cpu0/cpuidle/state3/time:151472271 /sys/devices/system/cpu/cpu1/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu1/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu1/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu1/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu1/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu1/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu1/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu1/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu1/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu1/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu1/cpuidle/state1/usage:4585 /sys/devices/system/cpu/cpu1/cpuidle/state1/time:3363710 /sys/devices/system/cpu/cpu1/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu1/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu1/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu1/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu1/cpuidle/state2/usage:6896 /sys/devices/system/cpu/cpu1/cpuidle/state2/time:9265035 /sys/devices/system/cpu/cpu1/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu1/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu1/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu1/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu1/cpuidle/state3/usage:57609 /sys/devices/system/cpu/cpu1/cpuidle/state3/time:1423589867 /sys/devices/system/cpu/cpu2/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu2/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu2/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu2/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu2/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu2/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu2/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu2/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu2/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu2/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu2/cpuidle/state1/usage:2403 /sys/devices/system/cpu/cpu2/cpuidle/state1/time:3142363 /sys/devices/system/cpu/cpu2/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu2/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu2/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu2/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu2/cpuidle/state2/usage:5152 /sys/devices/system/cpu/cpu2/cpuidle/state2/time:7944935 /sys/devices/system/cpu/cpu2/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu2/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu2/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu2/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu2/cpuidle/state3/usage:59271 /sys/devices/system/cpu/cpu2/cpuidle/state3/time:1427461159 /sys/devices/system/cpu/cpu3/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu3/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu3/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu3/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu3/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu3/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu3/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu3/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu3/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu3/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu3/cpuidle/state1/usage:2167 /sys/devices/system/cpu/cpu3/cpuidle/state1/time:3059448 /sys/devices/system/cpu/cpu3/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu3/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu3/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu3/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu3/cpuidle/state2/usage:4950 /sys/devices/system/cpu/cpu3/cpuidle/state2/time:7897376 /sys/devices/system/cpu/cpu3/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu3/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu3/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu3/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu3/cpuidle/state3/usage:60101 /sys/devices/system/cpu/cpu3/cpuidle/state3/time:1425792279 /sys/devices/system/cpu/cpu4/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu4/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu4/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu4/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu4/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu4/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu4/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu4/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu4/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu4/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu4/cpuidle/state1/usage:2798 /sys/devices/system/cpu/cpu4/cpuidle/state1/time:2781128 /sys/devices/system/cpu/cpu4/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu4/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu4/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu4/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu4/cpuidle/state2/usage:6026 /sys/devices/system/cpu/cpu4/cpuidle/state2/time:10147072 /sys/devices/system/cpu/cpu4/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu4/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu4/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu4/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu4/cpuidle/state3/usage:58763 /sys/devices/system/cpu/cpu4/cpuidle/state3/time:1422955488 /sys/devices/system/cpu/cpu5/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu5/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu5/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu5/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu5/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu5/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu5/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu5/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu5/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu5/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu5/cpuidle/state1/usage:4747 /sys/devices/system/cpu/cpu5/cpuidle/state1/time:4148144 /sys/devices/system/cpu/cpu5/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu5/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu5/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu5/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu5/cpuidle/state2/usage:7056 /sys/devices/system/cpu/cpu5/cpuidle/state2/time:9609591 /sys/devices/system/cpu/cpu5/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu5/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu5/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu5/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu5/cpuidle/state3/usage:60452 /sys/devices/system/cpu/cpu5/cpuidle/state3/time:1422184588 /sys/devices/system/cpu/cpu6/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu6/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu6/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu6/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu6/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu6/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu6/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu6/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu6/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu6/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu6/cpuidle/state1/usage:2454 /sys/devices/system/cpu/cpu6/cpuidle/state1/time:3193182 /sys/devices/system/cpu/cpu6/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu6/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu6/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu6/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu6/cpuidle/state2/usage:5287 /sys/devices/system/cpu/cpu6/cpuidle/state2/time:8147649 /sys/devices/system/cpu/cpu6/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu6/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu6/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu6/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu6/cpuidle/state3/usage:59626 /sys/devices/system/cpu/cpu6/cpuidle/state3/time:1427344192 /sys/devices/system/cpu/cpu7/cpuidle/state0/name:C0 /sys/devices/system/cpu/cpu7/cpuidle/state0/desc:CPUIDLE CORE POLL IDLE /sys/devices/system/cpu/cpu7/cpuidle/state0/latency:0 /sys/devices/system/cpu/cpu7/cpuidle/state0/power:4294967295 /sys/devices/system/cpu/cpu7/cpuidle/state0/usage:0 /sys/devices/system/cpu/cpu7/cpuidle/state0/time:0 /sys/devices/system/cpu/cpu7/cpuidle/state1/name:C1 /sys/devices/system/cpu/cpu7/cpuidle/state1/desc:ACPI FFH INTEL MWAIT 0x0 /sys/devices/system/cpu/cpu7/cpuidle/state1/latency:3 /sys/devices/system/cpu/cpu7/cpuidle/state1/power:4294967294 /sys/devices/system/cpu/cpu7/cpuidle/state1/usage:2209 /sys/devices/system/cpu/cpu7/cpuidle/state1/time:3352888 /sys/devices/system/cpu/cpu7/cpuidle/state2/name:C2 /sys/devices/system/cpu/cpu7/cpuidle/state2/desc:ACPI FFH INTEL MWAIT 0x10 /sys/devices/system/cpu/cpu7/cpuidle/state2/latency:205 /sys/devices/system/cpu/cpu7/cpuidle/state2/power:4294967293 /sys/devices/system/cpu/cpu7/cpuidle/state2/usage:5085 /sys/devices/system/cpu/cpu7/cpuidle/state2/time:8344587 /sys/devices/system/cpu/cpu7/cpuidle/state3/name:C3 /sys/devices/system/cpu/cpu7/cpuidle/state3/desc:ACPI FFH INTEL MWAIT 0x20 /sys/devices/system/cpu/cpu7/cpuidle/state3/latency:245 /sys/devices/system/cpu/cpu7/cpuidle/state3/power:4294967292 /sys/devices/system/cpu/cpu7/cpuidle/state3/usage:58917 /sys/devices/system/cpu/cpu7/cpuidle/state3/time:1427177694 > The same kernel booted with: > intel_idle.max_cstate=1 works fine. > But > intel_idle.max_cstate=2 hangs someplace during boot. If C-states are disabled in the firmware, yes. > You are using a CONFIG_NO_HZ=n kernel. > If you change that to CONFIG_NO_HZ=y then everything works fine. > Unexpectedly, a CONFIG_NO_HZ=y booted with "nohz=off" > also works fine. Both true, with C-states disabled, but they both use the menu governor. I've just resolved some trepidation I had in trying to duplicate this problem with 2.6.36 release. So far, for the hang to occur, all of the following must hold: C-states disabled in firmware, as above; INTEL_IDLE=y; NO_HZ=n (implies ladder governor); HIGH_RES_TIMERS=n; intel_idle.max_cstate>1. Negate any one, or more, of these conditions and the hang doesn't occur. HIGH_RES_TIMERS controls SCHED_HRTICK, so that might be involved as well. Hope this helps. > C-states disabled in firmware, as above; > INTEL_IDLE=y; > NO_HZ=n (implies ladder governor); > HIGH_RES_TIMERS=n; > intel_idle.max_cstate>1. > >Negate any one, or more, of these conditions and the hang doesn't occur. >HIGH_RES_TIMERS controls SCHED_HRTICK, so that might be involved as well. If C-state are enabled in BIOS SETUP (the default) then intel_idle works properly with no special cmdline params? (In reply to comment #27) > > C-states disabled in firmware, as above; > > INTEL_IDLE=y; > > NO_HZ=n (implies ladder governor); > > HIGH_RES_TIMERS=n; > > intel_idle.max_cstate>1. > >Negate any one, or more, of these conditions and the hang doesn't occur. > >HIGH_RES_TIMERS controls SCHED_HRTICK, so that might be involved as well. > If C-state are enabled in BIOS SETUP (the default) > then intel_idle works properly with no special cmdline params? Yes. But that requirement is not documented. >> If C-state are enabled in BIOS SETUP (the default) >> then intel_idle works properly with no special cmdline params? > >Yes. But that requirement is not documented. I'd be surprised to see that there is any documentation for that that BIOS SETUP option really does. But we can endeavor to find out, say by dumping the MSRs with default BIOS SETUP and with C-state disabled BIOS setup. My guess at this point is that this is a BIOS bug, or at least a BIOS quirk. please fetch the msr-tools from here: git clone git://git.kernel.org/pub/scm/utils/cpu/msr-tools/msr-tools.git and build rdmsr we can use it to dump out the MSRs with the BIOS defaults and compare the the same MSRs for the BIOS c-state disabled setting. with ./rdmsr present, please run this script and save msr.out for the default BIOS setting, and also for the C-state disabled BIOS setting, and attach them to this bug report. #!/bin/bash OUTPUT_FILE=msr.out echo output to $OUTPUT_FILE typeset -i msr msr=0 while [ $msr -lt 1600 ] ; do ./rdmsr -a $msr if [ $? == 0 ] ; then printf "MSR 0x%x\n" $msr fi msr=$msr+1 done > $OUTPUT_FILE 2> /dev/null > CONFIG_HIGH_RES_TIMERS=n
So if you set CONFIG_HIGH_RES_TIMERS=y then everything works fine?
If you build with CONFIG_HIGH_RES_TIMERS=y and then boot
with "highres=off" then we see the failure?
Created attachment 38532 [details]
C-state disabled MSRs
Created attachment 38542 [details]
C-state enabled MSRs
(In reply to comment #31) > > CONFIG_HIGH_RES_TIMERS=n > So if you set CONFIG_HIGH_RES_TIMERS=y then everything works fine? Yes, with the default highres setting. > If you build with CONFIG_HIGH_RES_TIMERS=y and then boot > with "highres=off" then we see the failure? Yes. Does the failing config still fail if CONFIG_HZ_100=y is used? Please clarify if "nolapic_timer" has an effect on the failing configuration. Also, it would be interesting to know if CONFIG_HPET=y has any effect. Neither HZ_100 nor HPET have any effect. Setting "nolapic_timer" does, however, prevent the hang. about the failure mode itself... Is it a hard hang, or if you hold down a key to give the system a stream of interrupts does the system make forward progres? oh, you can also test for hard hang by pressing the CAPS-LOCK key and see if that lights up, or ping on the network. There is no keyboard nor network at the point of the hang. Neither USB nor NIC drivers are compiled into the kernel, and an IP address that I could ping has not yet been assigned at that point. (In reply to comment #40) > There is no keyboard nor network at the point of the hang. Neither USB nor > NIC > drivers are compiled into the kernel, and an IP address that I could ping has > not yet been assigned at that point. I just now had a chance to confirm this. The keyboard is completely unresponsive during the hang. The various LOCK keys don't turn on LEDs, magic SysRq sequences do nothing, carriage returns don't work, etc. But, as I allude to in comment #40, that doesn't necessarily mean the CPUs are uninterruptable, nor that we have a hardware lockup here. On the other hand, I do have an Infiniband driver and the entire IPoIB infrastructure compiled into this particular kernel, and there definitely is traffic on the internal network the adapter is connected to. But I have no idea whether the adapter would be generating interrupts before being assigned an IP address and ifup'ed. Is there anything more on this? I've gone through the MSR differences and it appears that 0xe2 & 0xe4 are the relevant ones, although they are documented for the Sandy Bridge parts, not the Nehalem's. (In reply to comment #25) > (In reply to comment #23) > >>> Note also that disabling deep C-states impacts the ability of the machine > >>> to reach high-frequency turbo modes. You can observe this in the GHz > >>> column in turbostat when the system is under load. > >> This makes no sense. You get more performance if you allow the CPU(s) to > >> useless power when idle? Performance-per-watt maybe, but I don't care to > >> much about that. > > Each processor package has a fixed power and thermal budget. > > When some cores are idle, the busy cores have available power > > and thermal budget that they can use for "opportunistic > > frequency upside", AKA "turbo mode". > > Turbostat spells this out: > > 12 * 133 = 1600 MHz max efficiency > > 20 * 133 = 2667 MHz TSC frequency > > 22 * 133 = 2933 MHz max turbo 4 active cores > > 22 * 133 = 2933 MHz max turbo 3 active cores > > 23 * 133 = 3067 MHz max turbo 2 active cores > > 23 * 133 = 3067 MHz max turbo 1 active cores > > So under nominal electrical and cooling conditions, this part > > will run all cores continuously at up to 2667. Turbo allows > > all 4 cores to run at up to 2933 until it the part gets hot. > > If 2 or 3 cores are idle, then the maximum frequency is 3067, > > and that will be sustained as long as the part stays within > > its thermal limits. > > So the answer is yes, the part can deliver more performance > > when the cores are permitted to use less power when idle. > OK. You've got me convinced of the error in my ways. Silly question to ask > of you perhaps, but I'm wondering if the AMDs have a similar scheme. > Probably > not. > Thanks for the correction. BTW, would this explain why threads are at times reported as using more than 100% of an HT thread? Thanks. Hi, Marc Does this issue still exist if the latest kernel is used? Thanks. (In reply to comment #44) > Does this issue still exist if the latest kernel is used? Yes, it still occurs with 2.6.37. Created attachment 48162 [details] 3.6.37 .config diff ... between a working kernel (2.6.37-smp) and a non-working one (called 2.6.37-1-smp) This includes the settings indicated in comment #26, and the USB stuff I need for module-less support of keyboard and mouse. This looks like a hard hang (rather than a loop) as there is no keyboard response (no LED action, etc.) despite the keyboard & mouse having been earlier recognised. Thanks. Thanks. I've had an opportunity to test both 2.6.37.6 and 2.6.38.2. 2.6.37.6 hangs as before. But at the point where previous kernels hang, 2.6.38.2 simply slows down to a crawl, returning to normal speed when userland is started (in the initrd in this case). This results in grub-to-login-prompt boot times of nearly half an hour. Also, vesafb's cursor blink rate on the console is much slower than normal. The system is otherwise as responsive as usual. I inadvertently ended up testing 2.6.39.2 configured as above, and it behaves the same as 2.6.38.2 in comment #47, but this time despite having C-states enabled in the firmware. the "slow down to a crawl" is certainly due to lack of clock interrupts. If you pelt the system with interrupts from a device, like the network or the keyboard, then it will crawl faster;-) It's great that kernel bugzilla is back. can you please verify if the problem still exists in the latest upstream kernel? (In reply to comment #50) > can you please verify if the problem still exists in the latest upstream > kernel? I have a major outage scheduled for Jan 30th. I will see if 3.2.1 still exhibits the problem then. Thanks. (In reply to comment #50) > can you please verify if the problem still exists in the latest upstream > kernel? I've had an opportunity to test this with 3.2.2, and there is no hang nor slowdown anymore. So I consider this issue resolved. Thanks to all for your time. |