Previously I reported here (https://bugzilla.kernel.org/show_bug.cgi?id=19762) that intel_idle didn't support my Atom C4 and C6 idle states. Now that's fixed. But now my computer has the problem that usb-audio playback is very corrupted and thus unusable. It is directly related not only to intel_idle, but exactly the C4 state: 1. I diagnosed the problem 2. I started with max_cstate=2 -> no problems 3. I started with max_cstate=4 -> extremely bad problems with sound 4. I patched intel_idle and disabled (commented out) the C4 state -> no problems. Any idea what a prettier solution to the problem might be? Thanks! :)
does the problem go away if you boot with "nolapic_timer"?
When you modify intel_idle.c's atom_cstates[] to disable ATM-C4, yet keep ATM-C6 you do not see the problem? Please share the output from grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* for that working scenario. When you disable ATM-C6 yet keep ATM-C4 enabled then you have sound problems? Please share the output from grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* for that failing scenario.
Created attachment 34922 [details] C4 Disabled grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* yes, when I disable C4 and it's just in C6 the problem does not appear. As it's *much* worse with only C4 it seems that it's just the skipping through C4 that causes the problem in the first place. I have added the grep for C4 disabled now. I will add the grep with C4 enabled and try nolapic_timer in a few hours.
Created attachment 34992 [details] C4 enabled grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* I've tried nolapic_timer. The cpu is in 100% polling mode and the problem does not appear either. I've also attached the grep with C4 enabled. That should be everything you've mentioned then.
Created attachment 35002 [details] side by side diff of both; c4 active on the left side
I think I forgot to make this clear: I think the bug exists with processor.ko as well. The difference is just that there C4 gets disabled when I plug in AC so I usually won't notice it. Maybe the assumption that during bus master there is a no op does not apply here? Just random guessing... Don't think it would make sense then that it's *only* in C4.
Any ideas?
Ok, I know this does sound crazy, but it's totally reproducable, hence there has to be a cause somewhere. Could it be that the CPU does not stay in C4 long enough or something?
I guess we can change the status to wontfix, right?
not ready to give up on addressing this -- just haven't immediately had any good ideas or time:-( So the problem is with MWAIT 0x30, even though MWAIT 0x52 works, yes? Please verify that the same failure occurs when using acpi_idle. (eg boot with intel_idle.max_cstate=0), show the output from grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* to verify that ACPI is indeed using MWAIT 0x30, and verify that the problem happens there too.
Created attachment 38082 [details] AC: grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* On AC, as before, there is no problem, as it only goes up to C2.
Created attachment 38092 [details] Battery: grep . /sys/devices/system/cpu/cpu*/cpuidle/*/* On battery, as before, it's pretty bad every time it passes by C3.
I actually had the problem with ACPI first, but then only after unplugging AC, which activates the other modes. Remember the other bug, where someone wrote a patch because the ACPI changes didn't show up correctly? ;) Thanks for checking this out! Maybe we should just test increasing the latency of C3 and see if the problem persists?
Thanks for confirming that the issue is present with ACPI on DC -- when it exposes ACPI C1/C2/C3/C4. In that scenario, residency is much higher in the ACPI C4 MWAIT 0x52 state than it is in the problematic ACPI C3 MWAIT 0x30 state. Please confirm that in ACPI mode, if you boot with processor.max_cstate=3 to get rid of this C4 state, that residency in ACPI C3 increases and with it the problem is worse.
I started with Sorry, in #12 I made a wrong statement, I meant C4, as there is no C3 on my system. The list is C1 C2 C4 C6. The problem always occurs in C4. I just booted with intel_idle.max_cstate=0 processor.max_cstate=4 and if anything the problem was worse than with intel_idle. It was very, very bad. I heard more noise than sound. We have now tested: = without intel_idle = * up to C2 resulting in no problems (default without intel_idle in AC) * up to C6 resulting in some residency in C4 and some problems (default in without intel_idle in DC) * up to C4, resulting in very serious problems and high residency in C4. And we tested the same with intel_idle with the same results. This means we can definitely conclude that with or without intel_idle, the C4 state is to blame.
Yes, the terminology can be confusing, here is a decoder: ACPI C0 = Atom C0 ACPI C1 = Atom C1 = MWAIT 0x0 ACPI C2 = Atom C2 = MWAIT 0x10 ACPI C3 = Atom C4 = MWAIT 0x30 ACPI C4 = Atom C6 = MWAIT 0x52 Thanks for the confirmation that the MWAIT 0x30 state causes the issue for both acpi_idle or intel_idle, and that the MWAIT 0x52 state is not an issue. --- I don't understand comment #3. when you boot with "nolapic_timer" the system doesn't enter deep c-states, but instead cpuidle chooses polling? --- It is mysterious that MWAIT 0x30 should cause a problem while MWAIT 0x52 does not, because 0x52 is expected to be a higher latency C-state. What do you see if you edit intel_idle.c atom_cstates[] MWAIT C4 to have exit_latency = 140 and target_residency = 560 and then boot with intel_idle.max_cstate=4 to disable MWAIT 0x52? Do you still see a lot of MWAIT 0x30 residency and bad sound? What can you tell me about the sound device? what does lsusb show? when it is active, what does powertop show about its interrupt rate? What do you see if you "watch -d cat /proc/interrupts" when it is active?
In comments #4 you probably mean. Yes that's exactly what happened. It went into polling mode. I will try it again now and report if and only if I get different results this time. ---- I know, it's very weird, but it's what happens. I can make a video if you like :) I tried editing it like you suggested (C4 mode with C6 values). It didn't change anything. C4 still caused the same problem. I would also have thought that this is where the problem would be. Maybe, just maybe, this is a broken CPU just in my case? The sound device is a "Creative SoundBlaster X-Fi Surround 5.1 USB": 1 [S51 ]: USB-Audio - SB X-Fi Surround 5.1 Creative Technology SB X-Fi Surround 5.1 at usb-0000:00:1d.0-1, full speed powertop says it's 130-330. proc interrupts shows about * 300 per intervall (2s) in (IO-APIC-fasteoi uhci_hcd:usb1) * Timer interrupts go from 80-120 to about 300 per 2 seconds (LOC: 441745 505490 Local timer interrupts) everything else is unchanged.
Ok, with nolapic_timer in 2.6.36: C0 (cpu running) 1,6 % polling 5.5ms 98,4 % C1 mwait 0.0ms 0,0 % C2 mwait 0.0ms 0,0 % C6 mwait 0.0ms 0,0 % wakes per second: 180 idle sources: ca 800-1000 (94%) [extra timer interrupts].
Hi, Dennis Does the issue still exist on the latest linux kernel(for example: 2.6.38-rc4)? It will be great if you can attach the output of acpidump on your box. Please use the latest acpidump tool, which can be downloaded from: PMtools-20101221 http://www.kernel.org/pub/linux/kernel/people/lenb/acpi/utils/ Thanks.
Created attachment 47902 [details] acpidump by pmtools-20101221
And the issue still exists in 2.6.38-rc4.
changing category to "cpuidle" from "intel_idle", since acpi_idle and intel_idle both see the same problem with MWAIT 0x30 on this box. Dennis, just for grins... with MWAIT 0x30 disabled, what do you see if you change MWAIT 0x52 to 0x50 or 0x51? Do those states work fine?
It's great that kernel bugzilla is back. can you please verify if the problem still exists in the latest upstream kernel?
Yes, still exists in 3.3.0. Len, let me know if this made you smile... 0x50 caused the problems to reappear. I'll try 0x51 if you want. And for some reason it told me to contact a certain lenb@kernel.org... 417:[ 0.417673] intel_idle: MWAIT substates: 0x3020220 418:[ 0.417678] intel_idle: v0.4 model 0x1C 419:[ 0.417682] intel_idle: lapic_timer_reliable_states 0x2 420:[ 0.417696] intel_idle: unaware of model 0x1c MWAIT 4 please contact lenb@kernel.org 422:[ 0.417760] intel_idle: unaware of model 0x1c MWAIT 4 please contact lenb@kernel.org 594:[ 9.194608] ACPI: acpi_idle yielding to intel_idle
Something interesting again. The 0x51 version was pure static, while the 0x50 version was only intermittend static. On the other hand I tested the 0x50 version in console, the 0x51 in X. So that's more likely the difference.
Any reason for not just blacklisting C4 on that platform ?
3 choices to resolve this bug: 1. blacklist deep C-states on this platform in the kernel please attach the dmidecode, and we create a patch to do this. 2. use PM-QOS to fix this by having the sound driver, or user-space tell Linux not to use C-states with latency of 100usec and above. (which the listed exit_latency for MWAIT 0x30). 3. wait a month and if Denis doesn't reply, close as Documented -- since there are cmdline workarounds available for fellow travelers.
If this is still an issue, please re-open and supply the output from grep . /sys/class/dmi/id/*