Bug 12788 - ~40,000 wakeups/sec after suspend/resume with Tickless kernel; idle=nomwait or max_cstate=1 helps
Summary: ~40,000 wakeups/sec after suspend/resume with Tickless kernel; idle=nomwait o...
Status: CLOSED OBSOLETE
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: ykzhao
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-02-27 12:47 UTC by James Ettle
Modified: 2012-05-30 14:19 UTC (History)
8 users (show)

See Also:
Kernel Version: 2.6.29-0.43.rc6.fc10.x86_64
Tree: Fedora
Regression: No


Attachments
dmidecode for the machine (10.32 KB, text/plain)
2009-02-27 12:47 UTC, James Ettle
Details
acpidump for the notebook (138.28 KB, text/plain)
2009-02-27 12:48 UTC, James Ettle
Details
dmesg output with cpufreq.debug=7 cmdline option before suspend/resume (107.40 KB, text/plain)
2009-02-28 01:16 UTC, James Ettle
Details
Extra dmesg output with cpufreq.debug=7 cmdline option after suspend/resume (31.40 KB, text/plain)
2009-02-28 01:17 UTC, James Ettle
Details
grep . /sys/firmware/acpi/interrupts/* before suspend/resume (2.08 KB, text/plain)
2009-03-09 00:35 UTC, James Ettle
Details
grep . /sys/firmware/acpi/interrupts/* after suspend/resume (2.08 KB, text/plain)
2009-03-09 00:35 UTC, James Ettle
Details
cpu0cst (1.17 KB, text/plain)
2009-03-09 00:35 UTC, James Ettle
Details
cpu0ist (680 bytes, text/plain)
2009-03-09 00:36 UTC, James Ettle
Details
cpu1cst (133 bytes, text/plain)
2009-03-09 00:36 UTC, James Ettle
Details
cpu1ist (177 bytes, text/plain)
2009-03-09 00:36 UTC, James Ettle
Details
debug patch (783 bytes, patch)
2009-03-10 22:54 UTC, Zhang Rui
Details | Diff
Debug patch: Cleanup ups and enhancements for apic suspend routines (10.88 KB, patch)
2009-12-07 12:00 UTC, Thomas Renninger
Details | Diff
dmesg on 2.6.36 before suspend (61.50 KB, text/plain)
2010-11-05 18:02 UTC, James Ettle
Details
dmesg on 2.6.36 after resume (78.34 KB, text/plain)
2010-11-05 18:03 UTC, James Ettle
Details
lspci -vxxx on 2.6.36 before suspend (34.92 KB, text/plain)
2010-11-05 18:03 UTC, James Ettle
Details
lspci -vxxx on 2.6.36 after resume (34.92 KB, text/plain)
2010-11-05 18:04 UTC, James Ettle
Details
dmesg when booted with acpi_skip_timer_override (102.78 KB, text/plain)
2010-11-10 08:19 UTC, James Ettle
Details

Description James Ettle 2009-02-27 12:47:00 UTC
Latest working kernel version: Unknown

Earliest failing kernel version: 2.6.27.5-32.fc9.x86_64 (and earlier)

Distribution: Fedora 10

Hardware Environment: Clevo M720R notebook w/ Intel T8100 CPU

Software Environment: x86-64

Problem Description:
Before a suspend/resume cycle, with an idle desktop, powertop reports the processor switching between C0 and C3 states, with respective residencies 3% and 97%. After suspending and resuming, it doesn't use C3 as much but uses C2, with residencies of around 55%, 43% and 2% for C0, C2 and C3, respectively.

This results in increased power consumption, and so an increased operating temperature (~5C).

Steps to reproduce:
Suspend and resume.
Comment 1 James Ettle 2009-02-27 12:47:42 UTC
Created attachment 20384 [details]
dmidecode for the machine
Comment 2 James Ettle 2009-02-27 12:48:03 UTC
Created attachment 20385 [details]
acpidump for the notebook
Comment 3 James Ettle 2009-02-27 12:53:25 UTC
PowerTOP screenshot BEFORE suspend/resume at desktop idle:


    PowerTOP version 1.10      (C) 2007 Intel Corporation

Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 3.8%)         2.11 Ghz     3.2%
C0                0.0ms ( 0.0%)         2.10 Ghz     0.1%
C1                0.0ms ( 0.0%)         1200 Mhz     0.2%
C2                0.0ms ( 0.0%)          800 Mhz    96.5%
C3                5.3ms (96.2%)

Wakeups-from-idle per second : 183.8    interval: 10.0s
no ACPI power usage estimate available

Top causes for wakeups:
  22.7% ( 40.7)           firefox : futex_wait (hrtimer_wakeup) 
  17.1% ( 30.7)      <kernel IPI> : Rescheduling interrupts 
  14.8% ( 26.6)       <interrupt> : acpi 
   7.7% ( 13.8)       <interrupt> : extra timer interrupt 
   6.7% ( 12.0)     work_on_cpu/0 : mod_timer (rh_timer_func) 
   6.5% ( 11.7)       <interrupt> : uhci_hcd:usb3, mmc0, mmc1, iwlagn, i915@pci:
0000:00:02.0 
Suggestion: Disable 'hal' from polling your cdrom with:
hal-disable-polling --device /dev/scd0 'hal' is the component that auto-opens a
window if you plug in a CD but disables SATA power saving from kicking in.
 Q - Quit   R - Refresh   K - kill hald-addon-storage 


... and AFTER suspend/resume, under the same conditions:


     PowerTOP version 1.10      (C) 2007 Intel Corporation

Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        (53.6%)         2.11 Ghz     3.4%
C0                0.0ms ( 0.0%)         2.10 Ghz     0.3%
C1                0.0ms ( 0.0%)         1.60 Ghz     0.4%
C2                0.0ms (40.3%)          800 Mhz    95.9%
C3                0.1ms ( 6.1%)

Wakeups-from-idle per second : 34595.0  interval: 10.0s
no ACPI power usage estimate available

Top causes for wakeups:
  16.9% ( 61.2)   USB device  6-1 : USB Optical Mouse (Logitech) 
  16.8% ( 61.1)       <interrupt> : uhci_hcd:usb6, firewire_ohci, eth0 
  15.9% ( 57.7)      <kernel IPI> : Rescheduling interrupts 
  10.4% ( 37.8)       <interrupt> : acpi 
   8.9% ( 32.2)           firefox : futex_wait (hrtimer_wakeup) 
   5.9% ( 21.5)       <interrupt> : extra timer interrupt 

Suggestion: Enable USB autosuspend by pressing the U key or adding
usbcore.autosuspend=1 to the kernel command line in the grub config

 Q - Quit   R - Refresh   U - Enable USB suspend 
Comment 4 Andrew Morton 2009-02-27 12:55:01 UTC
> Earliest failing kernel version: 2.6.27.5-32.fc9.x86_64 (and earlier)

So.. you're saying that no kernel version has ever worked correctly
for you?
Comment 5 James Ettle 2009-02-27 12:58:56 UTC
I'm not sure. I've noticed it on *at least* those kernels, since those were the ones which coincided with other fixes in Fedora that addressed suspend/resume issues outside the kernel, and I'd therefore started using it.
Comment 6 James Ettle 2009-02-28 01:16:47 UTC
Created attachment 20386 [details]
dmesg output with cpufreq.debug=7 cmdline option before suspend/resume
Comment 7 James Ettle 2009-02-28 01:17:09 UTC
Created attachment 20387 [details]
Extra dmesg output with cpufreq.debug=7 cmdline option after suspend/resume
Comment 8 Thomas Renninger 2009-03-05 12:01:02 UTC
Looks like it's your mouse (C-states do not have to do anything with cpufreq, the P-states are controlled by cpufreq):
Top causes for wakeups:
  16.9% ( 61.2)   USB device  6-1 : USB Optical Mouse (Logitech) 
  16.8% ( 61.1)       <interrupt> : uhci_hcd:usb6, firewire_ohci, eth0
I don't know much about USB, but you could already try to unplug the mouse, rmmod uhci_hcd, modprobe it again, replug the mouse, does it help?

Is it better without mouse?

It's the interrupt activity of uhci/mouse interrupts:
watch -n1 cat /proc/interrupts
and look out to calme down the uhci_hcd interrupts.
AFAIK uhci always produces a lot interrupt activity and there exists USB autosuspend to suspend the USB device after some time of inactivity.

Is it really only after suspend or always if you plug in and use the USB mouse?
Using USB can be really bad for CPUs supporting deep C-states.

Hmm, I cannot change the component, this has nothing to do with cpufreq, more with suspend or better USB.
Comment 9 James Ettle 2009-03-05 14:50:44 UTC
I see a steady flow of interrupt 16 before and after suspend/resume, about 15/sec in both cases. I can't see anything else producing interrupts en-masse.

 16:         40       3707   IO-APIC-fasteoi   uhci_hcd:usb3, mmc0, mmc1, iwlagn, i915@pci:0000:00:02.0

Unplugging/disabling all USB devices does not appear to solve the problem. (Haven't tried reloading uhci driver, it's built into the kernel.)

Here's another screenshot following suspend/resume:

     PowerTOP version 1.10      (C) 2007 Intel Corporation

Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        (55.1%)         2.11 Ghz     0.2%
C0                0.0ms ( 0.0%)         2.10 Ghz     0.0%
C1                0.0ms ( 0.0%)         1.60 Ghz     0.5%
C2                0.0ms (44.1%)          800 Mhz    99.3%
C3                0.0ms ( 0.7%)

Wakeups-from-idle per second : 39692.4  interval: 10.0s
no ACPI power usage estimate available

Top causes for wakeups:
  30.3% ( 38.0)       <interrupt> : acpi 
  12.6% ( 15.8)      <kernel IPI> : Rescheduling interrupts 
  12.3% ( 15.4)       <interrupt> : uhci_hcd:usb3, mmc0, mmc1, i915@pci:0000:00:
  11.0% ( 13.8)       <interrupt> : extra timer interrupt 
   8.0% ( 10.0)           gkrellm : schedule_hrtimeout_range (hrtimer_wakeup) 
   6.7% (  8.4)       <interrupt> : ata_piix

I don't understand this at all --- it would appear to show no significant difference in the sources of wakeups than from the first screenshot I posed in Comment #3 --- yet claims around 200 times more wakeups in total. ...So where are they coming from? I can't see any process hogging CPU...
Comment 10 Thomas Renninger 2009-03-05 15:18:05 UTC
This one looks rather different.
In this case it should be acpi producing that much interrupts.
Hmm, I remember an endless loop for Dells (with docking?).
But you really should see an interrupt source heavily increasing?

Could you change the component, please. Best to ACPI, there are several people looking at ACPI bugs.

It very much looks like an interrupt source goes wild waking up the processor all the time.
If it's acpi and you have ACPI_DEBUG compiled in you could shortly increase debug level:
echo 0x21F > /sys/module/acpi/parameters/debug_level
echo 0xFFFFFFFF > /sys/module/acpi/parameters/debug_layer
sleep 1
echo 0xF > /sys/module/acpi/parameters/debug_level

and have a look which ACPI parts are processed.
Hmm, there are also GPEs exported now somewhere in /sys.
On each acpi interrupt a GPE should increase. Is there one going up very quickly after suspend? You can then guess from the DSDT table to what device (battery/thermal/...) the GPE belongs and at what driver to look at.
Comment 11 ykzhao 2009-03-08 20:25:02 UTC
From the comment #3 and #9 it seems that there exist so many wakes after suspend/resume.
   >Wakeups-from-idle per second : 34595.0  interval: 10.0s
   >Wakeups-from-idle per second : 39692.4
   It is abnormal.
    Will you please confirm whether the acpi interrupt is normal after suspend/resume?
    It will be great if you can attach the output of "grep . /sys/firmware/acpi/interrupts/* " before and after suspend/resume.
    Thanks.
Comment 12 ykzhao 2009-03-08 20:28:03 UTC
Will you please attach the following output?
   acpidump --addr  0x7f6d4e23 --length 0x2A8 -o cpu0ist
   acpidump --addr 0x7f6d50cb --length 0xb1 -o cpu1ist 
   acpidump --addr 0x7f6d48f4 --length 0x4AA -o cpu0cst
   acpidump --addr 0x7f6d4d9e --length 0x85 -o cpu1cst


   Thanks.
           
Comment 13 James Ettle 2009-03-09 00:35:24 UTC
Created attachment 20456 [details]
grep . /sys/firmware/acpi/interrupts/* before suspend/resume
Comment 14 James Ettle 2009-03-09 00:35:36 UTC
Created attachment 20457 [details]
grep . /sys/firmware/acpi/interrupts/* after suspend/resume
Comment 15 James Ettle 2009-03-09 00:35:48 UTC
Created attachment 20458 [details]
cpu0cst
Comment 16 James Ettle 2009-03-09 00:36:01 UTC
Created attachment 20459 [details]
cpu0ist
Comment 17 James Ettle 2009-03-09 00:36:13 UTC
Created attachment 20460 [details]
cpu1cst
Comment 18 James Ettle 2009-03-09 00:36:26 UTC
Created attachment 20461 [details]
cpu1ist
Comment 19 James Ettle 2009-03-09 00:37:40 UTC
The ACPI interrupt seems normal (in /proc/interrupts) after suspend/resume, I can't see any evidence of it machine-gunning.
Comment 20 Thomas Renninger 2009-03-09 05:01:09 UTC
As this seem to be acpi related, offlining the CPUs might not reproduce it:
echo 0 >/sys/devices/system/cpu/cpu1/online
echo 0 >/sys/devices/system/cpu/cpu2/online
...
Still it's an easy test.., can you trigger the misbehaviour offlining the CPUs?
Comment 21 Thomas Renninger 2009-03-09 05:02:29 UTC
Ehh and re-enable them again of course, this is what suspend also does with your CPUs.
Comment 22 James Ettle 2009-03-09 14:42:48 UTC
(In reply to comment #20)
> As this seem to be acpi related, offlining the CPUs might not reproduce it:
> echo 0 >/sys/devices/system/cpu/cpu1/online
> echo 0 >/sys/devices/system/cpu/cpu2/online

I disabled and re-enabled cpu1 (cpu0 has no "online" entry). There was no noticeable increase in the rate of wakeups.
Comment 23 Zhang Rui 2009-03-09 17:56:29 UTC
make sure CONFIG_PM_DEBUG is set,
run "echo core > /sys/power/pm_test" and "echo mem > /sys/power/state".
is the problem reproducible in this way?

get the output of /proc/interrupts both before and after S3, make sure none of these interrupts are increasing rapidly(compared with the behaviour before S3).
Comment 24 James Ettle 2009-03-10 02:02:47 UTC
(In reply to comment #23)
> make sure CONFIG_PM_DEBUG is set,
> run "echo core > /sys/power/pm_test" and "echo mem > /sys/power/state".
> is the problem reproducible in this way?
> 
> get the output of /proc/interrupts both before and after S3, make sure none
> of
> these interrupts are increasing rapidly(compared with the behaviour before
> S3).


OK, I tried this and didn't see the problem. (On the second command the screen changed to a text console, and the hard disc powered down; the machine came back when I touched the touchpad.) No interrupts are increasing rapidly (however, I don't see that even with the problem); wakeups is around 200--300 (as it should be, not the 40,000), and the processor seems to be mostly in C3.
Comment 25 Zhang Rui 2009-03-10 22:54:56 UTC
Created attachment 20487 [details]
debug patch

please try this debug patch and see if the problem still exist.
Comment 26 James Ettle 2009-03-11 15:35:34 UTC
(In reply to comment #25)
> Created an attachment (id=20487) [details]
> debug patch
> 
> please try this debug patch and see if the problem still exist.

Tried the patch, applied to Fedora 2.6.29-rc7-git3 kernel. The problem still exists, basically the same powertop stats as before.
Comment 27 Zhang Rui 2009-03-17 01:21:14 UTC
please unload uhci-hcd/ehci-hcd/ohci-hcd before suspend,
does the problem still exist after resume?
does the problem still exist with uhci-hcd/ehci-hcd/ohci-hcd reloaded after resume?
Comment 28 James Ettle 2009-03-20 01:19:24 UTC
(In reply to comment #27)
> please unload uhci-hcd/ehci-hcd/ohci-hcd before suspend,
> does the problem still exist after resume?

Yes.

> does the problem still exist with uhci-hcd/ehci-hcd/ohci-hcd reloaded after
> resume?

Yes.
Comment 29 ykzhao 2009-03-23 01:30:03 UTC
Hi, James
    Will you please add the boot option of "idle=nomwait" on the latest kernel and see whether this issue still exists after resume?
    Thanks.
Comment 30 Shaohua 2009-03-25 06:29:55 UTC
did you try windows in the system?
Comment 31 James Ettle 2009-03-25 07:59:20 UTC
(In reply to comment #30)
> did you try windows in the system?

No, I don't have a Windows license for this system.
Comment 32 James Ettle 2009-03-26 09:22:20 UTC
(In reply to comment #29)
> Hi, James
>     Will you please add the boot option of "idle=nomwait" on the latest
>     kernel
> and see whether this issue still exists after resume?
>     Thanks.

I tried that option on kernel-2.6.29-3.fc10.x86_64. In this case, the system took quite a bit longer to resume after pressing the power button and was also *very* sluggish. Here's a powertop screenshot:


     PowerTOP version 1.10      (C) 2007 Intel Corporation

Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        (52.9%)         2.11 Ghz    40.4%
C0                0.0ms ( 0.0%)         2.10 Ghz     2.2%
C1 halt           0.0ms ( 0.0%)         1200 Mhz     1.4%
C2                0.1ms ( 0.0%)          800 Mhz    55.1%
C3                0.6ms (47.1%)

Wakeups-from-idle per second : 847.9    interval: 10.0s
no ACPI power usage estimate available

Top causes for wakeups:
  27.3% ( 81.6)      <kernel IPI> : Rescheduling interrupts 
  16.6% ( 49.6)       <interrupt> : acpi 
  14.0% ( 41.9)   USB device  6-1 : USB Optical Mouse (Logitech) 
  13.2% ( 39.4)       <interrupt> : uhci_hcd:usb6, firewire_ohci, eth0 
   3.9% ( 11.7)       <interrupt> : ata_piix 
   3.9% ( 11.6)       <interrupt> : uhci_hcd:usb3, mmc0, mmc1, i915@pci:0000:00:
02.0, iwlagn 
Suggestion: Disable the unused bluetooth interface with the following command:
  hciconfig hci0 down ; rmmod hci_usb
Bluetooth is a radio and consumes quite some power, and keeps USB busy as well.
 Q - Quit   R - Refresh   B - Turn Bluetooth off 


So it looks like the number of wakeups is reduced (I'll run it again later to confirm this), but the C0 residence is still too high for idle and C1 is marked as halt for some reason (is this the reason for the slowness?).

Without idle=nomwait, things are as before.
Comment 33 ykzhao 2009-03-27 02:57:57 UTC
Hi, James
    Thanks for the test. It seems that the wakeups per second become normal after resuming if the boot option of "idle=nomwait" is used.
    But the powertop reports that C0 usage is very high. And the CPU will stay in P0-state for about 40.4%.
    Will you please confirm whether the CPU work load is high after resuming?

    Thanks.
Comment 34 James Ettle 2009-03-29 15:17:08 UTC
(In reply to comment #33)
> Hi, James
>     Thanks for the test. It seems that the wakeups per second become normal
> after resuming if the boot option of "idle=nomwait" is used.
>     But the powertop reports that C0 usage is very high. And the CPU will
>     stay
> in P0-state for about 40.4%.
>     Will you please confirm whether the CPU work load is high after resuming?

Yes, in all cases I've closes all unnecessary applications to make sure of this. Powertop readings were taken at an idle desktop, CPU usage at the 2-3% mark. (In some cases, there were more applications open *before* suspend/resume, with a lower C0 occupancy.)
Comment 35 ykzhao 2009-03-30 05:45:07 UTC
Hi, James
    How about this issue if only one cpu is used? (by adding the boot option of "maxcpus=1")
    Thanks.
Comment 36 James Ettle 2009-03-30 16:24:33 UTC
(In reply to comment #35)
> Hi, James
>     How about this issue if only one cpu is used? (by adding the boot option
>     of
> "maxcpus=1")
>     Thanks.

The excessive wakeups still exist. I've also tried disabling hrtimer, which had no effect.
Comment 37 James Ettle 2009-03-31 17:39:02 UTC
Additional info: this seems to be only a sleep-to-RAM problem, things seem OK when I resume from hibernation.
Comment 38 James Ettle 2009-04-03 10:34:51 UTC
This seems to be a Tickless issue. With nohz=off, I don't seen an excessive change in wakeups (roughly 1000 both before and after suspend/resume).
Comment 39 Zhang Rui 2009-05-25 08:22:49 UTC
hmmm, does the problem still exist in the latest git kernel, say 2.6.30-rc7?
Comment 40 James Ettle 2009-05-26 20:00:31 UTC
No improvement with kernel-2.6.30-0.91.rc7.git1.fc12.x86_64.
Comment 41 James Ettle 2009-08-13 21:38:41 UTC
No improvement with kernel 2.6.30.4, still ~40,000 wups after resume. Still none the wiser as to where they are coming from...
Comment 42 ykzhao 2009-11-18 07:20:39 UTC
Hi, James
   Will you please try the latest kernel(2.6.32-rc7) and see whether the issue still exists?
   Thanks.
Comment 43 James Ettle 2009-12-04 19:18:55 UTC
No improvement with 2.6.32 (kernel-2.6.32-1.fc13.x86_64).
Comment 44 Thomas Renninger 2009-12-07 11:58:55 UTC
Quick summary:
If you use idle=nomwait
the irq activity drops from 40000 to 800 which you say is high, but this looks pretty normal if you still have some things running. So the root cause for the IRQ storm seem to be C-states triggered through mwait.

You should also look out for BIOS updates from time to time...
Can you add "idle=nomwait helps" to the title (I searched but couldn't find how to do that -> no permission?).

As this only happens after a suspend/resume, I expect we or the BIOS forgot to re-program a chip properly or do something wrongly (or this is a bug in the CPU itself).

Possibly this is APIC related and I had a look at the the suspend/resume code there.
What I wonder is: Why is the code there only executed on one CPU while there is one local APIC per CPU, it even seem to be luck on which CPU the code gets executed (or it's ensured to be run on CPU0, because other CPUs are offlined at that time, not sure).

Could you try out attached patch, please.
Please give it two test runs, always with apic=verbose boot param added (and without passing idle=nomwait):
  - with default/normal params
  - additionally with disable_apic_suspend_hooks boot param
Is the IRQ storm gone after resume with one of above tries?
Thanks.
Comment 45 Thomas Renninger 2009-12-07 12:00:42 UTC
Created attachment 24074 [details]
Debug patch: Cleanup ups and enhancements for apic suspend routines
Comment 46 Thomas Renninger 2009-12-07 12:04:31 UTC
And please attach /proc/interrupts if it still does not work (with some thousand irqs logged in there).
Comment 47 James Ettle 2009-12-07 12:21:15 UTC
(In reply to comment #44)
> Can you add "idle=nomwait helps" to the title (I searched but couldn't find
> how
> to do that -> no permission?).

Done. (Although I wouldn't say idle=nomwait "helps"; it has undesirable side-effects.)

> As this only happens after a suspend/resume, I expect we or the BIOS forgot
> to
> re-program a chip properly or do something wrongly (or this is a bug in the
> CPU
> itself).

(Just to re-iterate: the bug doesn't happen with nohz=off.) Thanks for the debug patch; I'll try it as soon as I can.
Comment 48 James Ettle 2009-12-09 21:36:21 UTC
OK, I tried the patch with apic=verbose. Without disable_apic_suspend_hooks, it didn't suspend properly (hard disc turned off, then it froze); with disable_apic_suspend_hooks, it suspended but didn't resume (blank screen). I'll try with again with different (vanilla) kernel later.
Comment 49 James Ettle 2009-12-10 22:42:09 UTC
(Updated summary to reflect that passing processor.max_cstate=1 seems to help.)
Comment 50 Thomas Renninger 2009-12-11 10:16:03 UTC
Thanks for the test. I don't know enough to go on further, having a lapic suspend function for one CPU only looked bogus and I hoped the machine behaved better without any or with executing this on each core.
Could you still attach /proc/interrupts if things go wrong if you have time for an extra boot. The idea is that we might see that the irqs are coming from the thermal vector which gets some extra handling in the suspend functions or some similar hint.
Comment 51 ykzhao 2010-08-31 00:40:49 UTC
HI, James
    Will you please try the latest kernel and see whether the issue still an be reproduced?

Thanks.
Comment 52 Zhang Rui 2010-09-03 02:12:18 UTC
ping James
Comment 53 James Ettle 2010-09-03 05:55:18 UTC
(In reply to comment #52)
> ping James

Don't worry, I've not forgotten, just been rather busy of late --- I'll get on to testing with the latest kernel as soon as I can.
Comment 54 Zhang Rui 2010-09-27 00:23:57 UTC
any update? :)
Comment 55 James Ettle 2010-09-27 09:30:09 UTC
Sorry about the delay. I've just tried with a build of kernel 2.6.35.4 (OK, not the absolute latest...) and the problem still remains, albeit with a reduced number of stealth wakeups. The cmdline for this kernel is:

 ro root=/dev/mapper/vg_rhapsody-lv_root rd_LVM_LV=vg_rhapsody/lv_root rd_LVM_LV=vg_rhapsody/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_GB.UTF-8 KEYBOARDTYPE=pc KEYTABLE=uk rhgb quiet processor.ignore_tpc=1

Note that without the max_cstate=1, this kernel also suffers from the "pausing" problem whereby I have to keep pressing keys/moving the mouse/otherwise generate interrupts otherwise the machine just stops.

powertop before suspend/resume:


Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        ( 5.7%)         2.11 Ghz     3.6%
C0                0.0ms ( 0.0%)         2.10 Ghz     0.1%
C1 mwait          0.3ms ( 0.0%)         1.60 Ghz     0.5%
C2 mwait          0.2ms ( 0.5%)         1200 Mhz     0.1%
C3 mwait          3.7ms (93.8%)          800 Mhz    95.7%

Wakeups-from-idle per second : 284.8    interval: 5.0s
no ACPI power usage estimate available

Top causes for wakeups:
  51.5% (193.6)     <kernel core> : hrtimer_start_range_ns (tick_sched_timer) 
  11.0% ( 41.4)       <interrupt> : extra timer interrupt 
   5.7% ( 21.6)       packagekitd : hrtimer_start_range_ns (hrtimer_wakeup) 
   4.2% ( 15.8)       <interrupt> : acpi 
   3.9% ( 14.6)       <interrupt> : ata_piix 
   3.6% ( 13.4)     <kernel core> : hrtimer_start (tick_sched_timer) 


After:


Cn                Avg residency       P-states (frequencies)
C0 (cpu running)        (20.4%)         2.11 Ghz     1.6%
C0                0.1ms ( 0.0%)         2.10 Ghz     0.4%
C1 mwait          6.3ms (32.0%)         1.60 Ghz     0.5%
C2 mwait          0.0ms (32.3%)         1200 Mhz     0.0%
C3 mwait          0.3ms (15.3%)          800 Mhz    97.5%

Wakeups-from-idle per second : 15059.6  interval: 10.0s
no ACPI power usage estimate available

Top causes for wakeups:
  30.2% ( 64.2)     <kernel core> : hrtimer_start_range_ns (tick_sched_timer) 
  15.1% ( 32.1)       <interrupt> : acpi 
  11.2% ( 23.9)       <interrupt> : iwlagn 
   7.1% ( 15.2)     <kernel core> : hrtimer_start (tick_sched_timer) 
   6.3% ( 13.5)       <interrupt> : ata_piix 
   5.0% ( 10.6)           gkrellm : hrtimer_start_range_ns (hrtimer_wakeup)
Comment 56 ykzhao 2010-11-02 07:33:34 UTC
HI, James
     Sorry for the late response. It seems that the issue still exists. In fact from the powertop data it seems that there is no change about interrupt number. But it is very weird that the wakeup event is increased very significantly.

Will you please get the output of lspci -vxxx before and after the suspend/resume?


thanks.
    Yakui
Comment 57 James Ettle 2010-11-05 18:02:09 UTC
Yakui, have the data you requested. Tried it on a 2.6.36 kernel with no cstate limit; core issue remains, around 300 wups before, 10000 after. Have attached dmesgs and lspcis below.(In reply to comment #56)

> But it is very weird that the wakeup event is increased very significantly.

<wildguess> Rogue SMIs? </wildguess>
Comment 58 James Ettle 2010-11-05 18:02:40 UTC
Created attachment 36222 [details]
dmesg on 2.6.36 before suspend
Comment 59 James Ettle 2010-11-05 18:03:04 UTC
Created attachment 36232 [details]
dmesg on 2.6.36 after resume
Comment 60 James Ettle 2010-11-05 18:03:36 UTC
Created attachment 36242 [details]
lspci -vxxx on 2.6.36 before suspend
Comment 61 James Ettle 2010-11-05 18:04:10 UTC
Created attachment 36252 [details]
lspci -vxxx on 2.6.36 after resume
Comment 62 Vitus Jensen 2010-11-07 11:54:02 UTC
I do have a similar problem with tickless kernel but in a totally different machine: single Pentium M, acpi_idle.  See https://bugzilla.kernel.org/show_bug.cgi?id=22332 if interested.
Comment 63 James Ettle 2010-11-10 08:19:03 UTC
Created attachment 37022 [details]
dmesg when booted with acpi_skip_timer_override

Some more info: If I boot with acpi_skip_timer_override (no cstate restriction), powertop shows ~2500 wups before suspend/resume, and around 6000 wups afterwards. One difference is the default clocksource gets set to hpet rather than tsc, and tsc is no longer available. dmesg attached.
Comment 64 James Ettle 2010-11-10 09:01:56 UTC
Strange... in the 40 minutes or so since I last posted, wakeups seem to have calmed to around 2500-4000 per second. No idea what's changed in that time...
Comment 65 James Ettle 2010-11-10 11:22:59 UTC
(In reply to comment #63)
> One difference is the default clocksource gets set to hpet
> rather than tsc, and tsc is no longer available. dmesg attached.

OK, I now understand this is to be expected since I unblocked C2 and C3, which halt the TSC...
Comment 66 ykzhao 2010-12-06 03:30:14 UTC
Hi, James
    Sorry for the late response about the issue. It is very strange that the number of wakeup is increased very significantly after suspend/resume when the mwait is used to enter the idle state.
    From the section 8.10.4 of Intel Software Developer Manual, it seems that it will exit the idle state under the following cases:
     a. write to the monitored address configured by Monitor instruction(In the code the current_thread_info->flags is the corresponding monitor address)
     b. External interrupts, including NMI, SMI, INIT, BINIT, MCERR, A20M#
     c. Faults, Aborts (including Machine Check)
     d. Architectural TLB invalidations including writes to CR0, CR3, CR4 and certain MSR writes

     But from the powertop info it seems that the number of interrupts/timer event is farther less than the number of wakeup. Not sure why the number of wakeup is affected so significantly by mwait after suspend/resume.

     Please use the boot option of "idle=nomwait" to workaround this issue. 

Thanks.
    Yakui
Comment 67 Zhang Rui 2012-01-18 01:41:36 UTC
It's great that kernel bugzilla is back.

James,
can you please verify if the problem still exists in the latest upstream kernel?
Comment 68 James Ettle 2012-01-19 12:09:16 UTC
(In reply to comment #67)
> can you please verify if the problem still exists in the latest upstream
> kernel?

Using kernel-3.2.1-4.jhe.fc16.x86_64 (built from Fedora sources), with the following command-line options I get:

  processor.max_cstate=1  [what I normally use]
                          - No odd pauses (as in Bug 12118)
                          - No excessive wakeups

  idle=mwait              as processor.max_cstate=1

  idle=nomwait            - Get the pause phenomenon, need to keep moving
                            touchpad in order to get anywhere
                          - Machine does not resume from suspend properly
                            (screen doesn't turn on, halts, no SysRq response)

  <none of the above>     - No excessive wakeups, nice low wups both before
                            and after
                          - Have the annoying pause problem

So (after a very brief test --- so beware!) I think this particular issue may have cleared up on my hardware.
Comment 69 Alan 2012-05-30 14:19:10 UTC
Thanks, closing

Note You need to log in before you can comment on or make changes to this bug.