Bug 12788
Description
James Ettle
2009-02-27 12:47:00 UTC
Created attachment 20384 [details]
dmidecode for the machine
Created attachment 20385 [details]
acpidump for the notebook
PowerTOP screenshot BEFORE suspend/resume at desktop idle: PowerTOP version 1.10 (C) 2007 Intel Corporation Cn Avg residency P-states (frequencies) C0 (cpu running) ( 3.8%) 2.11 Ghz 3.2% C0 0.0ms ( 0.0%) 2.10 Ghz 0.1% C1 0.0ms ( 0.0%) 1200 Mhz 0.2% C2 0.0ms ( 0.0%) 800 Mhz 96.5% C3 5.3ms (96.2%) Wakeups-from-idle per second : 183.8 interval: 10.0s no ACPI power usage estimate available Top causes for wakeups: 22.7% ( 40.7) firefox : futex_wait (hrtimer_wakeup) 17.1% ( 30.7) <kernel IPI> : Rescheduling interrupts 14.8% ( 26.6) <interrupt> : acpi 7.7% ( 13.8) <interrupt> : extra timer interrupt 6.7% ( 12.0) work_on_cpu/0 : mod_timer (rh_timer_func) 6.5% ( 11.7) <interrupt> : uhci_hcd:usb3, mmc0, mmc1, iwlagn, i915@pci: 0000:00:02.0 Suggestion: Disable 'hal' from polling your cdrom with: hal-disable-polling --device /dev/scd0 'hal' is the component that auto-opens a window if you plug in a CD but disables SATA power saving from kicking in. Q - Quit R - Refresh K - kill hald-addon-storage ... and AFTER suspend/resume, under the same conditions: PowerTOP version 1.10 (C) 2007 Intel Corporation Cn Avg residency P-states (frequencies) C0 (cpu running) (53.6%) 2.11 Ghz 3.4% C0 0.0ms ( 0.0%) 2.10 Ghz 0.3% C1 0.0ms ( 0.0%) 1.60 Ghz 0.4% C2 0.0ms (40.3%) 800 Mhz 95.9% C3 0.1ms ( 6.1%) Wakeups-from-idle per second : 34595.0 interval: 10.0s no ACPI power usage estimate available Top causes for wakeups: 16.9% ( 61.2) USB device 6-1 : USB Optical Mouse (Logitech) 16.8% ( 61.1) <interrupt> : uhci_hcd:usb6, firewire_ohci, eth0 15.9% ( 57.7) <kernel IPI> : Rescheduling interrupts 10.4% ( 37.8) <interrupt> : acpi 8.9% ( 32.2) firefox : futex_wait (hrtimer_wakeup) 5.9% ( 21.5) <interrupt> : extra timer interrupt Suggestion: Enable USB autosuspend by pressing the U key or adding usbcore.autosuspend=1 to the kernel command line in the grub config Q - Quit R - Refresh U - Enable USB suspend > Earliest failing kernel version: 2.6.27.5-32.fc9.x86_64 (and earlier)
So.. you're saying that no kernel version has ever worked correctly
for you?
I'm not sure. I've noticed it on *at least* those kernels, since those were the ones which coincided with other fixes in Fedora that addressed suspend/resume issues outside the kernel, and I'd therefore started using it. Created attachment 20386 [details]
dmesg output with cpufreq.debug=7 cmdline option before suspend/resume
Created attachment 20387 [details]
Extra dmesg output with cpufreq.debug=7 cmdline option after suspend/resume
Looks like it's your mouse (C-states do not have to do anything with cpufreq, the P-states are controlled by cpufreq): Top causes for wakeups: 16.9% ( 61.2) USB device 6-1 : USB Optical Mouse (Logitech) 16.8% ( 61.1) <interrupt> : uhci_hcd:usb6, firewire_ohci, eth0 I don't know much about USB, but you could already try to unplug the mouse, rmmod uhci_hcd, modprobe it again, replug the mouse, does it help? Is it better without mouse? It's the interrupt activity of uhci/mouse interrupts: watch -n1 cat /proc/interrupts and look out to calme down the uhci_hcd interrupts. AFAIK uhci always produces a lot interrupt activity and there exists USB autosuspend to suspend the USB device after some time of inactivity. Is it really only after suspend or always if you plug in and use the USB mouse? Using USB can be really bad for CPUs supporting deep C-states. Hmm, I cannot change the component, this has nothing to do with cpufreq, more with suspend or better USB. I see a steady flow of interrupt 16 before and after suspend/resume, about 15/sec in both cases. I can't see anything else producing interrupts en-masse. 16: 40 3707 IO-APIC-fasteoi uhci_hcd:usb3, mmc0, mmc1, iwlagn, i915@pci:0000:00:02.0 Unplugging/disabling all USB devices does not appear to solve the problem. (Haven't tried reloading uhci driver, it's built into the kernel.) Here's another screenshot following suspend/resume: PowerTOP version 1.10 (C) 2007 Intel Corporation Cn Avg residency P-states (frequencies) C0 (cpu running) (55.1%) 2.11 Ghz 0.2% C0 0.0ms ( 0.0%) 2.10 Ghz 0.0% C1 0.0ms ( 0.0%) 1.60 Ghz 0.5% C2 0.0ms (44.1%) 800 Mhz 99.3% C3 0.0ms ( 0.7%) Wakeups-from-idle per second : 39692.4 interval: 10.0s no ACPI power usage estimate available Top causes for wakeups: 30.3% ( 38.0) <interrupt> : acpi 12.6% ( 15.8) <kernel IPI> : Rescheduling interrupts 12.3% ( 15.4) <interrupt> : uhci_hcd:usb3, mmc0, mmc1, i915@pci:0000:00: 11.0% ( 13.8) <interrupt> : extra timer interrupt 8.0% ( 10.0) gkrellm : schedule_hrtimeout_range (hrtimer_wakeup) 6.7% ( 8.4) <interrupt> : ata_piix I don't understand this at all --- it would appear to show no significant difference in the sources of wakeups than from the first screenshot I posed in Comment #3 --- yet claims around 200 times more wakeups in total. ...So where are they coming from? I can't see any process hogging CPU... This one looks rather different. In this case it should be acpi producing that much interrupts. Hmm, I remember an endless loop for Dells (with docking?). But you really should see an interrupt source heavily increasing? Could you change the component, please. Best to ACPI, there are several people looking at ACPI bugs. It very much looks like an interrupt source goes wild waking up the processor all the time. If it's acpi and you have ACPI_DEBUG compiled in you could shortly increase debug level: echo 0x21F > /sys/module/acpi/parameters/debug_level echo 0xFFFFFFFF > /sys/module/acpi/parameters/debug_layer sleep 1 echo 0xF > /sys/module/acpi/parameters/debug_level and have a look which ACPI parts are processed. Hmm, there are also GPEs exported now somewhere in /sys. On each acpi interrupt a GPE should increase. Is there one going up very quickly after suspend? You can then guess from the DSDT table to what device (battery/thermal/...) the GPE belongs and at what driver to look at. From the comment #3 and #9 it seems that there exist so many wakes after suspend/resume. >Wakeups-from-idle per second : 34595.0 interval: 10.0s >Wakeups-from-idle per second : 39692.4 It is abnormal. Will you please confirm whether the acpi interrupt is normal after suspend/resume? It will be great if you can attach the output of "grep . /sys/firmware/acpi/interrupts/* " before and after suspend/resume. Thanks. Will you please attach the following output? acpidump --addr 0x7f6d4e23 --length 0x2A8 -o cpu0ist acpidump --addr 0x7f6d50cb --length 0xb1 -o cpu1ist acpidump --addr 0x7f6d48f4 --length 0x4AA -o cpu0cst acpidump --addr 0x7f6d4d9e --length 0x85 -o cpu1cst Thanks. Created attachment 20456 [details]
grep . /sys/firmware/acpi/interrupts/* before suspend/resume
Created attachment 20457 [details]
grep . /sys/firmware/acpi/interrupts/* after suspend/resume
Created attachment 20458 [details]
cpu0cst
Created attachment 20459 [details]
cpu0ist
Created attachment 20460 [details]
cpu1cst
Created attachment 20461 [details]
cpu1ist
The ACPI interrupt seems normal (in /proc/interrupts) after suspend/resume, I can't see any evidence of it machine-gunning. As this seem to be acpi related, offlining the CPUs might not reproduce it: echo 0 >/sys/devices/system/cpu/cpu1/online echo 0 >/sys/devices/system/cpu/cpu2/online ... Still it's an easy test.., can you trigger the misbehaviour offlining the CPUs? Ehh and re-enable them again of course, this is what suspend also does with your CPUs. (In reply to comment #20) > As this seem to be acpi related, offlining the CPUs might not reproduce it: > echo 0 >/sys/devices/system/cpu/cpu1/online > echo 0 >/sys/devices/system/cpu/cpu2/online I disabled and re-enabled cpu1 (cpu0 has no "online" entry). There was no noticeable increase in the rate of wakeups. make sure CONFIG_PM_DEBUG is set, run "echo core > /sys/power/pm_test" and "echo mem > /sys/power/state". is the problem reproducible in this way? get the output of /proc/interrupts both before and after S3, make sure none of these interrupts are increasing rapidly(compared with the behaviour before S3). (In reply to comment #23) > make sure CONFIG_PM_DEBUG is set, > run "echo core > /sys/power/pm_test" and "echo mem > /sys/power/state". > is the problem reproducible in this way? > > get the output of /proc/interrupts both before and after S3, make sure none > of > these interrupts are increasing rapidly(compared with the behaviour before > S3). OK, I tried this and didn't see the problem. (On the second command the screen changed to a text console, and the hard disc powered down; the machine came back when I touched the touchpad.) No interrupts are increasing rapidly (however, I don't see that even with the problem); wakeups is around 200--300 (as it should be, not the 40,000), and the processor seems to be mostly in C3. Created attachment 20487 [details]
debug patch
please try this debug patch and see if the problem still exist.
(In reply to comment #25) > Created an attachment (id=20487) [details] > debug patch > > please try this debug patch and see if the problem still exist. Tried the patch, applied to Fedora 2.6.29-rc7-git3 kernel. The problem still exists, basically the same powertop stats as before. please unload uhci-hcd/ehci-hcd/ohci-hcd before suspend, does the problem still exist after resume? does the problem still exist with uhci-hcd/ehci-hcd/ohci-hcd reloaded after resume? (In reply to comment #27) > please unload uhci-hcd/ehci-hcd/ohci-hcd before suspend, > does the problem still exist after resume? Yes. > does the problem still exist with uhci-hcd/ehci-hcd/ohci-hcd reloaded after > resume? Yes. Hi, James Will you please add the boot option of "idle=nomwait" on the latest kernel and see whether this issue still exists after resume? Thanks. did you try windows in the system? (In reply to comment #30) > did you try windows in the system? No, I don't have a Windows license for this system. (In reply to comment #29) > Hi, James > Will you please add the boot option of "idle=nomwait" on the latest > kernel > and see whether this issue still exists after resume? > Thanks. I tried that option on kernel-2.6.29-3.fc10.x86_64. In this case, the system took quite a bit longer to resume after pressing the power button and was also *very* sluggish. Here's a powertop screenshot: PowerTOP version 1.10 (C) 2007 Intel Corporation Cn Avg residency P-states (frequencies) C0 (cpu running) (52.9%) 2.11 Ghz 40.4% C0 0.0ms ( 0.0%) 2.10 Ghz 2.2% C1 halt 0.0ms ( 0.0%) 1200 Mhz 1.4% C2 0.1ms ( 0.0%) 800 Mhz 55.1% C3 0.6ms (47.1%) Wakeups-from-idle per second : 847.9 interval: 10.0s no ACPI power usage estimate available Top causes for wakeups: 27.3% ( 81.6) <kernel IPI> : Rescheduling interrupts 16.6% ( 49.6) <interrupt> : acpi 14.0% ( 41.9) USB device 6-1 : USB Optical Mouse (Logitech) 13.2% ( 39.4) <interrupt> : uhci_hcd:usb6, firewire_ohci, eth0 3.9% ( 11.7) <interrupt> : ata_piix 3.9% ( 11.6) <interrupt> : uhci_hcd:usb3, mmc0, mmc1, i915@pci:0000:00: 02.0, iwlagn Suggestion: Disable the unused bluetooth interface with the following command: hciconfig hci0 down ; rmmod hci_usb Bluetooth is a radio and consumes quite some power, and keeps USB busy as well. Q - Quit R - Refresh B - Turn Bluetooth off So it looks like the number of wakeups is reduced (I'll run it again later to confirm this), but the C0 residence is still too high for idle and C1 is marked as halt for some reason (is this the reason for the slowness?). Without idle=nomwait, things are as before. Hi, James Thanks for the test. It seems that the wakeups per second become normal after resuming if the boot option of "idle=nomwait" is used. But the powertop reports that C0 usage is very high. And the CPU will stay in P0-state for about 40.4%. Will you please confirm whether the CPU work load is high after resuming? Thanks. (In reply to comment #33) > Hi, James > Thanks for the test. It seems that the wakeups per second become normal > after resuming if the boot option of "idle=nomwait" is used. > But the powertop reports that C0 usage is very high. And the CPU will > stay > in P0-state for about 40.4%. > Will you please confirm whether the CPU work load is high after resuming? Yes, in all cases I've closes all unnecessary applications to make sure of this. Powertop readings were taken at an idle desktop, CPU usage at the 2-3% mark. (In some cases, there were more applications open *before* suspend/resume, with a lower C0 occupancy.) Hi, James How about this issue if only one cpu is used? (by adding the boot option of "maxcpus=1") Thanks. (In reply to comment #35) > Hi, James > How about this issue if only one cpu is used? (by adding the boot option > of > "maxcpus=1") > Thanks. The excessive wakeups still exist. I've also tried disabling hrtimer, which had no effect. Additional info: this seems to be only a sleep-to-RAM problem, things seem OK when I resume from hibernation. This seems to be a Tickless issue. With nohz=off, I don't seen an excessive change in wakeups (roughly 1000 both before and after suspend/resume). hmmm, does the problem still exist in the latest git kernel, say 2.6.30-rc7? No improvement with kernel-2.6.30-0.91.rc7.git1.fc12.x86_64. No improvement with kernel 2.6.30.4, still ~40,000 wups after resume. Still none the wiser as to where they are coming from... Hi, James Will you please try the latest kernel(2.6.32-rc7) and see whether the issue still exists? Thanks. No improvement with 2.6.32 (kernel-2.6.32-1.fc13.x86_64). Quick summary: If you use idle=nomwait the irq activity drops from 40000 to 800 which you say is high, but this looks pretty normal if you still have some things running. So the root cause for the IRQ storm seem to be C-states triggered through mwait. You should also look out for BIOS updates from time to time... Can you add "idle=nomwait helps" to the title (I searched but couldn't find how to do that -> no permission?). As this only happens after a suspend/resume, I expect we or the BIOS forgot to re-program a chip properly or do something wrongly (or this is a bug in the CPU itself). Possibly this is APIC related and I had a look at the the suspend/resume code there. What I wonder is: Why is the code there only executed on one CPU while there is one local APIC per CPU, it even seem to be luck on which CPU the code gets executed (or it's ensured to be run on CPU0, because other CPUs are offlined at that time, not sure). Could you try out attached patch, please. Please give it two test runs, always with apic=verbose boot param added (and without passing idle=nomwait): - with default/normal params - additionally with disable_apic_suspend_hooks boot param Is the IRQ storm gone after resume with one of above tries? Thanks. Created attachment 24074 [details]
Debug patch: Cleanup ups and enhancements for apic suspend routines
And please attach /proc/interrupts if it still does not work (with some thousand irqs logged in there). (In reply to comment #44) > Can you add "idle=nomwait helps" to the title (I searched but couldn't find > how > to do that -> no permission?). Done. (Although I wouldn't say idle=nomwait "helps"; it has undesirable side-effects.) > As this only happens after a suspend/resume, I expect we or the BIOS forgot > to > re-program a chip properly or do something wrongly (or this is a bug in the > CPU > itself). (Just to re-iterate: the bug doesn't happen with nohz=off.) Thanks for the debug patch; I'll try it as soon as I can. OK, I tried the patch with apic=verbose. Without disable_apic_suspend_hooks, it didn't suspend properly (hard disc turned off, then it froze); with disable_apic_suspend_hooks, it suspended but didn't resume (blank screen). I'll try with again with different (vanilla) kernel later. (Updated summary to reflect that passing processor.max_cstate=1 seems to help.) Thanks for the test. I don't know enough to go on further, having a lapic suspend function for one CPU only looked bogus and I hoped the machine behaved better without any or with executing this on each core. Could you still attach /proc/interrupts if things go wrong if you have time for an extra boot. The idea is that we might see that the irqs are coming from the thermal vector which gets some extra handling in the suspend functions or some similar hint. HI, James Will you please try the latest kernel and see whether the issue still an be reproduced? Thanks. ping James (In reply to comment #52) > ping James Don't worry, I've not forgotten, just been rather busy of late --- I'll get on to testing with the latest kernel as soon as I can. any update? :) Sorry about the delay. I've just tried with a build of kernel 2.6.35.4 (OK, not the absolute latest...) and the problem still remains, albeit with a reduced number of stealth wakeups. The cmdline for this kernel is: ro root=/dev/mapper/vg_rhapsody-lv_root rd_LVM_LV=vg_rhapsody/lv_root rd_LVM_LV=vg_rhapsody/lv_swap rd_NO_LUKS rd_NO_MD rd_NO_DM LANG=en_GB.UTF-8 KEYBOARDTYPE=pc KEYTABLE=uk rhgb quiet processor.ignore_tpc=1 Note that without the max_cstate=1, this kernel also suffers from the "pausing" problem whereby I have to keep pressing keys/moving the mouse/otherwise generate interrupts otherwise the machine just stops. powertop before suspend/resume: Cn Avg residency P-states (frequencies) C0 (cpu running) ( 5.7%) 2.11 Ghz 3.6% C0 0.0ms ( 0.0%) 2.10 Ghz 0.1% C1 mwait 0.3ms ( 0.0%) 1.60 Ghz 0.5% C2 mwait 0.2ms ( 0.5%) 1200 Mhz 0.1% C3 mwait 3.7ms (93.8%) 800 Mhz 95.7% Wakeups-from-idle per second : 284.8 interval: 5.0s no ACPI power usage estimate available Top causes for wakeups: 51.5% (193.6) <kernel core> : hrtimer_start_range_ns (tick_sched_timer) 11.0% ( 41.4) <interrupt> : extra timer interrupt 5.7% ( 21.6) packagekitd : hrtimer_start_range_ns (hrtimer_wakeup) 4.2% ( 15.8) <interrupt> : acpi 3.9% ( 14.6) <interrupt> : ata_piix 3.6% ( 13.4) <kernel core> : hrtimer_start (tick_sched_timer) After: Cn Avg residency P-states (frequencies) C0 (cpu running) (20.4%) 2.11 Ghz 1.6% C0 0.1ms ( 0.0%) 2.10 Ghz 0.4% C1 mwait 6.3ms (32.0%) 1.60 Ghz 0.5% C2 mwait 0.0ms (32.3%) 1200 Mhz 0.0% C3 mwait 0.3ms (15.3%) 800 Mhz 97.5% Wakeups-from-idle per second : 15059.6 interval: 10.0s no ACPI power usage estimate available Top causes for wakeups: 30.2% ( 64.2) <kernel core> : hrtimer_start_range_ns (tick_sched_timer) 15.1% ( 32.1) <interrupt> : acpi 11.2% ( 23.9) <interrupt> : iwlagn 7.1% ( 15.2) <kernel core> : hrtimer_start (tick_sched_timer) 6.3% ( 13.5) <interrupt> : ata_piix 5.0% ( 10.6) gkrellm : hrtimer_start_range_ns (hrtimer_wakeup) HI, James Sorry for the late response. It seems that the issue still exists. In fact from the powertop data it seems that there is no change about interrupt number. But it is very weird that the wakeup event is increased very significantly. Will you please get the output of lspci -vxxx before and after the suspend/resume? thanks. Yakui Yakui, have the data you requested. Tried it on a 2.6.36 kernel with no cstate limit; core issue remains, around 300 wups before, 10000 after. Have attached dmesgs and lspcis below.(In reply to comment #56) > But it is very weird that the wakeup event is increased very significantly. <wildguess> Rogue SMIs? </wildguess> Created attachment 36222 [details]
dmesg on 2.6.36 before suspend
Created attachment 36232 [details]
dmesg on 2.6.36 after resume
Created attachment 36242 [details]
lspci -vxxx on 2.6.36 before suspend
Created attachment 36252 [details]
lspci -vxxx on 2.6.36 after resume
I do have a similar problem with tickless kernel but in a totally different machine: single Pentium M, acpi_idle. See https://bugzilla.kernel.org/show_bug.cgi?id=22332 if interested. Created attachment 37022 [details]
dmesg when booted with acpi_skip_timer_override
Some more info: If I boot with acpi_skip_timer_override (no cstate restriction), powertop shows ~2500 wups before suspend/resume, and around 6000 wups afterwards. One difference is the default clocksource gets set to hpet rather than tsc, and tsc is no longer available. dmesg attached.
Strange... in the 40 minutes or so since I last posted, wakeups seem to have calmed to around 2500-4000 per second. No idea what's changed in that time... (In reply to comment #63) > One difference is the default clocksource gets set to hpet > rather than tsc, and tsc is no longer available. dmesg attached. OK, I now understand this is to be expected since I unblocked C2 and C3, which halt the TSC... Hi, James Sorry for the late response about the issue. It is very strange that the number of wakeup is increased very significantly after suspend/resume when the mwait is used to enter the idle state. From the section 8.10.4 of Intel Software Developer Manual, it seems that it will exit the idle state under the following cases: a. write to the monitored address configured by Monitor instruction(In the code the current_thread_info->flags is the corresponding monitor address) b. External interrupts, including NMI, SMI, INIT, BINIT, MCERR, A20M# c. Faults, Aborts (including Machine Check) d. Architectural TLB invalidations including writes to CR0, CR3, CR4 and certain MSR writes But from the powertop info it seems that the number of interrupts/timer event is farther less than the number of wakeup. Not sure why the number of wakeup is affected so significantly by mwait after suspend/resume. Please use the boot option of "idle=nomwait" to workaround this issue. Thanks. Yakui It's great that kernel bugzilla is back. James, can you please verify if the problem still exists in the latest upstream kernel? (In reply to comment #67) > can you please verify if the problem still exists in the latest upstream > kernel? Using kernel-3.2.1-4.jhe.fc16.x86_64 (built from Fedora sources), with the following command-line options I get: processor.max_cstate=1 [what I normally use] - No odd pauses (as in Bug 12118) - No excessive wakeups idle=mwait as processor.max_cstate=1 idle=nomwait - Get the pause phenomenon, need to keep moving touchpad in order to get anywhere - Machine does not resume from suspend properly (screen doesn't turn on, halts, no SysRq response) <none of the above> - No excessive wakeups, nice low wups both before and after - Have the annoying pause problem So (after a very brief test --- so beware!) I think this particular issue may have cleared up on my hardware. Thanks, closing |