Bug 8391 - Soft lockup on CPU0 when resuming from suspension to ram, related to acpi processor module
Summary: Soft lockup on CPU0 when resuming from suspension to ram, related to acpi pro...
Status: CLOSED CODE_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Power-Processor (show other bugs)
Hardware: i386 Linux
: P2 high
Assignee: Thomas Gleixner
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-04-28 08:46 UTC by Giorgio Lando
Modified: 2007-11-14 00:01 UTC (History)
4 users (show)

See Also:
Kernel Version: 2.6.22
Subsystem:
Regression: ---
Bisected commit-id:


Attachments
My kernel config (42.58 KB, text/plain)
2007-04-28 08:51 UTC, Giorgio Lando
Details
bootlog (21.09 KB, text/plain)
2007-05-02 14:29 UTC, Giorgio Lando
Details
collection of pending fixups (13.61 KB, patch)
2007-05-02 14:33 UTC, Thomas Gleixner
Details | Diff
patch series of the previous (4.95 KB, application/x-bzip)
2007-05-02 15:02 UTC, Thomas Gleixner
Details

Description Giorgio Lando 2007-04-28 08:46:31 UTC
Most recent kernel where this bug did *NOT* occur: 2.6.20
Distribution: archlinux (but the problem occurs also with a vanilla kernel)
Hardware Environment: acer 1644WLMi, with an Intel Centrino 2000 MhZ

Problem Description: When I resume the laptop from a suspension to ram, I get a
soft lockup. If the module 'processor' is not loaded, the laptop resume
normally.  On the contrary compiling the processor acpi module into the kernel
does not solve the problem. 
This happens with a vanilla kernel. 2.6.21.1 does not solve it.
This is what I can see in the system log:
Apr 28 17:18:37 clarabella BUG: soft lockup detected on CPU#0!
Apr 28 17:18:37 clarabella [<c01524f8>] softlockup_tick+0xa8/0x110
Apr 28 17:18:37 clarabella [<c0132853>] update_process_times+0x33/0x80
Apr 28 17:18:37 clarabella [<c0144a1b>] tick_sched_timer+0x5b/0xc0
Apr 28 17:18:37 clarabella [<c0140a3d>] hrtimer_interrupt+0x13d/0x1d0
Apr 28 17:18:37 clarabella [<c01189f2>] smp_apic_timer_interrupt+0x52/0x90
Apr 28 17:18:37 clarabella [<c034387f>] preempt_schedule_irq+0x3f/0x60
Apr 28 17:18:37 clarabella [<c0104d30>] apic_timer_interrupt+0x28/0x30
Apr 28 17:18:37 clarabella [<c024f141>] cfb_imageblit+0x521/0x580
Apr 28 17:18:37 clarabella [<c0141f88>] clocksource_get_next+0x38/0x40
Apr 28 17:18:37 clarabella [<c01400b2>] ktime_get_ts+0x22/0x60
Apr 28 17:18:37 clarabella [<c0143432>] clockevents_program_event+0x92/0x110
Apr 28 17:18:37 clarabella [<c024cc06>] bit_putcs+0x576/0x600
Apr 28 17:18:37 clarabella [<c024dca0>] bitfill_aligned+0x0/0x100
Apr 28 17:18:37 clarabella [<c024b72b>] fbcon_switch+0x41b/0x5f0
Apr 28 17:18:37 clarabella [<c0246efd>] fbcon_putcs+0x19d/0x2e0
Apr 28 17:18:37 clarabella [<c024c690>] bit_putcs+0x0/0x600

Steps to reproduce: suspend to ram with the acpi support for the processor in
the kernel or as a loaded module; try to resume.
Comment 1 Giorgio Lando 2007-04-28 08:51:00 UTC
Created attachment 11311 [details]
My kernel config

I attach my kernel config
Comment 2 Thomas Gleixner 2007-04-28 10:09:55 UTC
Can you please try with the module loaded and following addons to the kernel
commandline:

A) highres=off

B) nohz=off

C) highres=off nohz=off

Thanks,

    tglx
Comment 3 Thomas Gleixner 2007-04-28 10:12:55 UTC
Can you please add a boot log (with the module loaded and no further commandline
options) ?

Thanks,

    tglx
Comment 4 Giorgio Lando 2007-05-02 14:12:14 UTC
I am able to resume from suspension to ram in scenario C), while I get the
previous soft lockup in scenarios A) and B).
Comment 5 Giorgio Lando 2007-05-02 14:29:48 UTC
Created attachment 11378 [details]
bootlog

This is the bootlog, with the module loaded and no kernel boot options.
Comment 6 Thomas Gleixner 2007-05-02 14:33:00 UTC
Created attachment 11379 [details]
collection of pending fixups

Can you please apply the attached patch and retest ?
Comment 7 Giorgio Lando 2007-05-02 14:53:48 UTC
The patch seems to solve the issue. With that applied, the same config, the
module loaded and no special kernel boot option, I am able to resume from
suspension to ram. Thanks.
Comment 8 Thomas Gleixner 2007-05-02 15:02:58 UTC
Created attachment 11380 [details]
patch series of the previous

May I ask you a favour?

The attached tarball has the seperate parts of the patch I attached before. The
tarball contains a quilt patch series. If you are not familiar with quilt, then
just apply the patches one after each other according to the order, which can
be found in the file "series". Please recompile and boot after each step and
report which one finally fixes the problem

Thanks

    tglx
Comment 9 Giorgio Lando 2007-05-02 15:39:38 UTC
The problem is solved when I apply the fourth patch, that is
highres-dyntick-avoid-xtime-lock-contention.patch +
clocksource-fix-resume-logic.patch +
acpi-keep-tsc-stable-when-lapic-timer-c2-ok-is-set.patch +
clockevents-fix-resume-logic.patch.
On the contrary, clockevents-fix-oneshot-suspend.patch does not seem to be required.
Comment 10 Thomas Gleixner 2007-05-02 15:51:28 UTC
Giorgio,

thanks a lot. The last patch is required for consitency on different hardware.

I put that bug into PATCH_ALREADY_AVAILABLE status for now. I close it once the
fixes hit mainline and the 2.6.21 stable sries.

Thanks,

     tglx
Comment 11 Giorgio Lando 2007-07-09 01:38:08 UTC
I have these problems again in 2.6.22. Always connected with the processor ACPI driver and solvable with 'nohz=off highres=off'. Does this mean that these patches (or their replacement) have not been included in 2.6.22?
Comment 12 Giorgio Lando 2007-07-09 01:43:22 UTC
I have looked and it seems that their replacement have been included. Thus I think that the bug should be opened again. 
Comment 13 Zhang Rui 2007-08-05 09:42:11 UTC
Hi, Thomas,
I can reproduce the bug and boot with "nohz=off" solves the problem.
Could you give a detailed description of this bug please?

Thanks,
Rui
Comment 14 Natalie Protasevich 2007-10-18 21:47:46 UTC
Any update on this bug please? Can anyone confirm that kernel works now as noted in #10?
Thanks.
Comment 15 Thomas Gleixner 2007-11-13 06:54:29 UTC
Giorgio, Zhang,

is the problem still there with 2.6.23 ?

Thanks,
      tglx
Comment 16 Giorgio Lando 2007-11-13 07:01:21 UTC
No, it is not. 2.6.23 suspends and resumes fine with highres and nohz. I think that the bug can be closed (actually I had forgotten this bug).
Comment 17 Zhang Rui 2007-11-13 17:31:08 UTC
No, everything is working well.
I think it's fixed in 2.6.23-rc3. :)
Comment 18 Thomas Gleixner 2007-11-14 00:01:21 UTC
Giorgio, Zhang,

Thanks for testing!

    tglx

Note You need to log in before you can comment on or make changes to this bug.