Bug 31562

Summary: Suspend to RAM (S3) sporadically hangs
Product: Power Management Reporter: Thilo-Alexander Ginkel (thilo)
Component: Hibernation/SuspendAssignee: power-management_other
Status: CLOSED PATCH_ALREADY_AVAILABLE    
Severity: normal CC: lenb, rjw, rui.zhang, tj
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38 Subsystem:
Regression: Yes Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: dmesg debugging output
lspci -vvv
kern.log from the successful (up to 'platform') and failed ('processor') attempts
acpidump
log produced by fwts (firmware test suite)
(Truncated) dmesg output (tail missing)
lsusb -v
.config (config-2.6.35-rc3-e22bee+)

Description Thilo-Alexander Ginkel 2011-03-21 12:33:44 UTC
Created attachment 51482 [details]
dmesg debugging output

Since upgrading my system's kernel from 2.6.35 to 2.6.38 Suspend to RAM (S3) sporadically hangs with the following symptoms:
The system goes through the shutdown sequence
 - Switches to tty1
 - Turns off the HDDs
...and then just stays there, i.e., with tty1 on screen, but completely unresponsive (even to Magic SysRq).

The issue occurs both with the current Ubuntu 11.04 as well as the vanilla 2.6.38 kernel. It did not occur with kernel 2.6.35.

I compiled a kernel with PM debugging turned on (the resulting dmesg output is attached to this ticket), but the fingerprinting did not yield any meaningful output:

andromeda kernel: [    2.187047]   Magic number: 0:400:740
andromeda kernel: [    2.187049]   hash matches /raid/home/tg/kernel-2.6.38/source/drivers/base/power/main.c:577

I'd appreciate any hints or experimental patches to further isolate the issue.
Comment 1 Thilo-Alexander Ginkel 2011-03-21 12:34:24 UTC
Created attachment 51492 [details]
lspci -vvv
Comment 2 Rafael J. Wysocki 2011-03-21 19:52:05 UTC
It looks like a problem with the graphics driver.  Do you use s2ram by chance?
Comment 3 Thilo-Alexander Ginkel 2011-03-21 20:25:05 UTC
AFAICS from /etc/pm/config.d/00sleep_module, the default (which according to a comment is 'kernel') is used and not 's2ram'.

While this may be a bug in the closed-source graphics driver, some change in the kernel since 2.6.35 seems to have triggered it.

I can try a bisect if that helps...
Comment 4 Len Brown 2011-03-22 01:38:00 UTC
> nvidia: module license 'NVIDIA' taints kernel.

please re-open if this this is still an issue
when the binary nvidia graphics driver is not present in the system.
Comment 5 Thilo-Alexander Ginkel 2011-03-26 14:06:27 UTC
I switched from the proprietary nvidia driver to nouveau and am still able to reproduce the issue. I also did a bisect, but the results are inconclusive (it identified some 2.6.35-rc5+ revisions are the culprit, but I told bisect that 2.6.35 works, but 2.6.36 does not, so I am confused why it even bothers to have a look at an older version (I had to skip a couple of revisions, which I could not get to boot):

git bisect start
# good: [9fe6206f400646a2322096b56c59891d530e8d51] Linux 2.6.35
git bisect good 9fe6206f400646a2322096b56c59891d530e8d51
# bad: [f6f94e2ab1b33f0082ac22d71f66385a60d8157f] Linux 2.6.36
git bisect bad f6f94e2ab1b33f0082ac22d71f66385a60d8157f
# skip: [78417334b5cb6e1f915b8fdcc4fce3f1a1b4420c] Merge branch 'bkl/core' of git://git.kernel.org/pub/scm/linux/kernel/git/frederic/random-tracing
git bisect skip 78417334b5cb6e1f915b8fdcc4fce3f1a1b4420c
# skip: [0a492896ac07336c98f37ad7fab4a6387b6ada78] sparc: Really fix "console=" for serial consoles.
git bisect skip 0a492896ac07336c98f37ad7fab4a6387b6ada78
# bad: [bbb7030fbb6469c46974c4736a5f90d12102f299] ACPI: expand Vista blacklist to include SP1 and SP2
git bisect bad bbb7030fbb6469c46974c4736a5f90d12102f299
# bad: [4a386c3e177ca2fbc70c9283d0b46537844763a0] Merge branch 'x86-xsave-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
git bisect bad 4a386c3e177ca2fbc70c9283d0b46537844763a0
# good: [6ba74014c1ab0e37af7de6f64b4eccbbae3cb9e7] Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net-next-2.6
git bisect good 6ba74014c1ab0e37af7de6f64b4eccbbae3cb9e7
# bad: [fc1caf6eafb30ea185720e29f7f5eccca61ecd60] Merge branch 'drm-core-next' of git://git.kernel.org/pub/scm/linux/kernel/git/airlied/drm-2.6
git bisect bad fc1caf6eafb30ea185720e29f7f5eccca61ecd60
# bad: [c3d1f1746b966907ba5ad2f75ddca24db8b21147] Merge branch 'upstream' of git://git.linux-mips.org/pub/scm/upstream-linus
git bisect bad c3d1f1746b966907ba5ad2f75ddca24db8b21147
# good: [ffd386a9a8273dcfa61705d0b349eebc7525ef87] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/percpu
git bisect good ffd386a9a8273dcfa61705d0b349eebc7525ef87
# good: [c853d945d3e0cadf60da03106f7d9bbf1346a518] MIPS: Unify the suffix of compressed vmlinux.bin
git bisect good c853d945d3e0cadf60da03106f7d9bbf1346a518
# bad: [5cf65713f87775c548e3eb48dbafa32e12f28000] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/hid
git bisect bad 5cf65713f87775c548e3eb48dbafa32e12f28000
# good: [0500e9b3f11ce84fc6ee48a3e29909145e58ba48] padata: Remove padata_get_cpumask
git bisect good 0500e9b3f11ce84fc6ee48a3e29909145e58ba48
# bad: [35d03a0d9b22fdf7cba75cd41a6f0a039ba35cdd] Merge branch 'acrux' into for-linus
git bisect bad 35d03a0d9b22fdf7cba75cd41a6f0a039ba35cdd
# skip: [a11b3fab94d4fb67297b76d0cb81612ebbff276e] HID: hid-ids.h: Whitespace fixup, align using TABs
git bisect skip a11b3fab94d4fb67297b76d0cb81612ebbff276e                                                    
# good: [64b386ea16112564e0b93473e2c347125effb6b2] HID: add proper support for Elecom BM084 bluetooth mouse 
git bisect good 64b386ea16112564e0b93473e2c347125effb6b2                                                    
# good: [8c8b01c38a70661d663175d355fdea85ca082272] HID: ignore digitizer usage Undefined (0x00)
git bisect good 8c8b01c38a70661d663175d355fdea85ca082272
# skip: [9929ffed889b4861b52ba648a44fa4add945c4d1] Merge branch 'upstream-fixes' into for-linus
git bisect skip 9929ffed889b4861b52ba648a44fa4add945c4d1
# skip: [7d3d42a79519df4cd62c3aa5d9ae2d77ebbf8fab] HID: fix up Kconfig entry for ACRUX driver
git bisect skip 7d3d42a79519df4cd62c3aa5d9ae2d77ebbf8fab
# skip: [540010812179a16d3d00fb8363bb06ee83af25b8] HID: Add support for Conceptronic CLLRCMCE
git bisect skip 540010812179a16d3d00fb8363bb06ee83af25b8
# skip: [d5e0a06f17a0ffb0eb08a5bd7b18f00af70d9a12] Revert "HID: add support for the Wacom Intuos 4 wireless"
git bisect skip d5e0a06f17a0ffb0eb08a5bd7b18f00af70d9a12
# skip: [c0dbcc33c652a0646542560de29a1c3f1ab7169f] HID: add ACRUX game controller force feedback support
git bisect skip c0dbcc33c652a0646542560de29a1c3f1ab7169f
# skip: [fc73697bd5bf4efe8b5b2b78ec09e00f967cd055] Merge branch 'upstream' into for-linus
git bisect skip fc73697bd5bf4efe8b5b2b78ec09e00f967cd055
Comment 6 Thilo-Alexander Ginkel 2011-03-28 13:03:42 UTC
I did some additional debugging using pm_test: freezer, devices and platform work reliably in a reproducible fashion. processors, however, fails (power LED stays lit instead of blinking and the system no longer wakes up as part of the test and requires a hard reset).

Unfortunately, tracing again only yields:

andromeda kernel: [    2.177994]   Magic number: 0:400:740
andromeda kernel: [    2.177995]   hash matches /raid/home/tg/kernel-2.6.38/source/drivers/base/power/main.c:577

However, I have captured the dmesg output for the failed session (from /var/log/kern.log) although it looks like the interesting part may be missing and only the previous attempts up to "platform" may have been fully recorded.
Comment 7 Thilo-Alexander Ginkel 2011-03-28 13:05:01 UTC
Created attachment 52332 [details]
kern.log from the successful (up to 'platform') and failed ('processor') attempts
Comment 8 Thilo-Alexander Ginkel 2011-03-28 13:06:20 UTC
Created attachment 52342 [details]
acpidump
Comment 9 Thilo-Alexander Ginkel 2011-03-28 14:34:40 UTC
Created attachment 52352 [details]
log produced by fwts (firmware test suite)
Comment 10 Thilo-Alexander Ginkel 2011-03-29 21:47:16 UTC
I applied some debug instrumentation to my kernel (using the TRACE_RESUME framework) and the last method call that is recorded is disable_nonboot_cpus() in suspend_enter (kernel/power/suspend.c). As there is another instrumentation point right after the method call I am quite certain that this method call never returns.

I'll add some further instrumentation code to figure out what is going on in that method...
Comment 11 Thilo-Alexander Ginkel 2011-03-29 21:50:07 UTC
Just an idea, but is it guaranteed that disable_nonboot_cpus is executed on CPU0?
Comment 12 Thilo-Alexander Ginkel 2011-03-30 11:43:30 UTC
Did some further instrumentation, which ended up in kernel/notifier.c. As soon as I compile the kernel with CONFIG_DEBUG_NOTIFIERS, the error probability decreases, but does not go down to zero. From the second consecutive suspend on, the power LED of the system no longer flashes, but instead stays dark (while suspended). Otherwise, the suspend works correctly. Eventually, the system will freeze again while suspending, HDDs powered down, power LED lit. When this happens the last statement recorded by the instrumentation is right before "return ret;" in notifier_call_chains (kernel/notifier.c) most probably called by __cpu_notify (kernel/cpu.c).

To me this has two implications:
1) There seem to be invalid notifiers in this call chain
2) Some notifier being called is corrupting memory and also the stack, which is why the return ends up nowhere

Now I could need some help figuring out which notifiers are called (which is not that obvious given their dynamic nature). ;-)
Comment 13 Thilo-Alexander Ginkel 2011-03-31 00:24:44 UTC
Just some additional info: I cannot reproduce this bug if I limit the number of used CPU cores to two (using maxcpus=2)
Comment 14 Thilo-Alexander Ginkel 2011-03-31 13:47:03 UTC
Please disregard my comment #12, which is just plain wrong: I instrumented notifier_call_chains, which is of course not exclusively called from __cpu_notify, so it is quite natural not to see any further instrumentation afterwards if the respective caller has not been instrumented.
Comment 15 Thilo-Alexander Ginkel 2011-04-03 02:02:02 UTC
I found an error in my bisect history, which I corrected. I now ended up at:

bd25f4dd6972755579d0ea50d1a5ace2e9b00d1a is the first bad commit
commit bd25f4dd6972755579d0ea50d1a5ace2e9b00d1a
Author: Arnd Bergmann <arnd@arndb.de>
Date:   Sun Jul 11 15:34:05 2010 +0200

    HID: hiddev: use usb_find_interface, get rid of BKL
    
    This removes the private hiddev_table in the usbhid
    driver and changes it to use usb_find_interface
    instead.
    
    The advantage is that we can avoid the race between
    usb_register_dev and usb_open and no longer need the
    big kernel lock.
    
    This doesn't introduce race condition -- the intf pointer could be
    invalidated only in hiddev_disconnect() through usb_deregister_dev(),
    but that will block on minor_rwsem and not actually remove the device
    until usb_open().
    
    Signed-off-by: Arnd Bergmann <arnd@arndb.de>
    Cc: Jiri Kosina <jkosina@suse.cz>
    Cc: "Greg Kroah-Hartman" <gregkh@suse.de>
    Signed-off-by: Jiri Kosina <jkosina@suse.cz>

:040000 040000 4ae14b3ba486373d7a354874e9ad334858f094e3 8041ffda20ca3020a6b60d64235ae179f8186bf0 M      drivers

Booting with no_console_suspend also brings up a BUG mentioning a soft lockup. As my serial console breaks during suspend, I had to take photos of the error messages:
  https://secure.tgbyte.de/dropbox/eaghoh9M.jpg
  https://secure.tgbyte.de/dropbox/ecae8ieR.jpg
  https://secure.tgbyte.de/dropbox/vah5ooR9.jpg

Unfortunately, I have so far been unable to get the complete error message in a single screen capture (as the output is either too verbose or too scarce).
Comment 16 Thilo-Alexander Ginkel 2011-04-03 02:11:48 UTC
Created attachment 53282 [details]
(Truncated) dmesg output (tail missing)
Comment 17 Thilo-Alexander Ginkel 2011-04-03 02:15:23 UTC
Created attachment 53292 [details]
lsusb -v
Comment 18 Thilo-Alexander Ginkel 2011-04-14 20:36:12 UTC
Some update:

The bug inroduced by the commit mentioned in comment #15 has similar symptoms to what I have been experiencing, but has already been fixed in the meantime. A new (pretty tedious) bisect brought up this change:

| e22bee782b3b00bd4534ae9b1c5fb2e8e6573c5c is the first bad commit
| commit e22bee782b3b00bd4534ae9b1c5fb2e8e6573c5c
| Author: Tejun Heo <tj@kernel.org>
| Date:   Tue Jun 29 10:07:14 2010 +0200
|
|     workqueue: implement concurrency managed dynamic worker pool

Reverting this commit reliably fixes the issue.

In the meantime, I have also been able to reproduce the issue in a KVM virtual machine, i.e., it is not hardware-specific, but must probably affects suspend on any SMP system (but the probability may depend on the number of CPUs).

Further details bout my system config (which is also mimicked by my VM):
- lvm running on top of
- dmcrypt (luks) running on top of
- md raid1
Comment 19 Thilo-Alexander Ginkel 2011-04-14 20:39:39 UTC
Some further details about the observed symptoms:

After issuing pm-suspend, it all starts with the suspend sequence hanging while shutting down non-boot CPUs (the CPU# varies from case to case):
 https://secure.tgbyte.de/dropbox/ueph3Ohm-1.jpg

Then, after some time, the following backtrace is written:
 https://secure.tgbyte.de/dropbox/ueph3Ohm-2.jpg

Transcript:
| RIP: _raw_spin_unlock_irqrestore
| Call Stack:
|   _set_cpus_allowed_ptr
|   worker_maybe_bind_and_lock
|   rescuer_thread
|   rescuer_thread
|   kernel_thread_helper
|   kthread
|   kernel_thread_helper
|   kthread

After some more time, the following backtrace is printed:
 https://secure.tgbyte.de/dropbox/ueph3Ohm-3.jpg

Transcript:
| RIP: worker_maybe_bind_and_lock
| Call Stack:
|   rescuer_thread
|   rescuer_thread
|   kernel_thread_helper
|   kthread
|   kernel_thread_helper
|   kthread

From then on these two backtraces are printed in an alternating fashion.
Comment 20 Thilo-Alexander Ginkel 2011-04-15 23:50:17 UTC
In another attempt to further isolate the system parameters that trigger the issue, I set up a couple of minimal systems using various combinations of dmcrypt and raid1. As it turned out, having raid1 enabled is sufficient for triggering the issue, so commit e22bee782b3b00bd4534ae9b1c5fb2e8e6573c5c apparently broke suspend for all raid(1?) multi-core systems.
Comment 21 Rafael J. Wysocki 2011-04-16 17:00:23 UTC
Not really, I cannot reproduce the issue on three different systems using
RAID-1 with 2.6.38 and 2.6.39-rc3.

Have you tried all that on the same or similar hardware?
Comment 22 Thilo-Alexander Ginkel 2011-04-16 17:52:58 UTC
That's weird. I assume your tests machines have more than two cores?

I can reproduce the issue on my physical PC (4 cores) as well as within a KVM (which is running on the same machine, but that should IMHO not make a difference considering the virtualization layer inbetween) with -smp 8.

I am currently uploading my test VM and will post an update including test instructions once it is available.
Comment 23 Rafael J. Wysocki 2011-04-16 17:59:46 UTC
One of my test boxes is six-core, but none of them supports hyperthreading.

Can you verify that the issue is still reproducible when hyperthreading
support is disabled in the kernel config?
Comment 24 Thilo-Alexander Ginkel 2011-04-16 18:23:24 UTC
My CPU (Intel Core 2 Quad Q8400) does not support HT either and I don't think that QEMU will provide a virtualized hyperthreaded CPU, but I can still test that, if you think it makes sense under these circumstances.
Comment 25 Rafael J. Wysocki 2011-04-16 18:35:20 UTC
Not really. :-)

Please attach your .config.
Comment 26 Thilo-Alexander Ginkel 2011-04-16 18:43:36 UTC
Created attachment 54502 [details]
.config (config-2.6.35-rc3-e22bee+)
Comment 27 Thilo-Alexander Ginkel 2011-04-16 20:29:21 UTC
BTW, I have run a couple of further tests. 2.6.39-rc3 no longer seems to be exhibiting the issue when testing using my VM (I have yet to run some tests with my 'real' PC). 2.6.38, however, reliably freezes in the VM (but it freezes so reliably that this may be a different issue or related to the Ubuntu kernel package - I'll build a vanilla kernel to rule out any distribution-specific bugs [any other kernels I tested were vanilla kernels]).
Comment 28 Thilo-Alexander Ginkel 2011-04-16 22:56:31 UTC
I just verified using a 2.6.38 vanilla kernel that the lockups also happen within my virtualized environment (although for 2.6.38 the lockup detector does not trigger [although it is compiled in], i.e. the suspend sequence just hangs after "CPU i is now offline"; i < n). The lockups happen with a higher probability when both mdraid and dmcrypt are deployed (compared to mdraid only).

BTW, this is my test script:

-- 8< --
#!/bin/bash
sysctl -w kernel.printk=8
echo core > /sys/power/pm_test
for i in {1..99}; do echo "Pass $i"; uname -a; sync; pm-suspend; done
-- 8< --
Comment 29 Thilo-Alexander Ginkel 2011-04-17 13:06:45 UTC
In case you'd like to get hold of my test environment:

https://secure.tgbyte.de/dropbox/lockup-test.tar.bz2 contains the disk images for my test VM. Boot using:
  kvm -hda LockupTestRaid-1.qcow2 -hdb LockupTestRaid-2.qcow2 -smp 8 -m 1024 -curses

To run the test, log in as root / test and run:
  /root/suspend-test

Currently, the image only contains kernels from my bisect test (i.e., no recent kernels, but those can be easily installed).
Comment 30 Zhang Rui 2012-01-18 03:22:54 UTC
It's great that the kernel bugzilla is back.

Can you please verify if the problem still exists in the latest upstream kernel?
Comment 31 Thilo-Alexander Ginkel 2012-01-18 17:58:02 UTC
This issue has already been fixed.