Bug 57381 - Radeon HD6950: Resuming from hibernation fails sometimes
Summary: Radeon HD6950: Resuming from hibernation fails sometimes
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2013-05-01 13:56 UTC by Harald Judt
Modified: 2013-08-21 19:49 UTC (History)
4 users (show)

See Also:
Kernel Version: 3.8.13
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Harald Judt 2013-05-01 13:56:58 UTC
Mainboard: ASRock Z77 Extreme4
Graphics card: AMD Radeon HD6950 (r600/Cayman)

Resuming from hibernation fails sometimes, but resuming from suspend never fails.

Observable symptoms when it fails:
After reading the complete page1? snapshot, the screen goes black, turns off (no signal) and the system hangs. No ssh or sysreq keys work, everything is frozen dead, leaving the reset button as the only option to get the system back into a working state.

This is with the vanilla kernel, starting at least with 3.2-rc8 something, last version tested is 3.8.7. But in production, I'm using identical kernels with the tuxonice patches applied. Now one interesting feature of this is that it allows to retry a failed resume (reading the image again). And this - though it might sometimes need one or two tries - actually works! No filesystem corruption or any other problems.

I've spent some time to narrow this down to some module, and it appeared that the issue doesn't happen with nomodeset=1. What's more relevating, is that it is not reproducible with a Radeon HD3650 (RV635 chipset) using the same module.

According to the driver developers, for the driver there should be no difference between suspend/hibernate.

Any idea what could go wrong here? Any chance getting to work hibernate/resume reliably on that machine with that graphics card?

Unfortunately, this is really hard to debug. There seem to be no messages that would help in dmesg, netconsole is no help here, boot messages scroll over the screen extremely fast, too fast to read.
Comment 1 Harald Judt 2013-05-01 13:59:00 UTC
One additional note: The issue does not occur at certain cycles, sometimes three hibernation->resume cycles go well, then the fourth and fifth will not work, then the next four are ok and so on...
Comment 2 Lan Tianyu 2013-05-07 06:35:31 UTC
This seems a hardware unstable problem since you said the problem said unreproducible by using Radeon HD3650. Further more, this doesn't happen every time.
Comment 3 Harald Judt 2013-05-07 10:37:38 UTC
I would agree with you, but find some contradictions:

1) The system is otherwise *completely* stable, hibernate/resume being the only issue.
2) Hibernate/resume is no problem in Microsoft Windows 7. I know, these are different conditions, but still...
3) There are no problems on boot or with suspend/resume.

Radeon HD3650 uses r600, but is different hardware, so not really comparable. If I could test another Radeon HD6950, then it would easier to say, but unfortunately, I do not have another test sample and won't be able to get one.

Before simply declaring this an unstable hardware problem, are there no further methods to track down the issue? Isn't there a chance this could be a locking issue, or race condition, whatever?
Comment 4 Lan Tianyu 2013-05-07 12:31:58 UTC
Currently, have no idea. How about changing a higher power supply? or using a serial port to get debug info(the system maybe still survive).
Comment 5 Harald Judt 2013-05-07 12:43:12 UTC
I see. 750W should be plenty, and it's not bad quality. Besides, I don't think the graphics card draws that much power when booting. When playing 3d games, maybe.

Perhaps I can look at that serial port stuff. I'll also try putting the graphics card in another machine, although I had an older p67 mainboard before where the same issue existed.
Comment 6 Lan Tianyu 2013-05-09 07:45:07 UTC
Ok. I think 750W is enough. Do you have some update about serial port debug?
Comment 7 Lan Tianyu 2013-05-14 07:56:56 UTC
ping ...
Comment 8 Harald Judt 2013-05-14 08:16:59 UTC
I wonder... Is netconsole that much different from serial console? I mean, will I get output right from the start, or only after reading the pageset1 (atomic copy)? Because if there aren't any differences, I won't get serial debug output either, right?

I will need to find another computer with a serial port and a serial cable first. Unfortunately, I won't have access to the machine for the next two weeks, but I'll report back then.
Comment 9 Harald Judt 2013-05-25 14:41:47 UTC
While I haven't been able to get the equipment for serial console debugging, I think I have at least found a workaround:

echo 0 > /sys/power/pm_async

Since I've disabled async suspend/resume, the problem has not occurred anymore (16 attempts by now). The console reappears on each attempt, and there is a 3 seconds delay now before the computer starts reading pageset2. So it's indeed a software problem, not a hardware one.

Now why does this issue happen only on hibernate/resume, but never on suspend/resume? I believe this is because of a BIOS setting (enabled by default) which forces to wait for all devices being ready before resuming from S3. The description for this setting _only_ talks about S3, not S4/S5. I have not tested this, so this may only be speculation, and the BIOS setting may not have anything to do with it, but I'd find it very probable.

Setting pm_async to 0 is ok for me; I don't care if it takes 3-5 seconds longer, as long as it works. However, I guess it's still a bug and pm_async should work, so I'm going to try this serial console stuff anyway. Any recommendations from you concerning debugging steps now that I've narrowed this issue down?
Comment 10 Lan Tianyu 2013-05-28 06:06:04 UTC
I think you can try to find which device's async suspend cause this issue. 

I write a script which can run under /sys/devices to find which device's async is enabled. You also can run it under some specific device's sysfs node to narrow the scope or enhance the script to disable the async and test.

#/bin/sh
for i in `find . -name async`; do
        rlt=`cat $i`
        if [ "$rlt" = enabled ] ; then
                echo $i $rlt
        fi
done
Comment 11 Harald Judt 2013-06-01 08:45:30 UTC
There are no async files below /sys/devices, and "*async*" doesn't reveal anything useful either.
Comment 12 Harald Judt 2013-06-05 17:56:47 UTC
Unfortunately it seems the problem is not solved completely. While it happens _much_ more rarely now (say maybe 3%), it still does even with pm_async deactivated. At first I thought it would be a different issue, but the symptoms are the same.

While this makes reproducibility a bit harder, I know how to change that ;-) So I'll still have to get that serial console debugging stuff going.
Comment 13 Lan Tianyu 2013-06-08 02:35:23 UTC
Ok. This shows pm_async just alleviates the issue.

Does the following patch work for you to find devices with async enabled.
find /sys -name "async" | while read i ; do \
    DEV="$i" ; ENAB="`cat $i`" ; \
    if [ "$ENAB" = "enabled" ]; then \
        echo enabled dev $DEV ; \
    fi ; \
done
Comment 14 Harald Judt 2013-06-14 17:54:29 UTC
I've tried to get serial console debugging working today, but failed miserably because the USB<->RS232 converter refused to work as soon as I attached the null modem cable (device errors, not reconnecting on re-plugging with the cable). Maybe that null modem cable is faulty or too long and the cheap USB converter can't deal with it. I suspect that the cable is still good though and will see if I can find another old piece of hardware I can attach to the machine, without having to deal with that USB stuff. I've used that null modem cable ages ago...

Next, I'm not sure I was correct about the problem not being solved completely. It could also be another driver problem sharing the same symptoms. It's a bit buggy with newer kernel versions, there are currently a few bugs open on freedesktop.org which might cause this. The machine hang two times while running, so it wouldn't be surprising if this was not related to suspend/resume at all.

As for your script: Find did not find anything on my machine because I compiled the kernel without CONFIG_PM_ADVANCED_DEBUG enabled. I'll activate this, do some tests and see what they turn up.
Comment 15 Harald Judt 2013-06-14 18:06:12 UTC
Here is the output of your lines of code from comment #13:

enabled dev /sys/devices/pci0000:00/0000:00:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:01.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:01.0/0000:00:01.0:pcie08/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb1/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb1/1-0:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb1/1-0:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb2/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb2/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb2/2-0:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/usb2/2-0:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:14.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:16.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/ep_01/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/ep_82/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/host8/scsi_host/host8/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/host8/target8:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/host8/target8:0:0/8:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/host8/target8:0:0/8:0:0:0/scsi_device/8:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/host8/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1.4/3-1.4:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/3-1:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/3-0:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/usb3/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1a.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1b.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ata7/host6/scsi_host/host6/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ata7/host6/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ata7/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ata8/host7/scsi_host/host7/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ata8/host7/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/ata8/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.3/0000:03:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.4/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.4/0000:04:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/0000:05:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/0000:05:00.0/0000:06:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/0000:05:00.0/0000:06:00.1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/0000:05:00.0/0000:06:01.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/0000:05:00.0/0000:06:01.1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.5/0000:05:00.0/0000:06:01.2/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb4/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb4/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb4/4-0:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb4/4-0:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb6/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb6/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb6/6-0:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/usb6/6-0:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1c.7/0000:07:00.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1.5/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1.5/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1.5/5-1.5:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1.5/5-1.5:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1.5/5-1.5:1.1/ep_82/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-1/5-1:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/ep_00/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-0:1.0/ep_81/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/usb5/5-0:1.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1d.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/scsi_host/host0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/target0:0:0/0:0:0:0/scsi_device/0:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata1/host0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata2/host1/scsi_host/host1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata2/host1/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata2/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/scsi_host/host2/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/2:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/2:0:0:0/scsi_device/2:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/target2:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata3/host2/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata3/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/scsi_host/host3/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/target3:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/target3:0:0/3:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/target3:0:0/3:0:0:0/scsi_device/3:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata4/host3/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata4/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata5/host4/scsi_host/host4/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata5/host4/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata5/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/scsi_host/host5/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/target5:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/target5:0:0/5:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/target5:0:0/5:0:0:0/scsi_device/5:0:0:0/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata6/host5/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/ata6/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.2/power/async
enabled dev /sys/devices/pci0000:00/0000:00:1f.3/power/async
enabled dev /sys/devices/pci0000:00/power/async

These are quite a lot of entries. Can you recommend any specific procedure to follow for which devices to disable async to reduce the amount of testing? E.g, deactivate async for all devices of a specific type first?

Here's lspci output:
00:00.0 Host bridge: Intel Corporation 2nd Generation Core Processor Family DRAM Controller (rev 09)
00:01.0 PCI bridge: Intel Corporation Xeon E3-1200/2nd Generation Core Processor Family PCI Express Root Port (rev 09)
00:14.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB xHCI Host Controller (rev 04)
00:16.0 Communication controller: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 (rev 04)
00:1a.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #2 (rev 04)
00:1b.0 Audio device: Intel Corporation 7 Series/C210 Series Chipset Family High Definition Audio Controller (rev 04)
00:1c.0 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 1 (rev c4)
00:1c.3 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 4 (rev c4)
00:1c.4 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 5 (rev c4)
00:1c.5 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 6 (rev c4)
00:1c.7 PCI bridge: Intel Corporation 7 Series/C210 Series Chipset Family PCI Express Root Port 8 (rev c4)
00:1d.0 USB controller: Intel Corporation 7 Series/C210 Series Chipset Family USB Enhanced Host Controller #1 (rev 04)
00:1f.0 ISA bridge: Intel Corporation Z77 Express Chipset LPC Controller (rev 04)
00:1f.2 SATA controller: Intel Corporation 7 Series/C210 Series Chipset Family 6-port SATA Controller [AHCI mode] (rev 04)
00:1f.3 SMBus: Intel Corporation 7 Series/C210 Series Chipset Family SMBus Controller (rev 04)
01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Cayman PRO [Radeon HD 6950]
01:00.1 Audio device: Advanced Micro Devices, Inc. [AMD/ATI] Cayman/Antilles HDMI Audio [Radeon HD 6900 Series]
03:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
04:00.0 Ethernet controller: Broadcom Corporation NetLink BCM57781 Gigabit Ethernet PCIe (rev 10)
05:00.0 PCI bridge: ASMedia Technology Inc. ASM1083/1085 PCIe to PCI Bridge (rev 01)
06:00.0 Multimedia video controller: Brooktree Corporation Bt878 Video Capture (rev 11)
06:00.1 Multimedia controller: Brooktree Corporation Bt878 Audio Capture (rev 11)
06:01.0 Multimedia audio controller: Creative Labs SB Audigy (rev 04)
06:01.1 Input device controller: Creative Labs SB Audigy Game Port (rev 04)
06:01.2 FireWire (IEEE 1394): Creative Labs SB Audigy FireWire Port (rev 04)
07:00.0 USB controller: ASMedia Technology Inc. ASM1042 SuperSpeed USB Host Controller
Comment 16 Harald Judt 2013-06-14 21:04:23 UTC
Ok, the following statements deactivate async for the graphics card (+ its HDMI audio):
echo disabled > "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/power/async"
echo disabled > "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/power/async"

This seems to have a similar effect as "echo 0 > /sys/power/pm_async".

With this, I ran a series of 30 hibernate/resume cycles. The 7th and the 29th attempts failed (the last one failed twice), with the usual symptoms. So it seems that disabling async for the graphics card indeed only alleviates the problem. I will perform the same tests with pm_async deactivated just to be sure.
Comment 17 Harald Judt 2013-06-14 21:32:29 UTC
On the other hand, 0 of 30 hibernate/resume cycles failed with "echo 0 > /sys/power/pm_async". Maybe coincidence, but probably not.

I will continue disabling async for additional single devices.
Comment 18 Lan Tianyu 2013-06-17 02:14:18 UTC
Thanks for test. I think you have done a very good job.

(In reply to comment #16)
> Ok, the following statements deactivate async for the graphics card (+ its
> HDMI
> audio):
> echo disabled >
> "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/power/async"
> echo disabled >
> "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/power/async"
> 
> This seems to have a similar effect as "echo 0 > /sys/power/pm_async".
> 
This shows the issue is still related with Graphics driver. Disable pm_async is just to make all the driver suspend callback to be executed serially.

So reassign this bug to Graphic category.


> With this, I ran a series of 30 hibernate/resume cycles. The 7th and the 29th
> attempts failed (the last one failed twice), with the usual symptoms. So it
> seems that disabling async for the graphics card indeed only alleviates the
> problem. I will perform the same tests with pm_async deactivated just to be
> sure.
Comment 19 Harald Judt 2013-06-25 19:10:07 UTC
> echo disabled >
> "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/power/async"
> echo disabled >
> "/sys/devices/pci0000:00/0000:00:01.0/0000:01:00.1/power/async"
> 
> This seems to have a similar effect as "echo 0 > /sys/power/pm_async".

I believe I was wrong with this assumption.

However, I got serial console debugging working now. Unfortunately, it does not reveal much, but here is the output anyway.

Linux version 3.8.13+ (user@localhost) (gcc version 4.7.3 (Gentoo 4.7.3 p1.0, pie-0.5.5) ) #142 SMP PREEMPT Tue Jun 25 18:56:39 CEST 2013^M^J
Command line: root=/dev/sda6 rootfstype=ext4 video=DVI-I-1:1920x1080 radeon.pcie_gen2=1 radeon.audio=1 pci=nocrs console=ttyS0,115200n8 console=tty0 no_console_suspend=1^M^J
e820: BIOS-provided physical RAM map:^M^J
BIOS-e820: [mem 0x0000000000000000-0x000000000009d7ff] usable^M^J
BIOS-e820: [mem 0x000000000009d800-0x000000000009ffff] reserved^M^J
BIOS-e820: [mem 0x00000000000e0000-0x00000000000fffff] reserved^M^J
BIOS-e820: [mem 0x0000000000100000-0x00000000dd49dfff] usable^M^J
BIOS-e820: [mem 0x00000000dd49e000-0x00000000de048fff] reserved^M^J
BIOS-e820: [mem 0x00000000de049000-0x00000000de0d5fff] usable^M^J
BIOS-e820: [mem 0x00000000de0d6000-0x00000000de177fff] ACPI NVS^M^J
BIOS-e820: [mem 0x00000000de178000-0x00000000de976fff] reserved^M^J
BIOS-e820: [mem 0x00000000de977000-0x00000000de977fff] usable^M^J
BIOS-e820: [mem 0x00000000de978000-0x00000000de9bafff] ACPI NVS^M^J
BIOS-e820: [mem 0x00000000de9bb000-0x00000000df3f5fff] usable^M^J
BIOS-e820: [mem 0x00000000df3f6000-0x00000000df7f1fff] reserved^M^J
BIOS-e820: [mem 0x00000000df7f2000-0x00000000df7fffff] usable^M^J
BIOS-e820: [mem 0x00000000f8000000-0x00000000fbffffff] reserved^M^J
BIOS-e820: [mem 0x00000000fec00000-0x00000000fec00fff] reserved^M^J
BIOS-e820: [mem 0x00000000fed00000-0x00000000fed03fff] reserved^M^J
BIOS-e820: [mem 0x00000000fed1c000-0x00000000fed1ffff] reserved^M^J
BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved^M^J
BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved^M^J
BIOS-e820: [mem 0x0000000100000000-0x000000041effffff] usable^M^J
NX (Execute Disable) protection: active^M^J
SMBIOS 2.7 present.^M^J
No AGP bridge found^M^J
e820: last_pfn = 0x41f000 max_arch_pfn = 0x400000000^M^J
x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106^M^J
e820: last_pfn = 0xdf800 max_arch_pfn = 0x400000000^M^J
found SMP MP-table at [mem 0x000fd8e0-0x000fd8ef] mapped at [ffff8800000fd8e0]^M^J
init_memory_mapping: [mem 0x00000000-0xdf7fffff]^M^J
init_memory_mapping: [mem 0x100000000-0x41effffff]^M^J
ACPI: RSDP 00000000000f0490 00024 (v02 ALASKA)^M^J
ACPI: XSDT 00000000de169078 0006C (v01 ALASKA    A M I 01072009 AMI  00010013)^M^J
ACPI: FACP 00000000de172f58 0010C (v05 ALASKA    A M I 01072009 AMI  00010013)^M^J
ACPI: DSDT 00000000de169178 09DDD (v02 ALASKA    A M I 00000022 INTL 20051117)^M^J
ACPI: FACS 00000000de176080 00040^M^J
ACPI: APIC 00000000de173068 00092 (v03 ALASKA    A M I 01072009 AMI  00010013)^M^J
ACPI: FPDT 00000000de173100 00044 (v01 ALASKA    A M I 01072009 AMI  00010013)^M^J
ACPI: MCFG 00000000de173148 0003C (v01 ALASKA    A M I 01072009 MSFT 00000097)^M^J
ACPI: AAFT 00000000de173188 00112 (v01 ALASKA OEMAAFT  01072009 MSFT 00000097)^M^J
ACPI: HPET 00000000de1732a0 00038 (v01 ALASKA    A M I 01072009 AMI. 00000005)^M^J
ACPI: SSDT 00000000de1732d8 0036D (v01 SataRe SataTabl 00001000 INTL 20091112)^M^J
ACPI: SSDT 00000000de173648 00A92 (v01  PmRef    CpuPm 00003000 INTL 20051117)^M^J
ACPI: BGRT 00000000de1740e0 00038 (v00 ALASKA    A M I 01072009 AMI  00010013)^M^J
No NUMA configuration found^M^J
Faking a node at [mem 0x0000000000000000-0x000000041effffff]^M^J
Initmem setup node 0 [mem 0x00000000-0x41effffff]^M^J
  NODE_DATA [mem 0x41effc000-0x41effffff]^M^J
Zone ranges:^M^J
  DMA      [mem 0x00010000-0x00ffffff]^M^J
  DMA32    [mem 0x01000000-0xffffffff]^M^J
  Normal   [mem 0x100000000-0x41effffff]^M^J
Movable zone start for each node^M^J
Early memory node ranges^M^J
  node   0: [mem 0x00010000-0x0009cfff]^M^J
  node   0: [mem 0x00100000-0xdd49dfff]^M^J
  node   0: [mem 0xde049000-0xde0d5fff]^M^J
  node   0: [mem 0xde977000-0xde977fff]^M^J
  node   0: [mem 0xde9bb000-0xdf3f5fff]^M^J
  node   0: [mem 0xdf7f2000-0xdf7fffff]^M^J
  node   0: [mem 0x100000000-0x41effffff]^M^J
ACPI: PM-Timer IO Port: 0x408^M^J
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)^M^J
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] enabled)^M^J
ACPI: LAPIC (acpi_id[0x03] lapic_id[0x04] enabled)^M^J
ACPI: LAPIC (acpi_id[0x04] lapic_id[0x06] enabled)^M^J
ACPI: LAPIC (acpi_id[0x05] lapic_id[0x01] enabled)^M^J
ACPI: LAPIC (acpi_id[0x06] lapic_id[0x03] enabled)^M^J
ACPI: LAPIC (acpi_id[0x07] lapic_id[0x05] enabled)^M^J
ACPI: LAPIC (acpi_id[0x08] lapic_id[0x07] enabled)^M^J
ACPI: LAPIC_NMI (acpi_id[0xff] high edge lint[0x1])^M^J
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])^M^J
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23^M^J
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)^M^J
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)^M^J
Using ACPI (MADT) for SMP configuration information^M^J
ACPI: HPET id: 0x8086a701 base: 0xfed00000^M^J
smpboot: Allowing 8 CPUs, 0 hotplug CPUs^M^J
PM: Registered nosave memory: 000000000009d000 - 000000000009e000^M^J
PM: Registered nosave memory: 000000000009e000 - 00000000000a0000^M^J
PM: Registered nosave memory: 00000000000a0000 - 00000000000e0000^M^J
PM: Registered nosave memory: 00000000000e0000 - 0000000000100000^M^J
PM: Registered nosave memory: 00000000dd49e000 - 00000000de049000^M^J
PM: Registered nosave memory: 00000000de0d6000 - 00000000de178000^M^J
PM: Registered nosave memory: 00000000de178000 - 00000000de977000^M^J
PM: Registered nosave memory: 00000000de978000 - 00000000de9bb000^M^J
PM: Registered nosave memory: 00000000df3f6000 - 00000000df7f2000^M^J
PM: Registered nosave memory: 00000000df800000 - 00000000f8000000^M^J
PM: Registered nosave memory: 00000000f8000000 - 00000000fc000000^M^J
PM: Registered nosave memory: 00000000fc000000 - 00000000fec00000^M^J
PM: Registered nosave memory: 00000000fec00000 - 00000000fec01000^M^J
PM: Registered nosave memory: 00000000fec01000 - 00000000fed00000^M^J
PM: Registered nosave memory: 00000000fed00000 - 00000000fed04000^M^J
PM: Registered nosave memory: 00000000fed04000 - 00000000fed1c000^M^J
PM: Registered nosave memory: 00000000fed1c000 - 00000000fed20000^M^J
PM: Registered nosave memory: 00000000fed20000 - 00000000fee00000^M^J
PM: Registered nosave memory: 00000000fee00000 - 00000000fee01000^M^J
PM: Registered nosave memory: 00000000fee01000 - 00000000ff000000^M^J
PM: Registered nosave memory: 00000000ff000000 - 0000000100000000^M^J
e820: [mem 0xdf800000-0xf7ffffff] available for PCI devices^M^J
setup_percpu: NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:8 nr_node_ids:1^M^J
PERCPU: Embedded 25 pages/cpu @ffff88041ec00000 s70592 r8192 d23616 u262144^M^J
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 4116414^M^J
Policy zone: Normal^M^J
Kernel command line: root=/dev/sda6 rootfstype=ext4 video=DVI-I-1:1920x1080 radeon.pcie_gen2=1 radeon.audio=1 pci=nocrs console=ttyS0,115200n8 console=tty0 no_console_suspend=1^M^J
PID hash table entries: 4096 (order: 3, 32768 bytes)^M^J
__ex_table already sorted, skipping sort^M^J
xsave: enabled xstate_bv 0x7, cntxt size 0x340^M^J
Checking aperture...^M^J
No AGP bridge found^M^J
Memory: 16388024k/17285120k available (4498k kernel code, 558072k absent, 339024k reserved, 4128k data, 496k init)^M^J
SLUB: Genslabs=15, HWalign=64, Order=0-3, MinObjects=0, CPUs=8, Nodes=1^M^J
Preemptible hierarchical RCU implementation.^M^J
^ICONFIG_RCU_FANOUT set to non-default value of 32^M^J
NR_IRQS:4352 nr_irqs:744 16^M^J
Extended CMOS year: 2000^M^J
Console: colour VGA+ 80x25^M^J
console [tty0] enabled^M^J
console [ttyS0] enabled^M^J
tsc: Fast TSC calibration using PIT^M^J
tsc: Detected 3400.073 MHz processor^M^J
Calibrating delay loop (skipped), value calculated using timer frequency.. 6800.14 BogoMIPS (lpj=3400073)^M^J
pid_max: default: 32768 minimum: 301^M^J
Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes)^M^J
Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes)^M^J
Mount-cache hash table entries: 256^M^J
CPU: Physical Processor ID: 0^M^J
CPU: Processor Core ID: 0^M^J
ENERGY_PERF_BIAS: Set to 'normal', was 'performance'^M^J
ENERGY_PERF_BIAS: View and update with x86_energy_perf_policy(8)^M^J
mce: CPU supports 9 MCE banks^M^J
CPU0: Thermal monitoring enabled (TM1)^M^J
process: using mwait in idle threads^M^J
Last level iTLB entries: 4KB 512, 2MB 0, 4MB 0^M^J
Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32^M^J
tlb_flushall_shift: 5^M^J
Freeing SMP alternatives: 20k freed^M^J
ACPI: Core revision 20121018^M^J
..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1^M^J
smpboot: CPU0: Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz (fam: 06, model: 2a, stepping: 07)^M^J
Performance Events: PEBS fmt1+, 16-deep LBR, SandyBridge events, Intel PMU driver.^M^J
... version:                3^M^J
... bit width:              48^M^J
... generic registers:      4^M^J
... value mask:             0000ffffffffffff^M^J
... max period:             000000007fffffff^M^J
... fixed-purpose events:   3^M^J
... event mask:             000000070000000f^M^J
smpboot: Booting Node   0, Processors  #1 #2 #3 #4 #5 #6 #7 OK^M^J
Brought up 8 CPUs^M^J
smpboot: Total of 8 processors activated (54401.16 BogoMIPS)^M^J
devtmpfs: initialized^M^J
PM: Registering ACPI NVS region [mem 0xde0d6000-0xde177fff] (663552 bytes)^M^J
PM: Registering ACPI NVS region [mem 0xde978000-0xde9bafff] (274432 bytes)^M^J
NET: Registered protocol family 16^M^J
ACPI FADT declares the system doesn't support PCIe ASPM, so disable it^M^J
ACPI: bus type pci registered^M^J
PCI: MMCONFIG for domain 0000 [bus 00-3f] at [mem 0xf8000000-0xfbffffff] (base 0xf8000000)^M^J
PCI: MMCONFIG at [mem 0xf8000000-0xfbffffff] reserved in E820^M^J
PCI: Using configuration type 1 for base access^M^J
bio: create slab <bio-0> at 0^M^J
ACPI: Added _OSI(Module Device)^M^J
ACPI: Added _OSI(Processor Device)^M^J
ACPI: Added _OSI(3.0 _SCP Extensions)^M^J
ACPI: Added _OSI(Processor Aggregator Device)^M^J
ACPI: Executed 1 blocks of module-level executable AML code^M^J
ACPI Error: [\_PR_.CPU0._PPC] Namespace lookup failure, AE_NOT_FOUND (20121018/psargs-359)^M^J
ACPI Error: Method parse/execution failed [\_PR_.CPU0._PDC] (Node ffff88040c4a1280), AE_NOT_FOUND (20121018/psparse-537)^M^J
ACPI: SSDT 00000000ddff8c18 00119 (v01  PmRef    ApCst 00003000 INTL 20051117)^M^J
ACPI: Dynamic OEM Table Load:^M^J
ACPI: SSDT           (null) 00119 (v01  PmRef    ApCst 00003000 INTL 20051117)^M^J
ACPI: Interpreter enabled^M^J
ACPI: (supports S0 S3 S4 S5)^M^J
ACPI: Using IOAPIC for interrupt routing^M^J
ACPI: No dock devices found.^M^J
PCI: Ignoring host bridge windows from ACPI; if necessary, use "pci=use_crs" and report a bug^M^J
ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-3e])^M^J
PCI host bridge to bus 0000:00^M^J
pci_bus 0000:00: root bus resource [bus 00-3e]^M^J
pci_bus 0000:00: root bus resource [io  0x0000-0xffff]^M^J
pci_bus 0000:00: root bus resource [mem 0x00000000-0xfffffffff]^M^J
pci 0000:00:01.0: PCI bridge to [bus 01]^M^J
pci 0000:00:1c.0: PCI bridge to [bus 02]^M^J
pci 0000:00:1c.3: PCI bridge to [bus 03]^M^J
pci 0000:00:1c.4: PCI bridge to [bus 04]^M^J
pci 0000:00:1c.5: PCI bridge to [bus 05-06]^M^J
pci 0000:05:00.0: PCI bridge to [bus 06] (subtractive decode)^M^J
pci 0000:00:1c.7: PCI bridge to [bus 07]^M^J
 pci0000:00: Requesting ACPI _OSC control (0x1d)^M^J
 pci0000:00: ACPI _OSC control (0x18) granted^M^J
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 10 *11 12 14 15)^M^J
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 *10 11 12 14 15)^M^J
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 10 *11 12 14 15)^M^J
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 *5 6 10 11 12 14 15)^M^J
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.^M^J
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 10 11 12 14 15) *0, disabled.^M^J
ACPI: PCI Interrupt Link [LNKG] (IRQs *3 4 5 6 10 11 12 14 15)^M^J
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 *10 11 12 14 15)^M^J
vgaarb: device added: PCI:0000:01:00.0,decodes=io+mem,owns=io+mem,locks=none^M^J
vgaarb: loaded^M^J
vgaarb: bridge control possible 0000:01:00.0^M^J
SCSI subsystem initialized^M^J
ACPI: bus type scsi registered^M^J
pps_core: LinuxPPS API ver. 1 registered^M^J
pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>^M^J
PTP clock support registered^M^J
PCI: Using ACPI for IRQ routing^M^J
hpet0: at MMIO 0xfed00000, IRQs 2, 8, 0, 0, 0, 0, 0, 0^M^J
hpet0: 8 comparators, 64-bit 14.318180 MHz counter^M^J
Switching to clocksource hpet^M^J
pnp: PnP ACPI init^M^J
ACPI: bus type pnp registered^M^J
system 00:00: [mem 0xfed40000-0xfed44fff] has been reserved^M^J
system 00:04: [io  0x0680-0x069f] has been reserved^M^J
system 00:04: [io  0x1000-0x100f] has been reserved^M^J
system 00:04: [io  0xffff] has been reserved^M^J
system 00:04: [io  0xffff] has been reserved^M^J
system 00:04: [io  0x0400-0x0453] has been reserved^M^J
system 00:04: [io  0x0458-0x047f] has been reserved^M^J
system 00:04: [io  0x0500-0x057f] has been reserved^M^J
system 00:04: [io  0x164e-0x164f] has been reserved^M^J
system 00:06: [io  0x0454-0x0457] has been reserved^M^J
system 00:07: [io  0x0290-0x029f] has been reserved^M^J
system 00:08: [io  0x04d0-0x04d1] has been reserved^M^J
system 00:0c: [mem 0xfed1c000-0xfed1ffff] has been reserved^M^J
system 00:0c: [mem 0xfed10000-0xfed17fff] has been reserved^M^J
system 00:0c: [mem 0xfed18000-0xfed18fff] has been reserved^M^J
system 00:0c: [mem 0xfed19000-0xfed19fff] has been reserved^M^J
system 00:0c: [mem 0xf8000000-0xfbffffff] has been reserved^M^J
system 00:0c: [mem 0xfed20000-0xfed3ffff] has been reserved^M^J
system 00:0c: [mem 0xfed90000-0xfed93fff] has been reserved^M^J
system 00:0c: [mem 0xfed45000-0xfed8ffff] has been reserved^M^J
system 00:0c: [mem 0xff000000-0xffffffff] has been reserved^M^J
system 00:0c: [mem 0xfee00000-0xfeefffff] could not be reserved^M^J
system 00:0c: [mem 0xf0200000-0xf0200fff] has been reserved^M^J
pnp: PnP ACPI: found 13 devices^M^J
ACPI: ACPI bus type pnp unregistered^M^J
pci 0000:00:01.0: PCI bridge to [bus 01]^M^J
pci 0000:00:01.0:   bridge window [io  0xe000-0xefff]^M^J
pci 0000:00:01.0:   bridge window [mem 0xf7e00000-0xf7efffff]^M^J
pci 0000:00:01.0:   bridge window [mem 0xe0000000-0xefffffff 64bit pref]^M^J
pci 0000:00:1c.0: PCI bridge to [bus 02]^M^J
pci 0000:00:1c.3: PCI bridge to [bus 03]^M^J
pci 0000:00:1c.3:   bridge window [io  0xd000-0xdfff]^M^J
pci 0000:00:1c.3:   bridge window [mem 0xf7d00000-0xf7dfffff]^M^J
pci 0000:00:1c.4: PCI bridge to [bus 04]^M^J
pci 0000:00:1c.4:   bridge window [mem 0xf7c00000-0xf7cfffff]^M^J
pci 0000:00:1c.4:   bridge window [mem 0xf0100000-0xf01fffff 64bit pref]^M^J
pci 0000:05:00.0: PCI bridge to [bus 06]^M^J
pci 0000:05:00.0:   bridge window [io  0xc000-0xcfff]^M^J
pci 0000:05:00.0:   bridge window [mem 0xf7b00000-0xf7bfffff]^M^J
pci 0000:05:00.0:   bridge window [mem 0xf0000000-0xf00fffff 64bit pref]^M^J
pci 0000:00:1c.5: PCI bridge to [bus 05-06]^M^J
pci 0000:00:1c.5:   bridge window [io  0xc000-0xcfff]^M^J
pci 0000:00:1c.5:   bridge window [mem 0xf7b00000-0xf7bfffff]^M^J
pci 0000:00:1c.5:   bridge window [mem 0xf0000000-0xf00fffff 64bit pref]^M^J
pci 0000:00:1c.7: PCI bridge to [bus 07]^M^J
pci 0000:00:1c.7:   bridge window [mem 0xf7a00000-0xf7afffff]^M^J
NET: Registered protocol family 2^M^J
TCP established hash table entries: 131072 (order: 9, 2097152 bytes)^M^J
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)^M^J
TCP: Hash tables configured (established 131072 bind 65536)^M^J
TCP: reno registered^M^J
UDP hash table entries: 8192 (order: 6, 262144 bytes)^M^J
UDP-Lite hash table entries: 8192 (order: 6, 262144 bytes)^M^J
NET: Registered protocol family 1^M^J
PCI-DMA: Using software bounce buffering for IO (SWIOTLB)^M^J
software IO TLB [mem 0xd949e000-0xdd49e000] (64MB) mapped at [ffff8800d949e000-ffff8800dd49dfff]^M^J
audit: initializing netlink socket (disabled)^M^J
type=2000 audit(1372193434.673:1): initialized^M^J
HugeTLB registered 2 MB page size, pre-allocated 0 pages^M^J
NTFS driver 2.1.30 [Flags: R/O].^M^J
fuse init (API version 7.20)^M^J
Btrfs loaded^M^J
msgmni has been set to 32007^M^J
Block layer SCSI generic (bsg) driver version 0.4 loaded (major 251)^M^J
io scheduler noop registered^M^J
io scheduler deadline registered^M^J
io scheduler cfq registered^M^J
io scheduler bfq registered (default)^M^J
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled^M^J
00:0b: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A^M^J
[drm] Initialized drm 1.1.0 20060810^M^J
[drm] radeon defaulting to kernel modesetting.^M^J
[drm] radeon kernel modesetting enabled.^M^J
[drm] initializing kernel modesetting (CAYMAN 0x1002:0x6719 0x1002:0x0B00).^M^J
[drm] register mmio base: 0xF7E20000^M^J
[drm] register mmio size: 131072^M^J
ATOM BIOS: CAYMAN^M^J
radeon 0000:01:00.0: VRAM: 2048M 0x0000000000000000 - 0x000000007FFFFFFF (2048M used)^M^J
radeon 0000:01:00.0: GTT: 512M 0x0000000080000000 - 0x000000009FFFFFFF^M^J
mtrr: type mismatch for e0000000,10000000 old: write-back new: write-combining^M^J
[drm] Detected VRAM RAM=2048M, BAR=256M^M^J
[drm] RAM width 256bits DDR^M^J
[TTM] Zone  kernel: Available graphics memory: 8194022 kiB^M^J
[TTM] Zone   dma32: Available graphics memory: 2097152 kiB^M^J
[TTM] Initializing pool allocator^M^J
[TTM] Initializing DMA pool allocator^M^J
[drm] radeon: 2048M of VRAM memory ready^M^J
[drm] radeon: 512M of GTT memory ready.^M^J
[drm] Supports vblank timestamp caching Rev 1 (10.10.2010).^M^J
[drm] Driver supports precise vblank timestamp query.^M^J
radeon 0000:01:00.0: radeon: using MSI.^M^J
[drm] radeon: irq initialized.^M^J
[drm] GART: num cpu pages 131072, num gpu pages 131072^M^J
[drm] probing gen 2 caps for device 8086:101 = 2/0^M^J
[drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0^M^J
[drm] Loading CAYMAN Microcode^M^J
[drm] PCIE GART of 512M enabled (table at 0x0000000000040000).^M^J
radeon 0000:01:00.0: WB enabled^M^J
radeon 0000:01:00.0: fence driver on ring 0 use gpu addr 0x0000000080000c00 and cpu addr 0xffff88040bd6cc00^M^J
radeon 0000:01:00.0: fence driver on ring 1 use gpu addr 0x0000000080000c04 and cpu addr 0xffff88040bd6cc04^M^J
radeon 0000:01:00.0: fence driver on ring 2 use gpu addr 0x0000000080000c08 and cpu addr 0xffff88040bd6cc08^M^J
radeon 0000:01:00.0: fence driver on ring 3 use gpu addr 0x0000000080000c0c and cpu addr 0xffff88040bd6cc0c^M^J
radeon 0000:01:00.0: fence driver on ring 4 use gpu addr 0x0000000080000c10 and cpu addr 0xffff88040bd6cc10^M^J
[drm] ring test on 0 succeeded in 3 usecs^M^J
[drm] ring test on 3 succeeded in 2 usecs^M^J
[drm] ring test on 4 succeeded in 1 usecs^M^J
[drm] Enabling audio support^M^J
[drm] ib test on ring 0 succeeded in 0 usecs^M^J
[drm] ib test on ring 3 succeeded in 0 usecs^M^J
[drm] ib test on ring 4 succeeded in 1 usecs^M^J
[drm] Radeon Display Connectors^M^J
[drm] Connector 0:^M^J
[drm]   DP-1^M^J
[drm]   HPD5^M^J
[drm]   DDC: 0x6430 0x6430 0x6434 0x6434 0x6438 0x6438 0x643c 0x643c^M^J
[drm]   Encoders:^M^J
[drm]     DFP1: INTERNAL_UNIPHY2^M^J
[drm] Connector 1:^M^J
[drm]   DP-2^M^J
[drm]   HPD4^M^J
[drm]   DDC: 0x6440 0x6440 0x6444 0x6444 0x6448 0x6448 0x644c 0x644c^M^J
[drm]   Encoders:^M^J
[drm]     DFP2: INTERNAL_UNIPHY2^M^J
[drm] Connector 2:^M^J
[drm]   HDMI-A-1^M^J
[drm]   HPD6^M^J
[drm]   DDC: 0x6460 0x6460 0x6464 0x6464 0x6468 0x6468 0x646c 0x646c^M^J
[drm]   Encoders:^M^J
[drm]     DFP3: INTERNAL_UNIPHY1^M^J
[drm] Connector 3:^M^J
[drm]   DVI-D-1^M^J
[drm]   HPD1^M^J
[drm]   DDC: 0x6450 0x6450 0x6454 0x6454 0x6458 0x6458 0x645c 0x645c^M^J
[drm]   Encoders:^M^J
[drm]     DFP4: INTERNAL_UNIPHY1^M^J
[drm] Connector 4:^M^J
[drm]   DVI-I-1^M^J
[drm]   HPD3^M^J
[drm]   DDC: 0x6470 0x6470 0x6474 0x6474 0x6478 0x6478 0x647c 0x647c^M^J
[drm]   Encoders:^M^J
[drm]     DFP5: INTERNAL_UNIPHY^M^J
[drm]     CRT1: INTERNAL_KLDSCP_DAC1^M^J
[drm] Internal thermal controller with fan control^M^J
[drm] radeon: power management initialized^M^J
[drm] fb mappable at 0xE114A000^M^J
[drm] vram apper at 0xE0000000^M^J
[drm] size 8294400^M^J
[drm] fb depth is 24^M^J
[drm]    pitch is 7680^M^J
fbcon: radeondrmfb (fb0) is primary device^M^J
Console: switching to colour frame buffer device 240x67^M^J
radeon 0000:01:00.0: fb0: radeondrmfb frame buffer device^M^J
radeon 0000:01:00.0: registered panic notifier^M^J
[drm] Initialized radeon 2.29.0 20080528 for 0000:01:00.0 on minor 0^M^J
ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 6 Gbps 0x3f impl SATA mode^M^J
ahci 0000:00:1f.2: flags: 64bit ncq pm led clo pio slum part ems apst ^M^J
scsi0 : ahci^M^J
scsi1 : ahci^M^J
scsi2 : ahci^M^J
scsi3 : ahci^M^J
scsi4 : ahci^M^J
scsi5 : ahci^M^J
ata1: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16100 irq 42^M^J
ata2: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16180 irq 42^M^J
ata3: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16200 irq 42^M^J
ata4: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16280 irq 42^M^J
ata5: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16300 irq 42^M^J
ata6: SATA max UDMA/133 abar m2048@0xf7f16000 port 0xf7f16380 irq 42^M^J
ahci: SSS flag set, parallel bus scan disabled^M^J
ahci 0000:03:00.0: AHCI 0001.0200 32 slots 2 ports 6 Gbps 0x3 impl SATA mode^M^J
ahci 0000:03:00.0: flags: 64bit ncq sntf stag led clo pmp pio slum part ccc sxs ^M^J
scsi6 : ahci^M^J
scsi7 : ahci^M^J
ata7: SATA max UDMA/133 abar m512@0xf7d00000 port 0xf7d00100 irq 43^M^J
ata8: SATA max UDMA/133 abar m512@0xf7d00000 port 0xf7d00180 irq 43^M^J
tg3.c:v3.128 (December 03, 2012)^M^J
tg3 0000:04:00.0 eth0: Tigon3 [partno(BCM57781) rev 57785100] (PCI Express) MAC address bc:5f:f4:43:92:27^M^J
tg3 0000:04:00.0 eth0: attached PHY is 57765 (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])^M^J
tg3 0000:04:00.0 eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] TSOcap[1]^M^J
tg3 0000:04:00.0 eth0: dma_rwctrl[00000001] dma_mask[64-bit]^M^J
i8042: PNP: PS/2 Controller [PNP0303:PS2K] at 0x60,0x64 irq 1^M^J
i8042: PNP: PS/2 appears to have AUX port disabled, if this is incorrect please boot with i8042.nopnp^M^J
serio: i8042 KBD port at 0x60,0x64 irq 1^M^J
mousedev: PS/2 mouse device common for all mice^M^J
rtc_cmos 00:05: RTC can wake from S4^M^J
rtc_cmos 00:05: rtc core: registered rtc_cmos as rtc0^M^J
rtc0: alarms up to one month, y3k, 242 bytes nvram, hpet irqs^M^J
cpuidle: using governor ladder^M^J
cpuidle: using governor menu^M^J
TCP: cubic registered^M^J
NET: Registered protocol family 17^M^J
TuxOnIce 3.3 (http://tuxonice.net)^M^J
input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0^M^J
tsc: Refined TSC clocksource calibration: 3400.023 MHz^M^J
Switching to clocksource tsc^M^J
ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)^M^J
ata5: SATA link down (SStatus 0 SControl 300)^M^J
ata2: SATA link down (SStatus 0 SControl 300)^M^J
ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300)^M^J
ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)^M^J
ata6: SATA link up 1.5 Gbps (SStatus 113 SControl 300)^M^J
ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata7: SATA link down (SStatus 0 SControl 300)^M^J
ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata1.00: ATA-9: M4-CT256M4SSD2, 040H, max UDMA/100^M^J
ata1.00: 500118192 sectors, multi 16: LBA48 NCQ (depth 31/32), AA^M^J
ata6.00: ATAPI: Optiarc DVD RW AD-7260S, 1.03, max UDMA/100^M^J
ata4.00: HPA detected: current 976771055, native 976773168^M^J
ata3.00: HPA detected: current 1953523055, native 1953525168^M^J
ata4.00: ATA-8: SAMSUNG HD501LJ, CR100-10, max UDMA7^M^J
ata4.00: 976771055 sectors, multi 16: LBA48 NCQ (depth 31/32), AA^M^J
ata1.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata1.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata3.00: ATA-8: WDC WD10EADS-00M2B0, 01.00A01, max UDMA/133^M^J
ata3.00: 1953523055 sectors, multi 16: LBA48 NCQ (depth 31/32), AA^M^J
ata6.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata6.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata4.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
ata4.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata1.00: configured for UDMA/100^M^J
ata3.00: ACPI cmd f5/00:00:00:00:00:00 (SECURITY FREEZE LOCK) filtered out^M^J
scsi 0:0:0:0: Direct-Access     ATA      M4-CT256M4SSD2   040H PQ: 0 ANSI: 5^M^J
sd 0:0:0:0: [sda] 500118192 512-byte logical blocks: (256 GB/238 GiB)^M^J
sd 0:0:0:0: [sda] Write Protect is off^M^J
sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA^M^J
ata3.00: ACPI cmd b1/c1:00:00:00:00:00 (DEVICE CONFIGURATION OVERLAY) filtered out^M^J
ata6.00: configured for UDMA/100^M^J
ata4.00: configured for UDMA/133^M^J
ata3.00: configured for UDMA/133^M^J
scsi 2:0:0:0: Direct-Access     ATA      WDC WD10EADS-00M 01.0 PQ: 0 ANSI: 5^M^J
sd 2:0:0:0: [sdb] 1953523055 512-byte logical blocks: (1.00 TB/931 GiB)^M^J
scsi 3:0:0:0: Direct-Access     ATA      SAMSUNG HD501LJ  CR10 PQ: 0 ANSI: 5^M^J
 sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 sda10 sda11 >^M^J
sd 3:0:0:0: [sdc] 976771055 512-byte logical blocks: (500 GB/465 GiB)^M^J
sd 3:0:0:0: [sdc] Write Protect is off^M^J
sd 3:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA^M^J
sd 0:0:0:0: [sda] Attached SCSI disk^M^J
scsi 5:0:0:0: CD-ROM            Optiarc  DVD RW AD-7260S  1.03 PQ: 0 ANSI: 5^M^J
 sdc: sdc1 sdc3 < sdc5 sdc6 >^M^J
sd 3:0:0:0: [sdc] Attached SCSI disk^M^J
sd 2:0:0:0: [sdb] Write Protect is off^M^J
sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA^M^J
ata8: SATA link down (SStatus 0 SControl 300)^M^J
 sdb: sdb2 < sdb5 sdb6 sdb7 sdb8 sdb9 sdb10 sdb11 >^M^J
sd 2:0:0:0: [sdb] Attached SCSI dis^M^J
TuxOnIce: Image found.^M^J
Using dev_t 8:9, returned by blk_lookup_fs_info.^M^J
Failed to launch userspace program '/sbin/tuxoniceui': Error -2^M
Launch userspace program failed.zing user space processes ... (elapsed 0.00 seconds) done.^M^J
Reading kernel & process data...^M^J
...20%...40%...60%...80%^M^J
Atomic restore.^M^J
Doing atomic copy/restore.^M^J

And that's all, the computer hangs here. Any way to get more output?
Comment 20 Nigel Cunningham 2013-06-25 20:54:06 UTC
If you install the userui binary in the initramfs, you'll be able to single step through the process and modify the TuxOnIce debugging settings as you go - perhaps this will help. Press 'P' to pause between steps, 0-9 to modify the level of debugging info that's displayed (9 = max info, including all kernel messages).
Comment 21 Harald Judt 2013-07-10 19:44:01 UTC
It took me a while because it turned out the old cable was faulty after all, and I had to replace it. Now that I got a new one, I installed an initramfs with tuxonice userui and tried to reproduce. Unfortunately, I was not very successful in getting more information.

Here are the last lines of the log. Sometimes when it hangs, only the 20%....80% line gets printed and nothing more. Sometimes, it goes like this:

...
TuxOnIce: Image found.^M^J
Using dev_t 8:9, returned by blk_lookup_fs_info.^M^J
Console is 67x240.^M^J
Freezing user space processes ... (elapsed 0.00 seconds) done.^M^J
...20%...40%...60%...80%^M^J
usb usb3: root hub lost power or was reset^M^J
usb usb6: root hub lost power or was reset^M^J
bttv: 0: reset, reinitialize^M^J
bttv: 0: Setting PLL: 28636363 => 35468950 (needs up to 100ms)^M^J
ehci-pci 0000:00:1d.0: setting latency timer to 64^M^J
usb usb5: root hub lost power or was reset^M^J
snd_hda_intel 0000:00:1b.0: irq 45 for MSI/MSI-X^M^J
snd_hda_intel 0000:01:00.1: irq 46 for MSI/MSI-X^M^J
i8042 kbd 00:0a: System wakeup disabled by ACPI^M^J
tg3 0000:04:00.0 eth0: Link is down^M^J
serial 00:0b: activated^M^J
ehci-pci 0000:00:1a.0: cache line size of 64 is not supported^M^J
ehci-pci 0000:00:1d.0: cache line size of 64 is not supported^M^J
bttv: PLL set ok^M^J

This happens after tuxoniceui shows me "About to restore original kernel. Press SPACE to continue". I guess this output is not complete, and not everything gets printed to the serial console because the computer hard freezes before that. The lines are similar, but not always in that order. They also appear in dmesg when everything goes right, so they're nothing special. Any idea what else I could do?
Comment 22 Harald Judt 2013-08-21 19:49:37 UTC
This was a problem with the power management of the radeon graphics driver and should be fixed in 3.11, as long as one disables UVD with an unofficial radeon.no_uvd=1 patch. Since this is solved for me now, thanks for your help.

Note You need to log in before you can comment on or make changes to this bug.