Bug 34102

Summary: radeon drm/kms: please use suspend/hibernate notifiers for allocating memory in suspend routines
Product: Drivers Reporter: Martin Steigerwald (Martin)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: CLOSED CODE_FIX    
Severity: normal CC: alan, florian, rjw
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.38 Subsystem:
Regression: No Bisected commit-id:
Bug Depends on:    
Bug Blocks: 7216    
Attachments: PM / Hibernate: Add new sysfs attribute for controlling reserved memory

Description Martin Steigerwald 2011-04-29 16:06:21 UTC
I have the issue that since switching to Radeon KMS neither in-kernel-suspend nor TuxOnIce - that I do not use anymore currently - hibernation aborts due to lack of free memory when I have more than just 2-3 applications open although I have 2 GB of RAM in my ThinkPad T42. For further details please see:

Bug #30482 -  try harder to free enough memory / improve image size autotuning

According to Rafael 

"The problem is a consequence of bugs in device drivers that shouldn't allocate
memory in their suspend/resume routines _at_ _all_.

So, a more fundamental fix would be to modify drivers so that they use
suspend/hibernate notifiers for allocating memory.  IOW, you should complain
to the developers of the drivers that cause problems to happen."

(see https://bugzilla.kernel.org/show_bug.cgi?id=30482#c7)

While I am not completely sure that it is the Radeon DRM/KMS driver allocating memory additionally memory during hibernation cycle having this issue since switching to KMS points at that.

Currently I am using 2.6.38.4 with gallium userspace:

martin@shambhala:~> glxinfo | grep OpenGL
OpenGL vendor string: X.Org R300 Project
OpenGL renderer string: Gallium 0.4 on ATI RV350
OpenGL version string: 2.1 Mesa 7.10.2
OpenGL shading language version string: 1.20
OpenGL extensions:

martin@shambhala:~> xdpyinfo | grep version
version number:    11.0
X.Org version: 1.10.1

but this happened with earlier versions as well.
Comment 1 Rafael J. Wysocki 2011-05-03 20:45:22 UTC
Strictly speaking, this is not a regression, because the radeon driver with
KMS has never worked correctly for you in this respect, right?

Now, while the ultimate fix would be to rework device drivers so that they
don't allocate memory from their ->suspend() callbacks, it's not quite
realistic to expect that that's going to happen any time soon.

For this reason, I'll try to provide a workaround for you.
Comment 2 Martin Steigerwald 2011-05-03 21:33:06 UTC
It depends on the view point. Compared with the pre KMS radeon driver this has been a regression for me.

Thanks for trying to provide a workaround. I appreciate it.

I work-around this problem for now by using KDE 4.6 activities which make it easy to close a set of application before hibernating and restoring them back to their old state after resuming. Anyway, my next notebook will have 8 GB of RAM and be 64-Bit, so this should not be much of the problem there. Just to hint at you that a fix is not urgent for me. But I will have that T42 where the problem happens at hand to test any workaround you provide.
Comment 3 Rafael J. Wysocki 2011-05-03 21:44:15 UTC
Created attachment 56382 [details]
PM / Hibernate: Add new sysfs attribute for controlling reserved memory

The attached patch adds a new knob, /sys/power/reserved_size, allowing you to
set the size of memory (in bytes) to be reserved for device drivers'
hibernation callbacks while counting the number of memory pages to allocate.

Please check if you can set it to such a number that hibernation will always
succeed on your machine.
Comment 4 Martin Steigerwald 2011-05-04 14:23:34 UTC
Sounds quite like extra pages allowance in TuxOnIce.

Okay, on 2.6.38.5-tp42-snapshot-resv-size-dirty with just your patch from this bug report I am starting with:

shambhala:/sys/power> ls
disk  image_size  pm_async  pm_test  reserved_size  resume  state  wakeup_count
shambhala:/sys/power> cat reserved_size 
1048576
shambhala:/sys/power> echo 2097152 > reserved_size
shambhala:/sys/power> cat reserved_size           
2097152

With that I suspended Amarok, KMail, Kontact, Akregaror on 4 activities with different backgrounds and whatnot just fine.

I bet I should remove any manual tweaking of image_size then, should I? I currently have:

echo 710000000 > /sys/power/image_size

Hmmm, I will remove it and reboot to get back the auto estimation.

Thanks.
Comment 5 Rafael J. Wysocki 2011-05-04 16:31:03 UTC
The image size setting shouldn't matter and in fact I'd like to restore the
old autotuned value (2/5 of RAM), so it would be good if you could test this
one.

Do you need a patch for that?
Comment 6 Martin Steigerwald 2011-05-04 19:37:53 UTC
Hmmm, I think I now have 2/5 of RAM as image_size autotune value with 2.6.38.5:

shambhala:/sys/power> free 
             total       used       free     shared    buffers     cached
Mem:       2073608    2018792      54816          0      34568     886860
-/+ buffers/cache:    1097364     976244
Swap:      4000180     141140    3859040
shambhala:/sys/power> cat image_size
844292096
shambhala:/sys/power> irb
irb(main):001:0> 2073608 * 1024
=> 2123374592
irb(main):002:0> 844292096 / 2123374592.0
=> 0.397618064745121
irb(main):003:0>

I removed the autotune refinement patch you asked me to test in bug #30482.

According to https://bugzilla.kernel.org/show_bug.cgi?id=30482#c5 that patch has been merged, but that doesn't seem to be the case!?
Comment 7 Rafael J. Wysocki 2011-05-04 20:05:11 UTC
It has been merged into the mainline, but not necessarily into -stable.

Anyway, with the patch from bug #30482 removed, are you able to set
reserved_size to a number allowing you to successfully hibernate every
time?
Comment 8 Martin Steigerwald 2011-05-05 19:53:40 UTC
Those 2 MB seem to do just fine with one KDE 4.6 session and lots of open stuff. Next week I hold a Linux training where I have a private and a job KDE session open and once. I will try whether the hibernation code takes that ;). If it doesn't - this sometimes didn't work with TuxOnIce either - even before Radeon KMS. Somewhere there might just be a limit?
Comment 9 Martin Steigerwald 2011-05-09 12:15:32 UTC
I now had a hang a preallocation (unfortunately before I came around to compile with your patch from bug 30492). I raised reserved size to 4 MiB for now. Lets see how that goes. If it works, then I might go back to 2 MiB or even less in order to trigger bug 30492 with the patch from there compiled in.
Comment 10 Martin Steigerwald 2011-05-09 15:15:49 UTC
This wasn't a hang I think, see there.

I tried to hibernate two running KDE 4 sessions with reserved_size upto 64 MiB:

shambhala:/sys/power> cat reserved_size 
67108864

Which failed with:

May  9 16:41:47 localhost kernel: PM: freeze of devices complete after 524.427 msecs
May  9 16:41:47 localhost kernel: PM: late freeze of devices complete after 0.509 msecs
May  9 16:41:47 localhost kernel: ACPI: Preparing to enter system sleep state S4
May  9 16:41:47 localhost kernel: PM: Saving platform NVS memory
May  9 16:41:47 localhost kernel: Extended CMOS year: 2000
May  9 16:41:47 localhost kernel: PM: Creating hibernation image:
May  9 16:41:47 localhost kernel: PM: Need to copy 218459 pages
May  9 16:41:47 localhost kernel: PM: Normal pages needed: 113686 + 1024, available pages: 113492
May  9 16:41:47 localhost kernel: PM: Not enough free memory
May  9 16:41:47 localhost kernel: PM: Error -12 creating hibernation image
May  9 16:41:47 localhost kernel: Extended CMOS year: 2000
May  9 16:41:47 localhost kernel: ACPI: Waking up from system sleep state S4
May  9 16:41:47 localhost kernel: PM: early recover of devices complete after 0.366 msecs

Does it make sense to go to even higher values? I am a bit puzzled, since for one KDE 4 session a value of 2 MiB has turned out to be enough for about 5-10 attempts - it didn't fail once.

Maybe here there is simply not enough memory available no matter what I reserve for driver allocation? If so, then so be it. That might be just a limit for a 2 GB RAM machine.

Is there a way to tell for sure whether reserved size is too low or there are general memory constraints? TuxOnIce logged how much of the reserved extra pages it used. Can in-kernel-hibernation do this too?
Comment 11 Martin Steigerwald 2011-05-09 15:59:09 UTC
Ok, now it did it! First with 256 MiB reserved_size then also with 128 MiB reserved_size. Two KDE 4 sessions with compositing enabled. It had some problems to freeze tasks initially, cause an rdiff-backup and two KDE 4 desktop search indexers were running, but after some tries it did it. And if the rdiff-backup and those desktop searches fill the laptop harddisk with I/O requests up to its limit, thats just what to be expected. I am now testing 128 MiB and looks whether it works reliably. Still I wonder why 2 MiB was enough for one KDE 4 session, but two need something around 128 MiB.
Comment 12 Rafael J. Wysocki 2011-05-09 19:06:31 UTC
It looks like the amount of memory the graphics driver allocates during
hibernation depends on how the GPU is loaded (the more 3D stuff you do before
hibernation, the more memory it needs to allocate).

So, I'd say reserved_size should match not only your hardware configuration,
but also your workload.

I'll try to post the patch from comment #3 for review, let's see if people
will like it. :-)
Comment 13 Florian Mickler 2011-05-30 07:56:27 UTC
A patch referencing this bug report has been merged in v3.0-rc1:

commit ddeb648708108091a641adad0a438ec4fd8bf190
Author: Rafael J. Wysocki <rjw@sisk.pl>
Date:   Sun May 15 11:38:48 2011 +0200

    PM / Hibernate: Add sysfs knob to control size of memory for drivers