Bug 47931

Summary: Hibernate fails 1st time if more than 4GB used on 8GB system, then succeeds - ThinkPad T420
Product: Power Management Reporter: Dan Carpenter (error27)
Component: Hibernation/SuspendAssignee: Aaron Lu (aaron.lu)
Status: CLOSED INSUFFICIENT_DATA    
Severity: normal CC: aaron.lu, alan, anath.amd, doa379, gfa, ht990332, jrf, lenb, narindam, rric, rui.zhang, tenoreamer
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 3.10 Subsystem:
Regression: No Bisected commit-id:
Attachments: dmesg
sudo tail -n 400 /var/log/messages >m
dmesg
syslog
Proposal for a s2disk-snapshot.c-patch
Proposal for a modified/simplified patch for snapshot.c
Add debug prints to the snapshot code
Add debug prints to the snapshot code
dmesg after 5 times hibernation in a row and one more after started vm and closed vm

Description Dan Carpenter 2012-09-25 11:23:42 UTC
Subject    : w/ current kernel (3.5.3) I need 2 attempts for s2disk in a row
Submitter  : Toralf Förster <toralf.foerster@gmx.de>
Date       : Sat, 15 Sep 2012 11:30:26 +0200
Message-ID : <50544AB2.302@gmx.de>
References : https://lkml.org/lkml/2012/9/15/36

WIth a ThinkPad T420 latest BIOS and current kernel, ( GB RAM and 10 GB swap space Ioften need 2 attempts in a row to s2disk.
The first fails while the second (immediately after it) works :

22012-09-15T05:17:54.114+02:00 n22 kernel: PM: Need to copy 719868 pages
2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Normal pages needed: 146128 + 1024, available pages: 81456
2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Not enough free memory
2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Error -12 creating hibernation image


...

2012-09-15T11:02:43.192+02:00 n22 kernel: PM: Creating hibernation image:
2012-09-15T11:02:43.192+02:00 n22 kernel: PM: Need to copy 384852 pages
2012-09-15T11:02:43.202+02:00 n22 kernel: PM: Normal pages needed: 65642 + 1024, available pages: 161930
2012-09-15T11:02:43.202+02:00 n22 kernel: PM: Restoring platform NVS memory


/me wonders whether it is a known issue or special to my system.

-- 
MfG/Sincerely
Toralf Förster
Comment 1 Toralf Förster 2012-09-25 16:12:32 UTC
Created attachment 81021 [details]
dmesg
Comment 2 Toralf Förster 2012-09-25 16:13:10 UTC
Created attachment 81031 [details]
sudo tail -n 400 /var/log/messages >m

last known good kernel 3.4.x I'd say
Comment 3 Toralf Förster 2012-09-25 16:18:21 UTC
Probably it starts with 3.5.3 (and 3.5.2 was good too) - I found no such error message for previous kernel versions than 3.5.3
Comment 4 Arindam Nath 2012-10-05 05:28:32 UTC
Hello,

Is this issue still reproducible with the latest upstream kernel?

If yes, please give a complete single log with unsuccessful and successful s2disk attempts.

Also, please let me know the exact command you used to enter hibernation.

It would also help if you could let me know the distro you are using and upload the kernel config file used to kernel compilation.

BTW, I do not have the Thinkpad model you mentioned, but I could still try to reproduce the issue at my end locally once you provide the info.

Thanks.
Comment 5 Toralf Förster 2012-10-05 17:47:19 UTC
Created attachment 82391 [details]
dmesg

Hi, 3.5.5 shows the same behaviour, I'm running a stable Gentoo Linux with self-compiled vanilla kernel, dmesg is attached (booted, worked a while to fullfill the RAM with pages, then s2disk'ed) via this commands in /etc/acpi/default.sh:

...
        #       suspend now
        #
        sync; sync
        echo $1 > /sys/power/state
        RC=$?

        if [[ $RC -ne 0 ]]; then
                TEXT="PM: 2nd attempt to suspend to $1 in 3 sec ..."
                echo -e "\n$TEXT\n"
                logger $TEXT
                sleep 3
                echo $1 > /sys/power/state
        fi
...
Comment 6 Toralf Förster 2012-10-10 20:13:41 UTC
FWIW, the system is booted from an external USB drive (30 MB/sec write speed),
/proc/sys/vm/dirty_writeback_centisecs was set to 3000
and I'm running 4 BOINC jobs (each occupying about 100 MB of RAM).

Now I set /proc/sys/vm/dirty_writeback_centisecs back to 500, stopped firefox, thunderbird and BOINC before and added a "sleep 2" between the 2 sync commands.

Using these values now it worked better.
Comment 7 Arindam Nath 2012-10-17 09:09:04 UTC
Can you please not use any scripts for the first level of debugging. Use the below command to enter S4 and press power button to resume. Let me know how many continuous cycles the system can enter S4 successfully.
Comment 8 Arindam Nath 2012-10-17 09:10:36 UTC
Oh, and in case it still fails, please attach the complete log. Sorry but I am not able to reproduce the issue at my end, so I will need your help with the outputs.
Comment 9 Toralf Förster 2012-10-17 17:23:43 UTC
(In reply to comment #7)
> Can you please not use any scripts for the first level of debugging. Use the
> below command to enter S4 and press power button to resume. Let me know how
> many continuous cycles the system can enter S4 successfully.

Erm, what do you mean with "below command" ?

BTW, the script itself is just a wrapper for 

echo disk > /sys/power/state

and will be called via an ACPI event, but of course I can do the echo command  on the command line directly.
Comment 10 Toralf Förster 2012-10-17 20:01:36 UTC
Created attachment 83741 [details]
syslog

attached are the whole syslog (2:10 h) from the boot till I run into that issue.
It seems that I need at least 4 GB of the 8 GB to be occupied before s2disk fails. For that I re-compiled wireshark, run BOINC with RAM intensive task, synced Gentoo's portage tree and update to user-mode-linux iso images.
Then it happened :


n22 /etc/init.d # free
             total       used       free     shared    buffers     cached
Mem:       8192556    3803352    4389204          0     294116    2163776
-/+ buffers/cache:    1345460    6847096
Swap:      9767516        312    9767204

n22 /etc/init.d # echo disk > /sys/power/state
-su: echo: write error: Cannot allocate memory


The imemdiately next try succeeded fine :

n22 /etc/init.d # echo disk > /sys/power/state

After that I restarted the network to fetch few emails, stopped it again to s2disk for the next time (succeeded too) :

n22 /etc/init.d # /etc/init.d/net.ppp0 restart
 * Bringing down interface ppp0
 *   Running predown ...
 *   Stopping pppd on ppp0                                                                                [ ok ]
 * Bringing up interface ppp0
 *   Starting pppd in ppp0 ...                                                                            [ ok ]
 *   received address 78.54.54.13
 * WARNING: net.ppp0 has started, but is inactive

n22 /etc/init.d # free
             total       used       free     shared    buffers     cached
Mem:       8192556    2725240    5467316          0     108608    1321388
-/+ buffers/cache:    1295244    6897312
Swap:      9767516        452    9767064

n22 /etc/init.d # /etc/init.d/net.ppp0 stop
 * Bringing down interface ppp0
 *   Running predown ...
 *   Stopping pppd on ppp0                                                                                [ ok ]

n22 /etc/init.d # date
Wed Oct 17 21:49:31 CEST 2012

n22 /etc/init.d # echo disk > /sys/power/state

n22 /etc/init.d #
Comment 11 Arindam Nath 2012-10-24 06:48:41 UTC
Hmm, can you please try:

echo 0 > /sys/power/image_size
echo disk > /sys/power/state

The first command will try to create a minimal image during hibernation.
Comment 12 Toralf Förster 2012-10-25 20:30:23 UTC
(In reply to comment #11)
> Hmm, can you please try:
> 
> echo 0 > /sys/power/image_size

That seems to work.
Comment 13 Toralf Förster 2012-11-29 15:21:23 UTC
/me wonders whether a wakeup from s2disk doesn't work if the image is > 3 GB
(I run a 32 bit Gentoo Linux with 8 GB RAM) ?
Comment 14 Rafael J. Wysocki 2013-06-04 01:24:22 UTC
First off, this is not a regression.

Second, the reason is that the image size automatically computed by the kernel is too large to fit into memory which is related to the way highmem is handled by the hibernate core.

Somebody sent me a patch to address this problem, I'll try to dig it up.  Please ping me in a few days.
Comment 15 Aaron Lu 2013-06-08 01:05:24 UTC
Hi Toralf,

Are you still there? Will you try out patches once Rafael dig them up? Thanks.
Comment 16 Toralf Förster 2013-06-08 17:37:49 UTC
(In reply to comment #15)
> Hi Toralf,
> 
> Are you still there? Will you try out patches once Rafael dig them up?
> Thanks.

1. Sry - back from Dresden now (high water).
2. yes.
Comment 17 Aaron Lu 2013-06-14 07:48:10 UTC
Rafael,

Got time to dig that patch up?
Comment 18 Eric Lan 2013-11-22 20:51:15 UTC
The same issue for 3.10-2-amd64 in linux mint debian edition.
Comment 19 Eric Lan 2013-11-22 20:52:31 UTC
$ dmesg
[142528.245031] PM: Creating hibernation image:
[142528.248008] PM: Need to copy 530249 pages
[142528.248008] PM: Normal pages needed: 530249 + 1024, available pages: 518040
[142528.248008] PM: Not enough free memory
[142528.248008] PM: Error -12 creating hibernation image
[142528.248008] CPU0: Thermal monitoring handled by SMI
[142528.248008] Enabling non-boot CPUs ...
[142528.248008] smpboot: Booting Node 0 Processor 1 APIC 0x1
[142528.144009] CPU1: Thermal monitoring handled by SMI
[142528.256966] CPU1 is up
[142528.261997] ACPI: Waking up from system sleep state S4
Comment 20 Zhang Rui 2014-06-03 02:45:25 UTC
ping Aaron.
Comment 21 Zhang Rui 2014-06-03 05:58:41 UTC
*** Bug 66041 has been marked as a duplicate of this bug. ***
Comment 22 Aaron Lu 2014-06-03 06:00:37 UTC
I think Rafael has already the patch as said in comment #14, the only thing needed is to dig it out and let reporters test.
Comment 23 Zhang Rui 2014-06-03 06:09:58 UTC
so ping Rafael...
Comment 24 Rainer Fiebig 2014-08-02 13:26:15 UTC
Good day!

Will this issue be solved in kernel 3.16?

Kind regards

Jay
Comment 25 Rainer Fiebig 2014-09-15 19:26:30 UTC
Hi!

I have encountert this long-lived bug many times: s2disk/hibernate works great until you start one program or VM too many. 
In that case it takes 2 or more attempts until s2disk finally succeeds.

Comment 14 suggests that this happens if free memory becomes too tight for the image to fit in.

Taking this as a starting point I found out that "buffers" and "cache" can take up so much memory that indeed there's not enough left for the image. 

For example:

> free
             total       used       free     shared    buffers     cached
Mem:      11973164   10071388    1901776          0    2416572    3160312
-/+ buffers/cache:    4494504    7478660
Swap:     10714108          0   10714108

Had I now startet s2disk (s2both) it clearly would have failed.

However, after freeing the memory held by buffers/cache by issuing "sync" followed by "echo 1 > /proc/sys/vm/drop_caches", s2disk worked at the first attempt.

I've automated this and s2disk now always works at the first attempt if not more than 50% of memory is used by programs and data. In my case this means it now works almost every time. Which is a big difference to the situation before. The price to pay is a slightly less responsive desktop in the beginning after resume. 

So I found a workaround for me. But shouldn't the s2disk/hibernate-routine take care for a situation like shown above? By making use of the swap-space before creating the image, for example?

A smaller image would be nice too, of course.

Jay
Comment 26 Rainer Fiebig 2014-09-25 20:26:31 UTC
Probably just talking to myself here. 

But today is the second anniversary of Bug 47931! Reason enough for a few lines. So, only for reference...

After having found a quick-and-dirty-workaround for me, I wanted to find out what might be the crux of this matter. 

So I took a look at snapshot.c and "int hibernate_preallocate_memory(void)". 
I added some printk-statements to get an idea of how it works, what the variables stand for, what values they take (see examples below) and where s2disk takes the emergency-exit in case of failure (line 1418 in my case).


I now think the problem is as follows:

1) s2disk does not differentiate between program-data and buffers/cached but takes the sum total of both as the image to be saved.
    - If this image is > 50% of total available memory (RAM), s2disk fails.
    
2) The mechanism applied by s2disk to free memory ("shrink_all_memory(saveable - size)") seems not effective enough.
    - Example: shrink_all_memory(2,594,668 - 1,510,485) should free 1,084,183 pages. Actually achieved: 139,143 (13%).
    - It apparently does not/not enough reduce buffers/cached.
    
3) As buffers/cached can become very big very quickly, s2disk often fails although not much memory seems to be used by programs.
    - This could also explain Bug 83821 and why "drop_caches=1" helps there, too.
    
4) During or after a failed attempt buffers/cached somehow are significantly reduced. This seems the major reason why the immediately following attempt often works - because there is enough free memory for the image now.

Adding more memory might help to a certain extent but a good deal of it would probably be eaten up by buffers/cached as well. So - in s2dik's current state, the only way I see to make s2disk succeed more reliably at the first attempt, is applying "sync/drop_caches" before initiating s2disk. 

The dark side of it: the closer "used minus buffers/cached" gets to 50% of total memory, the longer the resume and the period and severeness of the initial sluggishness of the system. If "used minus buffers/cached" becomes >= 50% of memory, the computer will probably hang and a hard reset may be necessary.


Suggestions to resolve Bug 47931 and improve s2disk:

- Short term:
1) "shrink_all_memory" should be replaced or supplemented by a more effective way to free memory.

2) This should be accompanied by a mechanism which intercepts a s2disk-attempt early if image-size comes too close to 50% of total memory. 

- Long term:
1) s2disk should be so designed that even an image of more than 50% of memory can be written to disk.

2) The tunable "image_size" should be dropped. This would declutter things without losing much value. s2disk can find the right size anyway.


Jay


------------------------


First attempt: failure				Pages		MB

save_highmem (= count_highmem_pages())		0		0
saveable (= count_data_pages())			2618027		10227
highmem						3061349		11958
count (+= highmem;)				3061349		11958
count (-= totalreserve_pages;)			3023172		11809
size (nr of pages required for page keys)	216		1
max_size (max nr of sav. pages to leave in mem)	1510454		5900
size (= DIV_ROUND_UP(image_size, PAGE_SIZE))	1346981		5262
pages (= minimum_image_size(saveable)		1867029		7293
avail_normal					3061349		11958
avail_normal					1194320		4665
size						1510454		5900
shrink_all_memory(saveable – size)		147415		576
alloc (-= pages)				318398		1244
pages (+= pages_highmem)			1194320		4665
pages_highmem (=preallocate_image_highmem(alloc))0		0


Second attempt, immediately after first: success

save_highmem (= count_highmem_pages())		0		0
saveable (= count_data_pages())			1452938		5676
highmem						3061416		11959
count (+= highmem;)				3061416		11959
count (-= totalreserve_pages;)			3023239		11810
size (nr of pages required for page keys)	216		1
max_size (max nr of sav. pages to leave in mem)	1510487		5900
size (= DIV_ROUND_UP(image_size, PAGE_SIZE))	1346981		5262
pages (= minimum_image_size(saveable)		0		0
avail_normal					3061416		11959
avail_normal					3061416		11959
size						1346981		5262
shrink_all_memory(saveable – size)		46846		183
done (allocated...pages)			1676258		6548
alloc (-= pages)				0
pages (+= pages_highmem)			0
pages_highmem (=preallocate_image_highmem(alloc))0
Comment 27 Alan 2014-09-26 09:41:32 UTC
4. Is because shrink_all_memory() pushed some of it to disk.

Why shrink_all_memory isn't clearing enough in your case I don't know.  Might be interesting to call it and sleep a second and call it again if you don't get enough or similar, just to see what is going on.

For all these things - patches welcomed.
Comment 28 Rainer Fiebig 2014-09-26 15:51:49 UTC
(In reply to Alan from comment #27)
> 4. Is because shrink_all_memory() pushed some of it to disk.
> 

OK - but wouldn't this imply "s2disk > fail > shrink_all_memory()" instead of "s2disk > shrink_all_memory() > succeed"?


For example:

Before failed first attempt
>free
             total       used       free     shared    buffers     cached
Mem:      11997572    8343040    3654532          0    1623016    2030956
-/+ buffers/cache:    4689068    7308504
Swap:     10714108          0   10714108


After failed first attempt
>free
             total       used       free     shared    buffers     cached
Mem:      11997572    5222784    6774788          0      24428     844300
-/+ buffers/cache:    4354056    7643516
Swap:     10714108     419896   10294212


The following attempt succeeds.

The magnitude of reduction of buffers/cache after a failed attempt is more or less the same as what is achieved by "drop_caches=1" applied before s2disk. 


> Why shrink_all_memory isn't clearing enough in your case I don't know. 
> Might be interesting to call it and sleep a second and call it again if you
> don't get enough or similar, just to see what is going on.

I'll try to find out. If I come to new insights, I'll post them here.

> 
> For all these things - patches welcomed.

Unfortunately, I have only basic programming skills. A patch for "Hellu Wurld!" would be doable. But nothing even more sophisticated.
Comment 29 Rainer Fiebig 2014-09-28 22:17:24 UTC
Hi!

I think I've solved this one. Plus Bug_83821. 

It's actually quite simple but it nevertheless killed my weekend.

But first: shrink_all_memory() does work effectively. Right now, however, it is placed too far down in the code (Kernel 3.16.2/snapshot.c/int hibernate_preallocate_memory(void)). 

Therefore it only SEEMS that it doesn't work. At its current position it comes too late. And this is why s2disk sometimes fails at the first attempt even in seemingly moderate memory-usage-cases.
So scrap my suggestion no. 1) in comment 26.

Now for the solution.

Two commands need to be added immediately before line 1366:
	
	shrink_all_memory(count - max_size);
	saveable = count_data_pages();
--

shrink_all_memory(saveable-size) at line 1397 should be deleted.

That's all.


Explanation:

1) shrink_all_memory(count - max_size) frees a lot of memory. Comparable to drop_caches=1 issued before calling s2disk. 

By this we can handle memory-usage of close to s2disk's current maximum (50%) - at the first attempt. 
As the image is relatively small, s2disk is perhaps even a bit faster than before.
Resumes are also fast and the system is responsive from the start - depending of course on the memory-load. But better than with drop_caches=1. 
In a high-memory-load-case the monitor may shortly show the desktop-colour before showing the login-box.

2) After freeing memory the data pages need to be counted again and the result is assigned to "saveable" which is much smaller now than before.

Therefore chances are good that the following if-condition is true - even in case of higher memory-load:

L 1366	/*
	 * If the desired number of image pages is at least as large as the
	 * current number of saveable pages in memory, allocate page frames for
	 * the image and we're done.
	 */
if (size >= saveable) {
		pages = preallocate_image_highmem(save_highmem);
		pages += preallocate_image_memory(saveable - pages, avail_normal);
		goto out;
	}

3) If (size >= saveable) = false the procedures following the if-clause can then try to reduce its size even more.	
	
4) In its new position shrink_all_memory(count - max_size) is executed on on EVERY s2disk - even if only few memory is used.

I found out that this can prevent Bug_83821 which always seems to strike when a s2disk is executed after a previous s2disk and memory content has not changed (much) in between - irrespective of memory-load. 

It is more severe than I first thought because it also happens after a big amount of memory has been freed - for example by closing VMs - and a s2disk is performed immediately after that. The system doesn't crash but the resume takes very long and the system needs quite a time to become responsive again. 

shrink_all_memory(count - max_size) exerts enough pressure (similar to drop_caches=1) to reduce (and/or change) memory-content and buffers/cache on each s2disk.

   
Of course I'm going to test this further in the following days. But so far I have not had one miss or problems. 
It looks and feels promising.
 

Jay

--------------------------------------------

Some test-results.
openSuse 12.3/64bit/Kernel 3.16.2


High-load-case

Before suspend  (s2both): 

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    8165828    3831744          0    1662480    1930716
-/+ buffers/cache:    4572632    7424940
Swap:     10714108          0   10714108

Suspend: OK.
Resume: OK.

After resume:

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    6530164    5467408          0    1432996     545424
-/+ buffers/cache:    4551744    7445828
Swap:     10714108      74952   10639156


dmesg:

[  763.216271] Syncing filesystems ... done.
[  763.237421] Freezing user space processes ... (elapsed 0.001 seconds) done.
[  763.240200] PM: Preallocating image memory... 
[  763.240325] JRF: save_highmem (= count_highmem_pages()) 0
[  763.387440] JRF: saveable (= count_data_pages()) 2124174
[  763.387444] JRF: highmem 3061194
[  763.387444] JRF: count (+= highmem;) 3061194
[  763.387445] JRF: count (-= totalreserve_pages;) 3022991
[  763.387446] JRF: size (+= page_key_additional_pages(saveable)) 216
[  763.387447] JRF: max_size (maximum nr of saveable pages to leave in memory) 1510363
[  763.387448] JRF: totalram_pages 2999393
[  763.387448] JRF: PAGE_SIZE 4096
[  763.387449] JRF: image_size 5528481177
[  763.387450] JRF: image_size 5528481177
[  763.387451] JRF: size (= DIV_ROUND_UP(image_size, PAGE_SIZE) 1349727
[  763.874157] JRF: Pages reclaimed by shrink_all_memory(count - max_size); 657187 
[  764.037874] JRF: saveable after shrink_all_memory() 1454822
[  764.037877] JRF: pages (= minimum_image_size(saveable);) 0
[  764.037878] JRF: avail_normal 3061194
[  764.037879] JRF: avail_normal 3061194
[  764.473625] JRF: Before out.
[  764.473629] done (allocated 1673264 pages)
[  764.473632] PM: Allocated 6693056 kbytes in 1.23 seconds (5441.50 MB/s)
[  764.473633] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[  764.475027] Suspending console(s) (use no_console_suspend to debug)
[  764.475486] serial 00:06: disabled
[  764.475535] i8042 kbd 00:02: System wakeup enabled by ACPI
[  764.877372] PM: freeze of devices complete after 402.441 msecs
[  764.877640] PM: late freeze of devices complete after 0.264 msecs
[  764.878114] PM: noirq freeze of devices complete after 0.472 msecs
[  764.878336] ACPI: Preparing to enter system sleep state S4
[  764.878561] PM: Saving platform NVS memory
[  764.878748] Disabling non-boot CPUs ...
[  764.880015] kvm: disabling virtualization on CPU1
[  764.880023] smpboot: CPU 1 is now offline
[  764.881588] kvm: disabling virtualization on CPU2
[  764.881744] smpboot: CPU 2 is now offline
[  764.883234] kvm: disabling virtualization on CPU3
[  764.883243] smpboot: CPU 3 is now offline
[  764.883686] PM: Creating hibernation image:
[  764.998576] PM: Need to copy 1342658 pages
[  766.568177] PM: Hibernation image created (1342658 pages copied)s) done.

---------------------


5 suspends in a row (for Bug_83821).

1. Suspend: 
> free
             total       used       free     shared    buffers     cached
Mem:      11997572    1567268   10430304          0     103448     637328
-/+ buffers/cache:     826492   11171080
Swap:     10714108          0   10714108

OK.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    1864056   10133516          0     511660     507828
-/+ buffers/cache:     844568   11153004
Swap:     10714108      19336   10694772


2. Suspend: OK.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    1601452   10396120          0     401636     574508
-/+ buffers/cache:     625308   11372264
Swap:     10714108     222360   10491748


3. Suspend: OK.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    1296392   10701180          0     382436     396272
-/+ buffers/cache:     517684   11479888
Swap:     10714108     334848   10379260


4. Suspend: OK.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    1202920   10794652          0     395904     394068
-/+ buffers/cache:     412948   11584624
Swap:     10714108     459516   10254592

5. Suspend: OK.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    1003536   10994036          0     313016     310272
-/+ buffers/cache:     380248   11617324
Swap:     10714108     506564   10207544
Comment 30 Rainer Fiebig 2014-09-30 21:08:14 UTC
Changed "shrink_all_memory(count - max_size)" to "shrink_all_memory(saveable)".

The latter adapts to varying memory-load while the first is a rather big fixed value. 

Tests so far are good: s2disk now reliably (!) succeeds at the first attempt as long as memory load is not too close to 50% of RAM (see failed tests below). 
Bug 83821 is also gone. s2disks are usually quite fast as are resumes. When memory load was high the system is initially a little less responsive.

More testing in other settings necessary.

My proposal for a patch for snapshot.c (kernel 3.16.2) is attached. Reviews/feedback/improvements welcome. 

But if nobody cares that's also fine with me. Because I already have a better s2disk.

I think I've done enough here so far. 

Jay


--------------

Some more test-results:


Medium-load

Before suspend (s2both):

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    6177576    5819996          0    1329736    1723856
-/+ buffers/cache:    3123984    8873588
Swap:     10714108     460184   10253924

Suspend:OK.
Resume: OK.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    5700248    6297324          0    1715288     872832
-/+ buffers/cache:    3112128    8885444
Swap:     10714108     504412   10209696

dmesg:

[ 1214.109862] Syncing filesystems ... done.
[ 1214.122223] Freezing user space processes ... (elapsed 0.001 seconds) done.
[ 1214.124957] PM: Preallocating image memory... 
[ 1214.277145] Info: saveable (= count_data_pages()) 1620532
[ 1214.940754] Info: saveable after shrink_all_memory(saveable) 1076637
[ 1215.094878] done (allocated 1076637 pages)
[ 1215.094882] PM: Allocated 4306548 kbytes in 0.97 seconds (4439.74 MB/s)
[ 1215.094883] Freezing remaining freezable tasks ... (elapsed 0.000 seconds) done
...
[ 1215.504562] PM: Creating hibernation image:
[ 1215.630606] PM: Need to copy 1074246 pages
-------------


High-load


Before suspend:

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    9122040    2875532          0    1663680    2091736
-/+ buffers/cache:    5366624    6630948
Swap:     10714108          0   10714108

Suspend: OK. Power-off computer.
Power-on/resume: OK - but took a while. Desktop-colour shortly displayed, login-box not immediately responsive.

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    5476912    6520660          0      26108     703612
-/+ buffers/cache:    4747192    7250380
Swap:     10714108     744844    9969264

dmesg:

.
[ 1859.237925] PM: Preallocating image memory... 
[ 1859.378650] Info: saveable (= count_data_pages()) 2478720
[ 1860.094750] Info: saveable after shrink_all_memory(saveable) 1683136
[ 1867.169507] done (allocated 1673355 pages)
[ 1867.169512] PM: Allocated 6693420 kbytes in 7.93 seconds (844.06 MB/s)
[ 1867.169513] Freezing remaining freezable tasks ... (elapsed 0.612 seconds) done.
...
[ 1868.190797] PM: Creating hibernation image:
[ 1868.304322] PM: Need to copy 1342411 pages

-------------


Too-high-load 1


Suspend: Failure. But system returns to desktop.

Before suspend:

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    9860764    2136808          0    2206728    1090152
-/+ buffers/cache:    6563884    5433688
Swap:     10714108     473700   10240408

------------


Too-high-load 2


Suspend: Failure. 

s2disk reached the following point but then hung: 

"Free swap: 9684708
Saving 1 526 997 image data pages (press backspace to abort)...0%"

Reset necessary.


Before suspend: 

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    6490540    5507032          0      32600     577632
-/+ buffers/cache:    5880308    6117264
Swap:     10714108     739580    9974528

-----------
Comment 31 Rainer Fiebig 2014-09-30 21:10:05 UTC
Created attachment 152151 [details]
Proposal for a s2disk-snapshot.c-patch
Comment 32 Aaron Lu 2014-10-11 08:43:51 UTC
I didn't look at the patch, but for patch to be considered, you will have to submit it to the mailing list.

Please send email to: "Rafael J. Wysocki" <rjw@rjwysocki.net> and cc: linux-pm@vger.kernel.org when submitting the patch.
Comment 33 Zhang Rui 2014-10-24 06:45:41 UTC
*** Bug 83821 has been marked as a duplicate of this bug. ***
Comment 34 Rainer Fiebig 2014-10-24 10:29:32 UTC
(In reply to Zhang Rui from comment #33)
> *** Bug 83821 has been marked as a duplicate of this bug. ***


Interesting. I'm really looking forward now to your solution - and where it deviates from mine.

Since posting it here, I've improved my patch a bit - resumes are faster and smoother now when memory-usage was high.

All in all: 
Lines of code modified: One.
Lines of code added: Zero.
Reliability on my 64-bit-system: 100 %.

In hindsight, it was easy. Took me a weekend. Should not have taken more than 2 years.

So long!

Jay


p.s. As was suggested I sent an email to rjw@rjwysocki.net - no reply. And no surprise.
Comment 35 Zhang Rui 2014-10-27 01:35:41 UTC
(In reply to Jay from comment #34)
> (In reply to Zhang Rui from comment #33)
> > *** Bug 83821 has been marked as a duplicate of this bug. ***
> 
> 
> Interesting. I'm really looking forward now to your solution - and where it
> deviates from mine.
> 
No, I don't have a solution. And I'm glad to see you have one. Thanks for your great work, Jay.

> Since posting it here, I've improved my patch a bit - resumes are faster and
> smoother now when memory-usage was high.
> 
yeah, Aaron, can you please take a look at Jay' patch and raise the issue/patch to Rafael?
Comment 36 Aaron Lu 2014-10-27 02:07:12 UTC
(In reply to Jay from comment #34)
> p.s. As was suggested I sent an email to rjw@rjwysocki.net - no reply. And
> no surprise.

Please also cc the mailing list: linux-pm@vger.kernel.org if you didn't, other people can also give you suggestions/corrections/comments/etc. for your patch.
Comment 37 Rainer Fiebig 2014-10-27 22:53:34 UTC
Thanks to both of you for the encouragement and advice!

I have used a slightly modified/simplified version of the first patch with kernel 3.16.2 now for a while and it is very reliable. I've also just tested it with kernel 3.14.22 and it's OK there, too.

So I'll dare go to the lion's den and cc it at linux-pm@vger.kernel.org. I hope they won't eat me there.

But I also post it here (see attachment) - so it won't get lost in case I change my mind.

As you can see, it's really straightforward. 

So long!

Jay
Comment 38 Rainer Fiebig 2014-10-27 22:54:39 UTC
Created attachment 155521 [details]
Proposal for a modified/simplified patch for snapshot.c
Comment 39 Rainer Fiebig 2014-11-27 17:50:30 UTC
To whom it may concern.

Having a bug/blemish of this significance and not doing anything about it for more than 2 years is bad enough:
It's a disservice to "Linux on the desktop".

But having been handed a simple and easily testable solution (for two bugs) on a silver tablet and completely ignoring it is even worse:
It's outright unfair to all those who would like to use s2disk but have given up on it for obvious reasons.

However - if you think this solution is total crap or even harmful: please tell me why. I've spent considerable time to find it and five minutes of yours are surely not asked to much.

And if you have something better: Don't count on me for testing anymore - but most welcome. And please implement it asap. Because in its current state s2disk is already 14 (fourteen) years behind Windows 2000.

For me this was a totally strange and baffling experience. Don't need this again.

In this spirit: Good night and good luck!

Jay
Comment 40 Aaron Lu 2014-11-28 01:49:52 UTC
Hi Jay,

I'll take a look at your patch and let you know what I think, stay tuned, thanks.
Comment 41 Aaron Lu 2014-12-01 08:42:30 UTC
I'm trying to reproduce the issue here, I started 3 VMs and the hibernate still succeeded. I wonder if this problem only occurs on 32 bits kernel?
Comment 42 Aaron Lu 2014-12-01 08:46:47 UTC
Created attachment 159311 [details]
Add debug prints to the snapshot code

BTW, can you please try the attached patch to give me a log of a failure hibernate? Thanks.
Comment 43 Toralf Förster 2014-12-01 17:04:29 UTC
(In reply to Aaron Lu from comment #41)
> I wonder if this problem only occurs on 32 bits kernel?
I'm pretty convinced, b/c using this snipped in my /etc/acpi/default scripts works well:

                #       s2disk is broken for big images 
                #
                CUR=$(cat /sys/power/image_size)
                MAX=$(echo "1024^3 * 2 - 4 * 1024^2" | bc)
                if [[ $CUR > $MAX ]]; then
                        echo " adapting /sys/power/image_size from $CUR to $MAX ..."
                        echo $MAX > /sys/power/image_size
                fi

whereas values above 2^31 won't work.
Comment 44 Toralf Förster 2014-12-01 17:32:05 UTC
(In reply to Aaron Lu from comment #42)
> Created attachment 159311 [details]
> Add debug prints to the snapshot code
> 
> BTW, can you please try the attached patch to give me a log of a failure
> hibernate? Thanks.

it fails against 3.17.4 :

$ sudo patch -p1 < /home/tfoerste/devel/hibernate-debug.patch 
patching file kernel/power/snapshot.c
Hunk #1 succeeded at 1525 (offset -25 lines).
Hunk #2 FAILED at 1582.
Hunk #3 succeeded at 1577 (offset -25 lines).
Hunk #4 succeeded at 1598 (offset -25 lines).
Hunk #5 succeeded at 1613 (offset -25 lines).
Hunk #6 succeeded at 1630 (offset -25 lines).
Hunk #7 succeeded at 1642 (offset -25 lines).
Hunk #8 succeeded at 1650 (offset -25 lines).
Hunk #9 succeeded at 1662 (offset -25 lines).
Comment 45 Rainer Fiebig 2014-12-01 17:53:03 UTC
(In reply to Aaron Lu from comment #41)
> I'm trying to reproduce the issue here, I started 3 VMs and the hibernate
> still succeeded. I wonder if this problem only occurs on 32 bits kernel?


I run only 64-bit-kernels. The poster of Comment 18 also seemed to have a 64-bit-system.
The origal submitter had a 32-bit-kernel, so 32- and 64-bit-kernels are affected.

What counts here is not how many programs are running but how much memory is used.
For example - in this case s2disk fails with the original snapshot.c but works with my patch:

> free
             total       used       free     shared    buffers     cached
Mem:      11997892    8711912    3285980     615184    2099844    1666364
-/+ buffers/cache:    4945704    7052188
Swap:     10713084     406280   10306804

Try to generate a similar situation. If hibernate then still works in your system AND you can not reproduce Bug 83821 (1), the orig. submitter, the poster of Comment 18 and I may have a memory-hardware-bug in common.

However, I think a software-issue is more likely.

For bug 47931 I've tracked it down to this:

If in the original snapshot.c the following condition is true (low memory usage), s2disk succeeds most of the time but not always (Bug 83821):


if (size >= saveable) {
		pages = preallocate_image_highmem(save_highmem);
		pages += preallocate_image_memory(saveable - pages, avail_normal);
		goto out;
	}


If false, the next statement is:


	/* Estimate the minimum size of the image. */
	pages = minimum_image_size(saveable);


And if minimum_image_size(saveable) does NOT return 0, things go wrong.

For convenience, here's this function:


static unsigned long minimum_image_size(unsigned long saveable)
{
	unsigned long size;

	size = global_page_state(NR_SLAB_RECLAIMABLE)
		+ global_page_state(NR_ACTIVE_ANON)
		+ global_page_state(NR_INACTIVE_ANON)
		+ global_page_state(NR_ACTIVE_FILE)
		+ global_page_state(NR_INACTIVE_FILE)
		- global_page_state(NR_FILE_MAPPED);

	return saveable <= size ? 0 : saveable - size;
}


My patch works because this function is now called with a much smaller value for "saveable". It returns 0 and all is good.

I don't know for certain why shrink_all_memory() also cures Bug 83821. 

I do know, however, that if it is bypassed - which is the case in low-memory-usage cases in the original snapshot.c - Bug 83821 is reproducibly there.




(1): Start your system. Immediately after login hibernate. Resume. Do it one more time. The third or fourth attempt should fail.
Alternatively: In a situation like shown above, close some memory-intensive programs so that "free" is clearly > "used". Now hibernate should also fail. In my system both cases are 100% reproducible.
Comment 46 Aaron Lu 2014-12-02 03:24:14 UTC
Created attachment 159391 [details]
Add debug prints to the snapshot code

New debug print patch against v3.17.4.

I should have posted the memory status, before hibernate:

             total       used       free     shared    buffers     cached
Mem:       3890192    3755444     134748      46804      29316     613540
-/+ buffers/cache:    3112588     777604
Swap:      4029436        140    4029296

I suppose that is already high enough?

And the message I got during preallocate memory is:

[  107.028127] PM: Basic memory bitmaps created
[  107.028456] PM: Preallocating image memory... 
[  107.072396] save_highmem=0 pages, 0MB
[  107.072402] saveable=989494 pages, 3865MB
[  107.072404] highmem=0 pages, 0MB
[  107.072406] additional_pages=88 pages, 0MB
[  107.072408] avail_normal=1018081 pages, 3976MB
[  107.072410] count=988350 pages, 3860MB
[  107.072412] max_size=493107 pages, 1926MB
[  107.072414] user_specified_image_size=388528 pages, 1517MB
[  107.072416] adjusted_image_size=388528 pages, 1517MB
[  107.072418] minimum_pages=112902 pages, 441MB
[  107.072420] target_image_size=388528 pages, 1517MB
[  109.185188] nr_should_reclaim=600966 pages, 2347MB
[  109.185194] nr_reclaimed=147357 pages, 575MB
[  109.185196] preallocated_high_mem=0 pages, 0MB
[  109.185198] to_alloc=495243 pages, 1934MB
[  109.185200] to_alloc_adjusted=495243 pages, 1934MB
[  112.668907] pages_allocated=495243 pages, 1934MB
[  114.523842] done (allocated 599822 pages)
[  114.523849] PM: Allocated 2399288 kbytes in 7.49 seconds (320.33 MB/s)

From the above log, the shrink_all_memory doesn't do what it's asked: 575MB reclaimed versus 2347MB requested, but the following allocation still succeeded.
Comment 47 Aaron Lu 2014-12-02 04:30:56 UTC
Created attachment 159401 [details]
dmesg after 5 times hibernation in a row and one more after started vm and closed vm

>(1): Start your system. Immediately after login hibernate. Resume.
> Do it one more time. The third or fourth attempt should fail.

I did this for 5 times, no problem occured.

> Alternatively: In a situation like shown above, close some memory-intensive 
> programs so that "free" is clearly > "used". Now hibernate should also fail. 
> In my system both cases are 100% reproducible.
$ free
             total       used       free     shared    buffers     cached
Mem:       3890192    3764096     126096      83676       3336     281184
-/+ buffers/cache:    3479576     410616
Swap:      4029436      85636    3943800

The above situation is after I started some VMs.

$ free
             total       used       free     shared    buffers     cached
Mem:       3890192     763132    3127060      64608       2596     364008
-/+ buffers/cache:     396528    3493664
Swap:      4029436     132624    3896812

The above is after I closed those VMs, and hibernate still succeeded.

I think it may be related to hardware config, perhaps 4GB memory isn't enough to reproduce this problem. Please apply the debug print patch and let's see where the problem is, thanks.
Comment 48 Aaron Lu 2014-12-02 04:33:58 UTC
For Toralf and Eric, their problem occurs when doing the copy, not preallocate. Jay, is this also your case? Perhaps we are dealing with two different problems here?
Comment 49 Rainer Fiebig 2014-12-02 09:55:41 UTC
Aaron, I use s2both. Maybe this is at the heart of the matter.

I mentioned that in earlier comments but should have stated it more clearly.
Comment 50 Rainer Fiebig 2014-12-02 17:47:52 UTC
(In reply to Aaron Lu from comment #46)
> Created attachment 159391 [details]
> Add debug prints to the snapshot code
> 
> New debug print patch against v3.17.4.
> 
> I should have posted the memory status, before hibernate:
> 
>              total       used       free     shared    buffers     cached
> Mem:       3890192    3755444     134748      46804      29316     613540
> -/+ buffers/cache:    3112588     777604
> Swap:      4029436        140    4029296
> 
> I suppose that is already high enough?

snip

Almost too high.

How can s2disk succeed in that situation? If "free" after "buffers/cache" is only 20% of total memory? 

I'm shocked. Because in my system, s2disk - pure or in s2both - never succeeds if "free" after "buffers/cache" is not at least 50% of total memory.
Have I missed/messed up some configuration stuff?
Please let me know.


OK, my new findings:

Bug 83821 occurs only in connection with s2both (suspend_hybrid). It's not there with pure s2disk.

Bug 47931 is still there - even with pure s2disk. See 2 failures and 1 success below.



Kernel 3.17.4, orig. snaphot.c, pure s2disk.


1) Failure.

> free
             total       used       free     shared    buffers     cached
Mem:      11997780    8667616    3330164     487716    1732632    2366068
-/+ buffers/cache:    4568916    7428864
Swap:     10713084          0   10713084


PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=2262873 pages, 8839MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061370 pages, 11958MB
count=3023167 pages, 11809MB
max_size=1510449 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=1894339 pages, 7399MB
target_image_size=1510449 pages, 5900MB
nr_should_reclaim=752424 pages, 2939MB
nr_reclaimed=662852 pages, 2589MB
preallocated_high_mem=0 pages, 0MB
to_alloc=1512718 pages, 5909MB
to_alloc_adjusted=1512718 pages, 5909MB
pages_allocated=1167031 pages, 4558MB
to_alloc_from_highmem=345687 pages, 1350MB
pages_allocated_from_highmem=0 pages, 0MB

Restarting tasks ... done.


2) Failure.

> free
             total       used       free     shared    buffers     cached
Mem:      11997780    7792240    4205540     331444    1167268    1980436
-/+ buffers/cache:    4644536    7353244
Swap:     10713084     130016   10583068


PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=2034193 pages, 7946MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061351 pages, 11958MB
count=3023148 pages, 11809MB
max_size=1510440 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=1883084 pages, 7355MB
target_image_size=1510440 pages, 5900MB
nr_should_reclaim=523753 pages, 2045MB
nr_reclaimed=477845 pages, 1866MB
preallocated_high_mem=0 pages, 0MB
to_alloc=1512708 pages, 5909MB
to_alloc_adjusted=1512708 pages, 5909MB
pages_allocated=1178267 pages, 4602MB
to_alloc_from_highmem=334441 pages, 1306MB
pages_allocated_from_highmem=0 pages, 0MB

Restarting tasks ... done.


3) Success. After closing a VM.

> free
             total       used       free     shared    buffers     cached
Mem:      11997780    3558144    8439636     289864     400040     924720
-/+ buffers/cache:    2233384    9764396
Swap:     10713084     261908   10451176


PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=978838 pages, 3823MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061374 pages, 11958MB
count=3023171 pages, 11809MB
max_size=1510451 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
done (allocated 978838 pages)
PM: Allocated 3915352 kbytes in 0.31 seconds (12630.16 MB/s)
Comment 51 Aaron Lu 2014-12-03 07:28:25 UTC
The minimum_pages value seems to be quite different. For my case, though the free memory is pretty small, the minimum_pages value is not big:
[  107.072418] minimum_pages=112902 pages, 441MB

And your case shows that:
minimum_pages=1894339 pages, 7399MB
Which is pretty big, bigger than max_size 5900MB.

I wonder what VM app are you using?
Comment 52 Rainer Fiebig 2014-12-03 11:15:25 UTC
(In reply to Aaron Lu from comment #51)
> The minimum_pages value seems to be quite different. For my case, though the
> free memory is pretty small, the minimum_pages value is not big:
> [  107.072418] minimum_pages=112902 pages, 441MB
> 
> And your case shows that:
> minimum_pages=1894339 pages, 7399MB
> Which is pretty big, bigger than max_size 5900MB.
> 
> I wonder what VM app are you using?

VirtualBox (4.3.18).
Comment 53 Rainer Fiebig 2014-12-03 21:02:17 UTC
(In reply to Aaron Lu from comment #51)
> The minimum_pages value seems to be quite different. For my case, though the
> free memory is pretty small, the minimum_pages value is not big:
> [  107.072418] minimum_pages=112902 pages, 441MB
> 
> And your case shows that:
> minimum_pages=1894339 pages, 7399MB
> Which is pretty big, bigger than max_size 5900MB.
> 
> I wonder what VM app are you using?


As minimum_image_size(saveable) returns such a high number, this function does perhaps not factor in well enough situations where buffers/cache are (very) high.

Would it therefore perhaps make sense to NOT subtract global_page_state(NR_FILE_MAPPED) from the rest (which amounts roughly to buffers/cache)?

I couldn't find NR_FILE_MAPPED, only nr_mapped. If these are identical, not subtracting it would result in a significantly smaller return-value.  

I had a situation here where pages=minimum_image_size(saveable) would then have been around 1,380,000 pages instead of 2,100,000.

This would have been below max_size of 1,510,449 pages.
Comment 54 Aaron Lu 2014-12-04 09:10:11 UTC
(In reply to Jay from comment #53)
> (In reply to Aaron Lu from comment #51)
> > The minimum_pages value seems to be quite different. For my case, though
> the
> > free memory is pretty small, the minimum_pages value is not big:
> > [  107.072418] minimum_pages=112902 pages, 441MB
> > 
> > And your case shows that:
> > minimum_pages=1894339 pages, 7399MB
> > Which is pretty big, bigger than max_size 5900MB.
> > 
> > I wonder what VM app are you using?
> 
> 
> As minimum_image_size(saveable) returns such a high number, this function
> does perhaps not factor in well enough situations where buffers/cache are
> (very) high.
> 
> Would it therefore perhaps make sense to NOT subtract
> global_page_state(NR_FILE_MAPPED) from the rest (which amounts roughly to
> buffers/cache)?

Sorry, I don't know, not familiar with this field.

> 
> I couldn't find NR_FILE_MAPPED, only nr_mapped. If these are identical, not
> subtracting it would result in a significantly smaller return-value.  
> 
> I had a situation here where pages=minimum_image_size(saveable) would then
> have been around 1,380,000 pages instead of 2,100,000.
> 
> This would have been below max_size of 1,510,449 pages.

Does it work then?

I also tried to skip the shrink_all_memory call here since it doesn't do much, and preallocate memory still succeeded. It seems that at my side, the memory management subsystem can get enough pages even when free memory is pretty small. I'm using virt-manager, which is a wrapper on top of qemu/kvm. It seems the software here plays a big role.
Comment 55 Rainer Fiebig 2014-12-04 17:18:28 UTC
(In reply to Aaron Lu from comment #54)
> (In reply to Jay from comment #53)
> > (In reply to Aaron Lu from comment #51)
> > > The minimum_pages value seems to be quite different. For my case, though
> the
> > > free memory is pretty small, the minimum_pages value is not big:
> > > [  107.072418] minimum_pages=112902 pages, 441MB
> > > 
> > > And your case shows that:
> > > minimum_pages=1894339 pages, 7399MB
> > > Which is pretty big, bigger than max_size 5900MB.
> > > 
> > > I wonder what VM app are you using?
> > 
> > 
> > As minimum_image_size(saveable) returns such a high number, this function
> > does perhaps not factor in well enough situations where buffers/cache are
> > (very) high.
> > 
> > Would it therefore perhaps make sense to NOT subtract
> > global_page_state(NR_FILE_MAPPED) from the rest (which amounts roughly to
> > buffers/cache)?
> 
> Sorry, I don't know, not familiar with this field.
> 
> > 
> > I couldn't find NR_FILE_MAPPED, only nr_mapped. If these are identical, not
> > subtracting it would result in a significantly smaller return-value.  
> > 
> > I had a situation here where pages=minimum_image_size(saveable) would then
> > have been around 1,380,000 pages instead of 2,100,000.
> > 
> > This would have been below max_size of 1,510,449 pages.
> 
> Does it work then?


It does indeed. At the first attempt.

Case 1 was a similar situation like before, where it failed. Look at the difference in "minimum_pages"!
Resumes were good.

It also works with s2both. The only negative so far: encountered Bug 83821 after having closed 2 VMs. 
However, I think I now know how to avoid that: by de-activating "threads = y" in /etc/suspend.conf. 
In 2 comparable cases after that, s2both succeeded. While writing the image to disk it paused a moment at around 60%/85% but instead of aborting, it then continued to 100%. I could not notice a difference in speed.

If the solution for Bug 83821 proves true, this new approach may be a nice way to solve Bug 47931.

So - the idea seems right. But we need to clearly understand what "NR_File_Mapped" is and if it is really expendable here.
Who could help us with that?


1) s2disk

> free
             total       used       free     shared    buffers     cached
Mem:      11997780    8561508    3436272     450580    1665708    2296112
-/+ buffers/cache:    4599688    7398092
Swap:     10713084          0   10713084

PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=2233721 pages, 8725MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061341 pages, 11958MB
count=3023138 pages, 11809MB
max_size=1510435 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=919339 pages, 3591MB
target_image_size=1349751 pages, 5272MB
nr_should_reclaim=883970 pages, 3453MB
nr_reclaimed=629022 pages, 2457MB
preallocated_high_mem=0 pages, 0MB
to_alloc=1512703 pages, 5908MB
to_alloc_adjusted=1512703 pages, 5908MB
pages_allocated=1512703 pages, 5908MB
done (allocated 1673387 pages)
PM: Allocated 6693548 kbytes in 1.40 seconds (4781.10 MB/s)


2) s2both

> free
             total       used       free     shared    buffers     cached
Mem:      11997780    9278744    2719036     280116    3529880    1857776
-/+ buffers/cache:    3891088    8106692
Swap:     10713084    1176508    9536576

PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=2408075 pages, 9406MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061410 pages, 11958MB
count=3023207 pages, 11809MB
max_size=1510469 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=924554 pages, 3611MB
target_image_size=1349751 pages, 5272MB
nr_should_reclaim=1058324 pages, 4134MB
nr_reclaimed=852807 pages, 3331MB
preallocated_high_mem=0 pages, 0MB
to_alloc=1512738 pages, 5909MB
to_alloc_adjusted=1512738 pages, 5909MB
pages_allocated=1512738 pages, 5909MB
done (allocated 1673456 pages)
PM: Allocated 6693824 kbytes in 1.40 seconds (4781.30 MB/s)
...
PM: Need to copy 1332075 pages
PM: Hibernation image created (1332075 pages copied)

--
Modified:
static unsigned long minimum_image_size(unsigned long saveable)
{
	unsigned long size;

	size = global_page_state(NR_SLAB_RECLAIMABLE)
		+ global_page_state(NR_ACTIVE_ANON)
		+ global_page_state(NR_INACTIVE_ANON)
		+ global_page_state(NR_ACTIVE_FILE)
		+ global_page_state(NR_INACTIVE_FILE);
		/*- global_page_state(NR_FILE_MAPPED);*/

	return saveable <= size ? 0 : saveable - size;
}
Comment 56 Aaron Lu 2014-12-05 07:45:00 UTC
After boot, cat /proc/meminfo:
AnonPages:        421368 kB
Mapped:           213032 kB

After boot one VM, cat /proc/meminfo:
AnonPages:       2630312 kB
Mapped:            79100 kB

So my VM app mostly makes use of anonymous pages while yours doesn't and that's probably why I do not have a problem in pre-allocation phase.

And on the removal of NR_FILE_MAPPED from the minimum_image_size, although this can make preallocate succeed, one problem is that in the later copy phase, the count_data_pages may return a value larger than what we have pre-allocated. But from the log, we pre-allocated 1512738 pages and "Need to copy 1332075 pages", which isn't a problem either. But where does those NR_FILE_MAPPED pages go? I'm still thinking about this.
Comment 57 Rainer Fiebig 2014-12-05 22:29:03 UTC
(In reply to Aaron Lu from comment #56)
> After boot, cat /proc/meminfo:
> AnonPages:        421368 kB
> Mapped:           213032 kB
> 
> After boot one VM, cat /proc/meminfo:
> AnonPages:       2630312 kB
> Mapped:            79100 kB
> 
> So my VM app mostly makes use of anonymous pages while yours doesn't and
> that's probably why I do not have a problem in pre-allocation phase.
> 

That's right.

1-GB-VM and 2-GB-VM running:
AnonPages:	  936.288 kB
Mapped:		3.647.764 kB

1-GB-VM closed:
AnonPages:	  938.864 kB
Mapped:		2.649.324 kB

And 2-GB-VM closed:
AnonPages:	  918.724 kB
Mapped:		  424.880 kB

1-GB-VM started:
AnonPages:	  992.816 kB
Mapped:		1.468.576 kB


And when NR_FILE_MAPPED is (very) high, long minimum_image_size(unsigned long saveable) is problematic.

Because when NR_FILE_MAPPED is high, "size" will be rather small. The conditional expression is therefore less likely to evaluate to 0 and in the other case only a relatively small amount will be subtracted from saveable. If "size" became negative its value would be ADDED to saveable!

In either case "pages" will be so big that s2disk fails.

> And on the removal of NR_FILE_MAPPED from the minimum_image_size, although
> this can make preallocate succeed, one problem is that in the later copy
> phase, the count_data_pages may return a value larger than what we have
> pre-allocated. But from the log, we pre-allocated 1512738 pages and "Need to
> copy 1332075 pages", which isn't a problem either. But where does those
> NR_FILE_MAPPED pages go? I'm still thinking about this.

NOT subtracting NR_FILE_MAPPED has so far brought the best results for me. Its an option we should further evaluate.

s2both now even succeeded in a situation where it never did before on my system - when clearly over 50% of memory was used:

> free
             total       used       free     shared    buffers     cached
Mem:      11997780    9884088    2113692     466596     797972    1524352
-/+ buffers/cache:    7561764    4436016
Swap:     10713084    1079088    9633996

PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=2564335 pages, 10016MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061292 pages, 11958MB
count=3023089 pages, 11808MB
max_size=1510410 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=932196 pages, 3641MB
target_image_size=1349751 pages, 5272MB
nr_should_reclaim=1214584 pages, 4744MB
nr_reclaimed=303484 pages, 1185MB
preallocated_high_mem=0 pages, 0MB
to_alloc=1512679 pages, 5908MB
to_alloc_adjusted=1512679 pages, 5908MB
pages_allocated=1512679 pages, 5908MB
done (allocated 1673338 pages)
PM: Allocated 6693352 kbytes in 18.39 seconds (363.96 MB/s)
...
PM: Need to copy 1343860 pages
PM: Hibernation image created (1343860 pages copied)
Comment 58 Rainer Fiebig 2014-12-08 00:16:04 UTC
Aaron, 

after some more thinking, tweaking and testing this is my bottom line:

- A high NR_FILE_MAPPED causes Bug 47931 (- at least on my system).
- Omitting "- global_page_state(NR_FILE_MAPPED)" in "minimum_image_size()" solves this bug.
- In addition, on my system s2disk and s2both now even work in cases where significantly more than 50% of memory is used (after "buffers/cached"!).
- Also, Bug 75101 has not ocurred once. Not even when I tried to provoke it.
- shrink_all_memory(saveable - size)" is indeed expendable. s2disk/s2both do work without it.
- Bug 83821 only occurs if s2both is used AND "thread=y" is activated in /etc/suspend.conf.
- Bug 83821 can be prevented by an early, moderate shrink_all_memory() that is executed with every s2disk, not just when memory-load is high.

By omitting "- global_page_state(NR_FILE_MAPPED)" AND an early, moderate "shrink_all_memory(count_data_pages() / 10)" instead of "shrink_all_memory(saveable - size)" Bug 47931, Bug 83821 and perhaps Bug 75101 can be solved or prevented.

This is sort of a sythesis of my first patch and our findings concerning NR_FILE_MAPPED.

Reliability so far is 100%. System-responsiveness after resume is very good. I'll stick with it.

May be it's time now for a corresponding patch and ask Toralf and Eric and perhaps the submitter of Bug 75101 for their help in testing it.

What's your opinion?

---

Just for your interest: 
This is a case where "size" actually would have been negative and about 400.000 pages would have been added to "saveable":

> cat /proc/meminfo
MemTotal:       11997808 kB
MemFree:         4603516 kB
MemAvailable:    6077028 kB
Buffers:         1263316 kB
Cached:           732752 kB
SwapCached:       340864 kB
Active:           429416 kB
Inactive:        2403768 kB
Active(anon):     287160 kB
Inactive(anon):   946080 kB
Active(file):     142256 kB
Inactive(file):  1457688 kB
Unevictable:          68 kB
Mlocked:              68 kB
SwapTotal:      10713084 kB
SwapFree:        8782760 kB
Dirty:                60 kB
Writeback:             0 kB
AnonPages:        664324 kB
Mapped:          4559752 kB
Shmem:            396120 kB
Slab:             168872 kB
SReclaimable:      84992 kB
Comment 59 Aaron Lu 2014-12-08 08:52:14 UTC
One problem with minimum_image_size is that, with virtuablbox(perhaps other app too), we have a situation where the sum of nr_reclaimable_slab + nr_active_anon + nr_inactive_anon + nr_active_file + nr_inactive_file is smaller than nr_file_mapped, and the calculated size is overflowed and then minimum_image_size returns 0. I assume the overflow of size is un-intended.

[  771.193960] nr_slab_reclaimable=11014 pages, 43MB
[  771.193962] nr_active_anon=81330 pages, 317MB
[  771.193965] nr_inactive_anon=69552 pages, 271MB
[  771.193967] nr_active_file=64751 pages, 252MB
[  771.193969] nr_inactive_file=59446 pages, 232MB
[  771.193971] nr_file_mapped=490676 pages, 1916MB
[  771.193973] minimum_pages=0 pages, 0MB
Comment 60 Rainer Fiebig 2014-12-08 22:34:32 UTC
(In reply to Aaron Lu from comment #59)
> One problem with minimum_image_size is that, with virtuablbox(perhaps other
> app too), we have a situation where the sum of nr_reclaimable_slab +
> nr_active_anon + nr_inactive_anon + nr_active_file + nr_inactive_file is
> smaller than nr_file_mapped, and the calculated size is overflowed and then
> minimum_image_size returns 0. I assume the overflow of size is un-intended.
> 
> [  771.193960] nr_slab_reclaimable=11014 pages, 43MB
> [  771.193962] nr_active_anon=81330 pages, 317MB
> [  771.193965] nr_inactive_anon=69552 pages, 271MB
> [  771.193967] nr_active_file=64751 pages, 252MB
> [  771.193969] nr_inactive_file=59446 pages, 232MB
> [  771.193971] nr_file_mapped=490676 pages, 1916MB
> [  771.193973] minimum_pages=0 pages, 0MB


static unsigned long minimum_image_size(unsigned long saveable)
{
	unsigned long size;

	size = global_page_state(NR_SLAB_RECLAIMABLE)
		+ global_page_state(NR_ACTIVE_ANON)
		+ global_page_state(NR_INACTIVE_ANON)
		+ global_page_state(NR_ACTIVE_FILE)
		+ global_page_state(NR_INACTIVE_FILE)
		- global_page_state(NR_FILE_MAPPED);

	return saveable <= size ? 0 : saveable - size;
}

If I understand that correctly, minimum_image_size(saveable) returns 0 if saveable <= size. In this case s2disk succeeds - in what I've seen so far.

If saveable is NOT <= size it returns saveable - size. And this is problematic if NR_FILE_MAPPED is high. Because in this case only a (too) small amount is subtracted from saveable or - if size becomes negative as in your example - is even being added to saveable. 
In your example size would be -204 583. And saveable - (-204 583) would mean saveable + 204 583. saveable would actually become bigger instead of smaller. 

For one of the failure-cases I reported, you rightly noticed (Comment 51):
"And your case shows that:
minimum_pages=1894339 pages, 7399MB
Which is pretty big, bigger than max_size 5900MB."

Here it is again:

PM: Preallocating image memory...
save_highmem=0 pages, 0MB
saveable=2262873 pages, 8839MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061370 pages, 11958MB
count=3023167 pages, 11809MB
max_size=1510449 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=1894339 pages, 7399MB
target_image_size=1510449 pages, 5900MB

Here, minimum_pages is ca. 83% of saveable and bigger than max_size.


Compare that to the following case where s2disk succeeded:

PM: Preallocating image memory...
save_highmem=0 pages, 0MB
saveable=2564335 pages, 10016MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061292 pages, 11958MB
count=3023089 pages, 11808MB
max_size=1510410 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=932196 pages, 3641MB
target_image_size=1349751 pages, 5272MB

In this case, minimum_pages is only 36% of saveable. And clearly below max_size.

When minimum_image_size() was designed, it was probably hardly foreseeable that one day there would be apps with NR_FILE_MAPPED so huge that the function would return nonsensical values which cause s2disk to fail.

But s2disk/s2both does work very well - even if much more than 50% of memory is used - if we simply scrap NR_FILE_MAPPED from minimum_image_size().
Comment 61 Aaron Lu 2014-12-09 07:39:02 UTC
(In reply to Jay from comment #60)
> (In reply to Aaron Lu from comment #59)
> > One problem with minimum_image_size is that, with virtuablbox(perhaps other
> > app too), we have a situation where the sum of nr_reclaimable_slab +
> > nr_active_anon + nr_inactive_anon + nr_active_file + nr_inactive_file is
> > smaller than nr_file_mapped, and the calculated size is overflowed and then
> > minimum_image_size returns 0. I assume the overflow of size is un-intended.
> > 
> > [  771.193960] nr_slab_reclaimable=11014 pages, 43MB
> > [  771.193962] nr_active_anon=81330 pages, 317MB
> > [  771.193965] nr_inactive_anon=69552 pages, 271MB
> > [  771.193967] nr_active_file=64751 pages, 252MB
> > [  771.193969] nr_inactive_file=59446 pages, 232MB
> > [  771.193971] nr_file_mapped=490676 pages, 1916MB
> > [  771.193973] minimum_pages=0 pages, 0MB
> 
> 
> static unsigned long minimum_image_size(unsigned long saveable)
> {
>       unsigned long size;
> 
>       size = global_page_state(NR_SLAB_RECLAIMABLE)
>               + global_page_state(NR_ACTIVE_ANON)
>               + global_page_state(NR_INACTIVE_ANON)
>               + global_page_state(NR_ACTIVE_FILE)
>               + global_page_state(NR_INACTIVE_FILE)
>               - global_page_state(NR_FILE_MAPPED);
> 
>       return saveable <= size ? 0 : saveable - size;
> }
> 
> If I understand that correctly, minimum_image_size(saveable) returns 0 if
> saveable <= size. In this case s2disk succeeds - in what I've seen so far.

Not my case:
[  357.059296] minimum_pages=0 pages, 0MB
[  357.059298] target_image_size=388528 pages, 1517MB
[  357.519193] nr_should_reclaim=471303 pages, 1841MB
[  357.519200] nr_reclaimed=154727 pages, 604MB
[  357.519203] preallocated_high_mem=0 pages, 0MB
[  357.519205] to_alloc=495243 pages, 1934MB
[  357.519208] to_alloc_adjusted=495243 pages, 1934MB
[  359.815384] pages_allocated=464967 pages, 1816MB
[  359.815391] to_alloc_from_highmem=30276 pages, 118MB
[  359.815393] pages_allocated_from_highmem=0 pages, 0MB
then fail since x64 does not have highmem.

> 
> If saveable is NOT <= size it returns saveable - size. And this is
> problematic if NR_FILE_MAPPED is high. Because in this case only a (too)
> small amount is subtracted from saveable or - if size becomes negative as in
> your example - is even being added to saveable. 

size is an unsigned long so when it is negative, it is actually a very big integer.

> In your example size would be -204 583. And saveable - (-204 583) would mean
> saveable + 204 583. saveable would actually become bigger instead of
> smaller. 

This doesn't happen: since -204583 is a very big integer due to size's type is unsigned and saveable is smaller than it so 0 is returned.

> 
> For one of the failure-cases I reported, you rightly noticed (Comment 51):
> "And your case shows that:
> minimum_pages=1894339 pages, 7399MB
> Which is pretty big, bigger than max_size 5900MB."
> 
> Here it is again:
> 
> PM: Preallocating image memory...
> save_highmem=0 pages, 0MB
> saveable=2262873 pages, 8839MB
> highmem=0 pages, 0MB
> additional_pages=220 pages, 0MB
> avail_normal=3061370 pages, 11958MB
> count=3023167 pages, 11809MB
> max_size=1510449 pages, 5900MB
> user_specified_image_size=1349750 pages, 5272MB
> adjusted_image_size=1349751 pages, 5272MB
> minimum_pages=1894339 pages, 7399MB
> target_image_size=1510449 pages, 5900MB
> 
> Here, minimum_pages is ca. 83% of saveable and bigger than max_size.

I think we have a problem of calculating minimum_pages.

> 
> 
> Compare that to the following case where s2disk succeeded:
> 
> PM: Preallocating image memory...
> save_highmem=0 pages, 0MB
> saveable=2564335 pages, 10016MB
> highmem=0 pages, 0MB
> additional_pages=220 pages, 0MB
> avail_normal=3061292 pages, 11958MB
> count=3023089 pages, 11808MB
> max_size=1510410 pages, 5900MB
> user_specified_image_size=1349750 pages, 5272MB
> adjusted_image_size=1349751 pages, 5272MB
> minimum_pages=932196 pages, 3641MB
> target_image_size=1349751 pages, 5272MB
> 
> In this case, minimum_pages is only 36% of saveable. And clearly below
> max_size.
> 
> When minimum_image_size() was designed, it was probably hardly foreseeable
> that one day there would be apps with NR_FILE_MAPPED so huge that the
> function would return nonsensical values which cause s2disk to fail.

Agree.

> 
> But s2disk/s2both does work very well - even if much more than 50% of memory
> is used - if we simply scrap NR_FILE_MAPPED from minimum_image_size().

This seems to be the case, but I have yet to understand why.
Comment 62 Rainer Fiebig 2014-12-09 21:31:08 UTC
(In reply to Aaron Lu from comment #61)
> (In reply to Jay from comment #60)
> > (In reply to Aaron Lu from comment #59)
> > > One problem with minimum_image_size is that, with virtuablbox(perhaps
> other
> > > app too), we have a situation where the sum of nr_reclaimable_slab +
> > > nr_active_anon + nr_inactive_anon + nr_active_file + nr_inactive_file is
> > > smaller than nr_file_mapped, and the calculated size is overflowed and
> then
> > > minimum_image_size returns 0. I assume the overflow of size is
> un-intended.
> > > 
> > > [  771.193960] nr_slab_reclaimable=11014 pages, 43MB
> > > [  771.193962] nr_active_anon=81330 pages, 317MB
> > > [  771.193965] nr_inactive_anon=69552 pages, 271MB
> > > [  771.193967] nr_active_file=64751 pages, 252MB
> > > [  771.193969] nr_inactive_file=59446 pages, 232MB
> > > [  771.193971] nr_file_mapped=490676 pages, 1916MB
> > > [  771.193973] minimum_pages=0 pages, 0MB
> > 
> > 
> > static unsigned long minimum_image_size(unsigned long saveable)
> > {
> >     unsigned long size;
> > 
> >     size = global_page_state(NR_SLAB_RECLAIMABLE)
> >             + global_page_state(NR_ACTIVE_ANON)
> >             + global_page_state(NR_INACTIVE_ANON)
> >             + global_page_state(NR_ACTIVE_FILE)
> >             + global_page_state(NR_INACTIVE_FILE)
> >             - global_page_state(NR_FILE_MAPPED);
> > 
> >     return saveable <= size ? 0 : saveable - size;
> > }
> > 
> > If I understand that correctly, minimum_image_size(saveable) returns 0 if
> > saveable <= size. In this case s2disk succeeds - in what I've seen so far.
> 
> Not my case:
> [  357.059296] minimum_pages=0 pages, 0MB
> [  357.059298] target_image_size=388528 pages, 1517MB
> [  357.519193] nr_should_reclaim=471303 pages, 1841MB
> [  357.519200] nr_reclaimed=154727 pages, 604MB
> [  357.519203] preallocated_high_mem=0 pages, 0MB
> [  357.519205] to_alloc=495243 pages, 1934MB
> [  357.519208] to_alloc_adjusted=495243 pages, 1934MB
> [  359.815384] pages_allocated=464967 pages, 1816MB
> [  359.815391] to_alloc_from_highmem=30276 pages, 118MB
> [  359.815393] pages_allocated_from_highmem=0 pages, 0MB
> then fail since x64 does not have highmem.
> 

I think we are both right here: 

If, as I've said, saveable <= size is true it returns 0 and s2disk succeeds - as you can see in the example at the bottom (no. 16!).
If however saveable <= size is false but minimum_image_size() returns 0 anyway, s2disk obviously fails - as it does if the returned value is too big (greater than max_size?).

> > 
> > If saveable is NOT <= size it returns saveable - size. And this is
> > problematic if NR_FILE_MAPPED is high. Because in this case only a (too)
> > small amount is subtracted from saveable or - if size becomes negative as
> in
> > your example - is even being added to saveable. 
> 
> size is an unsigned long so when it is negative, it is actually a very big
> integer.
> 
> > In your example size would be -204 583. And saveable - (-204 583) would
> mean
> > saveable + 204 583. saveable would actually become bigger instead of
> > smaller. 
> 
> This doesn't happen: since -204583 is a very big integer due to size's type
> is unsigned and saveable is smaller than it so 0 is returned.

Oops! Overlooked that. Good that it doesn't get bigger.

> 
> > 
> > For one of the failure-cases I reported, you rightly noticed (Comment 51):
> > "And your case shows that:
> > minimum_pages=1894339 pages, 7399MB
> > Which is pretty big, bigger than max_size 5900MB."
> > 
> > Here it is again:
> > 
> > PM: Preallocating image memory...
> > save_highmem=0 pages, 0MB
> > saveable=2262873 pages, 8839MB
> > highmem=0 pages, 0MB
> > additional_pages=220 pages, 0MB
> > avail_normal=3061370 pages, 11958MB
> > count=3023167 pages, 11809MB
> > max_size=1510449 pages, 5900MB
> > user_specified_image_size=1349750 pages, 5272MB
> > adjusted_image_size=1349751 pages, 5272MB
> > minimum_pages=1894339 pages, 7399MB
> > target_image_size=1510449 pages, 5900MB
> > 
> > Here, minimum_pages is ca. 83% of saveable and bigger than max_size.
> 
> I think we have a problem of calculating minimum_pages.
> 
> > 
> > 
> > Compare that to the following case where s2disk succeeded:
> > 
> > PM: Preallocating image memory...
> > save_highmem=0 pages, 0MB
> > saveable=2564335 pages, 10016MB
> > highmem=0 pages, 0MB
> > additional_pages=220 pages, 0MB
> > avail_normal=3061292 pages, 11958MB
> > count=3023089 pages, 11808MB
> > max_size=1510410 pages, 5900MB
> > user_specified_image_size=1349750 pages, 5272MB
> > adjusted_image_size=1349751 pages, 5272MB
> > minimum_pages=932196 pages, 3641MB
> > target_image_size=1349751 pages, 5272MB
> > 
> > In this case, minimum_pages is only 36% of saveable. And clearly below
> > max_size.
> > 
> > When minimum_image_size() was designed, it was probably hardly foreseeable
> > that one day there would be apps with NR_FILE_MAPPED so huge that the
> > function would return nonsensical values which cause s2disk to fail.
> 
> Agree.
> 
> > 
> > But s2disk/s2both does work very well - even if much more than 50% of
> memory
> > is used - if we simply scrap NR_FILE_MAPPED from minimum_image_size().
> 
> This seems to be the case, but I have yet to understand why.

Never wrong!

Another experiment.
I skipped minimum_image_size(saveable):
...
	/* Estimate the minimum size of the image. */
	pages = size; /*minimum_image_size(saveable);*/
...

s2both still worked. Writing the image to disk and especially resuming took their time. The system was somewhat sluggish for a while. It was a heavy-load-case, though (ensuing attempts with lesser memory-load were much better, almost like before):

> free
             total       used       free     shared    buffers     cached
Mem:      11997780   10251944    1745836     463312    1337436    1931652
-/+ buffers/cache:    6982856    5014924
Swap:     10713084          0   10713084

PM: Preallocating image memory... 
save_highmem=0 pages, 0MB
saveable=2660568 pages, 10392MB
highmem=0 pages, 0MB
additional_pages=220 pages, 0MB
avail_normal=3061352 pages, 11958MB
count=3023149 pages, 11809MB
max_size=1510440 pages, 5900MB
user_specified_image_size=1349750 pages, 5272MB
adjusted_image_size=1349751 pages, 5272MB
minimum_pages=1349751 pages, 5272MB
target_image_size=1349751 pages, 5272MB
nr_should_reclaim=1310817 pages, 5120MB
nr_reclaimed=480572 pages, 1877MB
preallocated_high_mem=0 pages, 0MB
to_alloc=1512709 pages, 5909MB
to_alloc_adjusted=1512709 pages, 5909MB
pages_allocated=1512709 pages, 5909MB
done (allocated 1673398 pages)
PM: Allocated 6693592 kbytes in 15.55 seconds (430.45 MB/s)
...
PM: Need to copy 1346668 pages
PM: Hibernation image created (1346668 pages copied)
--

It seems we could do completetely without minimum_image_size().

Anyway - I think its safe to say now that NR_FILE_MAPPED is unfit for calculations for minimum_image_size(). 
It can become too big. And it stays big because it is only minimally influenced by shrink_all_memory(). So even an early shrink_all_memory won't help.
I also think that - if possible - subtraction should be avoided in the formula - we now know where this can lead to.

So the options that I see are

- omitting NR_FILE_MAPPED 
- omitting NR_FILE_MAPPED AND omitting some of the values on the plus-side
- omitting minimum_image_size() completely
- develop a new way to calculate minimum_image_size()

The first one works like a charm so far on my box. In combination with an early, moderate shrink_all_memory(): 3 bugs gone + 50%-barrier broken.

The second one may be even better. But what to leave out?

The third: why not? Simpler, less code, less error-prone.

Number 4 would be for you. But I'd help with testing.



---
Success:

> free
             total       used       free     shared    buffers     cached
Mem:      11997572    7934668    4062904          0    2070348    1267884
-/+ buffers/cache:    4596436    7401136
Swap:     10714108     103724   10610384

PM: Preallocating image memory... 
1: save_highmem = count_highmem_pages(); 0
2: saveable = count_data_pages(); 1546294
3: highmem += zone_page_state(zone, NR_FREE_PAGES); 0
4: count += zone_page_state(zone, NR_FREE_PAGES); 3061840
5: count += highmem; 3061840
6: count -= totalreserve_pages; 3023637
7: size += page_key_additional_pages(saveable) 216
8: max_size (maximum nr of saveable pages to leave in memory) 1510686
9: totalram_pages 2999393
10: PAGE_SIZE 4096
11: image_size 5528481177
12: size = DIV_ROUND_UP(image_size, PAGE_SIZE); 1349727
13: size 1349727
16: pages = minimum_image_size(saveable); 0
17: avail_normal 3061840
18: pages_highmem = preallocate_image_highmem(highmem / 2); 0
19: alloc = count - max_size; 1512951
20: pages = preallocate_image_memory(alloc, avail_normal); 1512951
26: alloc = max_size - size; 160959
27: size = preallocate_highmem_fraction(alloc, highmem, count); 0
28: pages_highmem += size; 0
29: alloc -= size; 160959
30: size = preallocate_image_memory(alloc, avail_normal); 160959
31: pages_highmem += preallocate_image_highmem(alloc - size); 0
32: pages += pages_highmem + size; 1673910
done (allocated 1673910 pages)
PM: Allocated 6695640 kbytes in 1.72 seconds (3892.81 MB/s)
...
PM: Need to copy 1349433 pages
PM: Hibernation image created (1349433 pages copied)
---
Comment 63 Aaron Lu 2014-12-12 08:20:18 UTC
Jay,

For your system, it seems as long as the minimum_image_size returns a value that is smaller than max_size 5900MB, the preallocate will succeed. I think this is a good hint, but then I realized on my system even the minimum_image_size returns a value that is smaller than max_size, the preallocate still failed:
[  109.704036] save_highmem=0 pages, 0MB
[  109.704040] saveable=867833 pages, 3389MB
[  109.704043] highmem=0 pages, 0MB
[  109.704045] additional_pages=88 pages, 0MB
[  109.704046] avail_normal=1018100 pages, 3976MB
[  109.704048] count=988369 pages, 3860MB
[  109.704050] max_size=493116 pages, 1926MB
[  109.704052] user_specified_image_size=388518 pages, 1517MB
[  109.704054] adjusted_image_size=388518 pages, 1517MB
[  109.704067] minimum_pages=0 pages, 0MB
[  109.704069] target_image_size=388518 pages, 1517MB
[  110.091819] nr_should_reclaim=479315 pages, 1872MB
[  110.091825] nr_reclaimed=198780 pages, 776MB
[  110.091827] preallocated_high_mem=0 pages, 0MB
[  110.091829] to_alloc=495253 pages, 1934MB
[  110.091830] to_alloc_adjusted=495253 pages, 1934MB
[  111.824189] pages_allocated=474182 pages, 1852MB
[  111.824196] to_alloc_from_highmem=21071 pages, 82MB
[  111.824198] pages_allocated_from_highmem=0 pages, 0MB

[  111.980642] PM: Basic memory bitmaps freed
[  111.980646] Restarting tasks ... done.

So the change to minimum_image_size can't solve my problem, and perhaps others. I can't think of a general solution for this preallocation problem, maybe it's just the memory are used too much?
Comment 64 Rainer Fiebig 2014-12-12 11:15:00 UTC
Aaron,

this is important:

You omitted this line 	"- global_page_state(NR_FILE_MAPPED);"
from the calculation of "size" in minimum_image_size()
AND minimum_image_size() returned 0
AND s2disk failed?
Comment 65 Rainer Fiebig 2014-12-14 12:02:00 UTC
The whole problem is rooted in the right part of this formula:

 * [number of saveable pages] - [number of pages that can be freed in theory]
 
Ideally, [number of pages that can be freed in theory] should only be construed of elements that can be freed to 100%, if only theoretically.

1.) If (1) NR_SLAB_RECLAIMABLE, (2) NR_ACTIVE_ANON, (3) NR_INACTIVE_ANON, (4) NR_ACTIVE_FILE, (5) NR_INACTIVE_FILE are such elements, subtracting NR_FILE_MAPPED (or anything else) from them is simply a mistake.

2.) If (1) to (5) are not 100% freeable and something must be subtracted from them to achieve the "true" value, it must not be NR_FILE_MAPPED (or NR_ANON_PAGES) because this can become so huge that it blows the whole thing up.

If 1.) is true, the solution is obvious.

If 2.) is true, we need to find a better way to calculate [number of pages that can be freed in theory].
Comment 66 Aaron Lu 2014-12-15 02:32:51 UTC
(In reply to Jay from comment #64)
> Aaron,
> 
> this is important:
> 
> You omitted this line         "- global_page_state(NR_FILE_MAPPED);"
> from the calculation of "size" in minimum_image_size()
> AND minimum_image_size() returned 0
> AND s2disk failed?

I didn't omit "- global_page_state(NR_FILE_MAPPED)" in the calculation of size, it returns 0 because NR_FILE_MAPPED is bigger than the sum of the others and that caused an overflow. And yes, s2disk failed, as you can see from the log, there is 82MB still needed to allocate.
Comment 67 Aaron Lu 2014-12-15 02:38:43 UTC
Jay,

For your case:
According to the log you have posted, as long as the minimum_image_size is smaller than max_size, the preallocate will succeed.
For my case:
Even minimum_image_size returns 0, preallocate still failed.

Based on the above 2 results, I don't see a proper way to improve minimum_image_size.
Comment 68 Rainer Fiebig 2014-12-16 16:48:02 UTC
(In reply to Aaron Lu from comment #66)
> (In reply to Jay from comment #64)
> > Aaron,
> > 
> > this is important:
> > 
> > You omitted this line       "- global_page_state(NR_FILE_MAPPED);"
> > from the calculation of "size" in minimum_image_size()
> > AND minimum_image_size() returned 0
> > AND s2disk failed?
> 
> I didn't omit "- global_page_state(NR_FILE_MAPPED)" in the calculation of
> size, it returns 0 because NR_FILE_MAPPED is bigger than the sum of the
> others and that caused an overflow. 

That was to be expected.

>And yes, s2disk failed, as you can see
> from the log, there is 82MB still needed to allocate.

Well, I saw that.
Comment 69 Rainer Fiebig 2014-12-16 16:49:44 UTC
(In reply to Aaron Lu from comment #67)
> Jay,
> 
> For your case:
> According to the log you have posted, as long as the minimum_image_size is
> smaller than max_size, the preallocate will succeed.
> For my case:
> Even minimum_image_size returns 0, preallocate still failed.
> 
> Based on the above 2 results, I don't see a proper way to improve
> minimum_image_size.

No problem. I'm not trying to push or sell anything here.

Nevertheless I can say that by excluding NR_FILE_MAPPED from calculating minimum_image_size() - combined with an early, moderate shrink_all_memory() - I achieved a level of usability and reliability of s2disk/s2both that is simply incomparable to what I had before. And so far I haven't seen any serious negative effects like system-instability, data-loss or so. 

For a recent stress-test I had 3 VirtualBox-VMs running (8 GB in total) plus gitk plus several other programs and s2both still succeeded. Instead of failing, the system started to make use of swap-space during s2disk. Only after I had started another 2-GB-VM, did hibernate fail - almost to my relief.

So - I've found my solution. Whether you like to further explore it or not, is of course up to you.
Comment 70 Rainer Fiebig 2014-12-22 20:30:33 UTC
(In reply to Aaron Lu from comment #63)
> Jay,
> 
> For your system, it seems as long as the minimum_image_size returns a value
> that is smaller than max_size 5900MB, the preallocate will succeed. I think
> this is a good hint, but then I realized on my system even the
> minimum_image_size returns a value that is smaller than max_size, the
> preallocate still failed:
> [  109.704036] save_highmem=0 pages, 0MB
> [  109.704040] saveable=867833 pages, 3389MB
> [  109.704043] highmem=0 pages, 0MB
> [  109.704045] additional_pages=88 pages, 0MB
> [  109.704046] avail_normal=1018100 pages, 3976MB
> [  109.704048] count=988369 pages, 3860MB
> [  109.704050] max_size=493116 pages, 1926MB
> [  109.704052] user_specified_image_size=388518 pages, 1517MB
> [  109.704054] adjusted_image_size=388518 pages, 1517MB
> [  109.704067] minimum_pages=0 pages, 0MB
> [  109.704069] target_image_size=388518 pages, 1517MB
> [  110.091819] nr_should_reclaim=479315 pages, 1872MB
> [  110.091825] nr_reclaimed=198780 pages, 776MB
> [  110.091827] preallocated_high_mem=0 pages, 0MB
> [  110.091829] to_alloc=495253 pages, 1934MB
> [  110.091830] to_alloc_adjusted=495253 pages, 1934MB
> [  111.824189] pages_allocated=474182 pages, 1852MB
> [  111.824196] to_alloc_from_highmem=21071 pages, 82MB
> [  111.824198] pages_allocated_from_highmem=0 pages, 0MB
> 
> [  111.980642] PM: Basic memory bitmaps freed
> [  111.980646] Restarting tasks ... done.
> 
> So the change to minimum_image_size can't solve my problem, and perhaps
> others. I can't think of a general solution for this preallocation problem,
> maybe it's just the memory are used too much?


If minimum_image_size() returns 0, s2disk/s2both always succeeds in my system.

It does so because preallocate_image_pages() then still returns a value (for pages_allocated) that is identical to to_alloc_adjusted (s. Comment 62). 

Whereas in your system it returns a value that is less than to_alloc_adjusted.

I wonder whether this different behaviour is related to available memory. The submitter's system and mine have more than 4 GB (8 / 12 GB). But that's just an idea.

Reviewing previous cases I found that minimum_pages was always too high when s2disk failed and 0 in the immediately following, successful attempt. That pattern of high/failure, 0/success was consistent.

While it's mostly good to get a second chance: if a return-value of 0 for minimum_image_size() was intended to cause s2disk to fail, success in that case is somewhat misleading and obscures the reason for failure.
Comment 71 Aaron Lu 2015-03-19 07:55:30 UTC
Back to the original problem:

1
22012-09-15T05:17:54.114+02:00 n22 kernel: PM: Need to copy 719868 pages
2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Normal pages needed: 146128 + 1024, available pages: 81456
2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Not enough free memory
2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Error -12 creating hibernation image

2
[142528.248008] PM: Normal pages needed: 530249 + 1024, available pages: 518040
[142528.248008] PM: Not enough free memory
[142528.248008] PM: Error -12 creating hibernation image

Somehow, it seems that the needed pages is bigger than available ones and that caused problem. Is this still a problem on latest kernel?
Comment 72 Rainer Fiebig 2015-04-23 20:40:06 UTC
(In reply to Aaron Lu from comment #71)
> Back to the original problem:
> 
> 1
> 22012-09-15T05:17:54.114+02:00 n22 kernel: PM: Need to copy 719868 pages
> 2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Normal pages needed: 146128 +
> 1024, available pages: 81456
> 2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Not enough free memory
> 2012-09-15T05:17:54.114+02:00 n22 kernel: PM: Error -12 creating hibernation
> image
> 
> 2
> [142528.248008] PM: Normal pages needed: 530249 + 1024, available pages:
> 518040
> [142528.248008] PM: Not enough free memory
> [142528.248008] PM: Error -12 creating hibernation image
> 
> Somehow, it seems that the needed pages is bigger than available ones and
> that caused problem. Is this still a problem on latest kernel?


It's still there with kernel 4.0: First attempt = failure, second attempt = success.

Also still there: Bug 83821 (only when using s2both).

With two modifications in snapshot.c both are gone:

1) Bug 47931:

static unsigned long minimum_image_size(unsigned long saveable) 
...
+ global_page_state(NR_INACTIVE_FILE);
/*- global_page_state(NR_FILE_MAPPED);*/
...

2) Bug 83821 (when only little memory is used):

int hibernate_preallocate_memory(void) 
...
if (size >= saveable) {
shrink_all_memory(saveable);
...

I've been using this for months now which means several hundreds of s2both/resume-cycles. 

Only once (or twice? - can't remember) did s2both not work on the first attempt. And this was probably due to a problem with a VirtualBox-VM. 

No negative effects, the system is superbly stable.

I think these findings may warrant a closer look by developers.

Anyway - at least for me bugs 47931, 83821 and 75101 are history.
Comment 73 Aaron Lu 2015-04-24 01:51:15 UTC
Jay,

I think you are talking about another problem. The original problem reported by Toralf is that the preallocation phase succeed, but later when the data is actually being saved in swsusp_save, it was found the available memory is not enough while your case is that it fails at the preallocation phase. Feel free to open a new bug report for that, but we shouldn't talk that in this thread :-)
Comment 74 Rainer Fiebig 2015-04-24 10:52:37 UTC
(In reply to Aaron Lu from comment #73)
> Jay,
> 
> I think you are talking about another problem. The original problem reported
> by Toralf is that the preallocation phase succeed, but later when the data
> is actually being saved in swsusp_save, it was found the available memory is
> not enough while your case is that it fails at the preallocation phase. Feel
> free to open a new bug report for that, but we shouldn't talk that in this
> thread :-)


Alright, opened a new report (Bug 97201) - so it won't be forgotten. Maybe s2disk-users are an endangered species in the rising age of SSDs. But I still can't see any better energy-saving and productivity-enhancing feature in an OS and I think it should work under all conditions.

I also opened a new report (Bug 97211) for Bug 83821 because it's IMO not a duplicate of this one and only occurs with s2both when "threads = y" in /etc/suspend.conf.

So long!
Comment 75 Hussam Al-Tayeb 2016-10-03 18:03:46 UTC
I have a similar issue. I occasionally test nouveau. When hibernating under nouveau, it seems Linux attempts to cache the contents of the dedicated video memory into ram instead of swap on hibernation. This is possibly due to the assumption that it is going to be swapped to swap partition anyway while creating the hibernation image.
My system has 4GB physical ram and nvidia card has 2GB memory. When less than 2GB ram are free, the system would fail to hibernate complaining not enough memory to create the image.

This is odd since I have a 32GB swap partition and it rarely ever uses more than 2 00 to 300MB of swap space.

In your case, this is likely an Intel integrated graphics. But it seems to be the same behavior.

Maybe hibernate logic need polishing.
Comment 76 Hussam Al-Tayeb 2016-10-03 18:06:09 UTC
(In reply to Hussam Al-Tayeb from comment #75)
> I have a similar issue. I occasionally test nouveau. When hibernating under
> nouveau, it seems Linux attempts to cache the contents of the dedicated
> video memory into ram instead of swap on hibernation. This is possibly due
> to the assumption that it is going to be swapped to swap partition anyway
> while creating the hibernation image.
> My system has 4GB physical ram and nvidia card has 2GB memory. When less
> than 2GB ram are free, the system would fail to hibernate complaining not
> enough memory to create the image.
> 
> This is odd since I have a 32GB swap partition and it rarely ever uses more
> than 2 00 to 300MB of swap space.
> 
> In your case, this is likely an Intel integrated graphics. But it seems to
> be the same behavior.
> 
> Maybe hibernate logic need polishing.

Oddly enough, the second attempt seems to work as most of the memory content is now in swap. So there is enough free ram.

With proprietary nvidia driver, this issue never happens.
Comment 77 doa379 2016-10-07 21:25:32 UTC
The reason is the kernel creates the image entirely in RAM before committing it to swap. This is a major flaw in the implementation of suspend-to-disk.
Comment 78 Hussam Al-Tayeb 2016-10-07 21:42:21 UTC
(In reply to doa379 from comment #77)
> The reason is the kernel creates the image entirely in RAM before committing
> it to swap. This is a major flaw in the implementation of suspend-to-disk.

It would be nice is there was a way to work around this or at least some way to change the behavior.
Comment 79 doa379 2016-10-07 21:45:59 UTC
No. The way is to re-implement suspend-to-disk in the kernel.

https://bugzilla.kernel.org/show_bug.cgi?id=176461