Bug 196157

Summary:	100+ times slower disk writes on 4.x+/i386/16+RAM, compared to 3.x
Product:	Memory Management	Reporter:	Alkis Georgopoulos (alkisg)
Component:	Page Allocator	Assignee:	Andrew Morton (akpm)
Status:	NEW ---
Severity:	normal	CC:	fernando.filgueira, gurselm, gustep12, howaboutsynergy, nucleo, pmhahn, reserv0, tijs_vbuggen
Priority:	P1
Hardware:	All
OS:	Linux
Kernel Version:	4.x	Subsystem:
Regression:	No	Bisected commit-id:
Attachments:	Disk read/write speed Disk operations per second Disk merged operations per second

Description Alkis Georgopoulos 2017-06-22 06:25:49 UTC

Me and a lot of other users have an issue where disk writes start fast (e.g. 200 MB/sec), but after intensive disk usage, they end up 100+ times slower (e.g. 2 MB/sec), and never get fast again until we run "echo 3 > /proc/sys/vm/drop_caches".

This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any 64bit kernels (i.e. it only affects i386).

My initial bug report was in Ubuntu:
https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118

I included a test case there, which mostly says "Copy /lib around 100 times. You'll see that the first copy happens in 5 seconds, and the 30th copy may need more than 800 seconds".

Here is my latest version of the script (basically, the (3) step below):
1) . /etc/os-release; echo -n "$VERSION, $(uname -r), $(dpkg --print-architecture), RAM="; awk '/MemTotal:/ { print $2 }' /proc/meminfo
2) mount /dev/sdb2 /mnt && rm -rf /mnt/tmp/lib && mkdir -p /mnt/tmp/lib && sync && echo 3 > /proc/sys/vm/drop_caches && chroot /mnt
3) mkdir -p /tmp/lib; cd /tmp/lib; s=/lib; d=1; echo -n "Copying $s to $d: "; while /usr/bin/time -f %e sh -c "cp -a '$s' '$d'; sync"; do s=$d; d=$((($d+1)%100)); echo -n "Copying $s to $d: "; done

And here are some results, where you can see that all 4.x+ i386 kernels are affected:
-----------------------------------------------------------------------------
14.04, Trusty Tahr, 3.13.0-24-generic, i386, RAM=16076400 [Live CD]
8-13 secs

15.04 (Vivid Vervet), 3.19.0-15-generic, i386, RAM=16083080 [Live CD]
5-7 secs

15.10 (Wily Werewolf), 4.2.0-16-generic, i386, RAM=16082536 [Live CD]
4-350 secs

16.04.2 LTS (Xenial Xerus), 3.19.0-80-generic, i386, RAM=16294832 [HD install]
10-25 secs

16.04.2 LTS (Xenial Xerus), 4.2.0-42-generic, i386, RAM=16294392 [HD install]
14-89 secs

16.04.2 LTS (Xenial Xerus), 4.4.0-79-generic, i386, RAM=16293556 [HD install]
15-605 secs

16.04.2 LTS (Xenial Xerus), 4.8.0-54-generic, i386, RAM=16292708 [HD install]
6-160 secs

16.04.2 LTS (Xenial Xerus), 4.12.0-041200rc5-generic, i386, RAM=16292588 [HD install]
46-805 secs

16.04.2 LTS (Xenial Xerus), 4.8.0-36-generic, amd64, RAM=16131028 [Live CD]
4-11 secs

An example single run of the script:
-----------------------------------------------------------------------------
16.04.2 LTS (Xenial Xerus), 4.8.0-54-generic, i386, RAM=16292708 [HD install]
-----------------------------------------------------------------------------
Copying /lib to 1: 37.23
Copying 1 to 2: 6.74
Copying 2 to 3: 6.88
Copying 3 to 4: 7.89
Copying 4 to 5: 7.91
Copying 5 to 6: 9.03
Copying 6 to 7: 8.46
Copying 7 to 8: 8.10
Copying 8 to 9: 8.93
Copying 9 to 10: 10.51
Copying 10 to 11: 10.33
Copying 11 to 12: 11.08
Copying 12 to 13: 11.78
Copying 13 to 14: 14.18
Copying 14 to 15: 18.42
Copying 15 to 16: 23.19
Copying 16 to 17: 61.08
Copying 17 to 18: 155.88
Copying 18 to 19: 141.96
Copying 19 to 20: 152.98
Copying 20 to 21: 163.03
Copying 21 to 22: 154.85
Copying 22 to 23: 137.13
Copying 23 to 24: 146.08
Copying 24 to 25:

Thank you!

Comment 1 Andrew Morton 2017-06-22 19:37:39 UTC

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

hm, that's news to me.

Does anyone have access to a large i386 setup?  Interested in
reproducing this and figuring out what's going wrong?


On Thu, 22 Jun 2017 06:25:49 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=196157
> 
>             Bug ID: 196157
>            Summary: 100+ times slower disk writes on 4.x+/i386/16+RAM,
>                     compared to 3.x
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 4.x
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: alkisg@gmail.com
>         Regression: No
> 
> Me and a lot of other users have an issue where disk writes start fast (e.g.
> 200 MB/sec), but after intensive disk usage, they end up 100+ times slower
> (e.g. 2 MB/sec), and never get fast again until we run "echo 3 >
> /proc/sys/vm/drop_caches".
> 
> This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
> It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any 64bit
> kernels (i.e. it only affects i386).
> 
> My initial bug report was in Ubuntu:
> https://bugs.launchpad.net/ubuntu/+source/linux-hwe/+bug/1698118
> 
> I included a test case there, which mostly says "Copy /lib around 100 times.
> You'll see that the first copy happens in 5 seconds, and the 30th copy may
> need
> more than 800 seconds".
> 
> Here is my latest version of the script (basically, the (3) step below):
> 1) . /etc/os-release; echo -n "$VERSION, $(uname -r), $(dpkg
> --print-architecture), RAM="; awk '/MemTotal:/ { print $2 }' /proc/meminfo
> 2) mount /dev/sdb2 /mnt && rm -rf /mnt/tmp/lib && mkdir -p /mnt/tmp/lib &&
> sync
> && echo 3 > /proc/sys/vm/drop_caches && chroot /mnt
> 3) mkdir -p /tmp/lib; cd /tmp/lib; s=/lib; d=1; echo -n "Copying $s to $d: ";
> while /usr/bin/time -f %e sh -c "cp -a '$s' '$d'; sync"; do s=$d;
> d=$((($d+1)%100)); echo -n "Copying $s to $d: "; done
> 
> And here are some results, where you can see that all 4.x+ i386 kernels are
> affected:
> -----------------------------------------------------------------------------
> 14.04, Trusty Tahr, 3.13.0-24-generic, i386, RAM=16076400 [Live CD]
> 8-13 secs
> 
> 15.04 (Vivid Vervet), 3.19.0-15-generic, i386, RAM=16083080 [Live CD]
> 5-7 secs
> 
> 15.10 (Wily Werewolf), 4.2.0-16-generic, i386, RAM=16082536 [Live CD]
> 4-350 secs
> 
> 16.04.2 LTS (Xenial Xerus), 3.19.0-80-generic, i386, RAM=16294832 [HD
> install]
> 10-25 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.2.0-42-generic, i386, RAM=16294392 [HD install]
> 14-89 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.4.0-79-generic, i386, RAM=16293556 [HD install]
> 15-605 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.8.0-54-generic, i386, RAM=16292708 [HD install]
> 6-160 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.12.0-041200rc5-generic, i386, RAM=16292588 [HD
> install]
> 46-805 secs
> 
> 16.04.2 LTS (Xenial Xerus), 4.8.0-36-generic, amd64, RAM=16131028 [Live CD]
> 4-11 secs
> 
> An example single run of the script:
> -----------------------------------------------------------------------------
> 16.04.2 LTS (Xenial Xerus), 4.8.0-54-generic, i386, RAM=16292708 [HD install]
> -----------------------------------------------------------------------------
> Copying /lib to 1: 37.23
> Copying 1 to 2: 6.74
> Copying 2 to 3: 6.88
> Copying 3 to 4: 7.89
> Copying 4 to 5: 7.91
> Copying 5 to 6: 9.03
> Copying 6 to 7: 8.46
> Copying 7 to 8: 8.10
> Copying 8 to 9: 8.93
> Copying 9 to 10: 10.51
> Copying 10 to 11: 10.33
> Copying 11 to 12: 11.08
> Copying 12 to 13: 11.78
> Copying 13 to 14: 14.18
> Copying 14 to 15: 18.42
> Copying 15 to 16: 23.19
> Copying 16 to 17: 61.08
> Copying 17 to 18: 155.88
> Copying 18 to 19: 141.96
> Copying 19 to 20: 152.98
> Copying 20 to 21: 163.03
> Copying 21 to 22: 154.85
> Copying 22 to 23: 137.13
> Copying 23 to 24: 146.08
> Copying 24 to 25:
> 
> Thank you!
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.

Comment 2 Alkis Georgopoulos 2017-06-22 20:58:56 UTC

Στις 22/06/2017 10:37 μμ, ο Andrew Morton έγραψε:
> 
> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
> 
> hm, that's news to me.
> 
> Does anyone have access to a large i386 setup?  Interested in
> reproducing this and figuring out what's going wrong?
> 


I can arrange ssh/vnc access to an i386 box with 16 GB RAM that has the 
issue, if some kernel dev wants to work on that. Please PM me for 
details - also tell me your preferred distro.

Comment 3 Michal Hocko 2017-06-23 07:23:46 UTC

On Thu 22-06-17 12:37:36, Andrew Morton wrote:
[...]
> > Me and a lot of other users have an issue where disk writes start fast
> (e.g.
> > 200 MB/sec), but after intensive disk usage, they end up 100+ times slower
> > (e.g. 2 MB/sec), and never get fast again until we run "echo 3 >
> > /proc/sys/vm/drop_caches".

What is your dirty limit configuration. Is your highmem dirtyable
(highmem_is_dirtyable)?

> > This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
> > It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any
> 64bit
> > kernels (i.e. it only affects i386).

I remember we've had some changes in the way how the dirty memory is
throttled and 32b would be more sensitive to those changes. Anyway, I
would _strongly_ discourage you from using 32b kernels with that much of
memory. You are going to hit walls constantly and many of those issues
will be inherent. Some of them less so but rather non-trivial to fix
without regressing somewhere else. You can tune your system somehow but
this will be fragile no mater what.

Sorry to say that but 32b systems with tons of memory are far from
priority of most mm people. Just use 64b kernel. There are more pressing
problems to deal with.

Comment 4 Alkis Georgopoulos 2017-06-23 07:44:42 UTC

Στις 23/06/2017 10:13 πμ, ο Michal Hocko έγραψε:
> On Thu 22-06-17 12:37:36, Andrew Morton wrote:
> 
> What is your dirty limit configuration. Is your highmem dirtyable
> (highmem_is_dirtyable)?
> 
>>> This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
>>> It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any
>>> 64bit
>>> kernels (i.e. it only affects i386).
> 
> I remember we've had some changes in the way how the dirty memory is
> throttled and 32b would be more sensitive to those changes. Anyway, I
> would _strongly_ discourage you from using 32b kernels with that much of
> memory. You are going to hit walls constantly and many of those issues
> will be inherent. Some of them less so but rather non-trivial to fix
> without regressing somewhere else. You can tune your system somehow but
> this will be fragile no mater what.
> 
> Sorry to say that but 32b systems with tons of memory are far from
> priority of most mm people. Just use 64b kernel. There are more pressing
> problems to deal with.
> 

Hi, I'm attaching below all my settings from /proc/sys/vm.

I think that the regression also affects 4 GB and 8 GB RAM i386 systems, 
but not in an exponential manner; i.e. copies there are appear only 2-3 
times slower than they used to be in 3.x kernels.

Now I don't know the kernel internals, but if disk copies show up to be 
2-3 times slower, and the regression is in memory management, wouldn't 
that mean that the memory management is *hundreds* of times slower, to 
show up in disk writing benchmarks?

I.e. I'm afraid that this regression doesn't affect 16+ GB RAM systems 
only; it just happens that it's clearly visible there.

And it might even affect 64bit systems with even more RAM; but I don't 
have any such system to test with.

Kind regards,
Alkis

root@pc:/proc/sys/vm# grep . *
admin_reserve_kbytes:8192
block_dump:0
compact_unevictable_allowed:1
dirty_background_bytes:0
dirty_background_ratio:10
dirty_bytes:0
dirty_expire_centisecs:1500
dirty_ratio:20
dirtytime_expire_seconds:43200
dirty_writeback_centisecs:1500
drop_caches:3
extfrag_threshold:500
highmem_is_dirtyable:0
hugepages_treat_as_movable:0
hugetlb_shm_group:0
laptop_mode:0
legacy_va_layout:0
lowmem_reserve_ratio:256	32	32
max_map_count:65530
min_free_kbytes:34420
mmap_min_addr:65536
mmap_rnd_bits:8
nr_hugepages:0
nr_overcommit_hugepages:0
nr_pdflush_threads:0
oom_dump_tasks:1
oom_kill_allocating_task:0
overcommit_kbytes:0
overcommit_memory:0
overcommit_ratio:50
page-cluster:3
panic_on_oom:0
percpu_pagelist_fraction:0
stat_interval:1
swappiness:60
user_reserve_kbytes:131072
vdso_enabled:1
vfs_cache_pressure:100
watermark_scale_factor:10

Comment 5 Michal Hocko 2017-06-23 11:38:41 UTC

On Fri 23-06-17 10:44:36, Alkis Georgopoulos wrote:
> Στις 23/06/2017 10:13 πμ, ο Michal Hocko έγραψε:
> >On Thu 22-06-17 12:37:36, Andrew Morton wrote:
> >
> >What is your dirty limit configuration. Is your highmem dirtyable
> >(highmem_is_dirtyable)?
> >
> >>>This issue happens on systems with any 4.x kernel, i386 arch, 16+ GB RAM.
> >>>It doesn't happen if we use 3.x kernels (i.e. it's a regression) or any
> 64bit
> >>>kernels (i.e. it only affects i386).
> >
> >I remember we've had some changes in the way how the dirty memory is
> >throttled and 32b would be more sensitive to those changes. Anyway, I
> >would _strongly_ discourage you from using 32b kernels with that much of
> >memory. You are going to hit walls constantly and many of those issues
> >will be inherent. Some of them less so but rather non-trivial to fix
> >without regressing somewhere else. You can tune your system somehow but
> >this will be fragile no mater what.
> >
> >Sorry to say that but 32b systems with tons of memory are far from
> >priority of most mm people. Just use 64b kernel. There are more pressing
> >problems to deal with.
> >
> 
> 
> 
> Hi, I'm attaching below all my settings from /proc/sys/vm.
> 
> I think that the regression also affects 4 GB and 8 GB RAM i386 systems, but
> not in an exponential manner; i.e. copies there are appear only 2-3 times
> slower than they used to be in 3.x kernels.

If the regression shows with 4-8GB 32b systems then the priority for
fixing would be certainly much higher.

> Now I don't know the kernel internals, but if disk copies show up to be 2-3
> times slower, and the regression is in memory management, wouldn't that mean
> that the memory management is *hundreds* of times slower, to show up in disk
> writing benchmarks?

Well, it is hard to judge what the real problem is here but you have
to realize that 32b system has some fundamental issues which come from
how the memory has split between kernel (lowmem - 896MB at maximum) and
highmem. The more memory you have the more lowmem you consume by kernel
data structure. Just consider that ~160MB of this space is eaten by
struct pages to describe 16GB of memory. There are other data structures
which can only live in the low memory.

> I.e. I'm afraid that this regression doesn't affect 16+ GB RAM systems only;
> it just happens that it's clearly visible there.
> 
> And it might even affect 64bit systems with even more RAM; but I don't have
> any such system to test with.

Not really. 64b systems do not need kernel/usespace split because the
address space large enough. If there are any regressions since 3.0 then
we are certainly interested in hearing about them.

> root@pc:/proc/sys/vm# grep . *
> dirty_ratio:20
> highmem_is_dirtyable:0

this means that the highmem is not dirtyable and so only 20% of the free
lowmem (+ page cache in that region) is considered and writers might
get throttled quite early (this might be a really low number when the
lowmem is congested already). Do you see the same problem when enabling
highmem_is_dirtyable = 1?

Comment 6 Alkis Georgopoulos 2017-06-26 05:28:16 UTC

Στις 23/06/2017 02:38 μμ, ο Michal Hocko έγραψε:
> this means that the highmem is not dirtyable and so only 20% of the free
> lowmem (+ page cache in that region) is considered and writers might
> get throttled quite early (this might be a really low number when the
> lowmem is congested already). Do you see the same problem when enabling
> highmem_is_dirtyable = 1?
> 

Excellent advice! :)
Indeed, setting highmem_is_dirtyable=1 completely eliminates the issue!

Is that something that should be =1 by default, i.e. should I notify the 
Ubuntu developers that the defaults they ship aren't appropriate,
or is it something that only 16+ GB RAM memory owners should adjust in 
their local configuration?

Thanks a lot!
Results of 2 test runs, with highmem_is_dirtyable=0 and 1:

1) echo 0 > highmem_is_dirtyable:
-----------------------------------------------------------------------------
16.04.2 LTS (Xenial Xerus), 4.8.0-56-generic, i386, RAM=16292548
-----------------------------------------------------------------------------
Copying /lib to 1: 18.60
Copying 1 to 2: 6.09
Copying 2 to 3: 6.04
Copying 3 to 4: 7.04
Copying 4 to 5: 6.28
Copying 5 to 6: 5.03
Copying 6 to 7: 6.50
Copying 7 to 8: 4.82
Copying 8 to 9: 5.49
Copying 9 to 10: 5.88
Copying 10 to 11: 5.09
Copying 11 to 12: 5.70
Copying 12 to 13: 5.19
Copying 13 to 14: 4.55
Copying 14 to 15: 4.69
Copying 15 to 16: 4.76
Copying 16 to 17: 5.38
Copying 17 to 18: 4.59
Copying 18 to 19: 4.26
Copying 19 to 20: 4.47
Copying 20 to 21: 4.32
Copying 21 to 22: 4.33
Copying 22 to 23: 5.55
Copying 23 to 24: 4.73
Copying 24 to 25: 4.80
Copying 25 to 26: 5.06
Copying 26 to 27: 16.84
Copying 27 to 28: 5.28
Copying 28 to 29: 5.45
Copying 29 to 30: 12.35
Copying 30 to 31: 5.90
Copying 31 to 32: 4.90
Copying 32 to 33: 4.76
Copying 33 to 34: 4.37
Copying 34 to 35: 5.82
Copying 35 to 36: 4.55
Copying 36 to 37: 8.80
Copying 37 to 38: 5.07
Copying 38 to 39: 5.69
Copying 39 to 40: 4.88
Copying 40 to 41: 5.26
Copying 41 to 42: 4.69
Copying 42 to 43: 5.10
Copying 43 to 44: 4.79
Copying 44 to 45: 4.54
Copying 45 to 46: 7.46
Copying 46 to 47: 5.54
Copying 47 to 48: 4.86
Copying 48 to 49: 6.12
Copying 49 to 50: 5.37
Copying 50 to 51: 7.63
Copying 51 to 52: 6.37
Copying 52 to 53: 5.81
...

2) echo 1 > highmem_is_dirtyable:
-----------------------------------------------------------------------------
16.04.2 LTS (Xenial Xerus), 4.8.0-56-generic, i386, RAM=16292548
-----------------------------------------------------------------------------
Copying /lib to 1: 21.47
Copying 1 to 2: 5.54
Copying 2 to 3: 6.63
Copying 3 to 4: 4.69
Copying 4 to 5: 5.38
Copying 5 to 6: 8.50
Copying 6 to 7: 9.34
Copying 7 to 8: 8.78
Copying 8 to 9: 9.48
Copying 9 to 10: 10.89
Copying 10 to 11: 10.52
Copying 11 to 12: 11.28
Copying 12 to 13: 14.70
Copying 13 to 14: 17.71
Copying 14 to 15: 52.43
Copying 15 to 16: 92.52
...

Comment 7 Michal Hocko 2017-06-26 05:46:31 UTC

On Mon 26-06-17 08:28:07, Alkis Georgopoulos wrote:
> Στις 23/06/2017 02:38 μμ, ο Michal Hocko έγραψε:
> >this means that the highmem is not dirtyable and so only 20% of the free
> >lowmem (+ page cache in that region) is considered and writers might
> >get throttled quite early (this might be a really low number when the
> >lowmem is congested already). Do you see the same problem when enabling
> >highmem_is_dirtyable = 1?
> >
> 
> Excellent advice! :)
> Indeed, setting highmem_is_dirtyable=1 completely eliminates the issue!
> 
> Is that something that should be =1 by default,

Unfortunatelly, this is not something that can be applied in general.
This can lead to a premature OOM killer invocations. E.g. a direct write
to the block device cannot use highmem, yet there won't be anything to
throttle those writes properly. Unfortunately, our documentation is
silent about this setting. I will post a patch later.

Comment 8 Alkis Georgopoulos 2017-06-26 07:02:31 UTC

Στις 26/06/2017 08:46 πμ, ο Michal Hocko έγραψε:
> Unfortunatelly, this is not something that can be applied in general.
> This can lead to a premature OOM killer invocations. E.g. a direct write
> to the block device cannot use highmem, yet there won't be anything to
> throttle those writes properly. Unfortunately, our documentation is
> silent about this setting. I will post a patch later.


I should also note that highmem_is_dirtyable was 0 in all the 3.x kernel 
tests that I did; yet they didn't have the "slow disk writes" issue.

I.e. I think that setting highmem_is_dirtyable=1 works around the issue, 
but is not the exact point which caused the regression that we see in 
4.x kernels...

--
Kind regards,
Alkis Georgopoulos

Comment 9 Michal Hocko 2017-06-26 09:12:59 UTC

On Mon 26-06-17 10:02:23, Alkis Georgopoulos wrote:
> Στις 26/06/2017 08:46 πμ, ο Michal Hocko έγραψε:
> >Unfortunatelly, this is not something that can be applied in general.
> >This can lead to a premature OOM killer invocations. E.g. a direct write
> >to the block device cannot use highmem, yet there won't be anything to
> >throttle those writes properly. Unfortunately, our documentation is
> >silent about this setting. I will post a patch later.
> 
> 
> I should also note that highmem_is_dirtyable was 0 in all the 3.x kernel
> tests that I did; yet they didn't have the "slow disk writes" issue.

Yes this is possible. There were some changes in the dirty memory
throttling that could lead to visible behavior changes. I remember that
ab8fabd46f81 ("mm: exclude reserved pages from dirtyable memory") had
noticeable effect. The patch is something that we really want and it is
unnfortunate it has eaten some more from the dirtyable lowmem.

> I.e. I think that setting highmem_is_dirtyable=1 works around the issue, but
> is not the exact point which caused the regression that we see in 4.x
> kernels...

yes as I've said this is a workaround for for something that is an
inherent 32b lowmem/highmem issue.

Comment 10 Alkis Georgopoulos 2017-06-29 06:15:00 UTC

I've been working on a system with highmem_is_dirtyable=1 for a couple
of hours.

While the disk benchmark showed no performance hit on intense disk
activity, there are other serious problems that make this workaround
unusable.

I.e. when there's intense disk activity, the mouse cursor moves with
extreme lag, like 1-2 fps. Switching with alt+tab from e.g. thunderbird
to pidgin needs 10 seconds. kswapd hits 100% cpu usage. Etc etc, the
system becomes unusable until the disk activity settles down.
I was testing via SSH so I hadn't noticed the extreme lag.

All those symptoms go away when resetting highmem_is_dirtyable=0.

So currently 32bit installations with 16 GB RAM have no option but to
remove the extra RAM...


About ab8fabd46f81 ("mm: exclude reserved pages from dirtyable memory"),
would it make sense for me to compile a kernel and test if everything
works fine without it? I.e. if we see that this caused all those
regressions, would it be revisited?

And an unrelated idea, is there any way to tell linux to use a limited
amount of RAM for page cache, e.g. only 1 GB?

Kind regards,
Alkis Georgopoulos

Comment 11 Michal Hocko 2017-06-29 07:16:25 UTC

On Thu 29-06-17 09:14:55, Alkis Georgopoulos wrote:
> I've been working on a system with highmem_is_dirtyable=1 for a couple
> of hours.
> 
> While the disk benchmark showed no performance hit on intense disk
> activity, there are other serious problems that make this workaround
> unusable.
> 
> I.e. when there's intense disk activity, the mouse cursor moves with
> extreme lag, like 1-2 fps. Switching with alt+tab from e.g. thunderbird
> to pidgin needs 10 seconds. kswapd hits 100% cpu usage. Etc etc, the
> system becomes unusable until the disk activity settles down.
> I was testing via SSH so I hadn't noticed the extreme lag.
> 
> All those symptoms go away when resetting highmem_is_dirtyable=0.
> 
> So currently 32bit installations with 16 GB RAM have no option but to
> remove the extra RAM...

Or simply install 64b kernel. You can keep 32b userspace if you need
it but running 32b kernel will be always a fight.

> About ab8fabd46f81 ("mm: exclude reserved pages from dirtyable memory"),
> would it make sense for me to compile a kernel and test if everything
> works fine without it? I.e. if we see that this caused all those
> regressions, would it be revisited?

The patch makes a lot of sense in general. I do not think we will revert
it based on a configuration which is rare. We might come up with some
tweaks in the dirty memory throttling but that area is quite tricky
already. You can of course try to test without this commit applied (I
believe you would have to go and checkout ab8fabd46f81 and revert the
commit because a later revert sound more complicated to me. I might be
wrong here because I haven't tried that myself though).

> And an unrelated idea, is there any way to tell linux to use a limited
> amount of RAM for page cache, e.g. only 1 GB?

No.

Comment 12 Alkis Georgopoulos 2017-06-29 08:02:39 UTC

Στις 29/06/2017 10:16 πμ, ο Michal Hocko έγραψε:
> 
> Or simply install 64b kernel. You can keep 32b userspace if you need
> it but running 32b kernel will be always a fight.

Results with 64bit kernel on 32bit userspace:
16.04.2 LTS (Xenial Xerus), 4.4.0-83-generic, i386, RAM=16131400
Copying /lib to 1: 27.00
Copying 1 to 2: 9.37
Copying 2 to 3: 8.80
Copying 3 to 4: 9.13
Copying 4 to 5: 9.25
Copying 5 to 6: 8.08
Copying 6 to 7: 8.00
Copying 7 to 8: 8.85
Copying 8 to 9: 8.67
Copying 9 to 10: 8.55
Copying 10 to 11: 8.67
Copying 11 to 12: 8.15
Copying 12 to 13: 7.57
Copying 13 to 14: 8.05
Copying 14 to 15: 8.22
Copying 15 to 16: 8.35
Copying 16 to 17: 8.50
Copying 17 to 18: 8.30
Copying 18 to 19: 7.97
Copying 19 to 20: 7.81
Copying 20 to 21: 7.11
Copying 21 to 22: 8.20
Copying 22 to 23: 7.54
Copying 23 to 24: 7.96
Copying 24 to 25: 8.04
Copying 25 to 26: 7.87
Copying 26 to 27: 7.70
Copying 27 to 28: 8.33
Copying 28 to 29: 6.88
Copying 29 to 30: 7.18

It doesn't have the 32bit slowness issue, and it's "only" 2 times slower
than the full 64bit installation (so maybe there's an additional delay
involved somewhere in userspace)...
...but it's also hard to setup (e.g. Ubuntu doesn't allow 4.8 32bit
kernel to coexist with 4.8 64bit because they have the same file names;
so the 64 bit kernel needs to be 4.4),
and it doesn't run some applications, e.g. VirtualBox or proprietary
nvidia drivers...


Thank you very much for your continuous input on this, we'll see what we
can do to locally avoid the issue, probably just tell sysadmins to avoid
using -pae with more than 8 GB RAM.

Comment 13 reserv0 2017-07-21 10:04:15 UTC

Greetings,

I'd like to point out that this bug seems identical to the one I reported here:
https://bugzilla.kernel.org/show_bug.cgi?id=110031

While setting vm.highmem_is_dirtyable=1 is indeed a workaround, it is in no way the reason for this bug, that got introduced in Linux v4.2.0 (v4.0.x and 4.1.x do work perfectly fine in this respect and with vm.highmem_is_dirtyable=0, so it's not anything related with any change between Linux v3 and Linux v4).

Another notable fact, is that even 64 bits kernels were affected by this bug until Linux v4.8.4 came out, at which point the 64 bits compiled kernel worked fine again for me while the 32 bits one kept crawling after about 4Gb of disk writes occurred.
Note also that this bug seems glibc-version-dependent since one Linux distro (Rosa 2012 64 bits, a Mandriva fork) was affected while another (PCLinuxOS, also a 64 bits distro forked from Mandriva) was not on my system (with the same kernel configuration).

Definitely, there's something fishy in one of the v4.2 commits, that badly broke disk writes and/or disk caching code.

I am extremely worried that after v4.1 will become unmaintained, people who, like me, do need to run some old 32 bits systems on their computer (e.g. to compile 32 bits binaries compatible with old systems) will be left without a solution other than running an outdated kernel with potential security holes.

Please, pretty please, do not give up on fixing this show stopper bug...

Comment 14 Alkis Georgopoulos 2017-07-21 14:16:36 UTC

I found a very nice summary of the problem, along with all the known workarounds, there:
http://flaterco.com/kb/PAE_slowdown.html

The author says that recompiling the kernel with this option on, solved the issue for him:
VMSPLIT_2G:  Processor type and features → Memory split = 2G/2G user/kernel split (was 3G/1G user/kernel split)

I wonder, since 32bit kernel with 8+ GB RAM are unusable anyway (as is also mentioned in Documentation/vm/highmem.txt), would it be possible to change that setting upstream in the kernel, so that it enables itself during runtime, if it detects more than 8 GB RAM?

Comment 15 reserv0 2017-07-21 14:42:19 UTC

(In reply to Alkis Georgopoulos from comment #14)
> I found a very nice summary of the problem, along with all the known
> workarounds, there:
> http://flaterco.com/kb/PAE_slowdown.html
> 
> The author says that recompiling the kernel with this option on, solved the
> issue for him:
> VMSPLIT_2G:  Processor type and features → Memory split = 2G/2G user/kernel
> split (was 3G/1G user/kernel split)

The problem is that this split causes many memory-hungry applications to become unusable, since they are already in a very tight fit with 3Gb available for their virtual address space. So, again, such a work around is totally unsuitable.

No, the only solution is to find the commit which is the culprit for that *regression* which occurred between v4.1 and v4.2.0 and to plain and simply revert it.

Comment 16 Alkis Georgopoulos 2017-08-18 14:11:18 UTC

Well, if my following thoughts are correct,
 1) Noone will work on pinpointing the "commit with is the culprit". Or, it can't be pinointed because it's not a single commit but a collection of essential commits that need more and more low memory,
 2) Setting Memory split = 2G/2G does solve the issue,
 3) And that causes a regression with apps that need more than 2 GB of address space,

then, personally I prefer that, to what we currently have.

I believe the impact now is far more grave than some rare application that would need more than 2 GB under i386.

Comment 17 reserv0 2017-08-28 21:56:38 UTC

(In reply to Alkis Georgopoulos from comment #16)

> Well, if my following thoughts are correct,
>  1) Noone will work on pinpointing the "commit with is the culprit". Or, it
> can't be pinointed because it's not a single commit but a collection of
> essential commits that need more and more low memory,

I'm no Linux kernel expert, but in addition to the fact that the bug was introduced in v4.2.0, there's also the fact that this bug did affect 64 bits kernels for me (on a Rosa 2012 64 bits installation, on the same computer with 32Gb of RAM) *and* that it got fixed for 64 bits kernels in Linux v4.8.4 (see my comment dated 2017-07-21 10:04:15 UTC)... This should help someone knowledgeable in this field of expertise to narrow dramatically the amount of potential causes and bogus code changes for this bug.

>  2) Setting Memory split = 2G/2G does solve the issue,

No, it's just a work around.

>  3) And that causes a regression with apps that need more than 2 GB of
> address space,> then, personally I prefer that, to what we currently have.
> 
> I believe the impact now is far more grave than some rare application that
> would need more than 2 GB under i386.

Please, do not judge from you particular needs... It may be suitable for *your* needs, but it is not for mine.

Comment 18 reserv0 2017-09-19 22:33:58 UTC

Bug still present for 32 bits kernel in v4.13.2

Comment 19 reserv0 2017-11-24 18:02:37 UTC

Bug still present for 32 bits kernel in v4.14.2.

Comment 20 reserv0 2018-04-19 14:23:47 UTC

Bug still present for 32 bits kernel in v4.16.3, and with the "end of life" of v4.1 getting close, I'm worried I will be left without any option to run a maintained Linux kernel on 32 bits machines with 16 Gb of memory or more...

Comment 21 Andrew Morton 2018-04-19 20:36:39 UTC

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

https://bugzilla.kernel.org/show_bug.cgi?id=196157

People are still hurting from this.  It does seem a pretty major
regression for highmem machines.

I'm surprised that we aren't hearing about this from distros.  Maybe it
only affects a subset of highmem machines?

Anyway, can we please take another look at it?  Seems that we messed up
highmem dirty pagecache handling in the 4.2 timeframe.

Thanks.

Comment 22 reserv0 2018-04-20 07:55:35 UTC

On Thu, 4/19/18, Andrew Morton <akpm@linux-foundation.org> wrote:

> (switched to email.  Please respond via emailed reply-to-all, not via the
> bugzilla web interface).
>
> https://bugzilla.kernel.org/show_bug.cgi?id=196157
>
> People are still hurting from this.  It does seem a pretty major
> regression for highmem machines.
>
> I'm surprised that we aren't hearing about this from distros.  Maybe it
> only affects a subset of highmem machines?

Supposition: it would only affect distros with a given glibc version (my
affected machines run glibc v2.13) ?

Please, also take note that I encountered this bug on the 64 bits flavor of the
same distro (Rosa 2012), on 64 bits capable machines, with Linux v4.2+ and
until Linux v4.8.4 was released (and another interesting fact is that another
64 bits distro one the same machines was not affected at all by that bug,
which would reinforce my suspicion about a glibc-triggered and
glibc-version-dependent bug).

> Anyway, can we please take another look at it?  Seems that we messed up
> highmem dirty pagecache handling in the 4.2 timeframe.

Oh, yes, please, do have a look ! :-D

In the mean time, could you guys also consider extending the lifetime of the
v4.1 kernel until this ***showstopper*** bug is resolved in the mainline kernel
version ?

Many (many, many, many) thanks in advance !

Comment 23 reserv0 2018-06-04 14:13:53 UTC

Bug still present in v4.17.0, and with v4.1.52 now being marked "end of life", I will be left without any upgrade path...

Pretty please, consider keeping v4.1 live till this *showstopper* bug gets solved !

Comment 24 reserv0 2018-08-17 09:01:48 UTC

Bug still present for 32 bits kernel in v4.18.1, and now, v4.1 (last working Linux kernel for 32 bits machines with 16Gb or more RAM) has gone unmaintained...

Comment 25 Michal Hocko 2018-08-17 09:29:31 UTC

On Fri 17-08-18 09:01:41, Thierry wrote:
> Bug still present for 32 bits kernel in v4.18.1, and now, v4.1 (last
> working Linux kernel for 32 bits machines with 16Gb or more RAM) has
> gone unmaintained...

Have you tried to set highmem_is_dirtyable as suggested elsewhere?

I would like to stress out that 16GB with 32b kernels doesn't play
really nice. Even small changes (larger kernel memory footprint) can
lead to all sorts of problems. I would really recommend using 64b
kernels instead. There shouldn't be any real reason to stick with 32b
highmem based kernel for such a large beast. I strongly doubt the cpu
itself would be 32b only.

Comment 26 reserv0 2018-08-17 11:29:51 UTC

On Fri, 8/17/18, Michal Hocko <mhocko@kernel.org> wrote:

> Have you tried to set highmem_is_dirtyable as suggested elsewhere?

I tried everything, and yes, that too, to no avail. The only solution is to limit the
available RAM to less than 12Gb, which is just unacceptable for me.

> I would like to stress out that 16GB with 32b kernels doesn't play really
> nice.

I would like to stress out that 32 Gb of RAM played totally nice and very smoothly
with v4.1 and older kernels... This got broken in v4.2 and never repaired since.
This is a very nasty regression, and my suggestion to keep v4.1 maintained till
that regression would finally get worked around fell into deaf ears...

> Even small changes (larger kernel memory footprint) can lead to all sorts of
> problems. I would really recommend using 64b kernels instead. There shouldn't
> be
> any real reason to stick with 32bhighmem based  kernel for such a large
> beast.
> I strongly doubt the cpu itself would be 32b only.

The reasons are many (one of them dealing with being able to run old 32 bits
Linux distros but without the bugs and security flaws of old, unmaintained kernels).

But the reasons are not the problem here. The problem is that v4.2 introduced a
bug (*) that was never fixed since.

A shame, really. :-(

(*) and that bug also affected 64 bits kernels, at first, mind you, till v4.8.4 got
released; see my comment in my initial report here:
https://bugzilla.kernel.org/show_bug.cgi?id=110031#c14

Comment 27 Michal Hocko 2018-08-17 14:46:47 UTC

On Fri 17-08-18 11:29:45, Thierry wrote:
> On Fri, 8/17/18, Michal Hocko <mhocko@kernel.org> wrote:
> 
> > Have you tried to set highmem_is_dirtyable as suggested elsewhere?
> 
> I tried everything, and yes, that too, to no avail. The only solution is to
> limit the
> available RAM to less than 12Gb, which is just unacceptable for me.
>  
> > I would like to stress out that 16GB with 32b kernels doesn't play really
> nice.
> 
> I would like to stress out that 32 Gb of RAM played totally nice and very
> smoothly
> with v4.1 and older kernels... This got broken in v4.2 and never repaired
> since.
> This is a very nasty regression, and my suggestion to keep v4.1 maintained
> till
> that regression would finally get worked around fell into deaf ears...
> 
> > Even small changes (larger kernel memory footprint) can lead to all sorts
> of
> > problems. I would really recommend using 64b kernels instead. There
> shouldn't be
> > any real reason to stick with 32bhighmem based  kernel for such a large
> beast.
> > I strongly doubt the cpu itself would be 32b only.
> 
> The reasons are many (one of them dealing with being able to run old 32 bits
> Linux distros but without the bugs and security flaws of old, unmaintained
> kernels).

You can easily run 32b distribution on top of 64b kernels.

> But the reasons are not the problem here. The problem is that v4.2 introduced
> a
> bug (*) that was never fixed since.
> 
> A shame, really. :-(

Well. I guess nobody is disputing this is really annoying. I do
agree! On the other nobody came up with an acceptable solution. I would
love to dive into solving this but there are so many other things to
work on with much higher priority. Really, my todo list is huge and
growing. 32b kernels with that much memory is simply not all that high
on that list because there is a clear possibility of running 64b kernel
on the hardware which supports.

I fully understand your frustration and feel sorry about that but we are
only so many of us working on this subsystem. If you are willing to dive
into this then by all means. I am pretty sure you will find a word of
help and support but be warned this is not really trivial.

good luck

Comment 28 Tijs 2018-10-16 07:27:28 UTC

Created attachment 279043 [details]
Disk read/write speed

Disk (write) speed is normal, until midnight when updatedb/file integrity check runs

Comment 29 Tijs 2018-10-16 07:29:16 UTC

Created attachment 279045 [details]
Disk operations per second

Disk operations per second show (only) a slight decrease after updatedb

Comment 30 Tijs 2018-10-16 07:30:28 UTC

Created attachment 279047 [details]
Disk merged operations per second

Disk merged operations seem to fail after updatedb

Comment 31 Tijs 2018-10-16 07:57:54 UTC

Just want to comment on this... I'm testing a 4.14.71 32-bit kernel with 12GB of ram memory on Intel Xeon E5620 (2.4Ghz) with 8 cores/threads.

I run a cp -a/rm of a /lib directory to tmp on the same filesystem in a loop.

Everything runs smooth until lots of inodes/dentries are visited on the hard-disk due to updatedb/file integrity check at midnight. It is depicted quite clearly in the images I attached. For some reason the merged disk operations seem to fail after that, resulting in poor write performance.

When updatedb runs at midnight, netdata shows an increase of slab memory until it tops at a maximum of 275MB, while kernel stack is around 3MB and page cache is around 17MB. The slab memory shows unreclaimable memory of 47 (constant) and a reclaimable part of 228MB (dynamic).

When i flush the dentry/inode cache, the system goes back to normal behavior...

echo 2 > /proc/sys/vm/page_cache

During the slow down free -lm shows a decrease of available low memory, as if memory is being allocated constantly (and never freed) even though updatedb etc. was finished, and slap memory tops at 275MB.

Memory at slow down:

             total       used       free     shared    buffers     cached
Mem:         12164       6034       6130          0        272       2044
Low:           631        575         55
High:        11533       5458       6074
-/+ buffers/cache:       3716       8448
Swap:         1991          0       1991

Memory after dentry/inode cache flush:
             total       used       free     shared    buffers     cached
Mem:         12164       4036       8127          0        279        236
Low:           631        408        223
High:        11533       3628       7904
-/+ buffers/cache:       3521       8643
Swap:         1991          0       1991

Some 2GB of high memory is freed at the same time when the cache is flushed.

Comment 32 Sebastian 2019-08-02 10:13:39 UTC

I just wanted to confirm that this bug affected me when I tried to make a backup image on an i386 kernel 4.x with only 8GB of RAM. Write speeds to SSD became as slow as ~2MB/sec, on all 3 SSDs connected to the system, even on SSDs that weren't participating in the backup. After troubleshooting for much of the night, and reproducing the bug every time, I finally found the right hint, booted amd64, and everything worked ok.

The problem is that the naive expectation is that i386 should be legacy, stable, good enough - or even preferred - for a simple job like imaging a disk on a 5 year old Intel PC (Core i7-3520M). It's odd to find out that the opposite is the case.

Comment 33 Sebastian 2019-08-02 17:08:08 UTC

P.S. Some more details:

- Thinkpad T430s with i7-3520M, 8GB RAM, and 3 SSDs (1x mSATA, 2x SATA)
- OS: Parted Magic 2019-05, default options (legacy BIOS boot, 32-bit distro loaded entirely to RAM)
- Started Clonezilla from within the Parted Magic desktop

Of course, running a Linux Desktop entirely in RAM just to use text-based Clonezilla is rather resource-inefficient to begin with. I see the folly of that. But the idea was that I could multi-task easily that way, e.g. watch CPU loads and perhaps use gparted; no web browser or network traffic was involved.

On the plus side of things, this method was also a great way to reproduce and analyze the bug, and if I had *not* used the graphical desktop, I probably couldn't have (easily) tested read and write rates and other performance parameters (CPU) during and after Clonezilla's work. Multi-tasking without a Desktop is a little beyond my typical routine, without a Desktop I would typically just use a single terminal window (root shell?).

Most revealing troubleshooting:

- Write speed troubleshooting with dd to various test files showed slowdown
- Troubleshooting read rates with hdparm showed read the rates remained fast

I am tempted to do some more testing, such as:

- once the bug manifests itself again, check if writes to /dev/null are also slowed down? Seeing as it slowed down all disk writes permanently without recovery unless rebooted.

- any other suggestions for test devices I could write to with dd to perhaps narrow down the bug?

Thanks!

Comment 34 Sebastian 2019-08-02 20:04:58 UTC

Kernel: 5.1.5-pmagic64

dd writes are: 

19.2 GB/s from /dev/zero to /dev/null
194 MB/s from /dev/urandom to /dev/null

182 MB/s from /dev/zero to /media/sda2/test64
97.3 MB/s from /dev/urandom to /media/sda2/test64



Kernel: 5.1.5-pmagic (i386)

dd writes are: 

33.8 GB/s from /dev/zero to /dev/null
149 MB/s from /dev/urandom to /dev/null

147 MB/s from /dev/zero to /media/sda2/test32
72.3 MB/s from /dev/urandom to /media/sda2/test32

Now, after writing about 5GB with dd to /media/sda2/test32, during which the write rate apparently dropped steeply (had to cancel after 20 minutes), performance remains low as follows:

32.7 GB/s from /dev/zero to /dev/null (same)
149 MB/s from /dev/urandom to /dev/null (same)

1.0 MB/s from /dev/zero to /media/sda2/test32 (150x lower)
1.1 MB/s from /dev/urandom to /media/sda2/test32 (70x lower)

Comment 35 fernando 2020-01-11 01:17:36 UTC

Hello everyone, I registered only to post my findings about this bug in case it helps :

Im no expert in linux, nor in english ... :-)

Preamble:

Lubuntu distro tested (also in debian stretch 4.xxxx (uninstalled because of this)with kernel:

Linux jerry 5.0.0-37-generic #40~18.04.1-Ubuntu SMP Thu Nov 14 12:06:39 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
hdd 4 sata drives (2 wd, 2 seagate) surface checked and with no defects.

How to reproduce (in my experience) the bug:

cp a couple of TBytes from disc to disc (sata hdds in my case), and randomly (sometimes after 5 minutes, sometimes after an hour), speed will drop to a halt (1-5 MBytes/sec), you can check this with iotop.

My findings:

this options in /etc/sysctl.conf :

vm.dirty_background_ratio=5
vm.dirty_ratio=10
vm.dirty_bytes = 200000000
vm.swappiness=60

copy regain speed up to 75megs/sec (once every 45 seconds it drops to 0 but it regain's speed again, after some part of the buffers get cleared, you can see this behaviour in htop because a line in yellow (buffers) is deleted at the end in memory section, and iotop shows a gain in speed after that)

but, if you put a cron tasks like those at the same time (spaced 15 seconds in time each or 30 seconds if you want): 

*/1 * * * * /bin/echo 2 > /proc/sys/vm/drop_caches
*/1 * * * * ( sleep 30; /bin/echo 2 > /proc/sys/vm/drop_caches )
*/1 * * * * ( sleep 15; /bin/echo 2 > /proc/sys/vm/drop_caches )
*/1 * * * * ( sleep 45; /bin/echo 2 > /proc/sys/vm/drop_caches )

The copy speed gets to the teorethical max speed of my hdd (about 130 MB/sec) every now and then, so i think the problem is in the disk buffering portion of the kernel, because when its empty the copy is at full speed and then gets back to slow again when it fills up.

I hope it helps to find the bug, because Im not going back to windows 10.

Also the bug seems to be also in freebsd ...  so ... probably an opensource driver?

thanks, and sorry for my poor english, i hope you comprehend with this explanation what I'm trying to explain.

free bsd search:
https://www.google.com/search?q=freebsd+slow+disk+writes&oq=freebsd+slow+disk+writes
for similar problems.

Comment 36 fernando 2020-02-03 20:47:57 UTC

I found a soution that works for every kernel I tried:

Download the source kernel you want from:

https://cdn.kernel.org/pub/linux/kernel/v5.x/

and compile it yourself and ... it works OK!!!!

At the moment tried with kernel 4.16 and 5.4 and no problem at all.

Comment 37 fernando 2020-02-03 20:48:14 UTC

I found a soution that works for every kernel I tried:

Download the source kernel you want from:

https://cdn.kernel.org/pub/linux/kernel/v5.x/

and compile it yourself and ... it works OK!!!!

At the moment tried with kernel 4.16 and 5.4 and no problem at all.

Comment 38 reserv0 2020-02-20 12:41:13 UTC

(In reply to fernando from comment #37)
> I found a soution that works for every kernel I tried:
> 
> Download the source kernel you want from:
> 
> https://cdn.kernel.org/pub/linux/kernel/v5.x/
> 
> and compile it yourself and ... it works OK!!!!
> 
> At the moment tried with kernel 4.16 and 5.4 and no problem at all.

I'm afraid the problem is NOT solved. The last kernel version I tried (v5.4.4) did see an improvement (i.e. the problem surfaces later, after a larger amount of data writes have occurred; e.g. in my test case compilation, it appears at 50% instead at 25% of the total compilation), but the problem is still there, and only kernels v4.1.x and older are exempt of it.

Note that the amount of RAM you are using does also impact how fast the problem arises (or whether it will arise at all). I'm using 32 Gb here.