Bug 207861 - mremap MAP_ANONYMOUS|MAP_SHARED grow provide bad mapping.
Summary: mremap MAP_ANONYMOUS|MAP_SHARED grow provide bad mapping.
Status: NEW
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Page Allocator (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-05-23 03:53 UTC by phi.debian@gmail.com
Modified: 2020-06-08 05:20 UTC (History)
0 users

See Also:
Kernel Version: 5.4.0-29-generic
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description phi.debian@gmail.com 2020-05-23 03:53:44 UTC
Hi All,

I bumped into this, in my case it involve MAP_ANONYMOUS|MAP_SHARED.
There is another bug https://bugzilla.kernel.org/show_bug.cgi?id=8691 that may be relatd, but this one involve MAP_GROWSDOWN, while I don't need that. The problem exhibit itself as a Bus Error in the user land.

Here is my test case that is a simple demonstrator, a streamline of my need in my real application.

The test case here exhibit the 'Bad address' that would result in a Bus error if used, adn then I propose a workaround for the one like me who could be blocked by this.

Since there is a work around, I setup a prio to High and not blocking.

=============================================================
#define _GNU_SOURCE  
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <errno.h>
#include <fcntl.h>
#include <signal.h>

#include <sys/types.h>
#include <unistd.h>
#include <sys/mman.h>

#define checkaddr(p) access(p,0)
#define strchecka(p) (checkaddr(p),strerror(errno*(errno==EFAULT)))

int main(int c, char **v)
{ int   i;
  char  b[128], *p;
  union { char *p; long  l;}u;
  
  sprintf(b,"pmap %d | grep zero",getpid());

  p=mmap(0, 4096,PROT_READ|PROT_WRITE,MAP_ANONYMOUS|MAP_SHARED, -1, 0);
  if((void*)p == MAP_FAILED)
  { printf("mmap failed\n");
  }
  u.p=p;
  system((printf("After mmap\n"),b));
  printf("p=%#lx p[0]=%c\n",u.l,u.p[0]='a');

  p=mremap(p,4096,8192,MREMAP_MAYMOVE);
  system((printf("After mremap\n"),b));

  u.p=p+4094; *u.p='b';
  printf("p=%#lx p[0]=%c p[4094]=%c\n",u.l,p[0],*u.p);  
  printf("%#lx addr check => %s\n",u.l,strchecka(u.p));

  u.p=p+4096;
  printf("%#lx addr check => %s\n",u.l,strchecka(u.p));


  munmap(p+4096,4096);
  system((printf("After unmap p+4096\n"),b));
  u.p=mmap(p+4096, 4096,PROT_READ|PROT_WRITE,MAP_ANONYMOUS|MAP_SHARED, -1, 0);
  system((printf("After mmap p+4096\n"),b));
  printf("%#lx addr check => %s\n",u.l,strchecka(u.p));

  u.p=p; p[4096]='c';
  printf("p=%#lx p[0]=%c p[4094]=%c p[4096]=%c\n",u.l,u.p[0],p[4094],p[4096]);
  exit(0);
}
=============================================================

And the compile run on 
Linux phiw 5.4.0-29-generic #33-Ubuntu SMP Wed Apr 29 14:32:27 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

PW$ cc -o f2 f2.c

PW$ ./f2
After mmap
00007f022daf6000      4K rw-s- zero (deleted)
p=0x7f022daf6000 p[0]=a
After mremap
00007f022dac8000      8K rw-s- zero (deleted)
p=0x7f022dac8ffe p[0]=a p[4094]=b
0x7f022dac8ffe addr check => Success
0x7f022dac9000 addr check => Bad address
After unmap p+4096
00007f022dac8000      4K rw-s- zero (deleted)
After mmap p+4096
00007f022dac8000      4K rw-s- zero (deleted)
00007f022dac9000      4K rw-s- zero (deleted)
0x7f022dac9000 addr check => Success
p=0x7f022dac8000 p[0]=a p[4094]=b p[4096]=c

=============================================================

This work around  require 3 syscall to get the job done (instead of only one mremap) and end up with 2 mmap adjacent regions instead of only one as mremap would do.

I need this for a powerfull realloc, i.e basically what stated in the man page :)

I found another workaround based on having a file backing store with an unlinked uniq file, then at grow time, an ftruncate() then the mremap() would work (since not MAP_ANONYMOUS) but this doesn't fit my need, I got a high numbber of mmap() regions, and I could not afford to keep all those FDs open for the sake of mremap() grow.

Thanx for any pointer.
Cheers,
Phi
Comment 1 phi.debian@gmail.com 2020-05-23 05:26:25 UTC
While we are at it, is there a way to force a 2 adjacent region coalesce, can mremap() be used for that? may be with a MREMAP_COALESCE flag, this could a way to make the workaround more viable (multiple grow). The current workaround only works once....
Comment 2 Andrew Morton 2020-05-25 00:40:23 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Sat, 23 May 2020 03:53:44 +0000 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=207861
> 
>             Bug ID: 207861
>            Summary: mremap MAP_ANONYMOUS|MAP_SHARED grow provide bad
>                     mapping.
>            Product: Memory Management
>            Version: 2.5
>     Kernel Version: 5.4.0-29-generic
>           Hardware: All
>                 OS: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Page Allocator
>           Assignee: akpm@linux-foundation.org
>           Reporter: phi.debian@gmail.com
>         Regression: No
> 
> Hi All,

Hi.  (Again, please reply by email!)

> I bumped into this, in my case it involve MAP_ANONYMOUS|MAP_SHARED.
> There is another bug https://bugzilla.kernel.org/show_bug.cgi?id=8691 that
> may
> be relatd, but this one involve MAP_GROWSDOWN, while I don't need that. The
> problem exhibit itself as a Bus Error in the user land.
> 
> Here is my test case that is a simple demonstrator, a streamline of my need
> in
> my real application.
> 
> The test case here exhibit the 'Bad address' that would result in a Bus error
> if used, adn then I propose a workaround for the one like me who could be
> blocked by this.
> 
> Since there is a work around, I setup a prio to High and not blocking.

Nice report, thanks.

Yes, it looks like it's the same thing as the 13-year-old
https://bugzilla.kernel.org/show_bug.cgi?id=8691. 
MAP_ANONYMOUS|MAP_SHARED mmaps are backed by shmem and I guess it's
still the case that we don't grow the mapping in mremap() for
shmem-backed.  And returning success from mremap() in this situation
seems flat out rude.

Can folks please take a look?

> =============================================================
> #define _GNU_SOURCE  
> #include <stdlib.h>
> #include <stdio.h>
> #include <string.h>
> #include <errno.h>
> #include <fcntl.h>
> #include <signal.h>
> 
> #include <sys/types.h>
> #include <unistd.h>
> #include <sys/mman.h>
> 
> #define checkaddr(p) access(p,0)
> #define strchecka(p) (checkaddr(p),strerror(errno*(errno==EFAULT)))
> 
> int main(int c, char **v)
> { int   i;
>   char  b[128], *p;
>   union { char *p; long  l;}u;
> 
>   sprintf(b,"pmap %d | grep zero",getpid());
> 
>   p=mmap(0, 4096,PROT_READ|PROT_WRITE,MAP_ANONYMOUS|MAP_SHARED, -1, 0);
>   if((void*)p == MAP_FAILED)
>   { printf("mmap failed\n");
>   }
>   u.p=p;
>   system((printf("After mmap\n"),b));
>   printf("p=%#lx p[0]=%c\n",u.l,u.p[0]='a');
> 
>   p=mremap(p,4096,8192,MREMAP_MAYMOVE);
>   system((printf("After mremap\n"),b));
> 
>   u.p=p+4094; *u.p='b';
>   printf("p=%#lx p[0]=%c p[4094]=%c\n",u.l,p[0],*u.p);  
>   printf("%#lx addr check => %s\n",u.l,strchecka(u.p));
> 
>   u.p=p+4096;
>   printf("%#lx addr check => %s\n",u.l,strchecka(u.p));
> 
> 
>   munmap(p+4096,4096);
>   system((printf("After unmap p+4096\n"),b));
>   u.p=mmap(p+4096, 4096,PROT_READ|PROT_WRITE,MAP_ANONYMOUS|MAP_SHARED, -1,
>   0);
>   system((printf("After mmap p+4096\n"),b));
>   printf("%#lx addr check => %s\n",u.l,strchecka(u.p));
> 
>   u.p=p; p[4096]='c';
>   printf("p=%#lx p[0]=%c p[4094]=%c
>   p[4096]=%c\n",u.l,u.p[0],p[4094],p[4096]);
>   exit(0);
> }
> =============================================================
> 
> And the compile run on 
> Linux phiw 5.4.0-29-generic #33-Ubuntu SMP Wed Apr 29 14:32:27 UTC 2020
> x86_64
> x86_64 x86_64 GNU/Linux
> 
> PW$ cc -o f2 f2.c
> 
> PW$ ./f2
> After mmap
> 00007f022daf6000      4K rw-s- zero (deleted)
> p=0x7f022daf6000 p[0]=a
> After mremap
> 00007f022dac8000      8K rw-s- zero (deleted)
> p=0x7f022dac8ffe p[0]=a p[4094]=b
> 0x7f022dac8ffe addr check => Success
> 0x7f022dac9000 addr check => Bad address
> After unmap p+4096
> 00007f022dac8000      4K rw-s- zero (deleted)
> After mmap p+4096
> 00007f022dac8000      4K rw-s- zero (deleted)
> 00007f022dac9000      4K rw-s- zero (deleted)
> 0x7f022dac9000 addr check => Success
> p=0x7f022dac8000 p[0]=a p[4094]=b p[4096]=c
> 
> =============================================================
> 
> This work around  require 3 syscall to get the job done (instead of only one
> mremap) and end up with 2 mmap adjacent regions instead of only one as mremap
> would do.
> 
> I need this for a powerfull realloc, i.e basically what stated in the man
> page
> :)
> 
> I found another workaround based on having a file backing store with an
> unlinked uniq file, then at grow time, an ftruncate() then the mremap() would
> work (since not MAP_ANONYMOUS) but this doesn't fit my need, I got a high
> numbber of mmap() regions, and I could not afford to keep all those FDs open
> for the sake of mremap() grow.
> 
> Thanx for any pointer.
> Cheers,
> Phi
> 
> -- 
> You are receiving this mail because:
> You are the assignee for the bug.
Comment 3 phi.debian@gmail.com 2020-05-26 11:41:07 UTC
Created attachment 289293 [details]
attachment-16178-0.html

Hi All,

Following up, I think I got the big picture with what you said, and then I
think, that 'may be' there is no real use case for
MAP__ANONYMOUS|MAP_SHARED, well I guess we can even say there is none
otherwise they would have voiced long ago :) I got trapped into it in a
multithreaded app trying to be smart and grow the regions that way :). I
guess this could be closed with a simple little notice in the man page,
eventually in the BUGS section, specifying that using mremap() on
MAP_SHARED is not supported beside the virtual addr aliasing already
mentioned. I got trapped after rereading again and again the fine manual,
and thinking for a minute I must miss something.

Note that MAP__ANONYMOUS|MAP_SHARED works nicely, and this one is very
useful to implement performant 'realloc' without copy for page (multipages)
realloc, with all the precaution that no addr of the memory region got to
be taken due to the vaddr migration potential, that's already mentioned in
the man page.

So the fix is simply docco I guess :)

Cheers,
Phi
Comment 4 phi.debian@gmail.com 2020-06-08 05:20:13 UTC
Little typo in my last note above, I meant MAP_PRIVATE|MAP_ANONYMOUS works fine.

Note You need to log in before you can comment on or make changes to this bug.