Bug 9138 - kernel overwrites MAP_PRIVATE mmap
Summary: kernel overwrites MAP_PRIVATE mmap
Status: REJECTED INVALID
Alias: None
Product: Memory Management
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 high
Assignee: Andrew Morton
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-10-09 06:28 UTC by Paolo Bonzini
Modified: 2008-01-05 18:28 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.20, 2.6.22
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Paolo Bonzini 2007-10-09 06:28:25 UTC
Most recent kernel where this bug did not occur:
Distribution: Debian 2.6.8
Hardware Environment:
Software Environment:
Problem Description:

Steps to reproduce:

1) Download http://www.inf.unisi.ch/phd/bonzini/smalltalk-2.95d.tar.gz
2) Compile it with "./configure && make CFLAGS=-g" (the cflags is only for easier debuggability, the bug also reproduces without).
3) Run "./gst"
4) Type "ObjectMemory snapshot"

It crashes. To reproduce again:

5) Run "rm gst.im; make gst.im"
6) Go to step 4.

The code that crashes is in save.c; it dereferences a NULL pointer:

Program received signal SIGBUS, Bus error.
0x08083bf3 in make_oop_table_to_be_saved (header=0xbff29934) at save.c:345
345 int numPointers = NUM_OOPS (oop->object);
(gdb) p oop->object->objClass
$1 = (OOP) 0x0

However, going on in the debugging session, you can see that the memory is zeroed *by the kernel*:

(gdb) p &oop->object->objClass
$2 = (OOP *) 0xb7cb8784

We set up a breakpoint a little earlier:

(gdb) b 279
Breakpoint 1 at 0x8083a5d: file save.c, line 279.
(gdb) shell rm gst.im; make gst.im
(gdb) run
Starting program: /home/bonzinip/smalltalk-2.95d/gst
GNU Smalltalk ready
st> ObjectMemory snapshot
"Global garbage collection... done"

Breakpoint 1, _gst_save_to_file (
    fileName=0x812a118 "/home/bonzinip/smalltalk-2.95d/gst.im") at save.c:279
279 ftruncate (imageFd, 0);

Now we set a watchpoint on the location that triggered the NULL access:

(gdb) set can-use-hw-watchpoints 0
(gdb) watch *$2
Watchpoint 2: *$2
(gdb) n
Watchpoint 2: *$2

Old value = (OOP) 0x126
New value = (OOP) 0x0
0xb7ee9438 in ftruncate64 () from /lib/libc.so.6

From a disassembly, you can see that it was zeroed by the kernel:

(gdb) disass 0xb7ee9431 0xb7ee94ec
0xb7ee9431 <ftruncate64+49>: mov $0xc2,%eax
0xb7ee9436 <ftruncate64+54>: int $0x80 <---
0xb7ee9438 <ftruncate64+56>: xchg %edi,%ebx
0xb7ee943a <ftruncate64+58>: mov %eax,%esi

I believe the reason is a bad interaction between the private mmap established in save.c:

  buf = mmap (NULL, file_size, PROT_READ, MAP_PRIVATE, imageFd, 0);

and truncating the inode on which the mmap was done. Indeed, if the gst.im file is unlinked before opening it, the bug disappears. You can try this from the Smalltalk interpreter, without patching the source code:

$ ./gst
GNU Smalltalk ready

st> (File name: 'gst.im') remove
a RealFileHandler
st> ObjectMemory snapshot
"Global garbage collection... done"
ObjectMemory
st>

(no bus error anymore).

I hope this long explanation is understandable!
Comment 1 Paolo Bonzini 2007-10-09 06:29:15 UTC
Ah, I could reproduce it under i386 and x86-64
Comment 2 Anonymous Emailer 2007-10-09 08:39:58 UTC
Reply-To: akpm@linux-foundation.org


(switching to email - please reply via emailed reply-to-all, not via the
bugzilla web interface)

On Tue,  9 Oct 2007 06:28:28 -0700 (PDT) bugme-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=9138
> 
>            Summary: kernel overwrites MAP_PRIVATE mmap
>            Product: Memory Management
>            Version: 2.5
>      KernelVersion: 2.6.20, 2.6.22
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: high
>           Priority: P1
>          Component: Other
>         AssignedTo: akpm@osdl.org
>         ReportedBy: bonzini@gnu.org
> 
> 
> Most recent kernel where this bug did not occur:
> Distribution: Debian 2.6.8
> Hardware Environment:
> Software Environment:
> Problem Description:
> 
> Steps to reproduce:
> 
> 1) Download http://www.inf.unisi.ch/phd/bonzini/smalltalk-2.95d.tar.gz
> 2) Compile it with "./configure && make CFLAGS=-g" (the cflags is only for
> easier debuggability, the bug also reproduces without).
> 3) Run "./gst"
> 4) Type "ObjectMemory snapshot"
> 
> It crashes. To reproduce again:
> 
> 5) Run "rm gst.im; make gst.im"
> 6) Go to step 4.
> 
> The code that crashes is in save.c; it dereferences a NULL pointer:
> 
> Program received signal SIGBUS, Bus error.
> 0x08083bf3 in make_oop_table_to_be_saved (header=0xbff29934) at save.c:345
> 345 int numPointers = NUM_OOPS (oop->object);
> (gdb) p oop->object->objClass
> $1 = (OOP) 0x0
> 
> However, going on in the debugging session, you can see that the memory is
> zeroed *by the kernel*:
> 
> (gdb) p &oop->object->objClass
> $2 = (OOP *) 0xb7cb8784
> 
> We set up a breakpoint a little earlier:
> 
> (gdb) b 279
> Breakpoint 1 at 0x8083a5d: file save.c, line 279.
> (gdb) shell rm gst.im; make gst.im
> (gdb) run
> Starting program: /home/bonzinip/smalltalk-2.95d/gst
> GNU Smalltalk ready
> st> ObjectMemory snapshot
> "Global garbage collection... done"
> 
> Breakpoint 1, _gst_save_to_file (
>     fileName=0x812a118 "/home/bonzinip/smalltalk-2.95d/gst.im") at save.c:279
> 279 ftruncate (imageFd, 0);
> 
> Now we set a watchpoint on the location that triggered the NULL access:
> 
> (gdb) set can-use-hw-watchpoints 0
> (gdb) watch *$2
> Watchpoint 2: *$2
> (gdb) n
> Watchpoint 2: *$2
> 
> Old value = (OOP) 0x126
> New value = (OOP) 0x0
> 0xb7ee9438 in ftruncate64 () from /lib/libc.so.6
> 
> >From a disassembly, you can see that it was zeroed by the kernel:
> 
> (gdb) disass 0xb7ee9431 0xb7ee94ec
> 0xb7ee9431 <ftruncate64+49>: mov $0xc2,%eax
> 0xb7ee9436 <ftruncate64+54>: int $0x80 <---
> 0xb7ee9438 <ftruncate64+56>: xchg %edi,%ebx
> 0xb7ee943a <ftruncate64+58>: mov %eax,%esi
> 
> I believe the reason is a bad interaction between the private mmap
> established
> in save.c:
> 
>   buf = mmap (NULL, file_size, PROT_READ, MAP_PRIVATE, imageFd, 0);
> 
> and truncating the inode on which the mmap was done. Indeed, if the gst.im
> file
> is unlinked before opening it, the bug disappears. You can try this from the
> Smalltalk interpreter, without patching the source code:
> 
> $ ./gst
> GNU Smalltalk ready
> 
> st> (File name: 'gst.im') remove
> a RealFileHandler
> st> ObjectMemory snapshot
> "Global garbage collection... done"
> ObjectMemory
> st>
> 
> (no bus error anymore).
> 
> I hope this long explanation is understandable!

So can you confirm that this behaviour was not present in 2.6.8 but is
present in 2.6.20?

Would it be possible to prevail upon you to cook up a little standalone
testcase?  

Thanks.
Comment 3 Anonymous Emailer 2007-10-09 09:00:27 UTC
Reply-To: paolo.bonzini@lu.unisi.ch


> So can you confirm that this behaviour was not present in 2.6.8 but is
> present in 2.6.20?

Yes.  I also have access to a Debian i686 2.6.22.9 and it shows the bug. 
  Though I am not the one who compiled the kernel on either machine 
(neither the i686 nor the x86-64).

> Would it be possible to prevail upon you to cook up a little standalone
> testcase?  

I already tried to no avail.  I may have more time in november.

Paolo
Comment 4 Anonymous Emailer 2007-10-09 14:10:47 UTC
Reply-To: paolo.bonzini@lu.unisi.ch

This testcase is not a regression, but it is still a bug, and I believe
the root cause is the same: in this case, it is ftruncate modifying the
length of MAP_PRIVATE mmaps.


#include <sys/mman.h>
#include <fcntl.h>
#include <stdio.h>

int main()
{
   system ("echo foo > gst.im");
   int f = open ("gst.im", O_RDONLY);

   char *p = mmap(NULL, 1628636, PROT_READ, MAP_PRIVATE, f, 0);
   close (f);
   f = open("gst.im", O_RDWR|O_CREAT, 0666);
   printf ("%s", p);
   ftruncate (f, 0);
   mmap(NULL, 401408, PROT_READ|PROT_WRITE|PROT_EXEC,
        MAP_PRIVATE|MAP_ANON, -1, 0);
   printf ("%s", p);
}


Expected (and this is what it gives on e.g. Darwin):

foo
foo

Actual output:

foo
Bus error
Comment 5 Hugh Dickins 2007-10-09 14:39:27 UTC
On Tue, 9 Oct 2007, Paolo Bonzini wrote:
> > So can you confirm that this behaviour was not present in 2.6.8 but is
> > present in 2.6.20?
> 
> Yes.  I also have access to a Debian i686 2.6.22.9 and it shows the bug.

That's surprising, and sounds like a bug in 2.6.8 not in 2.6.20 or 2.6.22.

I may have misunderstood the steps, but you summarize:

> I believe the reason is a bad interaction between the private mmap
> established in save.c:
> 
>   buf = mmap (NULL, file_size, PROT_READ, MAP_PRIVATE, imageFd, 0);
> 
> and truncating the inode on which the mmap was done.

It is standard behaviour that truncating the inode on which an mmap
was done will generate SIGBUS on access to pages of the mmap beyond
the new end of file.  Easier to understand when MAP_SHARED, but even
when MAP_PRIVATE, and even when private pages have already been
C-O-Wed from the file.

Checking with SUSv3, I find it using the word "may" a lot, without
explicitly demanding this behaviour; but my recollection of the early
implementations of mmap in UNIX, which set the standard, is that they
behaved in this way - though I've often (like you) wished they did not.

Might it have been a different version of Smalltalk which was tested
with the 2.6.8 kernel, a version which didn't cause this to happen?
Comment 6 Anonymous Emailer 2007-10-09 21:42:23 UTC
Reply-To: paolo.bonzini@lu.unisi.ch


> It is standard behaviour that truncating the inode on which an mmap
> was done will generate SIGBUS on access to pages of the mmap beyond
> the new end of file.  Easier to understand when MAP_SHARED, but even
> when MAP_PRIVATE, and even when private pages have already been
> C-O-Wed from the file.

I would have expected MAP_PRIVATE to establish a snapshot of the file, 
as it appears to do on BSDs.  I find it hard to believe that code in the 
wild wants this behavior for MAP_PRIVATE (on the other hand, it is 
clearly the right thing for MAP_SHARED).

> Might it have been a different version of Smalltalk which was tested
> with the 2.6.8 kernel, a version which didn't cause this to happen?

Two weeks ago it started failing on x86-64 after a kernel update but 
still worked on i686; then, yesterday it also started failing on i686 
(guess what, after another kernel update).  It might well be that the 
bug was latent in 2.6.8 and was uncovered by another mmap-related change 
in the kernel, or something like that.

I can work around it by unlink+open; though it will break hard links, 
that's not a big deal.

Thanks for the explanation.

Paolo

Note You need to log in before you can comment on or make changes to this bug.