Bug 36372

Summary: 2.6.39 kernel doesn't work with PIE
Product: Platform Specific/Hardware Reporter: H.J. Lu (hjl.tools)
Component: x86-64Assignee: platform_x86_64 (platform_x86_64)
Status: RESOLVED OBSOLETE    
Severity: normal CC: alan, andi-bz, hpa
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: Subsystem:
Regression: No Bisected commit-id:
Attachments: A testcase
A patch
An updated patch

Description H.J. Lu 2011-05-31 17:38:50 UTC
Created attachment 60252 [details]
A testcase

With GCC 4.6.0, some PIE programs will fail with

# echo 0 > /proc/sys/kernel/randomize_va_space

I got

[root@gnu-68 168.wupwise]# echo 0 > /proc/sys/kernel/randomize_va_space
[root@gnu-68 168.wupwise]# ./wupwise
Killed
[root@gnu-68 168.wupwise]# echo 2 > /proc/sys/kernel/randomize_va_space
[hjl@gnu-68 168.wupwise]$ ./wupwise 
At line 35 of file init.f (unit = 10, file = '')
Fortran runtime error: File 'wupwise.in' does not exist
[hjl@gnu-68 168.wupwise]$
Comment 1 H.J. Lu 2011-05-31 22:58:18 UTC
[root@gnu-snb-1 168.wupwise]# ulimit -s  182047
[root@gnu-snb-1 168.wupwise]# ./wupwise 
Killed
[root@gnu-snb-1 168.wupwise]# ulimit -s  182048
[root@gnu-snb-1 168.wupwise]# ./wupwise 
At line 35 of file init.f (unit = 10, file = '')
Fortran runtime error: File 'wupwise.in' does not exist
[root@gnu-snb-1 168.wupwise]#
Comment 2 H.J. Lu 2011-05-31 23:07:26 UTC
arch/x86/mm/mmap.c has

/*
 * Top of mmap area (just below the process stack).
 *
 * Leave an at least ~128 MB hole with possible stack randomization.
 */
#define MIN_GAP (128*1024*1024UL + stack_maxrandom_size())
#define MAX_GAP (TASK_SIZE/6*5)
...
static unsigned long mmap_base(void)
{
        unsigned long gap = rlimit(RLIMIT_STACK);

        if (gap < MIN_GAP)
                gap = MIN_GAP;
        else if (gap > MAX_GAP)
                gap = MAX_GAP;

        return PAGE_ALIGN(TASK_SIZE - gap - mmap_rnd());
}

That limits the maximum size of .bss section in PIE. We
have

 [26] .data             PROGBITS        0000000000207700 007700 000004 00  WA  0   0  4
  [27] .bss              NOBITS          0000000000207720 007720 afc8020 00  WA  0   0 32

 LOAD           0x000000 0x0000000000000000 0x0000000000000000 0x0073ec 0x0073ec R E 0x200000
  LOAD           0x0073f0 0x00000000002073f0 0x00000000002073f0 0x000314 0xafc8350 RW  0x200000

and it failed since it is placed too close to the limit.
Comment 3 H.J. Lu 2011-06-01 01:21:08 UTC
Created attachment 60302 [details]
A patch

This patch works for me. But we should pick a better load bias
for PIE.  We can use 0x400000 for x86-64.
Comment 4 H.J. Lu 2011-06-01 03:09:21 UTC
Created attachment 60322 [details]
An updated patch

This patch is simpler. We should avoid 0 load address since
it may be mapped too close to task address limit and leaves
no room for the second loadable segment.
Comment 5 Andi Kleen 2011-06-01 06:43:54 UTC
I don't think the patch is correct. You're putting the loaded segment
into the mmap area, so brk() cannot grow as the comment indicates
and it would conflict with mmaps.

Probably need to make the placement smarter.
Comment 6 H.J. Lu 2011-06-01 13:54:30 UTC
We can use the same load base as linker uses.
Comment 7 H.J. Lu 2011-06-01 23:53:24 UTC
2 simple testcases:

[hjl@gnu-6 pie]$ cat foo1.c 
#include <stdio.h>

char foo[132109999];

int
main ()
{
  foo[sizeof (foo) - 2] = -34;
  printf ("%d\n", foo[sizeof (foo) - 2]);
  return 0;
}
[hjl@gnu-6 pie]$ gcc -fPIE -pie foo1.c; ./a.out
Bus error
[hjl@gnu-6 pie]$ cat foo2.c 
#include <stdio.h>

char foo[132121799];

int
main ()
{
  foo[sizeof (foo) - 2] = -34;
  printf ("%d\n", foo[sizeof (foo) - 2]);
  return 0;
}
[hjl@gnu-6 pie]$ gcc -fPIE -pie foo2.c; ./a.out
Killed
[hjl@gnu-6 pie]$
Comment 8 H.J. Lu 2011-06-02 04:07:41 UTC
(In reply to comment #5)
> I don't think the patch is correct. You're putting the loaded segment
> into the mmap area, so brk() cannot grow as the comment indicates
> and it would conflict with mmaps.
> 
> Probably need to make the placement smarter.

ELF_ET_DYN_BASE is only used to load PIE:

arch/x86/include/asm/elf.h:#define ELF_ET_DYN_BASE    (TASK_SIZE / 3 * 2)
arch/x86/include/asm/elf.h:#define COMPAT_ELF_ET_DYN_BASE   (TASK_UNMAPPED_BASE + 0x1000000)

and it should be used as the testcases shown.  If their values aren't
appropriate, they should be fixed.
Comment 9 Andrew Morton 2011-06-07 23:40:19 UTC
How are we coming along with this?

I recategorized it to x86.