Bug 114671

Summary: Compressed x86 kernel should be built as PIE
Product: Platform Specific/Hardware Reporter: H.J. Lu (hjl.tools)
Component: i386Assignee: platform_i386
Status: NEW ---    
Severity: normal CC: bjoernv, bugs-a17, hpa, mmarek
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 4.4.5 Tree: Mainline
Regression: No
Attachments: A patch
An updated patch

Description H.J. Lu 2016-03-15 19:13:45 UTC
Created attachment 209331 [details]
A patch

2-bit x86 assembler in binutils 2.26 will generate R_386_GOT32X
relocation to get the symbol address in PIC.  32-bit compressed
x86 kernel is compiled with PIC, but linked as normal executable.
Since the output isn't PIC, linker optimizes R_386_GOT32X relocations
to their fixed symbol addresses.  However, 32-bit compressed x86
kernel is loaded at a different address, which leads to load failure:

Failed to allocate space for phdrs

during the decompression stage.

If compressed x86 kernel is relocatable at run-time, it should be
compiled with -fPIE, instead of -fPIC, if possible and built as
Position Independent Executable (PIE) so that linker won't optimize
R_386_GOT32X relocation to its fixed symbol address.

To build 32-bit compressed x86 kernel as PIE, a linker with the bug
fix for

https://sourceware.org/bugzilla/show_bug.cgi?id=19827

commit 4e0c91e45402ebf4215066e4a61143896e831049
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Mar 15 11:46:51 2016 -0700

    Bind defined symbol locally in PIE

    Symbols defined in PIE should be bound locally, the same as -shared
    -Bsymbolic.

is required.

To build 64-bit compressed x86 kernel loader as PIE, we need to disable
relocation overflow check to avoid relocation overflow error with a new
linker command-line option, -z noreloc-overflow:

commit 4c10bbaa0912742322f10d9d5bb630ba4e15dfa7
Author: H.J. Lu <hjl.tools@gmail.com>
Date:   Tue Mar 15 11:07:06 2016 -0700

    Add -z noreloc-overflow option to x86-64 ld

    Add -z noreloc-overflow command-line option to the x86-64 ELF linker to
    disable relocation overflow check.  This can be used to avoid relocation
    overflow check if there will be no dynamic relocation overflow at
    run-time.
Comment 1 H.J. Lu 2016-03-16 15:17:45 UTC
(In reply to H.J. Lu from comment #0)
>
> To build 32-bit compressed x86 kernel as PIE, a linker with the bug
> fix for
> 
> https://sourceware.org/bugzilla/show_bug.cgi?id=19827
> 
> commit 4e0c91e45402ebf4215066e4a61143896e831049
> Author: H.J. Lu <hjl.tools@gmail.com>
> Date:   Tue Mar 15 11:46:51 2016 -0700
> 
>     Bind defined symbol locally in PIE
> 
>     Symbols defined in PIE should be bound locally, the same as -shared
>     -Bsymbolic.
> 
> is required.
> 

Technically, linker isn't wrong to generate R_386_32 relocation against
locally define symbol, which in this case is "_bss", in PIE.  It is just
less optimal than R_386_RELATIVE.  But x86 kernel fails to properly handle
R_386_32 relocation when relocating the kernel.
Comment 2 H.J. Lu 2016-03-17 02:59:55 UTC
Created attachment 209601 [details]
An updated patch

This patch should work with all linkers.
Comment 3 Kris Karas 2016-03-18 19:37:52 UTC
Is attachment 209601 [details] intended to work with released binutils-2.26 ?
kernel-4.1.20 + attachment-209601 + vmware = ENOBOOT  :-(
Even the "Failed to allocate space for phdrs" was missing.
Comment 4 H.J. Lu 2016-03-18 19:44:05 UTC
(In reply to Kris Karas from comment #3)
> Is attachment 209601 [details] intended to work with released binutils-2.26 ?
> kernel-4.1.20 + attachment-209601 + vmware = ENOBOOT  :-(
> Even the "Failed to allocate space for phdrs" was missing.

Binutils 2.26 release has quite a few issues.  Try binutils 2.26 branch.
Comment 5 Kris Karas 2016-03-18 21:56:36 UTC
Tried binutils-2_26-branch with kernel 4.4.6 patched from 209601 as above.
No luck.  Slightly different than booting 4.1.20: console goes black and kernel halts (zero CPU usage according to VMware), never throws itself back to the BIOS.  Have to "reset guest" in VMware to get back to LILO.
Comment 6 H.J. Lu 2016-03-18 22:03:15 UTC
(In reply to Kris Karas from comment #5)
> Tried binutils-2_26-branch with kernel 4.4.6 patched from 209601 as above.
> No luck.  Slightly different than booting 4.1.20: console goes black and
> kernel halts (zero CPU usage according to VMware), never throws itself back
> to the BIOS.  Have to "reset guest" in VMware to get back to LILO.

Which GCC are you using? Are you running 32-bit or 64-bit kernel?
Comment 7 Kris Karas 2016-03-21 15:38:03 UTC
I'm running 32-bit mainline kernels.  GCC 5.3.0
No 64bit VM handy on ESXi, so haven't tried that.
I have 32 and 64 bit VMs under KVM, and they boot just fine.

All machines (VM and native) are running slackware-current, using its vanilla toolchain.  Here's the timeline:

2016-02-05  Binutils 2.26 (release) installed.
2016-02-08  GCC-5.3.0 patched, upstream #69140, fix SSE/Wine
2016-02-22  Kernel 4.4.2 compiled, boots.
2016-02-26  Kernel 4.4.3 compiled, boots.
2016-02-29  Binutils patched from upstream, 16 fixes.
2016-03-XX  All new kernel compiles fail.
2016-03-18  Binutils-2_26-branch + kernel patch 29601, no boot.

I have more work to do:
Back out 29601 from this bug report and try binutils-2_26-branch on vanilla.
Try to get back to binutils-release from early Feb.
Revisit kernels 4.1.17/4.4.3 to rule out anything from 4.1.18/4.4.4
Comment 8 Kris Karas 2016-03-22 00:59:41 UTC
I'm beginning to understand why I am confused.  It appears, in my particular case (i686 + vmware) that there is more than one bug at play.  I've spent most of the day compiling various kernel versions against four different incantations of binutils, with and without patch 209601 as proposed by H.J. Lu.  (Do I smell a kernel bisection cooking?)

Results:

I made 4 package versions of binutils as follows:
V1  A guess as to how Patrick Volkerding packaged it, circa January 2016.
    A few upstream patches (same as those for binutils-2.25) applied.
V2  binutils-2.26-i586-3 right out of slackware-current
    Has 16 "upstream" patches (selected by Volkerding) applied.
V3  binutils-2_26-branch from binutils-gdb.git, unpatched.
V4  HEAD/master from binutils-gdb.git, again no patches applied.

Kernel  Result
4.1.17  Fails V1 (maybe wonky packaging on my part)
        Boots with V2, V3, V4.  Does not depend upon "patch" (209601).

4.1.18  Does not boot with any binutils, patch or no.  Memory=SLAB

4.4.3   Fails V1  Memory=SLUB
        Boots V2
        Fails V3, V4 "Failed to allocate space for phdrs"
        Does not matter if the patch (209601) is applied or not.

4.4.4   All binutils, V1, V2, V3, V4 fail.  Memory=SLUB
        Reboots instantly back to BIOS.
        Does not matter if patch applied.

4.4.4   All binutils, V1, V2, V3, V4 fail.  Memory=SLAB
        Like 4.1.18, console wedges.  Not even the "phdrs" message.

4.4.6   V1 without patch fails - instant reboot.  Memory=SLUB
        V1 with patch fails - Console wedges, no output at all.
        V2, V3 fail with or without patch.  "Failed to allocate... phdrs"
        V4 without patch fails - "Failed to allocate space for phdrs"
        V4 with patch Succeeds!