Bug 197035 - objtool segfault with ORC unwinder enabled
Summary: objtool segfault with ORC unwinder enabled
Status: RESOLVED CODE_FIX
Alias: None
Product: Tools
Classification: Unclassified
Component: Other (show other bugs)
Hardware: x86-64 Linux
: P1 high
Assignee: Tools.Other
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-09-25 18:53 UTC by Rafael Ristovski
Modified: 2018-01-18 19:12 UTC (History)
3 users (show)

See Also:
Kernel Version: 4.14-next
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Rafael Ristovski 2017-09-25 18:53:19 UTC
I have noticed that when the ORC unwinder is enabled, make randomly quits with 
`/bin/sh: line 1:  3050 Segmentation fault      ./tools/objtool/objtool orc generate --no-fp "arch/x86/kernel/x86_init.o"`

It appears to happen to random object files so its not an issue specifically with the one in this example. (Although it's not like it randomly happens, re-running the above command causes a segfault each time, so the issue is within the generated object file)

When I first tested ORC the first time it was introduced in the kernel it worked fine, but maybe 3 weeks ago I started noticing the build fails,
thinking it was some regression that is going to get fixed I just switched back to the default unwinder for some time.

Running the culprit command under gdb yields the following:

```
Program received signal SIGSEGV, Segmentation fault.
elf_rebuild_rela_section (sec=sec@entry=0x7ffff6ada010) at elf.c:554
554			relas[idx].r_info = GELF_R_INFO(rela->sym->idx, rela->type);
(gdb) bt
#0  elf_rebuild_rela_section (sec=sec@entry=0x7ffff6ada010) at elf.c:554
#1  0x0000000000407d89 in create_orc_sections (file=file@entry=0x7ffffff7ca70) at orc_gen.c:210
#2  0x0000000000406d51 in check (_objname=<optimized out>, _no_fp=<optimized out>, no_unreachable=<optimized out>, orc=<optimized out>) at check.c:1948
#3  0x0000000000402268 in handle_internal_command (argv=0x7fffffffcbe0, argc=4) at objtool.c:110
#4  main (argc=4, argv=0x7fffffffcbe0) at objtool.c:133
(gdb) p *relas
$1 = {r_offset = 0, r_info = 4294967298, r_addend = 0}
```

In this case r_info appears to have some erroneous value but in a previous try it was 0.

I have no idea how to further debug this, so any suggestions are welcome.
I have tested this with different kernel configs so I don't think it's an issue in configuration.

Compilers tested: GCC 8.0.0, GCC 5.4.0
Comment 1 Rafael Ristovski 2017-09-25 21:44:38 UTC
More gdb: (from different file, disabled -O2 from objtool to get normally optimized-out values)

(gdb) p relas
$1 = (GElf_Rela *) 0x4504a0
(gdb) p idx
$2 = 0
(gdb) p rela
$3 = (struct rela *) 0x44f240
(gdb) p rela->sym
$4 = (struct symbol *) 0x0

Seems like rela->sym is NULL but I'm not sure if that is the culprit
Comment 2 Rafael Ristovski 2017-09-26 18:26:18 UTC
Something I noticed:

```
./tools/objtool/objtool orc generate --no-fp "arch/x86/kernel/quirks.o";
arch/x86/kernel/quirks.o: warning: objtool: vt8237_force_enable_hpet()+0x3a: sibling call from callable instruction with modified stack frame
arch/x86/kernel/quirks.o: warning: objtool: nvidia_force_enable_hpet()+0x35: sibling call from callable instruction with modified stack frame
arch/x86/kernel/quirks.o: warning: objtool: ati_force_enable_hpet()+0x40: sibling call from callable instruction with modified stack frame
arch/x86/kernel/quirks.o: warning: objtool: force_hpet_resume()+0xd7: sibling call from callable instruction with modified stack frame
/bin/sh: line 1: 22955 Segmentation fault      ./tools/objtool/objtool orc generate --no-fp "arch/x86/kernel/quirks.o"
```

Commit id 4855022a52262411ce38c93dec4cb1470705c0a0 in -next seems to be related to this - https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/commit/?h=next-20170926&id=4855022a52262411ce38c93dec4cb1470705c0a0
Comment 3 Markus 2017-12-26 14:41:35 UTC
Hit the same segfault in 4.14.9 as ORC seems to be enabled by default now.
Using gcc 6.4.0 here.
Comment 4 Markus 2018-01-04 16:57:40 UTC
@Rafael Ristovski: Do you use the gold linker?
Comment 5 Rafael Ristovski 2018-01-18 18:46:58 UTC
(In reply to Markus from comment #4)
> @Rafael Ristovski: Do you use the gold linker?

Sorry for the slow reply, just got back from vacation. Yes, I do use the gold linker.
Comment 6 Markus 2018-01-18 19:11:19 UTC
Then we had the same issue:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2a0098d70640dda192a79966c14d449e7a34d675

(Not yet in 4.15-rc8, but should be in the next version.)
Comment 7 Rafael Ristovski 2018-01-18 19:12:54 UTC
(In reply to Markus from comment #6)
> Then we had the same issue:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/
> ?id=2a0098d70640dda192a79966c14d449e7a34d675
> 
> (Not yet in 4.15-rc8, but should be in the next version.)

I can confirm this indeed fixes the issue as I just finished compiling linux-next. Cheers!

Note You need to log in before you can comment on or make changes to this bug.