Bug 215217

Summary: Kernel fails to boot at an early stage when built with GCC_PLUGIN_LATENT_ENTROPY=y (PowerMac G4 3,6)
Product: Platform Specific/Hardware Reporter: Erhard F. (erhard_f)
Component: PPC-32Assignee: platform_ppc-32
Status: RESOLVED CODE_FIX    
Severity: normal CC: christophe.leroy, michael
Priority: P1    
Hardware: PPC-32   
OS: Linux   
Kernel Version: 5.16-rc3 Subsystem:
Regression: No Bisected commit-id:
Attachments: kernel .config (5.16-rc3, PowerMac G4 DP)
kernel .config (4.18.20, PowerMac G4 DP)
kernel .config (5.16-rc4 + CFLAGS_setup_32.o +=... , PowerMac G4 DP)
kernel zImage (5.16-rc5 + CFLAGS_setup_32.o/early_32.o +=... , PowerMac G4 DP)
kernel vmlinux.xz (5.16-rc5 + CFLAGS_setup_32.o/early_32.o +=... , PowerMac G4 DP)

Description Erhard F. 2021-12-04 22:48:03 UTC
Created attachment 299871 [details]
kernel .config (5.16-rc3, PowerMac G4 DP)

Get this on my PowerMac G4 DP when I build a kernel with GCC_PLUGIN_LATENT_ENTROPY=y. The kernel gets decompressed but shortly afterwards the machine freezes, displaying in black letters on an entirely white screen:

done
found display   : /pci@f0000000/ATY,AlteracParent@10/ATY,Alterac_B@1, opening...

Same kernel config built without GCC_PLUGIN_LATENT_ENTROPY just boots fine. Happens on v5.15.5 and v5.16-rc3. I did not test other kernel versions yet.

My Talos II (Power9) on the other hand just works fine with a GCC_PLUGIN_LATENT_ENTROPY=y built kernel.
Comment 1 Christophe Leroy 2021-12-05 08:08:15 UTC
POWER9 doesn't have KASAN.

Did you try G4 without KASAN ?
Comment 2 Erhard F. 2021-12-05 09:51:03 UTC
Just rebuilt the kernel without KASAN but I still hit this bug.
Comment 3 Christophe Leroy 2021-12-05 17:48:18 UTC
I tried your config under QEMU and it works. So I don't know how I could help.

>> =============================================================
>> OpenBIOS 1.1 [Jul 22 2021 22:33]
>> Configuration device id QEMU version 1 machine id 1
>> CPUs: 1
>> Memory: 2048M
>> UUID: 00000000-0000-0000-0000-000000000000
>> CPU type PowerPC,G4
milliseconds isn't unique.
Welcome to OpenBIOS v1.1 built on Jul 22 2021 22:33
>> [ppc] Kernel already loaded (0x01000000 + 0x00f39460) (initrd 0x0203a000 +
>> 0x001d1a3b)
>> [ppc] Kernel command line: noreboot
>> switching to new context:
OF stdout device is: /pci@f2000000/mac-io@c/escc@13000/ch-a@13020
Preparing to boot Linux version 5.16.0-rc3-PowerMacG4+ (chleroy@PO20335.IDSI0.si.c-s.fr) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #669 SMP Sun Dec 5 18:41:30 CET 2021
Detected machine type: 00000400
command line:  
memory layout at init:
  memory_limit : 00000000 (16 MB aligned)
  alloc_bottom : 01f3e000
  alloc_top    : 30000000
  alloc_top_hi : 80000000
  rmo_top      : 30000000
  ram_top      : 80000000
found display   : /pci@f2000000/QEMU,VGA@e, opening... done
copying OF device tree...
Building dt strings...
Building dt structure...
Device tree strings 0x01f3f000 -> 0x01f3e0a4
Device tree struct  0x01f40000 -> 0x7fde7ef8
Quiescing Open Firmware ...
Booting Linux via __start() @ 0x01000000 ...
Hello World !
Total memory = 2048MB; using 4096kB for hash table
Activating Kernel Userspace Execution Prevention
Activating Kernel Userspace Access Protection
Linux version 5.16.0-rc3-PowerMacG4+ (chleroy@PO20335.IDSI0.si.c-s.fr) (powerpc64-linux-gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #669 SMP Sun Dec 5 18:41:30 CET 2021
KASAN init done
ioremap() called early from pmac_feature_init+0x248/0xfe4. Use early_ioremap() instead
Found UniNorth memory controller & host bridge @ 0xf8000000 revision: 0x07
Mapped at 0xf53bf000
ioremap() called early from probe_one_macio+0x228/0x414. Use early_ioremap() instead
Found a Keylargo mac-io controller, rev: 0, mapped at 0x(ptrval)
PowerMac motherboard: PowerMac G4 AGP Graphics
ioremap() called early from udbg_scc_init+0x1dc/0x380. Use early_ioremap() instead
boot stdout isn't a display !
Using PowerMac machine description
printk: bootconsole [udbg0] enabled
CPU maps initialized for 1 thread per core
Comment 4 Erhard F. 2021-12-08 14:49:29 UTC
Created attachment 299935 [details]
kernel .config (4.18.20, PowerMac G4 DP)

Hmm, strange...

I tried to bisect but found out that this issue goes back to kernel v4.18.20 at least. This one is the earliest I am able to build with GCC_PLUGIN_LATENT_ENTROPY=y, kernels before error out with:

make: ngcc: No such file or directory
Cannot use CONFIG_GCC_PLUGINS: plugin support on gcc <= 5.1 is buggy on powerpc, please upgrade to gcc 5.2 or newer

I used gcc 9.4.0 and reduced the kernel .config a lot but it's still the same. v4.18.20 with GCC_PLUGIN_LATENT_ENTROPY=y freezes at this early boot stage, without GCC_PLUGIN_LATENT_ENTROPY booting continues.

Also GCC_PLUGIN_LATENT_ENTROPY=y makes no problem on my G5 either. So it's only the G4 which is affected. Wildly guessing this may be a 32bit PowerPC gcc specific thing?
Comment 5 Michael Ellerman 2021-12-09 03:17:57 UTC
It's likely there's some 32-bit boot code that is being instrumented in a way that causes it to crash.

We probably need to add some more uses of DISABLE_LATENT_ENTROPY_PLUGIN in arch/powerpc/kernel/Makefile.

To start with you could try adding:

  CFLAGS_setup_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
Comment 6 Erhard F. 2021-12-09 11:19:01 UTC
Ok I cheked that out. There are already some in the Makefile:

CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)

I added CFLAGS_setup_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN) on top of that, rebuilt the kernel after a make clean. No success so far, the kernel still does not boot.
Comment 7 Erhard F. 2021-12-09 11:20:20 UTC
Created attachment 299967 [details]
kernel .config (5.16-rc4 + CFLAGS_setup_32.o +=... , PowerMac G4 DP)
Comment 8 Christophe Leroy 2021-12-09 11:22:15 UTC
early_32.o should likely also have DISABLE_LATENT_ENTROPY_PLUGIN, maybe even more important that for setup_32.o
Comment 9 Christophe Leroy 2021-12-13 16:39:46 UTC
Erhard, were you able to redo the test with DISABLE_LATENT_ENTROPY_PLUGIN also disabled for early_32.o  ?

If you can try with it disabled for both early_32.o and setup_32.o
Then if it works, retry with it disabled only for early_32.o
Comment 10 Erhard F. 2021-12-14 00:45:12 UTC
Created attachment 300015 [details]
kernel zImage (5.16-rc5 + CFLAGS_setup_32.o/early_32.o +=... , PowerMac G4 DP)

Unfortunately still no success.

Relevant section in the Makefile now looks like this:
CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_setup_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_cputable.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom_init.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_btext.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
CFLAGS_prom.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)

I'll attach the generated zImage, maybe you can make something out of it.
Comment 11 Christophe Leroy 2021-12-14 05:41:12 UTC
Ok, so that's not enough, must be something else.

I guess you are right next step should be analysis of the image.

zImage however can hardly be used for that.

Could you provide vmlinux file ?
Comment 12 Erhard F. 2021-12-14 15:43:52 UTC
Created attachment 300027 [details]
kernel vmlinux.xz (5.16-rc5 + CFLAGS_setup_32.o/early_32.o +=... , PowerMac G4 DP)

Ok. I .xz-compressed it afterwards as it would be too big otherwise.
Comment 13 Christophe Leroy 2021-12-17 07:20:57 UTC
arch/powerpc/lib/feature-fixups.o also need DISABLE_LATENT_ENTROPY_PLUGIN, see extract from you vmlinux below


c0c0ad20 <apply_feature_fixups>:
c0c0ad20:       94 21 ff e0     stwu    r1,-32(r1)
c0c0ad24:       3c 60 c0 db     lis     r3,-16165
c0c0ad28:       7c 08 02 a6     mflr    r0
c0c0ad2c:       38 63 55 50     addi    r3,r3,21840
c0c0ad30:       bf 41 00 08     stmw    r26,8(r1)
c0c0ad34:       7c 3f 0b 78     mr      r31,r1
c0c0ad38:       3f 60 c0 da     lis     r27,-16166         <== latent_entropy@h
c0c0ad3c:       90 01 00 24     stw     r0,36(r1)
c0c0ad40:       3f 80 c0 d4     lis     r28,-16172
c0c0ad44:       83 db 5b 50     lwz     r30,23376(r27)     <== latent_entropy@l
c0c0ad48:       4b 40 60 35     bl      c0010d7c <add_reloc_offset>
Comment 14 Erhard F. 2021-12-18 02:19:24 UTC
(In reply to Christophe Leroy from comment #13)
> arch/powerpc/lib/feature-fixups.o also need DISABLE_LATENT_ENTROPY_PLUGIN,
> see extract from you vmlinux below
I can confirm this works, thanks!

I need

arch/powerpc/kernel/Makefile: 
CFLAGS_early_32.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)
arch/powerpc/lib/Makefile:
CFLAGS_feature-fixups.o += $(DISABLE_LATENT_ENTROPY_PLUGIN)

to make it going on my G4 with GCC_PLUGIN_LATENT_ENTROPY=y. Modifying setup_32.o is not needed.
Comment 15 Erhard F. 2022-02-03 17:29:48 UTC
Fix landed in kernel 5.16.5. Thanks!