Bug 206501

Summary: Kernel 5.6-rc1 fails to boot on a PowerMac G4 3,6 with CONFIG_VMAP_STACK=y: Oops! Machine check, sig: 7 [#1]
Product: Platform Specific/Hardware Reporter: Erhard F. (erhard_f)
Component: PPC-32Assignee: platform_ppc-32
Status: RESOLVED CODE_FIX    
Severity: normal CC: christophe.leroy
Priority: P1    
Hardware: PPC-32   
OS: Linux   
See Also: https://bugzilla.kernel.org/show_bug.cgi?id=205099
Kernel Version: 5.6.0-rc1 Subsystem:
Regression: No Bisected commit-id:
Attachments: screenshot
kernel .config (5.6.0-rc1, PowerMac G4 DP)
powerpc/32s: Fix add_hash_page() for CONFIG_VMAP_STACK
dmesg (5.6.0-rc1 + Fix DSI and ISI... patch , PowerMac G4 DP)

Description Erhard F. 2020-02-11 19:25:24 UTC
Created attachment 287311 [details]
screenshot

The G4 boots fine with CONFIG_VMAP_STACK=n, but fails to boot with CONFIG_VMAP_STACK=y.

[...]
NIP [c001c194] create_hpte+0xa8/0x120
LR [c001c0c4] add_hash_page+0x88/0xb0
Call Trace:
[f101dde8] [cO181568] alloc_set_pte+0x184/0x214 (unreliable)
[f101de18] [cO14d168] filemap_map_pages+0x21c/0x250
[f101de68] [c0181cf4] handle_mm_fault+0x66c/0x90c
[f101dee8] [c0019aac] do_page_fault+0x690/0x804
[f101df38] [c0014450] handle_page_fault+0x10/0x3c
--- interrupt: 401 at Oxb77ffd10
    LR = 0x0
Instruction dump:
6c64003f 6884ffx0 3884fff8 7c0903a6 84x40008 7c062800 4002fff8 41a2008c
68a50040 7c0903a6 3883fff8 84c40008 <54c60001> 4002fff8 41a20070 3c80c08e
---[ end trace cd24dd23c7db9d53 ]---

Machine check in kernel mode.
Caused by (from SRR1=141020): Transfer error ack signal
Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000007

(OCRed screenshot + corrections by hand)
Comment 1 Erhard F. 2020-02-11 19:26:23 UTC
Created attachment 287313 [details]
kernel .config (5.6.0-rc1, PowerMac G4 DP)
Comment 2 Christophe Leroy 2020-02-12 08:13:06 UTC
Created attachment 287321 [details]
powerpc/32s: Fix add_hash_page() for CONFIG_VMAP_STACK

Please test this patch.
Comment 3 Christophe Leroy 2020-02-12 11:53:09 UTC
Could you also test the patch https://patchwork.ozlabs.org/patch/1236804/ instead of the above patch.
Comment 4 Erhard F. 2020-02-12 15:10:05 UTC
Created attachment 287329 [details]
dmesg (5.6.0-rc1 + Fix DSI and ISI... patch , PowerMac G4 DP)

First patch was not successful, I got no stacktrace but the boot process still got still stuck.

Second patch was succesful to the point where the G4 was able to boot up to the point of revealing a dmesg full of other problems, e.g. some 'Unrecoverable exceptions' ;)

Please find the dmesg attached.
Comment 5 Christophe Leroy 2020-02-12 17:47:33 UTC
Interesting.

NIP:  00000550
NIP:  0000045c
NIP:  00000c38
NIP:  00000370

The kernel seems to badly fault on the first write to the stack. This suggests that there is no page allocated for the stack yet. Which is unexpected because in copy_thread_tls() several writes to the stack are performed so the pages must exist in page tables.
Comment 6 Christophe Leroy 2020-02-12 18:15:19 UTC
I found the reason I think. I just realised that the things saved to SPRN_SPRG and in thread struct get overwriten by the DSI taken at stack write.

I'll prepare something to fix that.
Comment 7 Christophe Leroy 2020-02-13 10:45:35 UTC
Can you try version v2 of the patch, https://patchwork.ozlabs.org/patch/1237387/
Comment 8 Erhard F. 2020-02-13 11:34:33 UTC
(In reply to Christophe Leroy from comment #7)
> Can you try version v2 of the patch,
> https://patchwork.ozlabs.org/patch/1237387/
I can confirm that v2 works as intended. The G4 completes booting with VMAP_STACK enabled and without producing further stack traces. Thanks!
Comment 9 Christophe Leroy 2020-02-13 17:32:37 UTC
Great. Can we add your Tested-by: to the commit ?
Comment 10 Erhard F. 2020-02-13 19:49:21 UTC
Sure. Thanks!
Comment 11 Erhard F. 2020-02-26 15:34:14 UTC
Fix landed in 5.6-rc3, works now as expected. Thanks!