Bug 207129 - PowerMac G4 DP (5.6.2 debug kernel + inline KASAN) freezes shortly after booting with "do_IRQ: stack overflow: 1760"
Summary: PowerMac G4 DP (5.6.2 debug kernel + inline KASAN) freezes shortly after boot...
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: PPC-32 (show other bugs)
Hardware: PPC-32 Linux
: P1 normal
Assignee: platform_ppc-32
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-05 21:32 UTC by Erhard F.
Modified: 2020-04-08 15:59 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.6.2
Tree: Mainline
Regression: No


Attachments
kernel .config (5.6.2, INLINE KASAN, PowerMac G4 DP) (97.30 KB, text/plain)
2020-04-05 21:32 UTC, Erhard F.
Details
screenshot01.jpg (1.22 MB, image/jpeg)
2020-04-06 12:26 UTC, Erhard F.
Details
screenshot02.jpg (1.05 MB, image/jpeg)
2020-04-06 12:27 UTC, Erhard F.
Details

Description Erhard F. 2020-04-05 21:32:40 UTC
Created attachment 288221 [details]
kernel .config (5.6.2, INLINE KASAN, PowerMac G4 DP)

Was trying to do some testing with the PowerMac G4 DP again, running a 5.6.2 debug kernel w. KASAN INLINE. The G4 boots fine, but crashes shortly afterwards when using it, leaving no stack trace, but only this message on the screen:

do_IRQ: stack overflow: 1760
CPU: 0 PID: 209 Comm: rsync Tained: G        W        5.6.2-PowerMacG4+ #3
Call Trace:


120 seconds panic timer does not kick in. I have to manually switch off/switch on the G4.
Comment 1 Christophe Leroy 2020-04-06 05:29:12 UTC
So it hands in show_stack().

Does it also hang without CONFIG_DEBUG_STACKOVERFLOW ? If not, it means we have a problem with check_stack_overflow()

Regardless of the result above, can you try increasing CONFIG_THREAD_SHIFT ?

Can you maybe also do a test without CONFIG_VMAP_STACK ?
Comment 2 Erhard F. 2020-04-06 12:26:32 UTC
Created attachment 288229 [details]
screenshot01.jpg

Without CONFIG_DEBUG_STACKOVERFLOW things are better. The rsync completes, the G4 was building stuff for 2 hours or so until I got these errors and a hard freeze:

[...]
Oops: kernel stack overflow, sig: 11 [#1]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: ...
CPU: 1 PID: 17105 Comm: kworker/u4:5 Tainted: G        W        5.6.2-PowerMacG4+ #5
------------[ cut here  ]------------
kernel BUG at mm/usercopy.c:99!
Oops: Exception in kernel mode, sig: 5 [#2]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: ...
CPU: 1 PID: 17185 Comm: kworker/u4:5 Tainted: G        W        5.6.2-PowerMacG4+ #5
usercopy: Kernel memory overwrite attempt detected to kernel text (offset 6336, size 4)!
------------[ cut here  ]------------
kernel BUG at mm/usercopy.c:99!
Oops: Exception in kernel mode, sig: 5 [#3]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: ...
CPU: 1 PID: 17185 Comm: kworker/u4:5 Tainted: G        W        5.6.2-PowerMacG4+ #5
usercopy: Kernel memory overwrite attempt detected to kernel text (offset 5336, size 4)!
------------[ cut here  ]------------
kernel BUG at mm/usercopy.c:99!
Oops: Exception in kernel mode, sig: 5 [#4]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: ...
CPU: 1 PID: 17185 Comm: kworker/u4:5 Tainted: G        W        5.6.2-PowerMacG4+ #5
usercopy: Kernel memory overwrite attempt detected to kernel text (offset 4336, size 4)!
------------[ cut here  ]------------
kernel BUG at mm/usercopy.c:99!
Oops: Exception in kernel mode, sig: 5 [#5]
BE PAGE_SIZE=4K MMU=Hash SMP NR_CPUS=2 PowerMac
Modules linked in: ...
Unrecoverable FP Unavailable Exception 801 at 9b8
CPU: 1 PID: 17185 Comm: kworker/u4:5 Tainted: G        W        5.6.2-PowerMacG4+ #5
usercopy: Kernel memory overwrite attempt detected to kernel text (offset 3336, size 4)!
------------[ cut here  ]------------

Now running with CONFIG_THREAD_SHIFT=14 which runs fine so far... Did not try without CONFIG_VMAP_STACK yet.
Comment 3 Erhard F. 2020-04-06 12:27:01 UTC
Created attachment 288231 [details]
screenshot02.jpg
Comment 4 Erhard F. 2020-04-06 22:57:16 UTC
Without CONFIG_VMAP_STACK I had one crash after 2-3 hours of building but the panic timer kicked in and rebooted the machine. Now it has been building packages for hours again without any anomalies.
Comment 5 Christophe Leroy 2020-04-08 14:55:09 UTC
Ok, so as a summary:
- With CONFIG_THREAD_SHIFT = 13 and CONFIG_DEBUG_STACKOVERFLOW, the system gets stuck
- With CONFIG_THREAD_SHIFT = 13 and without CONFIG_DEBUG_STACKOVERFLOW, stack overflow is not really detected until it gets into kernel text !!!
- With CONFIG_THREAD_SHIFT = 14 it runs fine
- With CONFIG_VMAP_STACK, the automatic restart doesn't work
- Without CONFIG_VMAP_STACK, the automatic restart works

So I'll send a patch to set CONFIG_THREAD_SHIFT to 14 when CONFIG_KASAN is selected. x86 and arm64 already do that.

And I'll try to investigate the other points when I have time.
Comment 6 Erhard F. 2020-04-08 15:59:23 UTC
Yes, precisely summarized! Thanks for your efforts!

CONFIG_KASAN though only is x86_64 not x86 AFAIK.

Note You need to log in before you can comment on or make changes to this bug.