Kernel panics on executing simple devmem 0x0 command from command prompt. This issue was not happening in at least 4.14 version and now we recently moved to 4.17, I am observing it. bcm958742k login: root (automatic login) bcm958742k:~# /sbin/devmem 0x0 [ 192.082901] SError Interrupt on CPU4, code 0xbf000000 -- SError [ 192.082904] CPU: 4 PID: 2424 Comm: devmem Not tainted 4.17.0-02102-gdcfa25a-dirty #109 [ 192.082905] Hardware name: Stingray Combo SVK (BCM958742K) (DT) [ 192.082906] pstate: 60000000 (nZCv daif -PAN -UAO) [ 192.082907] pc : 0000ffffa14600e8 [ 192.082907] lr : 000000000040cdec [ 192.082908] sp : 0000fffff6b7fce0 [ 192.082909] x29: 0000fffff6b7fee0 x28: 0000000000000000 [ 192.082910] x27: 0000000000000000 x26: 0000000000000000 [ 192.082912] x25: 0000fffff6b80010 x24: 0000000000000003 [ 192.082913] x23: 0000000000001000 x22: 0000fffff6b80018 [ 192.082914] x21: 0000000000000000 x20: 0000ffffa1477000 [ 192.082915] x19: 0000000000000020 x18: 00000000000004ca [ 192.082916] x17: 0000ffffa14600cc x16: 00000000004b6ff8 [ 192.082917] x15: 0000ffffa123fde0 x14: 0000ffffa124d2c8 [ 192.082918] x13: 0000ffffa138eac8 x12: 0000000000000000 [ 192.082919] x11: 0000000000000000 x10: 0101010101010101 [ 192.082921] x9 : 0000ffffa134e300 x8 : 00000000000000de [ 192.082922] x7 : 0000ffffa134ec00 x6 : 0000000000000000 [ 192.082923] x5 : 0000000000000000 x4 : 0000000000000003 [ 192.082924] x3 : 0000000000000001 x2 : 0000000000000000 [ 192.082925] x1 : 0000000000000008 x0 : 000000000048c4e8 [ 192.082927] Kernel panic - not syncing: Asynchronous SError Interrupt [ 192.082928] CPU: 4 PID: 2424 Comm: devmem Not tainted 4.17.0-02102-gdcfa25a-dirty #109 [ 192.082929] Hardware name: Stingray Combo SVK (BCM958742K) (DT) [ 192.082930] Call trace: [ 192.082939] dump_backtrace+0x0/0x1b8 [ 192.082941] show_stack+0x14/0x1c [ 192.082943] dump_stack+0x90/0xb0 [ 192.082945] panic+0x140/0x2a8 [ 192.082946] __stack_chk_fail+0x0/0x18 [ 192.082947] arm64_serror_panic+0x74/0x80 [ 192.082948] do_serror+0x48/0xa0 [ 192.082949] el0_error_naked+0x10/0x18 [ 192.082954] SMP: stopping secondary CPUs [ 192.082957] Kernel Offset: disabled [ 192.082959] CPU features: 0x21806008 [ 192.082959] Memory Limit: none [ 192.267536] ---[ end Kernel panic - not syncing: Asynchronous SError Interrupt ]--- Last commit which modifies earlier bad_mode framework says to update do_serror() for correctable issues but this is very basic and needs to be handled in graceful way. Probable fix: hv930220@hariv-server:~/ns2_master1/kernel$ git diff arch/arm64/kernel/traps.c diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index 8bbdc17..fc265dd 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -728,13 +728,19 @@ bool arm64_is_fatal_ras_serror(struct pt_regs *regs, unsign asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr) { - nmi_enter(); - - /* non-RAS errors are not containable */ - if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr)) - arm64_serror_panic(regs, esr); - - nmi_exit(); + if (user_mode(regs)) { + pr_crit("USER SError Interrupt on CPU%d, code 0x%08x -- %s\n", + smp_processor_id(), esr, esr_get_class_string(esr)); + die("Oops - user mode ", regs, 0); + } else { + pr_crit("KERNEL SError Interrupt on CPU%d, code 0x%08x -- %s\n", + smp_processor_id(), esr, esr_get_class_string(esr)); + nmi_enter(); + /* non-RAS errors are not containable */ + if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, + arm64_serror_panic(regs, esr); + nmi_exit(); + } } void __pte_error(const char *file, int line, unsigned long val) hv930220@hariv-server:~/ns2_master1/kernel$
Created attachment 277219 [details] Proposed fix to avoid issue. Just a simple fix to avoid issue. Will raise a formal patch soon.