The issue was found on the particular AMD ROME machine below: Serial Number diesel-sys9079-0001 Vendor AuthenticAMD Model Name AMD EPYC 7601 32-Core Processor The `kmem -s` reported "kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e" on a dumped vmcore. How reproducible: About 70% Steps to Reproduce: 1. Install the latest kernel, for example: commit 089cf7f6ecb266b6a4164919a2e69bd2f938374a (HEAD -> v5.3-rc7, tag: v5.3-rc7) Author: Linus Torvalds <torvalds@linux-foundation.org> Date: Mon Sep 2 09:57:40 2019 -0700 Linux 5.3-rc7 2. Enable SME by setting "mem_encrypt=on" on command line 3. Trigger a sysrq panic 4. Run crash 'kmem -s' to check the vmcore Actual results: #crash vmlinux vmcore ...... crash> kmem -s | grep -i invalid kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e kmem: dma-kmalloc-512: slab: ffffe192c0001000 invalid freepointer: e5ffef4e9a040b7e
As we know, kdump kernel will reuse the first 640k area because of something reasons, so the old content in the first 640k area will be copied to a backup area, which is done in purgatory(). When dumping the vmcore, kdump kernel will read the old content of the first 640k area from the backup area. Basically, the main reason should be also clear, kernel does not correctly handle the first 640k region when SME is enabled, which causes that kernel does not properly copy these old memory content to backup area in purgatory(). So, kernel reads out the incorrect content from the backup region when dumping vmcore. This bug is definitely related to the memory encryption, Any idea about this? Thanks.
Fixed in v5.5-rc1. Thanks.