Bug 198295 - AMD KASAN: use-after-free in find_cpio_data+0x9b5/0xa50
Summary: AMD KASAN: use-after-free in find_cpio_data+0x9b5/0xa50
Status: RESOLVED CODE_FIX
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: Other Linux
: P1 normal
Assignee: Borislav Petkov
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-12-28 05:41 UTC by higuita
Modified: 2018-01-24 11:09 UTC (History)
4 users (show)

See Also:
Kernel Version: 4.14.8
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Kasan oops waking from suspend (83.37 KB, text/plain)
2017-12-28 05:41 UTC, higuita
Details
new oops with 4.14.11 (9.20 KB, text/plain)
2018-01-08 21:34 UTC, higuita
Details
.config for kernel 4.14.14 (175.87 KB, text/plain)
2018-01-21 03:57 UTC, higuita
Details
test patch (480 bytes, patch)
2018-01-21 13:27 UTC, Borislav Petkov
Details | Diff
Wake up (9.86 KB, text/plain)
2018-01-21 23:52 UTC, higuita
Details

Description higuita 2017-12-28 05:41:57 UTC
Created attachment 273333 [details]
Kasan oops waking from suspend

Hi

i enabled KASAN to help debug a amdgpu problem and end finding this problem when waking up from suspend. It only happens sometimes and may explain some failure to wake up 1-5% of all my suspends
Comment 1 higuita 2018-01-08 21:34:14 UTC
Created attachment 273485 [details]
new oops with 4.14.11

New oops, this time with more debug info:

So taking BUG: KASAN: use-after-free in find_cpio_data+0x80a/0x880

    # nm vmlinux | grep find_cpio_data
    ffffffff848df750 t _GLOBAL__sub_D_65535_0_find_cpio_data
    ffffffff847ec0d0 t _GLOBAL__sub_I_65535_1_find_cpio_data
    ffffffff83337f60 T find_cpio_data

    # echo "obase=16;ibase=16;$(echo "ffffffff83337f60" | tr [a-z] [A-Z])+80A" | bc
    FFFFFFFF8333876A
    # eu-addr2line -e /usr/src/linux-4/vmlinux  FFFFFFFF8333876A
    lib/earlycpio.c:81


So the problem is here:

        while (len > cpio_header_len) {
                if (!*p) {   <------ line 81
                        /* All cpio headers need to be 4-byte aligned */
                        p += 4;
                        len -= 4;
                        continue;
                }
Comment 2 Chen Yu 2018-01-11 03:36:33 UTC
Good catch, this should be a amd microcode loading bug for the suspend/resume scenario: the system tries to load microcode from initrd, which has already been released during early system bootup. I think something should be fixed in load_ucode_amd_ap() to get microcode from the saved ucode_patch no matter whether there is a new rev patch or not - refer to load_ucode_intel_ap().
Comment 3 Borislav Petkov 2018-01-20 12:41:15 UTC
Is that vmlinuz-4.14.8-slack-smp the real stable kernel or does it have some patches ontop? I'm assuming slackware-something?

Also, can you upload your .config pls? I'd like to try to reproduce it here.

Thx.
Comment 4 higuita 2018-01-21 03:57:48 UTC
Created attachment 273767 [details]
.config for kernel 4.14.14

Yes, this is a plain kernel, no patch was applied. 

Attached is my .config

System is a slackaware-current x86_64, Hardware is a A10-7850k, asus A88X-PLUS Motherboard with latest firmware and i'm using a initrd to boot a lvm root filesystem

Thanks for picking up this
Comment 5 Borislav Petkov 2018-01-21 12:39:13 UTC
This config has

# CONFIG_KASAN is not set

but I'd like to be able to trigger the same report as you do.

So do a

$ grep KASAN config

on the 4.14.8 or 4.14.11 config with which you're seeing the warnings.
Comment 6 Borislav Petkov 2018-01-21 13:27:04 UTC
Ok, nevermind, I think I see it. Please try the attached hunk.
Comment 7 Borislav Petkov 2018-01-21 13:27:42 UTC
Created attachment 273771 [details]
test patch
Comment 8 higuita 2018-01-21 23:52:31 UTC
Created attachment 273781 [details]
Wake up

Well, it looks better, i do not see the KASAN error anymore! dmesg wake up messages attached ... but... there is a problem!

i could wake up once, on the second suspend/wake up cycle, it crashes when waking the secondary cpu. I  rebooted and i could do the same one more, but now it crash even on the first wake up (3 attempts, one off then a cold boot)

The oops scrolls a bit, but if you want, i can try to take a photo of what is still visible... or try to configure a serial port console to try to catch the all oops message
Comment 9 Borislav Petkov 2018-01-22 11:33:47 UTC
(In reply to higuita from comment #8)
> The oops scrolls a bit, but if you want, i can try to take a photo of what
> is still visible... or try to configure a serial port console to try to
> catch the all oops message

Yes, pls try that. I doubt it is microcode-loader related but I can try
and take a look.

Thx.
Comment 10 higuita 2018-01-24 03:21:26 UTC
well, those crash might have been related to some tests i had done in the past and when using ctrl+R to search for the suspend command, i may have execute then again ... and they had the --store-quirks-as-lkw option, so the options where stored

i removed the file and now i can suspend and wake up without any problem... so sorry about the noise, the use-after-free issue should be fixed now, after several suspend and wake up cycles, i get no more errors.

Thanks for the fix!

will you be able to include it in 4.15?
Comment 11 Borislav Petkov 2018-01-24 11:09:49 UTC
(In reply to higuita from comment #10)
> will you be able to include it in 4.15?

I'll try to.

If not, stable@ is CCed so it'll percolate to the affected kernels
eventually.

Thanks, closing.

Note You need to log in before you can comment on or make changes to this bug.