Bug 195919 - Frequent kernel crashes on AMD Ryzen
Summary: Frequent kernel crashes on AMD Ryzen
Status: NEW
Alias: None
Product: Platform Specific/Hardware
Classification: Unclassified
Component: x86-64 (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: platform_x86_64@kernel-bugs.osdl.org
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2017-05-29 22:19 UTC by blenheimears
Modified: 2019-09-05 10:38 UTC (History)
6 users (show)

See Also:
Kernel Version: 4.11.3
Subsystem:
Regression: No
Bisected commit-id:


Attachments
Section of kernel log before crash (9.05 KB, text/plain)
2017-05-29 22:19 UTC, blenheimears
Details

Description blenheimears 2017-05-29 22:19:48 UTC
Created attachment 256761 [details]
Section of kernel log before crash

After anywhere from a few hours to about a week, a kernel crash will occur. This is on a desktop system. It is not possible to reboot the system with the sysRq key after the crash occurs. Sometimes the USB mouse's LED will be off after the crash, and sometimes it will stay on. Usually only garbage or nothing at all is written to the kernel log, however, occasionally information may be written (which I have attached).

Hardware:
AMD Ryzen 7 1700x
Asrock X370 Killer ALI/ac
2x Crucial DDR4-2400 ECC 16GB
Comment 1 blenheimears 2017-05-29 23:34:56 UTC
This appears to be the same issue. This was not reported by me. https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1690085
Comment 2 blenheimears 2017-06-04 21:42:11 UTC
I don't think this is a problem with the Linux kernel, it's probably a CPU bug. The kernel could possibly still implement a workaround for this issue.
Comment 3 blenheimears 2017-06-09 01:05:34 UTC
Definitely a bug in the CPU. If we don't plan to work around it, please close this bug.
Comment 4 James Le Cuirot 2017-10-05 20:25:50 UTC
(In reply to blenheimears from comment #3)
> Definitely a bug in the CPU. If we don't plan to work around it, please
> close this bug.

What makes you think this is a hardware issue? The segfault issue is but my replacement still exhibits freezes like this. This is probably the same as bug #196683.
Comment 5 eric.c.morgan 2017-10-26 23:23:22 UTC
I'm also having this issue with a new, week 33 Ryzen that does not segfault. I have complete system crash every couple of days. I'm also in the ubuntu link above.

Dragonflybsd from the ubuntu thread talks about rcu kernel params to apply, which I did in a custom linux kernel, 4.13.3. I still have crashes.

This appears to occur when the system is very idle.
Comment 6 eric.c.morgan 2017-10-27 00:10:37 UTC
I posted here: https://bugzilla.kernel.org/show_bug.cgi?id=196683

I'm also trying the kernel boot params suggested.
Comment 7 lejimster 2017-11-19 19:03:26 UTC
(In reply to blenheimears from comment #3)
> Definitely a bug in the CPU. If we don't plan to work around it, please
> close this bug.

I don't think this a bug in the CPU as it doesn't happen on other OS.  There are work arounds by disabling C-State, which have helped my Ryzen 7 1700/B350 system stay up.  But it ideally needs addressing as we are losing some power saving features.

Before I disabled C-State in the UEFI I was experiencing 1-2 "random" reboots each day, especially when leaving the system to go idle.

Using 4.14-rc3

Agree with James Le Cuirot, probably the same issue as #196683.
Comment 8 CodingEagle02 2019-09-04 15:00:07 UTC
I've been lightly following this thread, but it is still a seemingly unresolved issue that impedes me from using Linux. Has there been any real progress on this? Or is the resolution still a vague 'we more or less know where the issue is but we're still waiting for someone upstream to take action to fix it'?
Comment 9 Borislav Petkov 2019-09-04 15:13:52 UTC
(In reply to blenheimears from comment #0)
> After anywhere from a few hours to about a week, a kernel crash will occur.

Can you trigger this with the latest upstream kernel, v5.2 currently, and *without* the proprietary nvidia and vbox* crap?

Thx.
Comment 10 CodingEagle02 2019-09-05 10:38:25 UTC
(Oops, I meant to reply to a related but different thread, sorry)

Note You need to log in before you can comment on or make changes to this bug.