Bug 207009 - WARNING: CPU: 10 PID: 0 at do_debug+0x192/0x220
Summary: WARNING: CPU: 10 PID: 0 at do_debug+0x192/0x220
Status: CLOSED WILL_NOT_FIX
Alias: None
Product: ACPI
Classification: Unclassified
Component: Other (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Zhang Rui
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-28 19:31 UTC by ilkka.prusi
Modified: 2020-11-19 05:27 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.5.13, 5.6.0-rc7+
Subsystem:
Regression: No
Bisected commit-id:


Attachments
panic during booting (1.03 MB, image/jpeg)
2020-03-29 15:18 UTC, ilkka.prusi
Details

Description ilkka.prusi 2020-03-28 19:31:52 UTC
With kernel 5.5.13, dmesg shows following:
[ 3272.084146] ------------[ cut here ]------------
[ 3272.084155] WARNING: CPU: 10 PID: 0 at do_debug+0x192/0x220
[ 3272.084156] Modules linked in: fuse(E) btrfs(E) blake2b_generic(E) zstd_compress(E) zstd_decompress(E) ufs(E) ntfs(E) msdos(E) jfs(E) xfs(E) dm_mod(E) snd_seq_dummy(E) snd_hrtimer(E) snd_seq_midi(E) snd_seq_midi_event(E) snd_seq(E) nf_tables(E) nfnetlink(E) binfmt_misc(E) nls_ascii(E) nls_cp437(E) vfat(E) fat(E) snd_usb_audio(E) snd_usbmidi_lib(E) snd_rawmidi(E) eeepc_wmi(E) asus_wmi(E) snd_seq_device(E) edac_mce_amd(E) battery(E) amdgpu(E) sparse_keymap(E) video(E) r8169(E) snd_hda_codec_realtek(E) mc(E) kvm_amd(E) snd_hda_codec_hdmi(E) snd_hda_codec_generic(E) ledtrig_audio(E) kvm(E) rfkill(E) realtek(E) snd_hda_intel(E) snd_intel_dspcfg(E) libphy(E) wmi_bmof(E) snd_hda_codec(E) irqbypass(E) snd_hda_core(E) sr_mod(E) snd_hwdep(E) cdrom(E) gpu_sched(E) snd_pcm_oss(E) crct10dif_pclmul(E) crc32_pclmul(E) ttm(E) joydev(E) ghash_clmulni_intel(E) snd_mixer_oss(E) drm_kms_helper(E) snd_pcm(E) snd_timer(E) drm(E) aesni_intel(E) snd(E) crypto_simd(E) cryptd(E) sp5100_tco(E) agpgart(E)
[ 3272.084199]  glue_helper(E) ccp(E) i2c_algo_bit(E) sg(E) soundcore(E) i2c_piix4(E) efi_pstore(E) efivars(E) k10temp(E) rng_core(E) wmi(E) acpi_cpufreq(E) button(E) nfsd(E) auth_rpcgss(E) nfs_acl(E) lockd(E) grace(E) sunrpc(E) efivarfs(E) ip_tables(E) x_tables(E) autofs4(E) raid10(E) raid456(E) libcrc32c(E) async_raid6_recov(E) async_memcpy(E) async_pq(E) async_xor(E) xor(E) async_tx(E) raid6_pq(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E) sd_mod(E) evdev(E) input_leds(E) hid_steam(E) hid_generic(E) usbhid(E) hid(E) ahci(E) xhci_pci(E) libahci(E) crc32c_intel(E) xhci_hcd(E) libata(E) usbcore(E) scsi_mod(E) gpio_amdpt(E) gpio_generic(E)
[ 3272.084234] CPU: 10 PID: 0 Comm: swapper/10 Tainted: G            E     5.5.13 #1
[ 3272.084235] Hardware name: System manufacturer System Product Name/TUF B450-PLUS GAMING, BIOS 2008 12/06/2019
[ 3272.084238] RIP: 0010:do_debug+0x192/0x220
[ 3272.084242] Code: 00 02 74 e0 fa 66 0f 1f 44 00 00 eb d7 e8 f6 77 0c 00 e9 bc fe ff ff e8 cc 79 0c 00 e9 2a ff ff ff f6 85 88 00 00 00 03 75 8a <0f> 0b 80 e4 bf 49 89 84 24 18 0b 00 00 f0 41 80 0c 24 10 48 81 a5
[ 3272.084243] RSP: 0018:fffffe0000223f20 EFLAGS: 00192046
[ 3272.084245] RAX: 0000000000004000 RBX: 0000000000000000 RCX: 0000000000000000
[ 3272.084246] RDX: 0000000000000001 RSI: 0000000000000003 RDI: ffffffff8205f900
[ 3272.084247] RBP: fffffe0000223f58 R08: 0000000000000000 R09: 0000000000000005
[ 3272.084248] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8887fac4cb00
[ 3272.084249] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[ 3272.084251] FS:  0000000000000000(0000) GS:ffff8887fea80000(0000) knlGS:0000000000000000
[ 3272.084253] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3272.084254] CR2: 00007f3a8c0f5000 CR3: 000000071e770000 CR4: 00000000003406e0
[ 3272.084255] Call Trace:
[ 3272.084257]  <#DB>
[ 3272.084264]  debug+0x28/0x60
[ 3272.084268] RIP: 0010:acpi_idle_do_entry+0x15/0x40
[ 3272.084271] Code: 66 90 fa 66 0f 1f 44 00 00 c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 0f b6 47 08 3c 01 74 1f 3c 02 74 20 8b 57 04 ec <48> 8b 05 ec 07 93 00 a9 00 00 00 80 75 08 48 8b 15 4a f0 e7 00 ed
[ 3272.084272] RSP: 0018:ffffc9000016fe18 EFLAGS: 495fa197
[ 3272.084274] RAX: 0000000000000000 RBX: 0000000000000002 RCX: 0000000000000068
[ 3272.084275] RDX: 0000000000000414 RSI: ffffffff820c21e0 RDI: ffff8887fa589498
[ 3272.084276] RBP: ffff8887fa589400 R08: 000002f9d6fe835d R09: 00000000000021dd
[ 3272.084277] R10: 0000000000001554 R11: ffff8887feaab824 R12: ffff8887fa589498
[ 3272.084278] R13: 0000000000000002 R14: 0000000000000002 R15: ffff8887fac4cb00
[ 3272.084281]  </#DB>
[ 3272.084285]  acpi_idle_enter+0xe8/0x2b0
[ 3272.084290]  cpuidle_enter_state+0x81/0x410
[ 3272.084293]  cpuidle_enter+0x29/0x40
[ 3272.084297]  do_idle+0x1d7/0x260
[ 3272.084301]  cpu_startup_entry+0x19/0x20
[ 3272.084304]  start_secondary+0x166/0x1c0
[ 3272.084307]  secondary_startup_64+0xa4/0xb0
[ 3272.084311] ---[ end trace 4f7494ff5c9ceb3d ]---
Comment 1 ilkka.prusi 2020-03-29 04:33:42 UTC
Potentially related bug #198833
Comment 2 ilkka.prusi 2020-03-29 15:17:26 UTC
Also kernel panic during booting 5.6.0-rc7+
Comment 3 ilkka.prusi 2020-03-29 15:18:27 UTC
Created attachment 288119 [details]
panic during booting

Image of panic
Comment 4 ilkka.prusi 2020-03-31 08:06:25 UTC
Kernel 5.5.11 seems to work better without same issues than newer kernels.

5.5.12 (and later) has changes in locking which might explain these issues.
Comment 5 Zhang Rui 2020-06-30 06:17:07 UTC
(In reply to ilkka.prusi from comment #4)
> Kernel 5.5.11 seems to work better without same issues than newer kernels.
> 
so 5.5.11 always work well. right?

> 5.5.12 (and later) has changes in locking which might explain these issues.

Give that you've already narrow down the problem to 5.5.11 - 5.5.12.
please run git-bisect to find out the offending commit.
Comment 6 ilkka.prusi 2020-07-27 07:29:38 UTC
(In reply to Zhang Rui from comment #5)
> (In reply to ilkka.prusi from comment #4)
> > Kernel 5.5.11 seems to work better without same issues than newer kernels.
> > 
> so 5.5.11 always work well. right?
> 
> > 5.5.12 (and later) has changes in locking which might explain these issues.
> 
> Give that you've already narrow down the problem to 5.5.11 - 5.5.12.
> please run git-bisect to find out the offending commit.

Sorry I did not respond sooner, notification was lost somewhere..

"better" does not mean "no issues" - just that this particular splat was not there before: that does not really mean anything as any one of the crashes where log could not be recovered might have been the same issue.

So I don't really have a known "good" configuration.

Today 5.8.0-rc7 again showed similar splat but at least I could get picture of it  (dmesg -w). I'll have to parse those pictures next..

Hopefully I'll get another system for comparison to rule out any hardware issues.
Comment 7 ilkka.prusi 2020-09-05 17:33:47 UTC
Looks like problem only appears when there are two identical DIMMs (of a set) on the computer but not just one of them. My guess is that RAM is incompatible in some way.

Current -stable is also stable on the computer when only one of the DIMMs is installed.

I did try with different components (different motherboard, power supply, different display card etc.) and those did change the issue.
But it seems to be entirely hardware problem.

I think this bug can be closed.

Note You need to log in before you can comment on or make changes to this bug.