Bug 215821
Summary: | kernel BUG at drivers/iommu/amd/init.c:851 amd_iommu_enable_interrupts+0x34d/0x420 when resuming from suspend to RAM | ||
---|---|---|---|
Product: | Drivers | Reporter: | Lahfa Samy (samy) |
Component: | IOMMU | Assignee: | drivers_iommu |
Status: | RESOLVED MOVED | ||
Severity: | high | CC: | matijs, samy, zheyuma97 |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.17.1-arch1-1 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: | dmesg output with initcall_debug, no_console_suspend, ignore_loglevel |
Follow up on this bug, it went away recently on the latest kernel but recently on 5.17.9-arch1-1 which now, causes another bug for which I shall make another bug report and link it to this one, I suppose. The bug is still in IOMMU of AMD device, it is inside the same function, than in this bug report, but the oops trace looks like it is a bit different this time. |
Created attachment 300727 [details] dmesg output with initcall_debug, no_console_suspend, ignore_loglevel This bug started recently on kernel 5.17.x I believe, I should do a downgrade to confirm this, however I'm pretty confident this issue wasn't here before a recent upgrade I've made to the kernel. So far testing under the Linux lts 5.15.32-arch1-1 shows that this issue is not present. The hardware is a ThinkPad T495 AMD Ryzen 7 PRO 3700U with a Radeon Vega RX10. Current linux-firmware installed : 20220209.6342082-1 If an X11 server is running resuming fails as the screen never comes back on, weirdly in the TTY it does resume, not sure if it's a relevant detail. Dmesg attached was made with cmdline options : initcall_debug, no_console_suspend, ignore_loglevel for lots of outputs. Here is some relevant logs : [ 82.540316] ACPI: PM: Preparing to enter system sleep state S3 [ 82.547782] ACPI: EC: event blocked [ 82.547784] ACPI: EC: EC stopped [ 82.547785] ACPI: PM: Saving platform NVS memory [ 82.548228] Disabling non-boot CPUs ... [ 82.550506] smpboot: CPU 1 is now offline [ 82.553132] smpboot: CPU 2 is now offline [ 82.555485] smpboot: CPU 3 is now offline [ 82.557593] smpboot: CPU 4 is now offline [ 82.559873] smpboot: CPU 5 is now offline [ 82.561829] smpboot: CPU 6 is now offline [ 82.563933] smpboot: CPU 7 is now offline [ 82.565077] ACPI: PM: Low-level resume complete [ 82.565107] ACPI: EC: EC started [ 82.565108] ACPI: PM: Restoring platform NVS memory [ 83.718277] ------------[ cut here ]------------ [ 83.718278] WARNING: CPU: 0 PID: 2572 at drivers/iommu/amd/init.c:851 amd_iommu_enable_interrupts+0x34d/0x420 [ 83.718290] Modules linked in: ccm cmac algif_hash algif_skcipher af_alg bnep lm92 uvcvideo videobuf2_vmalloc videobuf2_memops videobuf2_v4l2 videobuf2_common videodev mc btusb btrtl btbcm btintel btmtk bluetooth intel_rapl_msr ecdh_generic joydev mousedev intel_rapl_common crc16 edac_mce_amd snd_sof_amd_renoir snd_acp_config kvm_amd iwlmvm snd_sof_amd_acp kvm snd_sof_pci irqbypass snd_sof mac80211 snd_ctl_led snd_soc_acpi crct10dif_pclmul snd_hda_codec_realtek think_lmi crc32_pclmul libarc4 snd_hda_codec_hdmi snd_hda_codec_generic firmware_attributes_class crc32c_intel snd_soc_core ghash_clmulni_intel snd_hda_intel aesni_intel wmi_bmof snd_compress snd_intel_dspcfg iwlwifi snd_intel_sdw_acpi crypto_simd ac97_bus vfat snd_hda_codec snd_pcm_dmaengine cryptd iwlmei fat rapl snd_hda_core snd_pci_acp6x thinkpad_acpi snd_pci_acp5x snd_hwdep tpm_crb ledtrig_audio snd_pcm cfg80211 psmouse sp5100_tco platform_profile snd_rn_pci_acp3x ucsi_acpi zenpower(OE) snd_timer tpm_tis rfkill i2c_piix4 [ 83.718366] typec_ucsi snd ipmi_devintf typec snd_pci_acp3x tpm_tis_core ccp mei ipmi_msghandler r8168(OE) soundcore roles wmi tpm video rng_core i2c_scmi pinctrl_amd mac_hid acpi_cpufreq sg crypto_user acpi_call(OE) fuse bpf_preload ip_tables x_tables usbhid zfs(POE) zunicode(POE) zzstd(OE) zlua(OE) zavl(POE) icp(POE) zcommon(POE) znvpair(POE) spl(OE) serio_raw atkbd libps2 sdhci_pci cqhci sdhci xhci_pci xhci_pci_renesas mmc_core i8042 serio radeon amdgpu gpu_sched drm_ttm_helper ttm [ 83.718413] CPU: 0 PID: 2572 Comm: systemd-sleep Tainted: P OE 5.17.1-arch1-1 #1 0ea933cb6bfe82a8dc16ab834a4bccdd297f98b7 [ 83.718418] Hardware name: LENOVO 20NKS28F00/20NKS28F00, BIOS R12ET55W(1.25 ) 07/06/2020 [ 83.718421] RIP: 0010:amd_iommu_enable_interrupts+0x34d/0x420 [ 83.718427] Code: ff ff 49 8b 7f 18 89 04 24 e8 9f 36 ee ff 8b 04 24 e9 4b fd ff ff 0f 0b 4d 8b 3f 49 81 ff 50 09 56 99 0f 85 05 fd ff ff eb 96 <0f> 0b 4d 8b 3f 49 81 ff 50 09 56 99 0f 85 f1 fc ff ff eb 82 31 f6 [ 83.718429] RSP: 0018:ffffa787405cbc68 EFLAGS: 00010046 [ 83.718432] RAX: 00000001262cdc89 RBX: 0000000000000000 RCX: 0000000000000000 [ 83.718434] RDX: 000000000000607e RSI: 00000000000059ae RDI: 00000001262c7c0b [ 83.718436] RBP: 0000000080000000 R08: 0000000000000000 R09: 000000000000000f [ 83.718437] R10: 0000000079726f6d R11: 000000006d656d20 R12: 000ffffffffffff8 [ 83.718439] R13: 0800000000000000 R14: ffffa787405cbc70 R15: ffff95d48004a800 [ 83.718441] FS: 00007fb3d354fe80(0000) GS:ffff95d76fa00000(0000) knlGS:0000000000000000 [ 83.718443] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 83.718445] CR2: 00007f42204d6ad0 CR3: 000000012dbe8000 CR4: 00000000003506f0 [ 83.718447] Call Trace: [ 83.718450] <TASK> [ 83.718455] ? early_enable_iommus+0x1c5/0x300 [ 83.718460] ? enable_iommus_v2+0x8e/0x130 [ 83.718464] syscore_resume+0x4b/0x160 [ 83.718469] suspend_devices_and_enter+0x6d3/0x7d0 [ 83.718476] pm_suspend.cold+0x2fb/0x342 [ 83.718482] state_store+0x71/0xd0 [ 83.718487] kernfs_fop_write_iter+0x11c/0x1b0 [ 83.718493] new_sync_write+0x15c/0x1f0 [ 83.718500] vfs_write+0x1eb/0x280 [ 83.718503] ksys_write+0x67/0xe0 [ 83.718506] do_syscall_64+0x5c/0x80 [ 83.718511] ? do_syscall_64+0x69/0x80 [ 83.718513] ? exc_page_fault+0x72/0x170 [ 83.718517] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 83.718522] RIP: 0033:0x7fb3d3f44257 [ 83.718526] Code: 0f 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 83.718528] RSP: 002b:00007ffeda5645a8 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 83.718531] RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007fb3d3f44257 [ 83.718532] RDX: 0000000000000004 RSI: 00007ffeda564690 RDI: 0000000000000004 [ 83.718534] RBP: 00007ffeda564690 R08: 000055ba9c2d1230 R09: 0000000000000000 [ 83.718535] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000004 [ 83.718536] R13: 000055ba9c2cd3c0 R14: 0000000000000004 R15: 00007fb3d403d7a0 [ 83.718540] </TASK> [ 83.718541] ---[ end trace 0000000000000000 ]--- [ 83.719139] Enabling non-boot CPUs ... [ 83.719211] x86: Booting SMP configuration: A bug report was also made downstream to bugs.archlinux.org. For any more information, feel free to reach out to me in the comments.