Bug 208489
Summary: | amdgpu: kernel oops when overclocking Vega M GPU (i7-8809G) | ||
---|---|---|---|
Product: | Drivers | Reporter: | crab2313 |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | crab2313 |
Priority: | P1 | ||
Hardware: | x86-64 | ||
OS: | Linux | ||
Kernel Version: | 5.7.7 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: | full kernel dmesg |
Ok. The fix has been pushed to upstream. |
Created attachment 290165 [details] full kernel dmesg CPU: Intel(R) Core(TM) i7-8809G CPU @ 3.10GHz Intel Hades Canyon NUC Kit ❯ cat /sys/bus/pci/drivers/amdgpu/0000:01:00.0/pp_od_clk_voltage OD_SCLK: 0: 225MHz 750mV 1: 400MHz 750mV 2: 535MHz 750mV 3: 715MHz 750mV 4: 960MHz 750mV 5: 1080MHz 750mV 6: 1140MHz 750mV 7: 1250MHz 750mV OD_MCLK: 0: 300MHz 750mV 1: 500MHz 750mV 2: 800MHz 800mV OD_RANGE: SCLK: 225MHz 1600MHz MCLK: 300MHz 1000MHz VDDC: 750mV 750mV After doing this: #!/bin/sh sudo sh -c "echo 's 7 1250 750' > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/pp_od_clk_voltage" sudo sh -c "echo 'c' > /sys/devices/pci0000:00/0000:00:01.0/0000:01:00.0/pp_od_clk_voltage" kernel oops with the dmesg. [ 4.932714] Bluetooth: RFCOMM TTY layer initialized [ 4.932722] Bluetooth: RFCOMM socket layer initialized [ 4.932725] Bluetooth: RFCOMM ver 1.11 [ 9.120298] rfkill: input handler enabled [ 9.922018] fuse: init (API version 7.31) [ 10.492078] rfkill: input handler disabled [ 12.680512] wlp6s0: authenticate with 50:d2:f5:f1:12:ed [ 12.690803] wlp6s0: send auth to 50:d2:f5:f1:12:ed (try 1/3) [ 12.728470] wlp6s0: authenticated [ 12.728864] wlp6s0: associate with 50:d2:f5:f1:12:ed (try 1/3) [ 12.759696] wlp6s0: RX AssocResp from 50:d2:f5:f1:12:ed (capab=0x31 status=0 aid=2) [ 12.762966] wlp6s0: associated [ 13.100624] IPv6: ADDRCONF(NETDEV_CHANGE): wlp6s0: link becomes ready [ 606.958453] BUG: unable to handle page fault for address: ffff9032a4c849a4 [ 606.958455] #PF: supervisor read access in kernel mode [ 606.958456] #PF: error_code(0x0000) - not-present page [ 606.958457] PGD 173c01067 P4D 173c01067 PUD 0 [ 606.958459] Oops: 0000 [#1] PREEMPT SMP PTI [ 606.958460] CPU: 7 PID: 2337 Comm: bash Not tainted 5.7.7-zen1-1-zen #1 [ 606.958461] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0054.2019.0214.1350 02/14/2019 [ 606.958528] RIP: 0010:phm_find_closest_vddci+0x3b/0x60 [amdgpu] [ 606.958529] Code: c0 eb 09 48 83 c0 01 48 39 d0 74 19 44 0f b7 44 c3 0c 89 c5 66 41 39 f0 72 e9 44 89 c0 5b 5d c3 bd ff ff ff ff 0f 1f 44 00 00 <44> 0f b7 44 eb 0c 5b 5d 44 89 c0 c3 48 c7 c6 f0 c1 96 c0 48 c7 c7 [ 606.958530] RSP: 0018:ffffa3ecc18ff948 EFLAGS: 00010246 [ 606.958531] RAX: 00000000000002ee RBX: ffff902aa4c849a0 RCX: 0000000000000008 [ 606.958532] RDX: 0000000000000000 RSI: 0000000000000226 RDI: ffff902aa4c849a0 [ 606.958532] RBP: 00000000ffffffff R08: ffffa3ecc18ff9d4 R09: 0000000000000029 [ 606.958533] R10: 000000000000e401 R11: 0000000000000000 R12: ffff902aa6861600 [ 606.958534] R13: ffff902aa4c84000 R14: ffff902aa4c85301 R15: ffffa3ecc18ff9d4 [ 606.958534] FS: 00007fe8233c0b80(0000) GS:ffff902aaedc0000(0000) knlGS:0000000000000000 [ 606.958535] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 606.958536] CR2: ffff9032a4c849a4 CR3: 0000000467468005 CR4: 00000000003606e0 [ 606.958536] Call Trace: [ 606.958587] vegam_get_dependency_volt_by_clk.isra.0+0x8e/0x220 [amdgpu] [ 606.958637] vegam_populate_all_graphic_levels+0x26a/0x960 [amdgpu] [ 606.958686] smu7_set_power_state_tasks+0x77c/0x12b0 [amdgpu] [ 606.958734] phm_set_power_state+0x5a/0x80 [amdgpu] [ 606.958784] psm_adjust_power_state_dynamic+0xca/0x1d0 [amdgpu] [ 606.958831] hwmgr_handle_task+0x49/0xf0 [amdgpu] [ 606.958882] pp_dpm_dispatch_tasks+0x3a/0x60 [amdgpu] [ 606.958915] amdgpu_set_pp_od_clk_voltage+0x3cb/0x490 [amdgpu] [ 606.958921] kernfs_fop_write+0xce/0x1b0 [ 606.958923] vfs_write+0x10a/0x420 [ 606.958925] __x64_sys_write+0x6d/0xf0 [ 606.958926] do_syscall_64+0x4e/0x160 [ 606.958928] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 606.958930] RIP: 0033:0x7fe823523b57 [ 606.958931] Code: 0c 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff eb b7 0f 1f 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 51 c3 48 83 ec 28 48 89 54 24 18 48 89 74 24 [ 606.958931] RSP: 002b:00007ffea3ba5e88 EFLAGS: 00000246 ORIG_RAX: 0000000000000001 [ 606.958932] RAX: ffffffffffffffda RBX: 0000000000000002 RCX: 00007fe823523b57 [ 606.958933] RDX: 0000000000000002 RSI: 00005648e5dda620 RDI: 0000000000000001 [ 606.958934] RBP: 00005648e5dda620 R08: 000000000000000a R09: 0000000000000001 [ 606.958934] R10: 00005648e5d20870 R11: 0000000000000246 R12: 0000000000000002 [ 606.958935] R13: 00007fe8235f4500 R14: 0000000000000002 R15: 00007fe8235f4700 [ 606.958936] Modules linked in: ccm fuse rfcomm xt_CHECKSUM xt_MASQUERADE xt_conntrack cmac algif_hash ipt_REJECT nf_reject_ipv4 algif_skcipher af_alg xt_tcpudp ip6table_mangle ip6table_nat iptable_mangle iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter tun mousedev input_leds hid_generic joydev usbhid hid xpad ff_memless bridge stp llc bnep msr intel_rapl_msr intel_rapl_common x86_pkg_temp_thermal intel_powerclamp coretemp iwlmvm kvm_intel snd_hda_codec_realtek kvm snd_hda_codec_generic iTCO_wdt mac80211 iTCO_vendor_support irqbypass 8250_dw ledtrig_audio snd_hda_codec_hdmi mei_hdcp nls_iso8859_1 tps6598x crct10dif_pclmul typec libarc4 wmi_bmof crc32_pclmul nls_cp437 ghash_clmulni_intel intel_wmi_thunderbolt vfat aesni_intel snd_hda_intel btusb fat btrtl snd_intel_dspcfg iwlwifi btbcm crypto_simd cryptd glue_helper snd_hda_codec intel_cstate btintel intel_uncore snd_hda_core intel_rapl_perf pcspkr [ 606.958955] e1000e i2c_i801 cfg80211 snd_hwdep bluetooth igb snd_pcm mei_me ecdh_generic intel_lpss_pci rfkill dca snd_timer ecc intel_lpss mei idma64 intel_pch_thermal snd tpm_crb soundcore wmi tpm_tis i2c_multi_instantiate tpm_tis_core evdev tpm rng_core mac_hid tcp_bbr sch_cake sg crypto_user ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sdhci_pci cqhci xhci_pci sdhci crc32c_intel xhci_hcd mmc_core amdgpu gpu_sched i2c_algo_bit ttm drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm agpgart [ 606.958968] CR2: ffff9032a4c849a4 [ 606.958970] ---[ end trace d28ac9f0a176b773 ]--- [ 606.959020] RIP: 0010:phm_find_closest_vddci+0x3b/0x60 [amdgpu] [ 606.959021] Code: c0 eb 09 48 83 c0 01 48 39 d0 74 19 44 0f b7 44 c3 0c 89 c5 66 41 39 f0 72 e9 44 89 c0 5b 5d c3 bd ff ff ff ff 0f 1f 44 00 00 <44> 0f b7 44 eb 0c 5b 5d 44 89 c0 c3 48 c7 c6 f0 c1 96 c0 48 c7 c7 [ 606.959022] RSP: 0018:ffffa3ecc18ff948 EFLAGS: 00010246 [ 606.959023] RAX: 00000000000002ee RBX: ffff902aa4c849a0 RCX: 0000000000000008 [ 606.959023] RDX: 0000000000000000 RSI: 0000000000000226 RDI: ffff902aa4c849a0 [ 606.959024] RBP: 00000000ffffffff R08: ffffa3ecc18ff9d4 R09: 0000000000000029 [ 606.959024] R10: 000000000000e401 R11: 0000000000000000 R12: ffff902aa6861600 [ 606.959025] R13: ffff902aa4c84000 R14: ffff902aa4c85301 R15: ffffa3ecc18ff9d4 [ 606.959026] FS: 00007fe8233c0b80(0000) GS:ffff902aaedc0000(0000) knlGS:0000000000000000 [ 606.959027] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 606.959027] CR2: ffff9032a4c849a4 CR3: 0000000467468005 CR4: 00000000003606e0