Bug 201077
Summary: | plugging in ac adapter causes amdgpu powerplay error, than crashes the system. kernel 4.19 fails to boot | ||
---|---|---|---|
Product: | Drivers | Reporter: | Utku Helvacı (proje.pdf) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-other |
Status: | RESOLVED UNREPRODUCIBLE | ||
Severity: | blocking | CC: | alexdeucher, konoha02, proje.pdf, Teofilis.Martisius |
Priority: | P1 | ||
Hardware: | Intel | ||
OS: | Linux | ||
Kernel Version: | kernel versions newer than 4.16 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Log file for a kernel that works from original reporter
Log file for a kernel that fails from original reporter dmesg while charging dmesg with battery "fully charged" dmesg no ac adapter faulty kernel with amdgpu.dpm=0 parameter faulty kernel without amdgpu.dpm=0 parameter Kernel 4.20-rc6 dmesg output dmesg: mce: [Hardware Error] sensors sensors detect |
Description
Utku Helvacı
2018-09-10 14:53:35 UTC
journalctl -b-1 output: http://termbin.com/l8po this is output of 4.19-rc3, stuck at black screen and fans runs at full speed (In reply to Utku Helvacı from comment #1) > journalctl -b-1 output: http://termbin.com/l8po > this is output of 4.19-rc3, stuck at black screen and fans runs at full speed ok so here is the all information that i can give: on reddit page we tried to disable tpm device encryption and it did not worked, on manjaro page we have tried to disable tlp, disabling tlp made system more stable but issues still occurred . Also amdgpu dc problem is not related to this problem, i have confirmed by adding amdgpu.dc=0 to kernel parameters and nothing is changed there is nothing happened on archlinux forum issue appears first on 4.17 and 4.18, 4.19 is un-bootable journalctl output: http://termbin.com/x630 This error message happened after i booted 4.18 then it crash then shown while booting back 4.16: CPU 0: Machine Check: 0 Bank 4: fe00000000070f0f TSC 0 ADDR d123b184 MISC d012000001000000 PROCESSOR 2:660f51 TIME 1537382400 SOCKET 0 APIC 0 microcode 6006118 journalctl -b0: http://termbin.com/ylf35 journalctl -b-1: http://termbin.com/sefi as i tested 4.19-rc4 successfully to boot but still unusable Can you use git to bisect to identify what commit caused the problem? I didn't done git bisect before but i will try my best. I am going to university and my computer is 4 threaded so it can take a while. Is it there a way to skip first 100 commits of powerplay and drm then falling back 50 commits , i don't know to commands (In reply to Alex Deucher from comment #5) > Can you use git to bisect to identify what commit caused the problem? OK I have found the specific commit by bisecting the kernel https://github.com/torvalds/linux/commit/320b164abb32db876866a4ff8c2cb710524ac6ea (In reply to Utku Helvacı from comment #7) > (In reply to Alex Deucher from comment #5) > > Can you use git to bisect to identify what commit caused the problem? > > > OK I have found the specific commit by bisecting the kernel > https://github.com/torvalds/linux/commit/ > 320b164abb32db876866a4ff8c2cb710524ac6ea checked again, this couses the bug here is the dmesg output: http://termbin.com/djzke (In reply to Alex Deucher from comment #5) > Can you use git to bisect to identify what commit caused the problem? I have already found the commit, i am not sure you got the message, because of different timezone I apologize if this is spamming Created attachment 279381 [details]
Log file for a kernel that works from original reporter
Created attachment 279383 [details]
Log file for a kernel that fails from original reporter
Created attachment 279647 [details] dmesg while charging I have the same laptop, except my system behaves differently, I'm using Debian Stretch, with kernel 4.18.6 from backports (firmware and mesa drives also from there). When using AC adapter the system tends to freeze entirely from time to time, but it lasts very few seconds, it happens rarely, and when the freeze is gone, the system continues to work correctly. Still, every time I use AC adapter I expect freezes and the fan struggling, except nothing crashes, perhaps because I'm using debian stable. I get on dmesg the following [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <uvd_v6_0> failed -110 [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-110). Which seems to be gone when the battery hits it's maximum charge. I'm addressing another bug https://bugzilla.kernel.org/show_bug.cgi?id=201305 I can't hibarnate my laptop, but the thing is that I report as well that my system never seems to use it's maximum battery capacity: 28.0/32.2 Wh, the whole battery thing behaves estrange, although I can even use offline charging (with the laptop shutdown) I also have a non related bug with WiFi since I have Qualcomm card, Original Reporter seems to have Intel so I suppose it doesn't even matter) Created attachment 279649 [details]
dmesg with battery "fully charged"
Battery "fully charged" dmesg, (it says that it's fully charged when its 98% or 99% sometimes btw)
The laptop stops freezing after this, but the fan seems to be working twice as hard. Laptop gets hot ironically.
Created attachment 279651 [details]
dmesg no ac adapter
I guess is worth uploading my dmesg with no AC adapter at all.
This is how I regularly use my laptop. No freezes, fan is on but I can't even hear it. Even if I'm using videogame emulators, or watching Full HD media, multiple monitors, etc, the laptop is not loud, although, when using heavy graphics the fan behaves as expected going hard, cooling the laptop just fine.
When using AC adapter for charging, the fan goes up and down, even if not using GPU.
Without AC, if the laptop is not being used and it's cool, the fan will stop working, as expected, and go back to work if needed. It works quite well without AC tho.
I'd like to be emphatic that if the battery is fully charged and the AC adapter is still connected, the fan will behave in a wrong way, but the system will not freeze. While if the battery is still charging the system will freeze from time to time (dmesg related I suppose) and the fan will behave like I already said.
This laptop's power management seems to be rather buggy, battery definitely lasts longer with OEM Windows 10. Tried asking Acer but they don't support Linux (I already knew, not a shock).
So you use acer too, what model exactly? We might using the same model For temporary solution i use 4.16 realtime kernel from manjaro repositories 4.16 is the latest supporting kernel for me After 4.20 releases i will try to remail lkml I don't think this is acers fault ,i think it has more to do with amd didn't count rx 540 gpu when the commit made or something else because older kernels work just fine (In reply to Neil from comment #12) > Created attachment 279647 [details] > dmesg while charging > > I have the same laptop, except my system behaves differently, I'm using > Debian Stretch, with kernel 4.18.6 from backports (firmware and mesa drives > also from there). > > When using AC adapter the system tends to freeze entirely from time to time, > but it lasts very few seconds, it happens rarely, and when the freeze is > gone, the system continues to work correctly. > Still, every time I use AC adapter I expect freezes and the fan struggling, > except nothing crashes, perhaps because I'm using debian stable. > > I get on dmesg the following > > [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block > <uvd_v6_0> failed -110 > [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed > (-110). > > Which seems to be gone when the battery hits it's maximum charge. > I'm addressing another bug > https://bugzilla.kernel.org/show_bug.cgi?id=201305 I can't hibarnate my > laptop, but the thing is that I report as well that my system never seems to > use it's maximum battery capacity: 28.0/32.2 Wh, the whole battery thing > behaves estrange, although I can even use offline charging (with the laptop > shutdown) I also have a non related bug with WiFi since I have Qualcomm > card, Original Reporter seems to have Intel so I suppose it doesn't even > matter) Can you try to benchmark 4.16 Kernel and current kernel ac adaptor plugged in with glmark2 , 4.18 seemd normal to me but it was slowing down gaming performance heavily (In reply to Utku Helvacı from comment #15) > So you use acer too, what model exactly? > We might using the same model > > > For temporary solution i use 4.16 realtime kernel from manjaro repositories > > > 4.16 is the latest supporting kernel for me > > > After 4.20 releases i will try to remail lkml > > > I don't think this is acers fault ,i think it has more to do with amd didn't > count rx 540 gpu when the commit made or something else because older > kernels work just fine Hi, I got the Acer Aspire A515 41G, good laptop. But in my case I can't use it with anything below kernel 4.14 otherwise it'll not boot at all, or the fans will be always on (kernel 4.9 for example, boots but fans are always at high speed) At the moment I'm not able to test a different kernel, will do later. Looking forward to kernel 4.21 because it seems to have big AMD/POLARIS12 changes. Anyway, I found a workaround for the weird AC adapter behavior I'm having, if I use 'amdgpu.dpm=0" freezes/hangs are gone, but I still have to reboot the laptop and so to set the boot parameter. I don't use the parameter for normal use (no adapter) because it might drain my battery even worse I presume, and I also find it unnecessary. Hope the RX 540 gets better for linux. I still have to use "DRI_PRIME=1 xxx" with some programs so they use my RX 540 instead of the Radeon R7 Graphics. I would recommend you to try DRI_PRIME=1 to ensure that the program you need is using your dedicated graphics card (RX 540) instead of the Radeon R7 integrated one. With some programs I can see HUGE improvements on fps, still, I believe this card can perform even better. At least it does with Windows 10. Also, I cannot recommend you enough to try amdgpu.dpm=0 since I can confirm that I don't get anymore the errors while charging: [drm:amdgpu_device_ip_resume_phase2 [amdgpu]] *ERROR* resume of IP block <uvd_v6_0> failed -110 [drm:amdgpu_device_resume [amdgpu]] *ERROR* amdgpu_device_ip_resume failed (-110). But like a said, it's sad that at the moment the laptop needs different parameters for charging and normal use. I suppose vga_switcheroo could always do better job to avoid having to set DRI_PRIME=1 with certain software, but this is way beyond my knowledge, hopefully it also gets fixed with the next amdgpu patches. (And hopefully I'm going to be able to hibernate my machine without any parameters (nomodeset) too) :( on xubuntu "amdgpu.dpm=0" did removed ip resume errors, but i couldn't launched steam, it caused screen windows to stop updating (only mouse curser moved, apps runned in background) i will test on manjaro too Created attachment 279901 [details]
faulty kernel with amdgpu.dpm=0 parameter
this is the kernel including the faulty commit, parameter does save from crash but it also lowers the fps in games like ac adapter not plugged in, but infact its plugged in
Created attachment 279903 [details]
faulty kernel without amdgpu.dpm=0 parameter
Created attachment 280021 [details]
Kernel 4.20-rc6 dmesg output
(In reply to Utku Helvacı from comment #21) > Created attachment 280021 [details] > Kernel 4.20-rc6 dmesg output kernel 4.20 is the same, heavy frame drops, low performance, freezes Created attachment 281235 [details]
dmesg: mce: [Hardware Error]
I was experimenting a bit with this laptop, and noticed that when using 'DRI_PRIME=1 xxx' without AC adapter it works as bad as using the integrated card, but if I connect the AC adapter, while the videogame or graphical software is running with 'DRI_PRIME=1', the performance will automatically increase drastically, in fact this is the very first time I see videogames running well with their correct speed even if it's demanding graphically, It's even better than windows 10 performance lol, most games work correctly, even emulators.
But, the thing is that if the AC adapter is still connected when you exit the videogame or whatever, the system will freeze entirely, and it'll be followed by an abrupt reboot, the freeze or hang is very similar those caused by 'drm:uvd_v6_0_enc_ring_test_ring [amdgpu]' error, by the reboot is totally new in my experience with this laptop.
When the system came back I noticed the following during boot
[ 0.967971] mce: [Hardware Error]: Machine check events logged
[ 0.967971] mce: [Hardware Error]: CPU 0: Machine Check: 0 Bank 4: fe00000000070f0f
[ 0.967971] mce: [Hardware Error]: TSC 0 ADDR d143b184 MISC d012000001000000
[ 0.967971] mce: [Hardware Error]: PROCESSOR 2:660f51 TIME 1550686386 SOCKET 0 APIC 0 microcode 600611a
Will attach my dmesg file.
Most of the time I use amdgpu.dpm=0 to get rid of hangs during graphically non demanding activities, I wouldn't recommend it for graphics.
I also found Debian to be the most reliable distro for this hardware.
I found another page talking about same bug and using the same hardware: https://bugs.freedesktop.org/show_bug.cgi?id=109073 Ok i have started to a real bisect this time, i am using github repo and using this command to start bisecting: git bisect start 320b164abb32db876866a4ff8c2cb710524ac6ea 0adb32858b0bddf4ada5f364a84ed60b196dbcda -- drivers/gpu/drm/amd I had to skip 9021d2edd259d992cf8b5b48791ab50829129de7 because it didn't compiled with gcc 8 or gcc 6 https://www.reddit.com/r/linuxquestions/comments/b52l7m/how_to_use_older_gcc_with_make/ Error message: utku@utku-Linux:/home/utku2/Programlar/ram/linux$ make CC='/usr/bin/gcc-6' -j4 CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h DESCEND objtool CHK include/generated/utsrelease.h CHK scripts/mod/devicetable-offsets.h CHK include/generated/bounds.h CHK include/generated/timeconst.h CHK include/generated/asm-offsets.h CALL scripts/checksyscalls.sh CHK include/generated/compile.h CHK kernel/config_data.h Building modules, stage 2. CPUSTR arch/x86/boot/cpustr.h CC arch/x86/boot/compressed/misc.o RELOCS arch/x86/boot/compressed/vmlinux.relocs HOSTCC arch/x86/boot/compressed/mkpiggy Unsupported relocation type: R_X86_64_PLT32 (4) make[2]: *** [arch/x86/boot/compressed/Makefile:123: arch/x86/boot/compressed/vmlinux.relocs] Error 1 make[2]: *** Bitmemiş işler için bekliyor.... MODPOST 179 modules make[1]: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] Error 2 make: *** [arch/x86/Makefile:299: bzImage] Error 2 make: *** Bitmemiş işler için bekliyor.... bisect log so far: [utku2@utku2 linux]$ git bisect log # bad: [320b164abb32db876866a4ff8c2cb710524ac6ea] Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux # good: [0adb32858b0bddf4ada5f364a84ed60b196dbcda] Linux 4.16 git bisect start '320b164abb32db876866a4ff8c2cb710524ac6ea' '0adb32858b0bddf4ada5f364a84ed60b196dbcda' '--' 'drivers/gpu/drm/amd' # skip: [9021d2edd259d992cf8b5b48791ab50829129de7] drm/amdgpu: mitigate workaround for i915 git bisect skip 9021d2edd259d992cf8b5b48791ab50829129de7 Sooo... I failed to compile next commit even though i tried to use gcc-6, gcc-8 and clang; Also xubuntu and manjaro. Also i couldn't found a way to use older binutils. It also gives the exact same error. If anyone can try to test skipped commits, it will be really helpful, Thanks! [utku2@utku2 linux]$ git bisect skip Bisecting: 333 revisions left to test after this (roughly 8 steps) [fcb7d51571e6ab542e3e6e84ae29c8f541460ca8] drm/amdgpu/sdma4: use num_instances for clock/powergating config [utku2@utku2 linux]$ git bisect log # bad: [320b164abb32db876866a4ff8c2cb710524ac6ea] Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux # good: [0adb32858b0bddf4ada5f364a84ed60b196dbcda] Linux 4.16 git bisect start '320b164abb32db876866a4ff8c2cb710524ac6ea' '0adb32858b0bddf4ada5f364a84ed60b196dbcda' '--' 'drivers/gpu/drm/amd' # skip: [9021d2edd259d992cf8b5b48791ab50829129de7] drm/amdgpu: mitigate workaround for i915 git bisect skip 9021d2edd259d992cf8b5b48791ab50829129de7 # skip: [4ee778dcc16b0ebbd4370a6de79c10bd88c89328] drm/amd/display: disable seamless vp adjustment for mirrored surface git bisect skip 4ee778dcc16b0ebbd4370a6de79c10bd88c89328 i started bisecting with this command: `git bisect start 320b164abb32db876866a4ff8c2cb710524ac6ea 0adb32858b0bddf4ada5f364a84ed60b196dbcda -- drivers/gpu/drm/amd` While building `9021d2edd259d992cf8b5b48791ab50829129de7` i get this error: ``` utku@utku-Linux:/home/utku2/Programlar/ram/linux$ make CC='/usr/bin/gcc-6' -j4 CHK include/config/kernel.release CHK include/generated/uapi/linux/version.h DESCEND objtool CHK include/generated/utsrelease.h CHK scripts/mod/devicetable-offsets.h CHK include/generated/bounds.h CHK include/generated/timeconst.h CHK include/generated/asm-offsets.h CALL scripts/checksyscalls.sh CHK include/generated/compile.h CHK kernel/config_data.h Building modules, stage 2. CPUSTR arch/x86/boot/cpustr.h CC arch/x86/boot/compressed/misc.o RELOCS arch/x86/boot/compressed/vmlinux.relocs HOSTCC arch/x86/boot/compressed/mkpiggy Unsupported relocation type: R_X86_64_PLT32 (4) make[2]: *** [arch/x86/boot/compressed/Makefile:123: arch/x86/boot/compressed/vmlinux.relocs] Error 1 make[2]: *** Bitmemiş işler için bekliyor.... MODPOST 179 modules make[1]: *** [arch/x86/boot/Makefile:112: arch/x86/boot/compressed/vmlinux] Error 2 make: *** [arch/x86/Makefile:299: bzImage] Error 2 make: *** Bitmemiş işler için bekliyor.... ``` While searching on the internet I saw people tried to build it with `gcc-6` 7instead of default `gcc-8` so i give it a shot, I also even tried `clang-7`: On `Xubuntu 18.10` with using the same .config file from `Manjaro`, `make CC='/usr/bin/gcc-6' -j4` gives the same error On `Manjaro`, `make CC='clang' -j4` gives the same error On `Manjaro` and `Xubuntu`, `make -j4` gives the same error So with another internet search i have tried `make CC='gcc -fno-pie -no-pie' -j4` on `Manjaro`, which give the same error and probably `CC='gcc -fno-pie -no-pie'` doesn't work, but people did do this way so i gived a shot. `make CFLAGS=' -no-pie' CCFLAGS=' -no-pie' CXXFLAGS=' -no-pie' -j4` on `Manjaro` gives the same error. `make CFLAGS='-fno-pie -no-pie' CCFLAGS='-fno-pie -no-pie' CXXFLAGS='-fno-pie -no-pie' -j4` on `Manjaro` doesn't build at all, gives this error: ``` [utku2@utku2 ram]$ make CFLAGS='-fno-pie -no-pie' CCFLAGS='-fno-pie -no-pie' CXXFLAGS='-fno-pie -no-pie' -j4 SYSTBL arch/x86/include/generated/asm/syscalls_32.h SYSHDR arch/x86/include/generated/asm/unistd_32_ia32.h HOSTCC scripts/basic/bin2c CHK include/config/kernel.release SYSHDR arch/x86/include/generated/asm/unistd_64_x32.h SYSTBL arch/x86/include/generated/asm/syscalls_64.h UPD include/config/kernel.release HYPERCALLS arch/x86/include/generated/asm/xen-hypercalls.h WRAP arch/x86/include/generated/uapi/asm/bpf_perf_event.h WRAP arch/x86/include/generated/uapi/asm/poll.h CHK include/generated/uapi/linux/version.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_32.h UPD include/generated/uapi/linux/version.h CHK include/generated/utsrelease.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_64.h UPD include/generated/utsrelease.h SYSHDR arch/x86/include/generated/uapi/asm/unistd_x32.h DESCEND objtool HOSTCC /home/utku/Programlar/ram/tools/objtool/fixdep.o HOSTLD /home/utku/Programlar/ram/tools/objtool/fixdep-in.o LINK /home/utku/Programlar/ram/tools/objtool/fixdep CC /home/utku/Programlar/ram/tools/objtool/exec-cmd.o CC /home/utku/Programlar/ram/tools/objtool/help.o CC /home/utku/Programlar/ram/tools/objtool/pager.o CC /home/utku/Programlar/ram/tools/objtool/parse-options.o CC /home/utku/Programlar/ram/tools/objtool/run-command.o CC /home/utku/Programlar/ram/tools/objtool/sigchain.o CC /home/utku/Programlar/ram/tools/objtool/subcmd-config.o CC /home/utku/Programlar/ram/tools/objtool/arch/x86/decode.o LD /home/utku/Programlar/ram/tools/objtool/libsubcmd-in.o AR /home/utku/Programlar/ram/tools/objtool/libsubcmd.a CC /home/utku/Programlar/ram/tools/objtool/builtin-check.o CC /home/utku/Programlar/ram/tools/objtool/builtin-orc.o LD /home/utku/Programlar/ram/tools/objtool/arch/x86/objtool-in.o CC /home/utku/Programlar/ram/tools/objtool/check.o CC /home/utku/Programlar/ram/tools/objtool/orc_gen.o CC /home/utku/Programlar/ram/tools/objtool/orc_dump.o CC /home/utku/Programlar/ram/tools/objtool/elf.o CC /home/utku/Programlar/ram/tools/objtool/special.o CC /home/utku/Programlar/ram/tools/objtool/objtool.o CC /home/utku/Programlar/ram/tools/objtool/libstring.o CC /home/utku/Programlar/ram/tools/objtool/str_error_r.o HOSTCC arch/x86/tools/relocs_32.o WRAP arch/x86/include/generated/asm/dma-contiguous.h WRAP arch/x86/include/generated/asm/early_ioremap.h WRAP arch/x86/include/generated/asm/mcs_spinlock.h WRAP arch/x86/include/generated/asm/mm-arch-hooks.h HOSTCC arch/x86/tools/relocs_64.o HOSTCC arch/x86/tools/relocs_common.o LD /home/utku/Programlar/ram/tools/objtool/objtool-in.o LINK /home/utku/Programlar/ram/tools/objtool/objtool /usr/bin/ld: /home/utku/Programlar/ram/tools/objtool/objtool-in.o: relocation R_X86_64_32S against symbol `inat_avx_tables' can not be used when making a PIE object; recompile with -fPIC /usr/bin/ld: son bağlama başarısız: nonrepresentable section on output collect2: hata: ld çıkış durumu 1 ile döndü make[2]: *** [Makefile:50: /home/utku/Programlar/ram/tools/objtool/objtool] Error 1 make[1]: *** [Makefile:63: objtool] Error 2 make: *** [Makefile:1675: tools/objtool] Error 2 make: *** Bitmemiş işler için bekliyor.... HOSTLD arch/x86/tools/relocs ``` I also asked question on Unix&Linux: https://unix.stackexchange.com/questions/508852/unsupported-relocation-type-r-x86-64-plt32-error-while-bisecting-kernel If i can't find a solution to this, i will not continue bisecting until summer because it took 5 hours just today Ok i finally found a way to continue bisecting and created a repository for it: https://github.com/tuxutku/rx_540_kernel_bisecting_files i am currently using ubuntu 16.04.6 to build, manjaro to configure and install. [utku2@utku2 linux]$ git bisect log # bad: [320b164abb32db876866a4ff8c2cb710524ac6ea] Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux # good: [0adb32858b0bddf4ada5f364a84ed60b196dbcda] Linux 4.16 git bisect start '320b164abb32db876866a4ff8c2cb710524ac6ea' '0adb32858b0bddf4ada5f364a84ed60b196dbcda' '--' 'drivers/gpu/drm/amd' # good: [9021d2edd259d992cf8b5b48791ab50829129de7] drm/amdgpu: mitigate workaround for i915 git bisect good 9021d2edd259d992cf8b5b48791ab50829129de7 # bad: [aa5a5777304228819a52562d346bc3eb1b4873fa] drm/amd/display: Vari-bright looks disabled near end of MM14 git bisect bad aa5a5777304228819a52562d346bc3eb1b4873fa # good: [81988f9c3d9907d7df0ea97e8e4842064b88b7b8] drm/amdgpu: use separate status for buffer funcs availability v2 git bisect good 81988f9c3d9907d7df0ea97e8e4842064b88b7b8 BIG UPDATE!: I have finished bisecting finally, now its all left to send e-mails ! you can visit: https://github.com/tuxutku/rx_540_kernel_bisecting_files to see all logs and bisect history! (In reply to Utku Helvacı from comment #31) > BIG UPDATE!: > I have finished bisecting finally, now its all left to send e-mails ! > you can visit: https://github.com/tuxutku/rx_540_kernel_bisecting_files to > see all logs and bisect history! Thank you. Please post your results to this bug as well so we can track it in one place. [utku2@utku2 linux]$ git bisect bad e1deba285156fb4023bb48f22068de5b60e34e15 is the first bad commit commit e1deba285156fb4023bb48f22068de5b60e34e15 Author: Rex Zhu <Rex.Zhu@amd.com> Date: Tue Feb 27 18:27:54 2018 +0800 drm/amd/pp: Use amdgpu acpi helper functions in powerplay Reviewed-by: Alex Deucher <alexander.deucher@amd.com> Signed-off-by: Rex Zhu <Rex.Zhu@amd.com> Signed-off-by: Alex Deucher <alexander.deucher@amd.com> :040000 040000 b8e6fd4f5269564ff9dba96359c4f26b02f68b4a e3555ae88191d1a060f67998dabf977551d3a127 M drivers [utku2@utku2 linux]$ git bisect log # bad: [320b164abb32db876866a4ff8c2cb710524ac6ea] Merge tag 'drm-for-v4.17' of git://people.freedesktop.org/~airlied/linux # good: [0adb32858b0bddf4ada5f364a84ed60b196dbcda] Linux 4.16 git bisect start '320b164abb32db876866a4ff8c2cb710524ac6ea' '0adb32858b0bddf4ada5f364a84ed60b196dbcda' '--' 'drivers/gpu/drm/amd' # good: [9021d2edd259d992cf8b5b48791ab50829129de7] drm/amdgpu: mitigate workaround for i915 git bisect good 9021d2edd259d992cf8b5b48791ab50829129de7 # bad: [aa5a5777304228819a52562d346bc3eb1b4873fa] drm/amd/display: Vari-bright looks disabled near end of MM14 git bisect bad aa5a5777304228819a52562d346bc3eb1b4873fa # good: [81988f9c3d9907d7df0ea97e8e4842064b88b7b8] drm/amdgpu: use separate status for buffer funcs availability v2 git bisect good 81988f9c3d9907d7df0ea97e8e4842064b88b7b8 # bad: [128ccceaba8656573b8b0f86d3ab6e38094cc754] Merge branch 'drm-next-4.17' of git://people.freedesktop.org/~agd5f/linux into drm-next git bisect bad 128ccceaba8656573b8b0f86d3ab6e38094cc754 # bad: [180a8bebdd50fc8ce4677e579d49d9b73880caa7] drm/amd/pp: Fix sclk in highest two levels when compute on smu7 git bisect bad 180a8bebdd50fc8ce4677e579d49d9b73880caa7 # bad: [ada6770e956b7f7d298bfef56fed457ade5bad9e] drm/amd/pp: Remove cgs_query_system_info git bisect bad ada6770e956b7f7d298bfef56fed457ade5bad9e # good: [a2c120ce6b686c753968b7b1293c7bb878440b7f] drm/amd/pp: Simplify the create of powerplay instance git bisect good a2c120ce6b686c753968b7b1293c7bb878440b7f # good: [589941e1a2d65f5425c91a5859a5454df64b6982] drm/amdgpu: Notify sbios device ready before send request git bisect good 589941e1a2d65f5425c91a5859a5454df64b6982 # bad: [6848d73e889bb29cfede51df8c1d0496c9787454] drm/amd/pp: Remove the wrap functions for acpi in powerplay git bisect bad 6848d73e889bb29cfede51df8c1d0496c9787454 # bad: [e1deba285156fb4023bb48f22068de5b60e34e15] drm/amd/pp: Use amdgpu acpi helper functions in powerplay git bisect bad e1deba285156fb4023bb48f22068de5b60e34e15 # first bad commit: [e1deba285156fb4023bb48f22068de5b60e34e15] drm/amd/pp: Use amdgpu acpi helper functions in powerplay [utku2@utku2 linux]$ Created attachment 282325 [details]
sensors
Created attachment 282327 [details] sensors detect Added sensor information since bug is acpi related https://bugzilla.kernel.org/attachment.cgi?id=282325 i just tested drm-next git from aur "linux-drm-next-git-5.2.827536.08269364808f-1-x86_64" running non-vulkan apps on integrated gpu is fine and system doesn't crash while running apps on discrete graphics but problem is still accuring Issue has been gone! As of 5.3.0-050300rc1-generic #201907212232 the issue is have been gone. i have downloaded and installed the kernel via mainline kernel update utility on Pop_OS! tested both vulkan and opengl. |