Bug 211649
Summary: | "drm/amd/display: reuse current context instead of recreating one" cause hdmi hotplug blackscreen on amdgpu | ||
---|---|---|---|
Product: | Drivers | Reporter: | youling257 |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | alexdeucher, eric, eseifert, nicholas.kazlauskas, pmenzel+bugzilla.kernel.org, stefan |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 5.11-rc7 | Subsystem: | |
Regression: | No | Bisected commit-id: | |
Attachments: |
dmesg.txt
amdgpu_dm_dtn_log dmesg.txt trace.txt |
Description
youling257
2021-02-09 06:51:06 UTC
Please attach your xorg log (if using X) and your dmesg output. Please reproduce the blackscreen issue and attach the following logs/information: - dmesg.log (with drm.debug=6 set as part of your modeline) - Xorg.0.log if using X - The name of your desktop environment - eg. GNOME, KDE, etc. I'm unable to reproduce this locally on my end and it doesn't seem to have applied generically to DCN21/DCN30 per Dan's test report. replug hdmi, [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:67:crtc-0] flip_done timed out [ 252.377374] [drm:drm_atomic_helper_wait_for_flip_done [drm_kms_helper]] *ERROR* [CRTC:67:crtc-0] flip_done timed out [ 262.617295] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [CRTC:67:crtc-0] flip_done timed out [ 270.268771] [drm] amdgpu_dm_irq_schedule_work FAILED src 2 [ 272.857440] [drm:drm_atomic_helper_wait_for_dependencies [drm_kms_helper]] *ERROR* [PLANE:55:plane-3] flip_done timed out [ 272.906034] ------------[ cut here ]------------ [ 272.906037] WARNING: CPU: 0 PID: 15 at amdgpu_dm_atomic_commit_tail+0x2582/0x25f0 [amdgpu] [ 272.906404] Modules linked in: nct6775 hwmon_vid ccm binfmt_misc overlay exportfs kvm_amd kvm crc32c_intel crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore snd_intel_dspcfg amdgpu drm_kms_helper cec fb_sys_fops syscopyarea sysfillrect sysimgblt gpu_sched drm_ttm_helper ttm drm ccp rng_core hid_multitouch btusb iwlmvm btrtl btintel btbcm bluetooth ecdh_generic ecc mac80211 libarc4 iwlwifi cfg80211 igb i2c_algo_bit k10temp gpio_amdpt gpio_generic efi_pstore sdcardfs [ 272.906461] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G W 5.11.0-rc5-android-x86_64+ #1 [ 272.906466] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Gaming-ITX/ac, BIOS P4.10 07/08/2020 [ 272.906468] Workqueue: events dm_irq_work_func [amdgpu] [ 272.906826] RIP: 0010:amdgpu_dm_atomic_commit_tail+0x2582/0x25f0 [amdgpu] [ 272.907181] Code: a0 fd ff ff 01 c7 85 9c fd ff ff 37 00 00 00 c7 85 a4 fd ff ff 20 00 00 00 e8 5a 30 13 00 e9 a0 fa ff ff 0f 0b e9 f0 f8 ff ff <0f> 0b e9 3f f9 ff ff 0f 0b 0f 0b e9 55 f9 ff ff 49 8b 06 41 0f b6 [ 272.907185] RSP: 0018:ffffa0a8000a3ab8 EFLAGS: 00210002 [ 272.907189] RAX: 0000000000000002 RBX: 0000000000000003 RCX: ffff996d91211918 [ 272.907191] RDX: 0000000000000001 RSI: 0000000000200297 RDI: ffff996d510a0178 [ 272.907194] RBP: ffffa0a8000a3db8 R08: 0000000000000005 R09: 0000000000000000 [ 272.907196] R10: ffffa0a8000a3a18 R11: ffffa0a8000a3a1c R12: 0000000000200287 [ 272.907198] R13: ffff996d6c35d800 R14: ffff996d91211800 R15: ffff996d6f4d3600 [ 272.907200] FS: 0000000000000000(0000) GS:ffff996e56e00000(0000) knlGS:0000000000000000 [ 272.907203] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 272.907206] CR2: 00000000144c2000 CR3: 000000010e262000 CR4: 00000000001506f0 [ 272.907209] Call Trace: [ 272.907219] commit_tail+0x94/0x130 [drm_kms_helper] [ 272.907251] drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper] [ 272.907280] dm_restore_drm_connector_state+0xef/0x170 [amdgpu] [ 272.907635] handle_hpd_irq+0x118/0x150 [amdgpu] [ 272.907990] dm_irq_work_func+0x49/0x60 [amdgpu] [ 272.908346] process_one_work+0x1c1/0x3c0 [ 272.908353] worker_thread+0x4d/0x3d0 [ 272.908357] ? rescuer_thread+0x3b0/0x3b0 [ 272.908362] kthread+0x133/0x150 [ 272.908367] ? kthread_create_worker_on_cpu+0x70/0x70 [ 272.908373] ret_from_fork+0x22/0x30 [ 272.908379] ---[ end trace 9e7a0ad9c8574ed6 ]--- [ 272.908402] ------------[ cut here ]------------ [ 272.908403] WARNING: CPU: 0 PID: 15 at amdgpu_dm_atomic_commit_tail+0x258b/0x25f0 [amdgpu] [ 272.908757] Modules linked in: nct6775 hwmon_vid ccm binfmt_misc overlay exportfs kvm_amd kvm crc32c_intel crc32_pclmul snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_hda_codec_hdmi snd_hda_intel snd_hda_codec snd_hda_core snd_hwdep snd_pcm snd_timer snd soundcore snd_intel_dspcfg amdgpu drm_kms_helper cec fb_sys_fops syscopyarea sysfillrect sysimgblt gpu_sched drm_ttm_helper ttm drm ccp rng_core hid_multitouch btusb iwlmvm btrtl btintel btbcm bluetooth ecdh_generic ecc mac80211 libarc4 iwlwifi cfg80211 igb i2c_algo_bit k10temp gpio_amdpt gpio_generic efi_pstore sdcardfs [ 272.908803] CPU: 0 PID: 15 Comm: kworker/0:1 Tainted: G W 5.11.0-rc5-android-x86_64+ #1 [ 272.908807] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B450 Gaming-ITX/ac, BIOS P4.10 07/08/2020 [ 272.908809] Workqueue: events dm_irq_work_func [amdgpu] [ 272.909163] RIP: 0010:amdgpu_dm_atomic_commit_tail+0x258b/0x25f0 [amdgpu] [ 272.909517] Code: ff ff 37 00 00 00 c7 85 a4 fd ff ff 20 00 00 00 e8 5a 30 13 00 e9 a0 fa ff ff 0f 0b e9 f0 f8 ff ff 0f 0b e9 3f f9 ff ff 0f 0b <0f> 0b e9 55 f9 ff ff 49 8b 06 41 0f b6 8e 2d 01 00 00 48 c7 c6 38 [ 272.909521] RSP: 0018:ffffa0a8000a3ab8 EFLAGS: 00210086 [ 272.909524] RAX: 0000000000000001 RBX: 0000000000000003 RCX: ffff996d91211918 [ 272.909526] RDX: 0000000000000001 RSI: 0000000000200297 RDI: ffff996d510a0178 [ 272.909528] RBP: ffffa0a8000a3db8 R08: 0000000000000005 R09: 0000000000000000 [ 272.909530] R10: ffffa0a8000a3a18 R11: ffffa0a8000a3a1c R12: 0000000000200287 [ 272.909532] R13: ffff996d6c35d800 R14: ffff996d91211800 R15: ffff996d6f4d3600 [ 272.909534] FS: 0000000000000000(0000) GS:ffff996e56e00000(0000) knlGS:0000000000000000 [ 272.909537] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 272.909539] CR2: 00000000144c2000 CR3: 000000010e262000 CR4: 00000000001506f0 [ 272.909541] Call Trace: [ 272.909549] commit_tail+0x94/0x130 [drm_kms_helper] [ 272.909579] drm_atomic_helper_commit+0x113/0x140 [drm_kms_helper] [ 272.909608] dm_restore_drm_connector_state+0xef/0x170 [amdgpu] [ 272.909961] handle_hpd_irq+0x118/0x150 [amdgpu] [ 272.910315] dm_irq_work_func+0x49/0x60 [amdgpu] [ 272.910669] process_one_work+0x1c1/0x3c0 [ 272.910674] worker_thread+0x4d/0x3d0 [ 272.910678] ? rescuer_thread+0x3b0/0x3b0 [ 272.910682] kthread+0x133/0x150 [ 272.910687] ? kthread_create_worker_on_cpu+0x70/0x70 [ 272.910692] ret_from_fork+0x22/0x30 [ 272.910697] ---[ end trace 9e7a0ad9c8574ed7 ]--- Created attachment 295189 [details]
dmesg.txt
echo 6 > /sys/module/drm/parameters/debug
replug hdmi dmesg.
It would be good to know what OS/Userspace this is, I don't think I've seen anything that has atomic check failures on nearly every flip, eg: [ 179.142898] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22 Black screen comes from the page-flip timeout, but it's hard to say what the root cause is. When the screen is black can you cat the DTN log and then attach the result to this ticket? cat /sys/kernel/debug/dri/0/amdgpu_dm_dtn_log Thanks! Created attachment 295193 [details]
amdgpu_dm_dtn_log
I using amd 3400g running on androidx86 7 with mesa21.0
cat /sys/kernel/debug/dri/0/amdgpu_dm_dtn_log
That makes more sense then, that's the behavior I'd expect on CrOS/Android. Also explains why we haven't seen this on X/Wayland based compositors like GNOME on Ubuntu. One more log I'd like captured here, the trace log. Please record a log from before the black screen occurs until after. I've included a snippet below that shows how to capture this log: #!/bin/bash set +x OUTPUT_DIR=$(pwd) mount -o remount,rw / echo 4 > /sys/module/drm/parameters/debug echo 65536 > /sys/kernel/debug/tracing/buffer_size_kb cd /sys/kernel/debug/tracing/ echo amdgpu_dm* > set_ftrace_filter echo function > current_tracer echo 1 > events/amdgpu_dm/enable echo 1 > tracing_on echo > trace echo "Capturing a trace...\n" read -n 1 cat trace > $OUTPUT_DIR/trace.txt echo "Saved trace to ${OUTPUT_DIR}/trace.txt" dmesg > $OUTPUT_DIR/dmesg.txt echo "Saved dmesg to ${OUTPUT_DIR}/dmesg.txt" (In reply to Nicholas Kazlauskas from comment #7) > That makes more sense then, that's the behavior I'd expect on CrOS/Android. > > Also explains why we haven't seen this on X/Wayland based compositors like > GNOME on Ubuntu. > > One more log I'd like captured here, the trace log. Please record a log from > before the black screen occurs until after. > > I've included a snippet below that shows how to capture this log: > > #!/bin/bash > > set +x > OUTPUT_DIR=$(pwd) > > mount -o remount,rw / > echo 4 > /sys/module/drm/parameters/debug > echo 65536 > /sys/kernel/debug/tracing/buffer_size_kb > > cd /sys/kernel/debug/tracing/ > echo amdgpu_dm* > set_ftrace_filter > echo function > current_tracer > echo 1 > events/amdgpu_dm/enable > echo 1 > tracing_on > echo > trace > echo "Capturing a trace...\n" > read -n 1 > cat trace > $OUTPUT_DIR/trace.txt > echo "Saved trace to ${OUTPUT_DIR}/trace.txt" > dmesg > $OUTPUT_DIR/dmesg.txt > echo "Saved dmesg to ${OUTPUT_DIR}/dmesg.txt" run the script,only echo Capturing a trace...\n failed output trace.txt and dmesg.txt. Created attachment 295197 [details]
dmesg.txt
I delete "read -n 1", can output dmesg.txt and trace.txt.
trace.txt,
# tracer: function
#
# entries-in-buffer/entries-written: 0/0 #P:8
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# ||| / delay
# TASK-PID CPU# |||| TIMESTAMP FUNCTION
# | | | |||| | |
I don't expect the trace log to be completely empty. You'll need to leave the trace log running after enabling it as well (with tracing on/trace). You'll also need various config options set in your kernel: CONFIG_FUNCTION_TRACER CONFIG_FUNCTION_GRAPH_TRACER CONFIG_STACK_TRACER CONFIG_DYNAMIC_FTRACE Sorry if this is a little bit involved, but I'd like capture the exact sequence of atomic commits with the trace so I can reproduce the sequence on my end. Thanks! Created attachment 295201 [details]
trace.txt
cat /sys/kernel/debug/tracing/trace
Thanks, this was what I'm looking for. *** Bug 211821 has been marked as a duplicate of this bug. *** Can this bug be closed then? (In reply to Paul Menzel from comment #16) > Can this bug be closed then? Only the person that opens the bug can close it, but yes, it can be closed. *** Bug 211997 has been marked as a duplicate of this bug. *** |