<3>[ 236.837169] Freezing of tasks failed after 20.000 seconds (2 tasks refusing to freeze, wq_busy=0): <6>[ 236.837198] systemd-udevd D12744 302 252 0x80000104 <4>[ 236.837214] Call Trace: <4>[ 236.837228] ? __schedule+0x364/0xbf0 <4>[ 236.837242] schedule+0x2d/0x90 <4>[ 236.837252] __pm_runtime_barrier+0x9c/0x160 <4>[ 236.837264] ? wait_woken+0xa0/0xa0 <4>[ 236.837275] __pm_runtime_disable+0x84/0xe0 <4>[ 236.837292] snd_hda_codec_cleanup_for_unbind+0x203/0x210 [snd_hda_codec] <4>[ 236.837311] hda_codec_driver_probe+0x47/0x100 [snd_hda_codec] <4>[ 236.837324] driver_probe_device+0x306/0x480 <4>[ 236.837335] __driver_attach+0xdb/0x100 <4>[ 236.837344] ? driver_probe_device+0x480/0x480 <4>[ 236.837354] ? driver_probe_device+0x480/0x480 <4>[ 236.837365] bus_for_each_dev+0x74/0xc0 <4>[ 236.837377] bus_add_driver+0x15f/0x250 <4>[ 236.837387] ? 0xffffffffa00e3000 <4>[ 236.837395] driver_register+0x56/0xe0 <4>[ 236.837404] ? 0xffffffffa00e3000 <4>[ 236.837413] do_one_initcall+0x58/0x370 <4>[ 236.837425] ? do_init_module+0x1d/0x1ea <4>[ 236.837436] ? rcu_read_lock_sched_held+0x6f/0x80 <4>[ 236.837447] ? kmem_cache_alloc_trace+0x282/0x2e0 <4>[ 236.837460] do_init_module+0x56/0x1ea <4>[ 236.837471] load_module+0x2435/0x2b20 <4>[ 236.837490] ? __se_sys_finit_module+0xd3/0xf0 <4>[ 236.837501] __se_sys_finit_module+0xd3/0xf0 <4>[ 236.837515] do_syscall_64+0x55/0x190 <4>[ 236.837526] entry_SYSCALL_64_after_hwframe+0x49/0xbe <4>[ 236.837537] RIP: 0033:0x7f58dc89d839 <4>[ 236.837545] Code: Bad RIP value. <4>[ 236.837559] RSP: 002b:00007ffd689bab78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 <4>[ 236.837574] RAX: ffffffffffffffda RBX: 000055df23929a20 RCX: 00007f58dc89d839 <4>[ 236.837587] RDX: 0000000000000000 RSI: 00007f58dc57c0e5 RDI: 0000000000000011 <4>[ 236.837599] RBP: 00007f58dc57c0e5 R08: 0000000000000000 R09: 00007ffd689bac90 <4>[ 236.837612] R10: 0000000000000011 R11: 0000000000000246 R12: 0000000000000000 <4>[ 236.837624] R13: 000055df23957ca0 R14: 0000000000020000 R15: 000055df23929a20 <6>[ 236.837648] alsactl D11880 627 1 0x80000106 <4>[ 236.837662] Call Trace: <4>[ 236.837671] ? __schedule+0x364/0xbf0 <4>[ 236.837681] ? rwsem_down_write_failed+0xa0/0x2d0 <4>[ 236.837693] ? rwsem_down_write_failed+0x1c2/0x2d0 <4>[ 236.837704] schedule+0x2d/0x90 <4>[ 236.837712] rwsem_down_write_failed+0x1c7/0x2d0 <4>[ 236.837728] ? call_rwsem_down_write_failed+0x13/0x20 <4>[ 236.837740] call_rwsem_down_write_failed+0x13/0x20 <4>[ 236.837752] ? snd_ctl_release+0x7e/0x120 <4>[ 236.837762] down_write+0x4e/0x60 <4>[ 236.837770] ? snd_ctl_release+0x7e/0x120 <4>[ 236.837779] snd_ctl_release+0x7e/0x120 <4>[ 236.837791] __fput+0xaf/0x1f0 <4>[ 236.837801] task_work_run+0x8e/0xc0 <4>[ 236.837813] do_exit+0x396/0xd40 <4>[ 236.837823] ? ksys_ioctl+0x35/0x60 <4>[ 236.837835] rewind_stack_do_exit+0x17/0x20 <6>[ 236.837858] OOM killer enabled. Boot logs: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4454/fi-kbl-8809g/boot0.log Logs during the execution of our testsuite: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4454/fi-kbl-8809g/dmesg0.log All the test artefacts: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_4454/fi-kbl-8809g/igt@gem_exec_suspend@basic-s4-devices.html
The oops file has a better description: <4>[ 6.285553] general protection fault: 0000 [#1] PREEMPT SMP PTI <4>[ 6.285619] CPU: 0 PID: 626 Comm: kworker/0:3 Not tainted 4.18.0-rc3-CI-CI_DRM_4454+ #1 <4>[ 6.285685] Hardware name: Intel Corporation NUC8i7HVK/NUC8i7HVB, BIOS HNKBLi70.86A.0040.2018.0516.1521 05/16/2018 <4>[ 6.285741] Workqueue: pm pm_runtime_work <4>[ 6.285765] RIP: 0010:__x86_indirect_thunk_rax+0x10/0x20 <4>[ 6.285793] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 e8 07 00 00 00 f3 90 0f ae e8 eb f9 48 89 04 24 <c3> 0f 1f 44 00 00 66 2e 0f 1f 84 00 00 00 00 00 e8 07 00 00 00 f3 <4>[ 6.285963] RSP: 0018:ffffc900010c3d20 EFLAGS: 00010202 <4>[ 6.285991] RAX: 6b6b6b6b6b6b6b6b RBX: ffff880270118008 RCX: 0000000000000001 <4>[ 6.286029] RDX: 0000000000000000 RSI: 00000000aafad925 RDI: ffff880270118008 <4>[ 6.286065] RBP: ffff8802591db3f8 R08: 00000000d381941d R09: 0000000000000001 <4>[ 6.286103] R10: 0000000000000000 R11: 0000000000000000 R12: ffff880270118560 <4>[ 6.286139] R13: 0000000000000000 R14: ffffffffa0194a30 R15: 0000000000000008 <4>[ 6.286176] FS: 0000000000000000(0000) GS:ffff88027ec00000(0000) knlGS:0000000000000000 <4>[ 6.286218] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 <4>[ 6.286248] CR2: 000055f779ae60a8 CR3: 0000000005210005 CR4: 00000000003606f0 <4>[ 6.286285] Call Trace: <4>[ 6.286303] ? alc_suspend+0x36/0x50 [snd_hda_codec_realtek] <6>[ 6.286331] input: HDA Intel PCH HDMI/DP,pcm=3 as /devices/pci0000:00/0000:00:1f.3/sound/card0/input15 <4>[ 6.286383] ? hda_call_codec_suspend+0x1d/0xa0 [snd_hda_codec] <4>[ 6.286416] ? hda_codec_runtime_suspend+0x54/0xa0 [snd_hda_codec] <4>[ 6.286450] ? __rpm_callback+0xb3/0x1b0 <4>[ 6.286471] ? hda_codec_runtime_resume+0x40/0x40 [snd_hda_codec] <4>[ 6.286502] ? rpm_callback+0x1a/0x70 <4>[ 6.286522] ? hda_codec_runtime_resume+0x40/0x40 [snd_hda_codec] <4>[ 6.286555] ? rpm_suspend+0x125/0x6d0 <4>[ 6.286577] ? pm_runtime_work+0x6e/0xb0 <4>[ 6.286599] ? process_one_work+0x248/0x6c0 <4>[ 6.287861] ? worker_thread+0x37/0x380 <4>[ 6.289111] ? process_one_work+0x6c0/0x6c0 <4>[ 6.290352] ? kthread+0x119/0x130 <4>[ 6.291593] ? kthread_flush_work_fn+0x10/0x10 <4>[ 6.292822] ? ret_from_fork+0x3a/0x50 <4>[ 6.294046] Modules linked in: snd_hda_codec_realtek(+) snd_hda_codec_generic amdgpu i915 x86_pkg_temp_thermal intel_powerclamp coretemp crct10dif_pclmul crc32_pclmul btusb btrtl btbcm snd_hda_codec_hdmi btintel bluetooth snd_hda_intel snd_hda_codec ghash_clmulni_intel chash gpu_sched e1000e ecdh_generic snd_hwdep ttm snd_hda_core igb snd_pcm mei_me mei prime_numbers pinctrl_sunrisepoint pinctrl_intel <6>[ 6.295228] input: HDA Intel PCH Line Out as /devices/pci0000:00/0000:00:1f.3/sound/card0/input16 <4>[ 6.296904] ---[ end trace 8b569215a2200dc6 ]---
How can I reproduce the bug?
(In reply to Takashi Iwai from comment #2) > How can I reproduce the bug? Try suspending/resuming in a loop for a couple of thousand times :s I hope this will not require one of these bad boys to reproduce: https://www.intel.com/content/www/us/en/products/boards-kits/nuc/kits/nuc8i7hvk.html So far, we have only seen this issue once. I will keep you updated when we get more runs with this issue.
Do you mean S4 or S3? The log you showed indicates the GPF at early boot, and is this while suspend/resume testing? Somehow the information doesn't match with the description... The more information is appreciated. Thanks.
Sorry about that. I followed the code and it looks like the test is verifying that hibernating while the GPU is active is done properly. The hibernation is triggered like this: # echo devices > /sys/power/pm_test # echo disk > /sys/power/state I hope this helps!
Last seen CI_DRM_4454 (7 months, 1 week / 4171 runs ago). This issue is not seen lately.
Still have not reproduced 8 months later. So much has changed since this bug was filed that it makes no sense to keep it open. Closing!