Bug 203033 - nouveau hung task
Summary: nouveau hung task
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2019-03-25 14:45 UTC by Matteo Croce
Modified: 2022-11-08 18:23 UTC (History)
4 users (show)

See Also:
Kernel Version: 5.0.0
Subsystem:
Regression: No
Bisected commit-id:


Attachments

Description Matteo Croce 2019-03-25 14:45:21 UTC
The GUI freezes randomly with nouveau with this error in the syslog:

INFO: task kworker/u16:4:14755 blocked for more than 120 seconds.
      Not tainted 5.0.0-matteo #28
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u16:4   D    0 14755      2 0x80000000
Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Call Trace:
 ? __schedule+0x1a6/0x6c0
 schedule+0x2c/0x70
 schedule_timeout+0x268/0x380
 ? nvif_notify_get+0x94/0xa0 [nouveau]
 dma_fence_default_wait+0x204/0x270
 ? dma_fence_release+0x90/0x90
 dma_fence_wait_timeout+0xdd/0x100
 drm_atomic_helper_wait_for_fences+0x3a/0xc0 [drm_kms_helper]
 nv50_disp_atomic_commit_tail+0x7c/0x850 [nouveau]
 ? __switch_to_asm+0x34/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x40/0x70
 ? __switch_to_asm+0x40/0x70
 process_one_work+0x1fa/0x400
 worker_thread+0x2d/0x3d0
 ? process_one_work+0x400/0x400
 kthread+0x113/0x130
 ? kthread_create_on_node+0x60/0x60
 ret_from_fork+0x35/0x40
Comment 1 me 2019-10-20 20:11:30 UTC
I'm having the same issue. Arch Linux, 5.3.6-arch1-1-ARCH, on nouveau drivers on a GTX 970 (NV110).

INFO: task kworker/u8:0:7163 blocked for more than 122 seconds.
      Not tainted 5.3.6-arch1-1-ARCH #1
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
kworker/u8:0    D    0  7163      2 0x80004080
Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Call Trace:
 ? __schedule+0x27f/0x6d0
 schedule+0x43/0xd0
 schedule_timeout+0x299/0x3d0
 dma_fence_default_wait+0x1b9/0x2c0
 ? dma_fence_wait_timeout+0x110/0x110
 dma_fence_wait_timeout+0x105/0x110
 drm_atomic_helper_wait_for_fences+0x61/0xc0 [drm_kms_helper]
 nv50_disp_atomic_commit_tail+0x7a/0x6c0 [nouveau]
 ? _raw_spin_unlock_irq+0x1d/0x30
 ? finish_task_switch+0x85/0x2e0
 ? __switch_to+0x86/0x460
 process_one_work+0x1d1/0x3a0
 worker_thread+0x4a/0x3d0
 kthread+0xfb/0x130
 ? process_one_work+0x3a0/0x3a0
 ? kthread_park+0x80/0x80
 ret_from_fork+0x35/0x40
Comment 2 Mathieu Malaterre 2020-05-11 15:01:21 UTC
Same here:

May 11 16:54:31 vostrodell kernel: [  605.330992] INFO: task kworker/u8:3:162 blocked for more than 120 seconds.
May 11 16:54:31 vostrodell kernel: [  605.330997]       Not tainted 5.4.0-0.bpo.4-amd64 #1 Debian 5.4.19-1~bpo10+1
May 11 16:54:31 vostrodell kernel: [  605.330999] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 11 16:54:31 vostrodell kernel: [  605.331001] kworker/u8:3    D    0   162      2 0x80004000
May 11 16:54:31 vostrodell kernel: [  605.331083] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
May 11 16:54:31 vostrodell kernel: [  605.331095] Call Trace:
May 11 16:54:31 vostrodell kernel: [  605.331108]  ? __schedule+0x2e6/0x6f0
May 11 16:54:31 vostrodell kernel: [  605.331111]  schedule+0x2f/0xa0
May 11 16:54:31 vostrodell kernel: [  605.331114]  schedule_timeout+0x20d/0x310
May 11 16:54:31 vostrodell kernel: [  605.331155]  ? nvif_notify_get+0x94/0xa0 [nouveau]
May 11 16:54:31 vostrodell kernel: [  605.331224]  ? nv84_fence_sync+0x40/0x40 [nouveau]
May 11 16:54:31 vostrodell kernel: [  605.331234]  dma_fence_default_wait+0x22f/0x290
May 11 16:54:31 vostrodell kernel: [  605.331241]  ? dma_fence_release+0x140/0x140
May 11 16:54:31 vostrodell kernel: [  605.331245]  dma_fence_wait_timeout+0xdd/0x100
May 11 16:54:31 vostrodell kernel: [  605.331264]  drm_atomic_helper_wait_for_fences+0x3c/0xd0 [drm_kms_helper]
May 11 16:54:31 vostrodell kernel: [  605.331332]  nv50_disp_atomic_commit_tail+0x72/0x710 [nouveau]
May 11 16:54:31 vostrodell kernel: [  605.331340]  ? __switch_to_asm+0x40/0x70
May 11 16:54:31 vostrodell kernel: [  605.331357]  ? __switch_to_asm+0x34/0x70
May 11 16:54:31 vostrodell kernel: [  605.331360]  ? __switch_to+0x7a/0x3e0
May 11 16:54:31 vostrodell kernel: [  605.331365]  ? __switch_to_asm+0x34/0x70
May 11 16:54:31 vostrodell kernel: [  605.331370]  process_one_work+0x1a7/0x360
May 11 16:54:31 vostrodell kernel: [  605.331377]  worker_thread+0x30/0x390
May 11 16:54:31 vostrodell kernel: [  605.331383]  ? create_worker+0x1a0/0x1a0
May 11 16:54:31 vostrodell kernel: [  605.331388]  kthread+0x112/0x130
May 11 16:54:31 vostrodell kernel: [  605.331394]  ? kthread_park+0x80/0x80
May 11 16:54:31 vostrodell kernel: [  605.331400]  ret_from_fork+0x35/0x40
Comment 3 Mathieu Malaterre 2020-05-11 15:02:55 UTC
Forgot to include hardware information:

$ lspci -s 01:00.0
01:00.0 VGA compatible controller: NVIDIA Corporation GT215 [GeForce GT 240] (rev a2)
Comment 4 Mathieu Malaterre 2020-05-11 15:03:22 UTC
$ uname -a
Linux vostrodell 5.4.0-0.bpo.4-amd64 #1 SMP Debian 5.4.19-1~bpo10+1 (2020-03-09) x86_64 GNU/Linux
Comment 5 James O'Beirne 2020-08-28 15:23:25 UTC
Having the same issue.

# lspci | grep VGA
1f:00.0 VGA compatible controller: NVIDIA Corporation GK208 [GeForce GT 710B] (rev a1)

# uname -a
Linux slug 4.19.0-10-amd64 #1 SMP Debian 4.19.132-1 (2020-07-24) x86_64 GNU/Linux

Aug 28 02:34:11 slug kernel: [38304.923718] INFO: task kworker/u32:0:5123 blocked for more than 120 seconds.
Aug 28 02:34:11 slug kernel: [38304.923725]       Tainted: G            E     4.19.0-10-amd64 #1 Debian 4.19.132-1
Aug 28 02:34:11 slug kernel: [38304.923727] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 28 02:34:11 slug kernel: [38304.923729] kworker/u32:0   D    0  5123      2 0x80000000
Aug 28 02:34:11 slug kernel: [38304.923798] Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Aug 28 02:34:11 slug kernel: [38304.923799] Call Trace:
Aug 28 02:34:11 slug kernel: [38304.923808]  __schedule+0x2a2/0x870
Aug 28 02:34:11 slug kernel: [38304.923812]  schedule+0x28/0x80
Aug 28 02:34:11 slug kernel: [38304.923814]  schedule_timeout+0x26d/0x390
Aug 28 02:34:11 slug kernel: [38304.923875]  ? nvkm_client_map+0x10/0x10 [nouveau]
Aug 28 02:34:11 slug kernel: [38304.923880]  dma_fence_default_wait+0x238/0x2a0
Aug 28 02:34:11 slug kernel: [38304.923882]  ? dma_fence_release+0x90/0x90
Aug 28 02:34:11 slug kernel: [38304.923884]  dma_fence_wait_timeout+0x42/0xf0
Aug 28 02:34:11 slug kernel: [38304.923897]  drm_atomic_helper_wait_for_fences+0x63/0xc0 [drm_kms_helper]
Aug 28 02:34:11 slug kernel: [38304.923957]  nv50_disp_atomic_commit_tail+0x7c/0x880 [nouveau]
Aug 28 02:34:11 slug kernel: [38304.923963]  ? __switch_to+0x15b/0x440
Aug 28 02:34:11 slug kernel: [38304.923966]  ? __switch_to_asm+0x35/0x70
Aug 28 02:34:11 slug kernel: [38304.923971]  process_one_work+0x1a7/0x3a0
Aug 28 02:34:11 slug kernel: [38304.923975]  worker_thread+0x30/0x390
Aug 28 02:34:11 slug kernel: [38304.923978]  ? create_worker+0x1a0/0x1a0
Aug 28 02:34:11 slug kernel: [38304.923981]  kthread+0x112/0x130
Aug 28 02:34:11 slug kernel: [38304.923983]  ? kthread_bind+0x30/0x30
Aug 28 02:34:11 slug kernel: [38304.923985]  ret_from_fork+0x22/0x40
Comment 6 Murph 2022-11-08 18:23:28 UTC
Having a similar issue here. GUI locks up but system is still running in the background (meetings can still see and hear me, I can hear them). Can't switch to different tty, have to hard reset.

λ ~/ lspci | grep VGA
01:00.0 VGA compatible controller: NVIDIA Corporation TU117M [GeForce GTX 1650 Mobile / Max-Q] (rev a1)

λ ~/ uname -a
Linux murph-icl-gen2 5.15.74 #1-NixOS SMP Sat Oct 15 05:59:05 UTC 2022 x86_64 GNU/Linux

Nov 08 09:19:38 murph-icl-gen2 kernel: task:kworker/u32:42  state:D stack:    0 pid:66054 ppid:     2 flags:0x00004000
Nov 08 09:19:38 murph-icl-gen2 kernel: Workqueue: events_unbound nv50_disp_atomic_commit_work [nouveau]
Nov 08 09:19:38 murph-icl-gen2 kernel: Call Trace:
Nov 08 09:19:38 murph-icl-gen2 kernel:  <TASK>
Nov 08 09:19:38 murph-icl-gen2 kernel:  __schedule+0x2e1/0x1350
Nov 08 09:19:38 murph-icl-gen2 kernel:  ? nvkm_event_get+0x70/0x90 [nouveau]
Nov 08 09:19:38 murph-icl-gen2 kernel:  ? nvkm_client_notify_get+0x23/0x40 [nouveau]
Nov 08 09:19:38 murph-icl-gen2 kernel:  schedule+0x5b/0xd0
Nov 08 09:19:38 murph-icl-gen2 kernel:  schedule_timeout+0x104/0x140
Nov 08 09:19:38 murph-icl-gen2 kernel:  ? nouveau_fence_enable_signaling+0x2a/0x70 [nouveau]
Nov 08 09:19:38 murph-icl-gen2 kernel:  dma_fence_default_wait+0x1a8/0x240
Nov 08 09:19:38 murph-icl-gen2 kernel:  ? dma_fence_free+0x20/0x20
Nov 08 09:19:38 murph-icl-gen2 kernel:  dma_fence_wait_timeout+0xb6/0xd0
Nov 08 09:19:38 murph-icl-gen2 kernel:  drm_atomic_helper_wait_for_fences+0x82/0xe0 [drm_kms_helper]
Nov 08 09:19:38 murph-icl-gen2 kernel:  nv50_disp_atomic_commit_tail+0x90/0x850 [nouveau]
Nov 08 09:19:38 murph-icl-gen2 kernel:  process_one_work+0x1ee/0x390
Nov 08 09:19:38 murph-icl-gen2 kernel:  worker_thread+0x53/0x3e0
Nov 08 09:19:38 murph-icl-gen2 kernel:  ? process_one_work+0x390/0x390
Nov 08 09:19:38 murph-icl-gen2 kernel:  kthread+0x124/0x150
Nov 08 09:19:38 murph-icl-gen2 kernel:  ? set_kthread_struct+0x50/0x50
Nov 08 09:19:38 murph-icl-gen2 kernel:  ret_from_fork+0x1f/0x30
Nov 08 09:19:38 murph-icl-gen2 kernel:  </TASK>

Note You need to log in before you can comment on or make changes to this bug.