Dec 2 21:05:25 local kernel: [ 0.955901] nouveau 0000:01:00.0: NVIDIA GM107 (117300a2) Dec 2 21:05:25 local kernel: [ 0.992024] nouveau 0000:01:00.0: bios: version 82.07.9d.00.14 Dec 2 21:05:25 local kernel: [ 0.993477] nouveau 0000:01:00.0: fb: 4096 MiB GDDR5 Dec 2 21:05:25 local kernel: [ 0.993527] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 001228 [ IBUS ] Dec 2 21:05:25 local kernel: [ 1.008241] nouveau 0000:01:00.0: bus: MMIO read of 00000000 FAULT at 10ac08 [ IBUS ] Dec 2 21:05:25 local kernel: [ 1.061536] nouveau 0000:01:00.0: DRM: VRAM: 4096 MiB Dec 2 21:05:25 local kernel: [ 1.061539] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB Dec 2 21:05:25 local kernel: [ 1.061543] nouveau 0000:01:00.0: DRM: TMDS table version 2.0 Dec 2 21:05:25 local kernel: [ 1.061546] nouveau 0000:01:00.0: DRM: DCB version 4.0 Dec 2 21:05:25 local kernel: [ 1.061549] nouveau 0000:01:00.0: DRM: DCB outp 00: 04800fb6 04420010 Dec 2 21:05:25 local kernel: [ 1.061552] nouveau 0000:01:00.0: DRM: DCB outp 01: 02011fa6 04420010 Dec 2 21:05:25 local kernel: [ 1.061555] nouveau 0000:01:00.0: DRM: DCB outp 02: 02011f62 00020010 Dec 2 21:05:25 local kernel: [ 1.061558] nouveau 0000:01:00.0: DRM: DCB outp 03: 08022fc6 04420010 Dec 2 21:05:25 local kernel: [ 1.061561] nouveau 0000:01:00.0: DRM: DCB outp 04: 08022f82 00020010 Dec 2 21:05:25 local kernel: [ 1.061564] nouveau 0000:01:00.0: DRM: DCB outp 05: 01033fd6 04420020 Dec 2 21:05:25 local kernel: [ 1.061567] nouveau 0000:01:00.0: DRM: DCB outp 06: 01033f92 00020020 Dec 2 21:05:25 local kernel: [ 1.061570] nouveau 0000:01:00.0: DRM: DCB conn 00: 00002047 Dec 2 21:05:25 local kernel: [ 1.061573] nouveau 0000:01:00.0: DRM: DCB conn 01: 00001146 Dec 2 21:05:25 local kernel: [ 1.061575] nouveau 0000:01:00.0: DRM: DCB conn 02: 00010246 Dec 2 21:05:25 local kernel: [ 1.061578] nouveau 0000:01:00.0: DRM: DCB conn 03: 00020346 Dec 2 21:05:25 local kernel: [ 1.433020] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies Dec 2 21:05:25 local kernel: [ 1.535562] nouveau 0000:01:00.0: DRM: allocated 1920x1080 fb: 0x80000, bo 0000000071889fdf Dec 2 21:05:25 local kernel: [ 1.853891] nouveau 0000:01:00.0: disp: 0x00006671[0]: INIT_GENERIC_CONDITON: unknown 0x07 Dec 2 21:05:25 local kernel: [ 2.034030] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device Dec 2 21:05:25 local kernel: [ 2.034061] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0 Dec 2 22:35:07 local kernel: [ 5422.645466] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.645475] nouveau 0000:01:00.0: gr: GPC0/TPC3/MP trap: global 00000000 [] warp 3c000d [OOR_REG] Dec 2 22:35:07 local kernel: [ 5422.646304] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646316] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 0, y = 0, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646334] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646346] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 384, y = 74, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646362] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646373] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 352, y = 152, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646388] nouveau 0000:01:00.0: gr: TRAP ch 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646399] nouveau 0000:01:00.0: gr: GPC0/PROP trap: 00000200 [] x = 448, y = 268, format = 0, storage type = fe Dec 2 22:35:07 local kernel: [ 5422.646418] nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 000000000a721000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 82 [] on channel 4 [00ff85c000 X[3819]] Dec 2 22:35:07 local kernel: [ 5422.646425] nouveau 0000:01:00.0: fifo: channel 4: killed Dec 2 22:35:07 local kernel: [ 5422.646427] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery Dec 2 22:35:07 local kernel: [ 5422.646432] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery Dec 2 22:35:07 local kernel: [ 5422.646437] nouveau 0000:01:00.0: X[3819]: channel 4 killed! Dec 2 22:35:31 local kernel: [ 5446.744051] sysrq: SysRq : Keyboard mode set to system default Dec 2 22:35:32 local kernel: [ 5447.080135] sysrq: SysRq : Terminate All Tasks
It would be soooo cool if anyone would actually read this bug report and maybe try to fix it. I will assist in testing patches until this is resolved. And: I am willing to offer $100 for fixing this annoying bug! Keeps freezing my 4.19.39 kernel out of nowhere. Some things I would like to get into discussion: a) - it might have something to do with memory pressure _and_ b) - high CPU load _or_ - high number of context switches. For the latter I'm not sure. The bug actually always occurs when I ie. compile two kernels at -j24 and habe some other work besides this, say a YT video. The bug is, however, definitely triggered by a graphics event, ie. resizing/creating a window, scrolling a Web page or watching a video. [2019-05-04 15:43:24] err kern 03 kernel : [ 523.906459] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906467] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906473] nouveau 0000:01:00.0: fifo: channel 2: killed [2019-05-04 15:43:24] notice kern 05 kernel : [ 523.906479] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery [2019-05-04 15:43:24] warning kern 04 kernel : [ 523.906789] nouveau 0000:01:00.0: X[8006]: channel 2 killed! [2019-05-04 15:43:24] err kern 03 kernel : nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] [2019-05-04 15:43:24] notice kern 05 kernel : nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery [2019-05-04 15:43:24] notice kern 05 kernel : nouveau 0000:01:00.0: fifo: channel 2: killed [2019-05-04 15:43:24] notice kern 05 kernel : nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery [2019-05-04 15:43:24] warning kern 04 kernel : nouveau 0000:01:00.0: X[8006]: channel 2 killed! [2019-05-04 15:44:24] info kern 06 kernel : [ 584.121331] sysrq: SysRq : Keyboard mode set to system default [2019-05-04 15:44:24] info kern 06 kernel : sysrq: SysRq : Keyboard mode set to system default
Here's another freeze report: From $ uname -a Linux ceo1homenx 5.2.0-8-generic #9-Ubuntu SMP Mon Jul 8 13:07:27 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux just before lock syslog: Jul 21 09:45:20 ceo1homenx kernel: [89849.919490] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT] Jul 21 09:45:20 ceo1homenx kernel: [89849.919500] nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery Jul 21 09:45:20 ceo1homenx kernel: [89849.919506] nouveau 0000:01:00.0: fifo: channel 8: killed Jul 21 09:45:20 ceo1homenx kernel: [89849.919511] nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery Jul 21 09:45:20 ceo1homenx kernel: [89849.919815] nouveau 0000:01:00.0: Xorg[1546]: channel 8 killed! -- hard lock --
i think i got this issue, too: → uname -a Linux sticke 4.19.66-1-MANJARO #1 SMP PREEMPT Fri Aug 9 18:01:53 UTC 2019 x86_64 GNU/Linux → journalctl -b-1 Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: fault 01 [WRITE] at 0000000000240000 engine 00 [GR] client 0f [GPC0/PROP_0] reason 82 [] on channel 2 [003fbec000 Xorg[634]] Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: channel 2: killed Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: runlist 0: scheduled for recovery Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: fifo: engine 0: scheduled for recovery Aug 13 15:36:55 sticke kernel: nouveau 0000:01:00.0: Xorg[634]: channel 2 killed! btw. i am working with exactly the same os (usb stick) on a different hardware where this problem does not occur. pls let me know if i should post more details (and which ones).
sf@sf-T3600 ~ % uname -a Linux sf-T3600 6.4.10-arch1-1 #1 SMP PREEMPT_DYNAMIC Fri, 11 Aug 2023 11:03:36 +0000 x86_64 GNU/Linux Aug 16 08:21:06 sf-T3600 kernel: nouveau 0000:03:00.0: fifo: fault 01 [WRITE] at 000000000002e000 engine 15 [PCE0] client 01 [HUB/PCOPY0] reason 02 [PAGE_NOT_PRES> Aug 16 08:21:06 sf-T3600 kernel: nouveau 0000:03:00.0: fifo:000000:0001:[(udev-worker)[738]] rc scheduled Aug 16 08:21:06 sf-T3600 kernel: nouveau 0000:03:00.0: fifo:000000: rc scheduled Aug 16 08:21:06 sf-T3600 kernel: nouveau 0000:03:00.0: fifo:000000:0001:0001:[(udev-worker)[738]] errored - disabling channel Aug 16 08:21:06 sf-T3600 kernel: nouveau 0000:03:00.0: DRM: channel 1 killed! Aug 16 08:21:08 sf-T3600 kernel: sched: RT throttling activated Aug 16 08:21:44 sf-T3600 rtkit-daemon[1065]: Supervising 8 threads of 5 processes of 1 users. Aug 16 08:21:44 sf-T3600 rtkit-daemon[1065]: Supervising 8 threads of 5 processes of 1 users. Aug 16 08:21:49 sf-T3600 kernel: ------------[ cut here ]------------ Aug 16 08:21:49 sf-T3600 kernel: WARNING: CPU: 0 PID: 19149 at mm/gup.c:1101 __get_user_pages+0x582/0x680 Aug 16 08:21:49 sf-T3600 kernel: Modules linked in: snd_seq_dummy snd_hrtimer snd_seq rfkill intel_rapl_msr intel_rapl_common uvcvideo x86_pkg_temp_thermal intel_> Aug 16 08:21:49 sf-T3600 kernel: crct10dif_pclmul crc32_pclmul crc32c_intel polyval_clmulni uas polyval_generic usb_storage usbhid gf128mul ghash_clmulni_intel s> Aug 16 08:21:49 sf-T3600 kernel: CPU: 0 PID: 19149 Comm: chrome_crashpad Tainted: G W OE 6.4.10-arch1-1 #1 2d4402bf7ad4a7ea488c9261840b8101c9d1e712 Aug 16 08:21:49 sf-T3600 kernel: Hardware name: Dell Inc. Precision T3600/08HPGT, BIOS A15 05/08/2017 Aug 16 08:21:49 sf-T3600 kernel: RIP: 0010:__get_user_pages+0x582/0x680 Aug 16 08:21:49 sf-T3600 kernel: Code: 00 e9 cb fd ff ff 48 03 bd 88 00 00 00 e9 c7 fb ff ff 48 81 e1 00 f0 ff ff e9 4b fc ff ff 48 81 e2 00 f0 ff ff e9 b5 fc ff > Aug 16 08:21:49 sf-T3600 kernel: RSP: 0018:ffffb531ccc17bf8 EFLAGS: 00010202 Aug 16 08:21:49 sf-T3600 kernel: RAX: ffff94633a009cc0 RBX: 000000000005000a RCX: 00007ffdc4e02fff Aug 16 08:21:49 sf-T3600 kernel: RDX: 0000000000000000 RSI: 00007eff84e4b000 RDI: ffff9463c986a080 Aug 16 08:21:49 sf-T3600 kernel: RBP: ffff946398f0bc80 R08: ffff94633a0b8008 R09: 0000000000000001 Aug 16 08:21:49 sf-T3600 kernel: R10: ffff94633a0b8080 R11: ffff94633a0b800c R12: 0000000000000000 Aug 16 08:21:49 sf-T3600 kernel: R13: ffff94633a009cc0 R14: ffffb531ccc17cbc R15: ffffb531ccc17cbc Aug 16 08:21:49 sf-T3600 kernel: FS: 00007f7035d135c0(0000) GS:ffff946a2f600000(0000) knlGS:0000000000000000 Aug 16 08:21:49 sf-T3600 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Aug 16 08:21:49 sf-T3600 kernel: CR2: 000036cc0040c300 CR3: 000000018e95e004 CR4: 00000000000626f0 Aug 16 08:21:49 sf-T3600 kernel: Call Trace: Aug 16 08:21:49 sf-T3600 kernel: <TASK> Aug 16 08:21:49 sf-T3600 kernel: ? __get_user_pages+0x582/0x680 Aug 16 08:21:49 sf-T3600 kernel: ? __warn+0x81/0x130 Aug 16 08:21:49 sf-T3600 kernel: ? __get_user_pages+0x582/0x680 Aug 16 08:21:49 sf-T3600 kernel: ? report_bug+0x171/0x1a0 Aug 16 08:21:49 sf-T3600 kernel: ? handle_bug+0x3c/0x80 Aug 16 08:21:49 sf-T3600 kernel: ? exc_invalid_op+0x17/0x70 Aug 16 08:21:49 sf-T3600 kernel: ? asm_exc_invalid_op+0x1a/0x20 Aug 16 08:21:49 sf-T3600 kernel: ? __get_user_pages+0x582/0x680 Aug 16 08:21:49 sf-T3600 kernel: ? __get_user_pages+0x8a/0x680 Aug 16 08:21:49 sf-T3600 kernel: get_user_pages_remote+0x14a/0x400 Aug 16 08:21:49 sf-T3600 kernel: __access_remote_vm+0x1bf/0x420 Aug 16 08:21:49 sf-T3600 kernel: mem_rw.isra.0+0x111/0x1d0 Aug 16 08:21:49 sf-T3600 kernel: vfs_read+0xac/0x320 Aug 16 08:21:49 sf-T3600 kernel: ? mem_rw.isra.0+0x18a/0x1d0 Aug 16 08:21:49 sf-T3600 kernel: ? vfs_read+0xac/0x320 Aug 16 08:21:49 sf-T3600 kernel: __x64_sys_pread64+0x98/0xd0 Aug 16 08:21:49 sf-T3600 kernel: do_syscall_64+0x60/0x90 Aug 16 08:21:49 sf-T3600 kernel: ? __x64_sys_pread64+0xa8/0xd0 Aug 16 08:21:49 sf-T3600 kernel: ? syscall_exit_to_user_mode+0x1b/0x40 Aug 16 08:21:49 sf-T3600 kernel: ? do_syscall_64+0x6c/0x90 Aug 16 08:21:49 sf-T3600 kernel: ? do_syscall_64+0x6c/0x90 Aug 16 08:21:49 sf-T3600 kernel: entry_SYSCALL_64_after_hwframe+0x77/0xe1 Aug 16 08:21:49 sf-T3600 kernel: RIP: 0033:0x7f7035ae8d07 Aug 16 08:21:49 sf-T3600 kernel: Code: 08 89 3c 24 48 89 4c 24 18 e8 85 00 fa ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 48 8b 74 24 08 8b 3c 24 b8 11 00 00 00 0f > Aug 16 08:21:49 sf-T3600 kernel: RSP: 002b:00007fff5f79a8f0 EFLAGS: 00000293 ORIG_RAX: 0000000000000011 Aug 16 08:21:49 sf-T3600 kernel: RAX: ffffffffffffffda RBX: 0000000000001000 RCX: 00007f7035ae8d07 Aug 16 08:21:49 sf-T3600 kernel: RDX: 0000000000001000 RSI: 00007fff5f79abe0 RDI: 0000000000000007 Aug 16 08:21:49 sf-T3600 kernel: RBP: 00007fff5f79aa90 R08: 0000000000000000 R09: 000055f24f499c20 Aug 16 08:21:49 sf-T3600 kernel: R10: 00007eff84e4a880 R11: 0000000000000293 R12: 00007eff84e4a880 Aug 16 08:21:49 sf-T3600 kernel: R13: 000036cc00218380 R14: 00007fff5f79abe0 R15: 0000000000001000 Aug 16 08:21:49 sf-T3600 kernel: </TASK> Aug 16 08:21:49 sf-T3600 kernel: ---[ end trace 0000000000000000 ]--- Aug 16 08:22:08 sf-T3600 systemd[1]: Started Getty on tty5.
Please refile here https://gitlab.freedesktop.org/drm/nouveau/-/issues/ 4.19.6 is terribly old and outdated regardless. Please try at least 4.19.292
I copied my info and created an issue here: https://gitlab.freedesktop.org/drm/nouveau/-/issues/256