Bug 206781 - GM104 (GeForce 840m) with many errors "fifo: SCHED_ERROR 20"
Summary: GM104 (GeForce 840m) with many errors "fifo: SCHED_ERROR 20"
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: Video(DRI - non Intel) (show other bugs)
Hardware: All Linux
: P1 high
Assignee: drivers_video-dri
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-03-07 21:41 UTC by gsedej
Modified: 2020-03-08 11:16 UTC (History)
1 user (show)

See Also:
Kernel Version: 5.4
Subsystem:
Regression: No
Bisected commit-id:


Attachments
dmesg contining errors - shortend at end (77.82 KB, text/plain)
2020-03-07 21:41 UTC, gsedej
Details

Description gsedej 2020-03-07 21:41:54 UTC
Created attachment 287821 [details]
dmesg contining errors - shortend at end

Laptop HP Envy 15 with i7-4700mq and GeForce 840m on Ubuntu 18.04 and 20.04.

nouveau outputs many megabytes into system logs (kern.log and syslog) with
"nouveau 0000:07:00.0: fifo: SCHED_ERROR 20 []"


full log is in attachment

some relevant output before "SCHED_ERROR"


nouveau 0000:07:00.0: NVIDIA GM108 (118010a2)
...
nouveau 0000:07:00.0: bios: version 82.08.14.00.0e
...
nouveau 0000:07:00.0: fb: 2048 MiB DDR3
nouveau 0000:07:00.0: bus: MMIO read of 00000000 FAULT at 6013d4 [ IBUS ]
...
nouveau 0000:07:00.0: bus: MMIO read of 00000000 FAULT at 10ac08 [ IBUS ]
nouveau 0000:07:00.0: fifo: fault 01 [WRITE] at 0000000000150000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel -1 [007fd38000 unknown]
nouveau 0000:07:00.0: fifo: fault 01 [WRITE] at 0000000000000000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 0a [UNSUPPORTED_APERTURE] on channel -1 [007fd38000 unknown]
vga_switcheroo: enabled
[TTM] Zone  kernel: Available graphics memory: 8164024 KiB
nouveau 0000:07:00.0: fifo: SCHED_ERROR 20 []
nouveau 0000:07:00.0: fifo: SCHED_ERROR 20 []
nouveau 0000:07:00.0: fifo: SCHED_ERROR 20 []
...


I have also reported here: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/522
Comment 1 Ilia Mirkin 2020-03-07 22:25:49 UTC
As mentioned on IRC, the MMIO faults at the start aren't anything to worry about.

These two I've never seen before:

nouveau 0000:07:00.0: fifo: fault 01 [WRITE] at 0000000000150000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 02 [PTE] on channel -1 [007fd38000 unknown]
nouveau 0000:07:00.0: fifo: fault 01 [WRITE] at 0000000000000000 engine 05 [BAR2] client 08 [HUB/HOST_CPU_NB] reason 0a [UNSUPPORTED_APERTURE] on channel -1 [007fd38000 unknown]

And the SCHED thing is most frequently due to the ctxsw firmware timing out or other error. But given the faults above, could easily be follow-on from that.
Comment 2 gsedej 2020-03-08 11:16:29 UTC
In the first step:
what can be done that other users would not have to deal with "nouveau.noaccel=1" or other kernel options just to use the system (assuming no need for 3D from dedicated card)
I was also told it's issue with system (ubuntu's) logg-er or something?
For me - I am using Ubuntu 16.04 with nvidia blob on this laptop. But when trying different ubuntu versions, I always get this spamming in system logs.
(the laptop is HP ENVY 15-j192nf - K4D07EAR#ABF)


For the actual bug - how can this issue be further diagnosed? 
I found similar issues:
https://bugzilla.kernel.org/show_bug.cgi?id=101911
https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/175
https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/352
https://bbs.archlinux.org/viewtopic.php?id=231444

Note You need to log in before you can comment on or make changes to this bug.