Bug 81711

Summary: [855gm] Thinkpad R51 reports [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
Product: Drivers Reporter: Stuart Foster (smf-linux)
Component: Video(DRI - Intel)Assignee: intel-gfx-bugs (intel-gfx-bugs)
Status: RESOLVED WILL_NOT_FIX    
Severity: normal CC: daniel, dev, intel-gfx-bugs, jarek.hryszko, jdelvare, jidanni, peter+linux, tiwai, ville.syrjala
Priority: P5    
Hardware: i386   
OS: Linux   
Kernel Version: 3.17 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: .config dmesg lspci and xorg logs
linux-3.17-rc5 dmesg output
drm debug dmesg report
dmesg from drm-intel-nightly git build

Description Stuart Foster 2014-08-05 08:08:32 UTC
Created attachment 145201 [details]
.config dmesg lspci and xorg logs

Found the following when booting my Thinkpad R51 with the latest 3.16.0 kernel. Apart from the dmesg report the system is behaving normally, is this just a consequence of extra reporting in the kernel or is it indicating a real underlying
problem ?

[    0.478413] [drm] Initialized drm 1.1.0 20060810
[    0.478804] [drm] Memory usable by graphics device = 128M
[    0.478864] [drm] Replacing VGA console driver
[    0.480355] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    0.480366] [drm] Driver supports precise vblank timestamp query.
[    0.485336] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[    0.497316] [drm] initialized overlay support
[    0.520239] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[    0.550374] fbcon: inteldrmfb (fb0) is primary device
[    1.135005] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[    1.135095] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[   84.457009] [drm:i9xx_check_fifo_underruns] *ERROR* pipe B underrun
[   86.541733] [drm:i9xx_check_fifo_underruns] *ERROR* pipe B underrun
[   86.555900] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   86.598726] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun

The system is LFS, glibc 2.19 gcc 4.8.2

thanks.
Comment 1 Stuart Foster 2014-08-16 20:49:16 UTC
The condition is also present in linux-3.17-rc1.
Comment 2 Stuart Foster 2014-08-26 15:39:08 UTC
Also linux-3-17-rc2.
Comment 3 Stuart Foster 2014-09-01 15:43:00 UTC
Also linux-3.17-rc3
Comment 4 Stuart Foster 2014-09-11 16:02:54 UTC
Also linux-3.17-rc4
Comment 5 Stuart Foster 2014-09-16 09:08:27 UTC
Created attachment 150551 [details]
linux-3.17-rc5 dmesg output

With rc5 in addition to the fifo under run there is a stacktrace being generated. They of course may not be related.

[   38.244822] [drm:i9xx_check_fifo_underruns] *ERROR* pipe B underrun
[   38.258719] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   38.352009] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   38.398655] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   40.107762] [drm:i8xx_irq_handler] *ERROR* pipe B underrun
[   40.439655] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   40.482330] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   40.525655] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   85.069498] ------------[ cut here ]------------
[   85.069517] WARNING: CPU: 0 PID: 1935 at drivers/gpu/drm/i915/i915_gem.c:4457 i915_gem_free_object+0x355/0x370()
[   85.069520] Modules linked in: nfsv3 nfs fscache nfsd lockd sunrpc af_packet microcode sha256_generic cbc floppy ac e100 thermal 8250_pci m
ii snd_intel8x0 snd_ac97_codec 8250 battery i2c_i801 sg serial_core psmouse ac97_bus rng_core ehci_pci dm_crypt dm_mod usb_storage pl2303 usbs
erial snd_pcm_oss snd_pcm snd_timer snd_mixer_oss acpi_cpufreq processor wmi hdaps input_polldev thinkpad_acpi hwmon snd soundcore nvram ohci_
hcd uhci_hcd ehci_hcd usbcore usb_common unix
[   85.069596] CPU: 0 PID: 1935 Comm: X Not tainted 3.17.0-rc5 #1
[   85.069599] Hardware name: IBM 2889SG1/2889SG1, BIOS 1VET71WW (1.29 ) 12/01/2006
[   85.069602]  c14761ca c1034f4f c1575310 00000000 0000078f c15aa51c 00001169 c128e935
[   85.069612]  c128e935 c1c76b00 00000000 c1c76b68 c1c76b34 c1035019 00000009 00000000
[   85.069621]  c128e935 f6377980 f6314c00 f6001e00 f6ed4ee0 00000000 c1ce0000 c1c76b00
[   85.069631] Call Trace:
[   85.069640]  [<c14761ca>] ? dump_stack+0xa/0x13
[   85.069648]  [<c1034f4f>] ? warn_slowpath_common+0x7f/0xb0
[   85.069655]  [<c128e935>] ? i915_gem_free_object+0x355/0x370
[   85.069661]  [<c128e935>] ? i915_gem_free_object+0x355/0x370
[   85.069666]  [<c1035019>] ? warn_slowpath_null+0x19/0x20
[   85.069672]  [<c128e935>] ? i915_gem_free_object+0x355/0x370
[   85.069682]  [<c1252a95>] ? drm_gem_object_free+0x15/0x20
[   85.069689]  [<c1252d54>] ? drm_gem_object_handle_unreference_unlocked+0xa4/0xd0
[   85.069695]  [<c1252dfb>] ? drm_gem_handle_delete+0x7b/0xc0
[   85.069700]  [<c1253590>] ? drm_gem_handle_create+0x30/0x30
[   85.069705]  [<c1253f3d>] ? drm_ioctl+0x19d/0x5c0
[   85.069711]  [<c1253590>] ? drm_gem_handle_create+0x30/0x30
[   85.069718]  [<c10a9e22>] ? unmap_single_vma+0x3e2/0x510
[   85.069725]  [<c1096fc7>] ? lru_add_drain+0x7/0x10
[   85.069732]  [<c10b79b0>] ? free_pages_and_swap_cache+0x10/0x90
[   85.069738]  [<c10a89bb>] ? tlb_flush_mmu_free+0x1b/0x30
[   85.069743]  [<c10a91a9>] ? tlb_finish_mmu+0x9/0x40
[   85.069749]  [<c10adc58>] ? unmap_region+0x98/0xe0
[   85.069754]  [<c1253da0>] ? drm_copy_field+0x70/0x70
[   85.069759]  [<c10d717c>] ? do_vfs_ioctl+0x33c/0x4f0
[   85.069765]  [<c1252c7a>] ? drm_gem_vm_close+0x2a/0x60
[   85.069771]  [<c10adcde>] ? remove_vma+0x3e/0x50
[   85.069777]  [<c10adcde>] ? remove_vma+0x3e/0x50
[   85.069783]  [<c10afb00>] ? do_munmap+0x1f0/0x2d0
[   85.069788]  [<c10d7373>] ? SyS_ioctl+0x43/0x80
[   85.069796]  [<c147af0b>] ? sysenter_do_call+0x12/0x12
[   85.069800] ---[ end trace 88b45625225c3595 ]---
Comment 6 Stuart Foster 2014-09-22 12:32:49 UTC
Same two issues present in linux-3.17-rc6
Comment 7 Rodrigo Vivi 2014-09-23 00:38:22 UTC
It this a regression?

Could you please boot your kernel with drm.debug=0xe than re-attach dmesg?

Thanks
Comment 8 Stuart Foster 2014-09-23 08:30:37 UTC
Created attachment 151571 [details]
drm debug dmesg report

Herewith the debug dmesg you requested. As to the regression question, the stack trace looks familiar but I cannot find anything similar in any of the bugs I have a record of.

thanks
Comment 9 Jani Nikula 2014-09-24 11:47:12 UTC
You might want to try drm-intel-nightly from http://cgit.freedesktop.org/drm-intel.
Comment 10 Stuart Foster 2014-09-24 22:53:28 UTC
Created attachment 151851 [details]
dmesg from drm-intel-nightly git build

Both the underrun and stack trace are generated using the drm-intel-nightly build.
Comment 11 Stuart Foster 2014-09-29 10:21:26 UTC
Nothing different in 3.17.0-rc7.
Comment 12 Stuart Foster 2014-10-06 19:45:25 UTC
Same conditions in 3.17
Comment 13 Jean Delvare 2014-10-18 16:54:45 UTC
I see similar pipe underrun messages on my old Panasonic Toughbook CF-18 laptop after updating to openSUSE 13.2 RC1 (kernel based on 3.16.3.) Before that it was running openSUSE 11.4, where kernel was based on 3.7.10, and I never saw these error messages. So the problem indeed qualifies as a regression.

The graphics chip in my laptop is:
00:02.0 VGA compatible controller [0300]: Intel Corporation 82852/855GM Integrated Graphics Device [8086:3582] (rev 02)
00:02.1 Display controller [0380]: Intel Corporation 82852/855GM Integrated Graphics Device [8086:3582] (rev 02)

No errors nor warnings visible in Xorg.0.log.

Relevant kernel log messages:
[    5.462859] [drm] Initialized drm 1.1.0 20060810
[    5.516140] [drm] Memory usable by graphics device = 128M
[    5.516146] [drm] Replacing VGA console driver
[    5.516158] fb: switching to inteldrmfb from EFI VGA
[    5.521916] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    5.521924] [drm] Driver supports precise vblank timestamp query.
[    5.580087] [drm] GMBUS [i915 gmbus panel] timed out, falling back to bit banging on pin 3
[    5.599601] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[    5.619742] [drm] initialized overlay support
[    5.698544] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[    5.780209] fbcon: inteldrmfb (fb0) is primary device
[    6.592586] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
[    6.592616] [drm] Initialized i915 1.6.0 20080730 for 0000:00:02.0 on minor 0
[    6.642304] [drm:i9xx_check_fifo_underruns] *ERROR* pipe B underrun
[    6.725855] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   47.458008] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   47.736185] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   48.686871] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   75.032451] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[   75.428211] [drm:i9xx_set_fifo_underrun_reporting] *ERROR* pipe A underrun
[ 1402.930605] [drm:i8xx_irq_handler] *ERROR* pipe B underrun
[ 5332.865613] [drm:i8xx_irq_handler] *ERROR* pipe B underrun
[ 6240.731628] [drm:i8xx_irq_handler] *ERROR* pipe B underrun
Comment 14 Dan Jacobson 2014-11-20 23:03:37 UTC
In addition, on the R51, upon starting chromium, I see
libGL error: failed to load driver: i915
Comment 15 Jean Delvare 2014-11-25 10:37:24 UTC
(In reply to Dan Jacobson from comment #14)
> In addition, on the R51, upon starting chromium, I see
> libGL error: failed to load driver: i915

I very much doubt this has anything to do with the initial bug report. More likely this is caused by the lack of DRI3 support for old hardware in the i915 driver.
Comment 16 Jean Delvare 2014-11-25 10:38:28 UTC
Would it help if I bisect the kernel to find out which commit introduced this issue?
Comment 17 Takashi Iwai 2014-11-25 11:28:38 UTC
(In reply to Jean Delvare from comment #15)
> (In reply to Dan Jacobson from comment #14)
> > In addition, on the R51, upon starting chromium, I see
> > libGL error: failed to load driver: i915
> 
> I very much doubt this has anything to do with the initial bug report. More
> likely this is caused by the lack of DRI3 support for old hardware in the
> i915 driver.

Right, this message is an infamous harmless message from Mesa, which failed to load from the update path.  You can ignore it safely.
Comment 18 Dan Jacobson 2014-11-25 11:37:39 UTC
(In reply to Takashi Iwai from comment #17)
> Right, this message is an infamous harmless message from Mesa, which failed
> to load from the update path.  You can ignore it safely.
Thanks. Hope they fix the message.
Comment 19 Dan Jacobson 2014-11-25 11:38:33 UTC
(In reply to Dan Jacobson from comment #14)
> In addition, on the R51
(Oops, I own an R50e.)
Comment 20 Jean Delvare 2014-12-02 09:58:42 UTC
I bisected this issue and here are my results.

First of all, there are two sets of error messages to consider: the one from i9xx_set_fifo_underrun_reporting, and the rest.

The error message from i9xx_set_fifo_underrun_reporting appeared with commit e69abff0d6794311d834de0fa2f188eb24a977b9 ("drm/i915: Check for FIFO underuns when disabling reporting on gmch platforms" in v3.16.) This commit is merely adding detection and reporting of the error condition, so this is not a regression per se.

The other messages appeared with commit fc2c807b7a2b2ca8dbe2aed2f5ae730c19beeda5 ("drm/i915: Make underruns DRM_ERROR" in v3.15.) Again this commit is only making the reporting of the error conditions unconditional, so this is not a regression per se.

As a summary, the underruns have always been there, just they were not reported as errors before.

Can someone please explain how bad these pipe underruns are? And if/how they can be fixed?

I admit I find it strange that messages that were reported at debug level before were suddenly considered so important that they needed to be reported at error level. I understand that driver developers would like to see these messages without having to turn debugging on, but may I suggest that info level would seem more appropriate?

The reason why I am suggesting that is that these early error messages kill the nice splash screen at boot time. Info messages would not.
Comment 21 Jani Nikula 2015-06-16 09:37:55 UTC
Updating priority to reflect reality. Please try later kernels.
Comment 22 Jean Delvare 2015-07-06 13:24:21 UTC
I just tested kernel 4.1.1 and I still get fifo underrun error messages in the kernel log at boot time.

Anyway my original request was not to fix all these underruns, but to adjust the log level to something more realistic. See comment #20.
Comment 23 Jani Nikula 2015-10-07 11:11:12 UTC
I think it's fair to say we are not working on this. Sorry.

Please file any new bug reports at the freedesktop.org bugzilla [1]. Thank you.

[1] https://bugs.freedesktop.org/enter_bug.cgi?product=DRI&component=DRM/Intel
Comment 24 Jarek Hryszko 2016-01-24 14:58:50 UTC
Gentlemen, I am sure that all of you who struggle with above issue will find this interesting -- apart from having my console messed up with the error message mentioned, my laptop also suffered from going into suspended mode, after random periods of time (mostly few seconds). So, I set Debian's setting "HandleSuspendKey" and "HandleLidSwitch" to ignore in /etc/systemd/logind.conf.
After that action, apart from getting rid of suspending problem, also [drm:i9xx_set_fifo... messages disappeared! Hope it will help.
Comment 25 Daniel Vetter 2016-01-25 09:28:13 UTC
Yeah, this happens on these old platforms - the "helpful" bios interferes with the display state and leaves a giantic mess behind. We try to recover from the most obvious cases (we read out hw state and restore what we want both on resume and after lid events). But if your bios just randomly suspends your box then there's not much we can do.

Now if you still see fifo underruns when initiating a normal suspend/resume, then that would be more cause for concern. In that case please file a bug like Jani explained.

Thanks, Daniel
Comment 26 Dan Jacobson 2016-02-09 03:37:24 UTC
(In reply to Jarek Hryszko from comment #24)
> setting "HandleSuspendKey" and "HandleLidSwitch" to ignore in
The messages didn't disappear for me. Never had the halting though, fortunately.