Bug 208737

Summary: No graphical output with 5.7.x
Product: Platform Specific/Hardware Reporter: Peter Ganzhorn (peter.ganzhorn)
Component: x86-64Assignee: drivers_video-other
Status: NEW ---    
Severity: blocking CC: clarkaddison, joonas.lahtinen, nicolas.masse, niklas, peter.ganzhorn, vollmerpeter
Priority: P1    
Hardware: Intel   
OS: Linux   
Kernel Version: 5.7.x - 5.8.x Subsystem:
Regression: Yes Bisected commit-id:
Attachments: dmesg
journalctl -b output
xorg.log
kernel configuration
lspci -vvv
trace
trace with enable_dc=0

Description Peter Ganzhorn 2020-07-29 12:32:48 UTC
Created attachment 290667 [details]
dmesg

With Linux 5.7.x there is absolutely no graphical output.

Since Linux 5.7 my system does not produce any graphical output after the GRUB menu (i.e. GRUB shows loading kernel and initrd messages and the screen just stays the same at this point).
The system boots however and I can ssh into it.
Trying to shut the system down via ssh does not work, the connection closes but the machine does not power off. A new ssh connection cannot be established afterwards.

The affected system is an Intel Core i7 5775C (Haswell with Iris Pro graphics and 128MB L4).
My display is connected via DisplayPort.

I have not spotted anything suspicious in my dmesg, but I've attached it. At first I suspected the i915 driver to be responsible, but even when compiling it as module (I had it compiled in with 5.6.x which is working fine) the behaviour is the same.

Please let me know what information I can provide to track the issue down.
Comment 1 Peter Ganzhorn 2020-07-29 12:33:31 UTC
Created attachment 290669 [details]
journalctl -b output
Comment 2 Peter Ganzhorn 2020-07-29 12:33:49 UTC
Created attachment 290671 [details]
xorg.log
Comment 3 Peter Ganzhorn 2020-07-29 12:35:00 UTC
Created attachment 290673 [details]
kernel configuration
Comment 4 Peter Ganzhorn 2020-08-03 08:48:56 UTC
Update: With Linux 5.8 the issue is even worse, ssh'ing into the machine is no longer possible. The visual behaviour is the same as before.
Therefore it is not only a graphical problem anymore and I am changing the bug to Platform Specific/Hardware as the reason is unknown to me.
Comment 5 Peter Vollmer 2020-08-30 13:28:31 UTC
Hi,

I seem to have the same issue, to be exact this seems to be broken
since Linux Kernel v5.7-rc1 mainline (including v5.9-rc2) my laptop with an
Iris Pro (see attached "lspci --vv") and i7-4760HQ loses graphical
output during early boot which hangs completely.
I can boot when i915 is blacklisted and auto loading is prohibited,
When I then try to modprobe i915 manually afterwards, the graphics lock up
completely, reboot does not work anymore but at least SSH keeps working. There
are no errors, warnings or anything suspicious in dmesg.

I tried to bisect this between v5.6 and v5.7-rc1 but had too many build
issues and skipped commits to get a clean candidate commit.

I also tried function graph tracing in i915 with "echo ':mod:i915'
> /sys/kernel/debug/tracing/set_ftrace_filter". I have attached that trace
once with no special parameters and once with "modprobe i915 enable_dc=0.
I looked into it and it seems suspicious that at some point it dissolves into
just repeating.

...
 __i915_vm_close [i915]();
 i915_vma_unbind [i915]() {
   __i915_active_wait [i915]() {
     i915_active_acquire_if_busy [i915]();
   }
 }
...

So I'm at a loss of things to try and would appreciate any help
in getting this resolved.

@Peter Ganzhorn, can you confirm that you can also boot
with i915 disabled?
For example on Arch Linux add the following to /etc/modprobe.d/blacklist.conf

blacklist i915
install i915 /bin/false

and rebuilt initramfs with mkinitcpio.
Comment 6 Peter Vollmer 2020-08-30 13:31:28 UTC
Created attachment 292225 [details]
lspci -vvv
Comment 7 Peter Vollmer 2020-08-30 13:32:48 UTC
Created attachment 292227 [details]
trace

echo ':mod:i915'> /sys/kernel/debug/tracing/set_ftrace_filter

with function_graph tracer and no additional modprobe parameters.
Comment 8 Peter Vollmer 2020-08-30 13:34:33 UTC
Created attachment 292229 [details]
trace with enable_dc=0

echo ':mod:i915'> /sys/kernel/debug/tracing/set_ftrace_filter

with function_graph tracer and modprobe enable_dc=0
Comment 9 Peter Vollmer 2020-08-30 13:38:01 UTC
@Peter Ganzhorn, if you can confirm the part about
blacklisting i915 then it might make sense to change
the assignee to intel-gfx@lists.freedesktop.org.
I've also added that list to CC.
Comment 10 Peter Ganzhorn 2020-08-30 14:33:41 UTC
I've also tried a few things since I reported this and am able to boot without i915.

Also intel_iommu=off let's me boot and I can ssh into the machine, but the graphical output still stops at "switching to inteldrmfb from simple".
Since this is clearly a i915 bug (without i915 the symptoms don't show) I reported this directly to the i915 issue tracker here: https://gitlab.freedesktop.org/drm/intel/-/issues/2381

I was hoping to see a little more activity there since it makes my machine virtually unusable after 5.6.x, but I haven't seen any activity there since I reported it also.

@Peter Vollmer: Thanks for adding intel-gfx to the CC list and suggesting changing the assignee.
Comment 11 Peter Ganzhorn 2020-08-30 14:38:11 UTC
@Peter Vollmer: Do you know how I can change the assignee of the bug report? I don't see any option to edit it, maybe you can give me a hint how to do it?
Comment 12 Peter Vollmer 2020-08-30 19:34:05 UTC
(In reply to Peter Ganzhorn from comment #11)
> @Peter Vollmer: Do you know how I can change the assignee of the bug report?
> I don't see any option to edit it, maybe you can give me a hint how to do it?

@Peter Ganzhorn: No clue, sorry. Thought you have more options, because of thread creation. Maybe additional permissions necessary.

The adding of the mailing list to CC was not working either, but I was able to add one of the intel supporters, maybe he can help. As next option I would address this topic then directly into the mentioned mailing list intel-gfx@lists.freedesktop.org .
Comment 13 Peter Ganzhorn 2020-09-04 07:31:59 UTC
I've finally gotten a response to my bug report at https://gitlab.freedesktop.org/drm/intel/-/issues/2381 from Ville Syrjälä from Intel.
It seems this definitely is a i915 bug and is already known at Intel. Ville provided a patch with a small hack which lets my machine boot with graphical output on 5.8 again.
As far as I can assess his info, the patch does not fix the actual bug.
If Ville can help, maybe this bug report should be closed and the one at the Intel freedesktop.org be favored?
I don't feel comfortable having two bug reports, but since it took over a month to get some kind of reaction, it also was probably neccessary to open both.

@ Peter Vollmer: Can you confirm the hack from Ville Syrjälä also fixes the issue for you?
Comment 14 Peter Vollmer 2020-09-04 18:25:34 UTC
@Peter Ganzhorn: Yes this workaround patch works for me too. Tested it with v5.8 and v5.9-rc3 and for both it works not without and it works with patch. I would keep this bug open too, because this is the official kernel bugzilla and I have referenced it in my mail on the intel-gfx@lists.freedesktop.org mailing list.