Created attachment 199651 [details] PC Specs (lshw) I've recently upgraded from Linux 4.2.3 to 4.3.3 on my machine (specs in attachment). With the newest kernel, the machine becomes completely unresponsive after a while, in diverse conditions and workloads. Only hard reset works. The bug is easily reproducible by launching 'powertop —auto-tune' in case it does not trigger on its own during normal use. I have no problems whatsoever on 4.2.3, it's very stable and delivers great performance. I encountered this problem both on Fedora 23 and Arch Linux using the respective stock 4.3.x kernels, hence why I decided to tag it here on mainline. Regarding the discrete GPU, I'm using nouveau on Fedora and Nvidia's proprietary drivers plus bbswitch on Arch, while keeping it powered down in both cases, so I don't think it has any influence on the problem. I'm also attaching 2 dmesg logs and a complete journalctl that seem to stop when the complete lockup happens, taken on Fedora. And in addition to that, there seems to be nothing strange /sys/class/drm/card0/error (no error state collected).
Created attachment 199661 [details] Dmesg (1)
Created attachment 199671 [details] Dmesg (2)
Created attachment 199681 [details] Journalctl
Created attachment 200091 [details] Turbostat I'm attaching a turbostat log It has been trucked in the moment of the freeze. I can't see differences between this one and the one from 4.2.3 but it may be useful
Created attachment 200101 [details] Journalctl (cleared) Another journal but this time i've sed out all the WRITE and REAT blocks report, it may be more readable this way
The problem does not seem to occur with a specific workload, it seems almost random. I fail to see where the problem can be, if anyone needs more testing/logs or other type of logs, i'll be more than happy to test.
Tested and I can also confirm this behavior on 4.4-mainline on fedora 23!
UPDATE: Moved to DRI/Intel and lowered to high importance because it seems connected to the flag: i915.enable_fbc=1 It's strange that on 4.2.3 works wonderfully and on 4.3.3+ freezes completely. But I can use the PC without it. Output of "tail /sys/kernel/debug/dri/0/i915_capabilities" has_fbc: yes has_pipe_cxsr: no has_hotplug: yes cursor_needs_physical: no has_overlay: no overlay_needs_physical: no supports_tv: no has_llc: yes has_ddi: yes has_fpga_dbg: yes so Frame buffer Compression is supported on the HW. Still /sys/class/drm/card0/error shows no error after a Lockup. I still leave it to "High" because it is an important feature to add power saving to the laptop. Without it, my turbostat says that PC3 is only about 20-25% of time (and its is max value), with FBC it stays on PC3 85/90% of the time (can't go deeper than pc3 sadly) It translates to about 3-4w higher power consumpion in idle/light workload. It seems to not affect high workload. If I can do more to help debugging I will do it gladly!
(In reply to Luca Di Maio from comment #8) > UPDATE: > Moved to DRI/Intel and lowered to high importance because it seems connected > to the flag: > > i915.enable_fbc=1 Specifying that module parameter should taint the kernel. If FBC is not enabled by default, it's not supported by the driver for your hardware.
Yes I've now disabled it and seems that freeze are gone, but it may be useful to debug because up to 4.2.3 (from 3.18) always worked perfectly! If I can help to debug, I'll be happy to do! Thanks for the answer!
You could try drm-intel-nightly branch of http://cgit.freedesktop.org/drm-intel which *may* work better. We track bugs at the freedesktop.org bugzilla; some of the FBC bugs there may be helpful. https://bugs.freedesktop.org/buglist.cgi?bug_status=NEW&bug_status=ASSIGNED&bug_status=REOPENED&bug_status=NEEDINFO&component=DRM%2FIntel&f1=cf_i915_features&f2=short_desc&j_top=OR&list_id=566944&o1=substring&o2=anywords&product=DRI&query_format=advanced&v1=Display%2FFBC&v2=FBC