Bug 82711
Summary: | After update to kernel soft lockup (oops) and incomplete boot and shutdown fail | ||
---|---|---|---|
Product: | Drivers | Reporter: | Mike Cloaked (mike.cloaked) |
Component: | Video(DRI - non Intel) | Assignee: | drivers_video-dri |
Status: | NEW --- | ||
Severity: | high | CC: | abandonedaccountubdprczb8hs, Actualize.in.Material+bugzillakernel, crouse.hackz, zazdxscf+bugzilla.kernel.org |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 3.16.1-1-ARCH | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Systemd journal log after rebooting from crashed system
Journal log after cleaning out systemd journal and reboot dmesg after selecting reboot from kdm Systemd journal after failed reboot from kdm greeter dmesg log after incomplete boot as per comment #6 systemd journal log after incomplete boot as per comment #6 systemd journal showing kernel oops with kernel 3.16.4 |
Created attachment 147031 [details]
Journal log after cleaning out systemd journal and reboot
After a normal looking bootup and working with a normal kde session, I logged out of kde, and used the kdm greeter screen to request the system to reboot. It seemed to exit from X but left the system hanging with a VT visible but it had not rebooted. At this point I captured the dmesg and journal files which I will now attach. Created attachment 147111 [details]
dmesg after selecting reboot from kdm
Selecting reboot from kdm greeter the system failed to reboot but left a working VT from which a root login was possible.
Created attachment 147121 [details]
Systemd journal after failed reboot from kdm greeter
After logging out from kde and selecting reboot, the system exited X but hung with a VT screen. This is the journal log after logging in as root to the VT at that point.
After capturing the log files in comments #3 and #4 the system was commanded to power off using "systemctl poweroff". The shutdown sequence started and was interrupted by a countdown timer displaying on the VT: A stop job is running for K Display Manager (xxx xxx / 1min 30s) Once the 90 seconds elapsed the system did complete the shutdown and poweroff. After the system was powered off as in comment #5 the laptop was booted again - to the kdm greeter - but changing to a text VT only and attempting to log in as root the screen became filled with log messages, and I attempted to capture the dmesg and journal log files as the system slowly began to slow to responses from the keyboard, and the fans increased speed to high level. I was able to capture the log files before the system became completely unresponsive and it was necessary to hold the power button to hard poweroff. The log files were then rsynced to a different computer and I will attach these two logs next. Created attachment 147131 [details] dmesg log after incomplete boot as per comment #6 Created attachment 147141 [details] systemd journal log after incomplete boot as per comment #6 This may or may not apply but maybe you could check if your video driver (nouveau?) is (re)compiled for the new kernel? I got a similar NULL pointer deref. when I recompiled the kernel and my fglrx(not nouveau for me) driver failed to compile and thus kept the previous module. The nouveau-dri package in my system was actually updated after I sent this report (actually three days after my last comment), and therefore presumably compiled for the new kernel. [2014-08-21 20:40] [PACMAN] upgraded nouveau-dri (10.2.5-1 -> 10.2.6-1) The nouveau package was updated a little earlier, and prior to the date of the original bug report. [2014-07-29 12:37] [PACMAN] upgraded xf86-video-nouveau (1.0.10-2 -> 1.0.10-3) However since the nouveau-dri package was updated I have not had a repeat of the incomplete boot problem. I was continuing to test and wait to check if the problem did recur. I will test further and report back again if there is any recurrence. Created attachment 152771 [details]
systemd journal showing kernel oops with kernel 3.16.4
With the latest kernel linux 3.16.4-1 and the packages as follows in arch linux:
xf86-video-nouveau 1.0.11-2
xf86-video-intel 2.99.916-3
mesa-dri 10.3.0-3
I appear to boot to the KDE greeter and login without problems but shurtdown from KDE gives a minute or two delay before the screen shows the kernel oops and requests a reboot is needed. I captured the systemd journal log at this point before trying to reboot after logging back in as root and using systemctl to reboot. Reboot or shutdown then gives a 1min 30sec delay due to a "stop job" running before the system will shutdown or reboot.
The only way I can get the laptop to behave sensibly on shutdown is to blacklist the nouveau module at boot using modprobe.blacklist=nouveau on the kernel line, or to add to the file /etc/modprobe.d/blacklist.conf a single line with "install nouveau /bin/false". Presumably this indicates there is a bug in the nouveau module for my graphics card and if there is a way to get some diagnostics to help pin this down so that a code fix can be found then please let me know what to do to generate suitable log files? The graphics cards on this machine are: 00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT 750M] (rev a1) Possible related to Bug #70354 and/or #85791? Bug #85791 https://bugzilla.kernel.org/show_bug.cgi?id=85791 Bug #70354 https://bugs.freedesktop.org/show_bug.cgi?id=70354 This is bug is affecting me too, however other option is to only use the integrated (Intel) card & disable the Nvidia card in BIOS. Currently works for me. Hope it helps! |
Created attachment 147021 [details] Systemd journal log after rebooting from crashed system After update to the kernel 3.16.1-1-ARCH the Lenovo Y510p with hybrid Intel/Nvidia graphics fails to boot though occasionally boot does complete and when it did then partial logs were available from the systemd journal despite having to pull the power to shutdown the system which corrupts the journal files. CPU soft lockup is one part of the log, and it seems that the graphics card fails to initialise properly also. Aug 18 10:10:14 lenovo2 kernel: nouveau E[ PGRAPH][0000:01:00.0] init failed, -16 Aug 18 10:10:14 lenovo2 kernel: nouveau E[ PGRAPH][0000:01:00.0][0x0300e417][ffff880263bcfc00] engine failed, -16 Aug 18 10:10:14 lenovo2 kernel: nouveau E[ PGRAPH][0000:01:00.0][0x0000a097][ffff880261d45700] parent failed, -16 Aug 18 10:10:14 lenovo2 kernel: nouveau E[Xorg.bin[439]] 0xdddddddd:0xcccc0000 init failed with -16 Aug 18 10:10:14 lenovo2 kernel: nouveau E[Xorg.bin[439]] 0xffffffff:0xdddddddd init failed with -16 Aug 18 10:10:14 lenovo2 kernel: nouveau E[Xorg.bin[439]] 0xffffffff:0xffffffff init failed with -16 Graphics cards are: 00:02.0 VGA compatible controller: Intel Corporation 4th Gen Core Processor Integrated Graphics Controller (rev 06) 01:00.0 VGA compatible controller: NVIDIA Corporation GK107M [GeForce GT 750M] (rev a1) I was not sure which components to select for this report so chose non Intel DRI - if necessary that can be changed. Either way when the bug hits during boot the VTs fill with log output but it is not clear that the logs contain all of the diagnostic information but the attached log file is all that I was able to capture.