Bug 197103

Summary: BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 / IP: nouveau_fbcon_set_suspend_work+0x45/0xf0 [nouveau]
Product: Drivers Reporter: Igor Raits (igor.raits)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED CODE_FIX    
Severity: normal CC: pierre.morrow, regressions
Priority: P1    
Hardware: All   
OS: Linux   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=102381
Kernel Version: 4.14.0-0.rc2.git4.1.fc28.x86_64 Subsystem:
Regression: Yes Bisected commit-id:
Attachments: journal

Description Igor Raits 2017-10-02 07:27:42 UTC
Created attachment 258693 [details]
journal

This is Lenovo ThinkPad W541 machine. 4.13.0 works fine, as far as I know, none of 4.14 works..

BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
IP: nouveau_fbcon_set_suspend_work+0x45/0xf0 [nouveau]
PGD 0 P4D 0
Oops: 0000 [#1] SMP
Modules linked in: pcc_cpufreq(-) intel_cstate(+) intel_uncore intel_rapl_perf iwlmvm(+) mac80211 joydev(+) wmi_bmof i2c_i801 iwlwifi btusb btrtl btbcm btintel bluetooth lpc_ich uvcvideo cfg80211 videobuf2_vmalloc snd_hda_codec_realtek videobuf2_memops videobuf2_v4l2 snd_hda_codec_hdmi(+) videobuf2_core snd_hda_codec_generic videodev media snd_hda_intel snd_hda_codec ecdh_generic snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm mei_me mei snd_timer ie31200_edac shpchp thinkpad_acpi snd soundcore rfkill tpm_tis fjes(-) tpm_tis_core tpm acpi_cpufreq(-) xfs libcrc32c dm_crypt nouveau i915 crct10dif_pclmul crc32_pclmul crc32c_intel mxm_wmi ghash_clmulni_intel ttm e1000e i2c_algo_bit serio_raw drm_kms_helper sdhci_pci sdhci mmc_core drm ptp pps_core wmi video
CPU: 4 PID: 35 Comm: kworker/4:0 Not tainted 4.14.0-0.rc2.git4.1.fc28.x86_64 #1
Hardware name: LENOVO 20EGS0R60R/20EGS0R60R, BIOS GNET81WW (2.29 ) 11/24/2016
Workqueue: events nouveau_fbcon_set_suspend_work [nouveau]
task: ffff9c8b6a50b4c0 task.stack: ffffbb5a81a28000
RIP: 0010:nouveau_fbcon_set_suspend_work+0x45/0xf0 [nouveau]
RSP: 0018:ffffbb5a81a2be28 EFLAGS: 00010286
RAX: ffff9c8b6212f000 RBX: ffff9c8b63a157f8 RCX: 0000000000000000
RDX: ffff9c8b63a14000 RSI: 0000000000000001 RDI: 0000000000000000
RBP: ffffbb5a81a2be30 R08: ffffbb5a81a2bf40 R09: ffffffff99e0b400
R10: 0000000000000001 R11: 0000000000000000 R12: ffff9c8b6d5dafc0
R13: ffff9c8b6d5df800 R14: ffff9c8b63a157f8 R15: 0000000000000000
FS:  0000000000000000(0000) GS:ffff9c8b6d400000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000467f4e005 CR4: 00000000001606e0
Call Trace:
 process_one_work+0x26b/0x6c0
 worker_thread+0x35/0x3b0
 kthread+0x171/0x190
 ? process_one_work+0x6c0/0x6c0
 ? kthread_create_on_node+0x70/0x70
 ret_from_fork+0x2a/0x40
Code: 8b bb f0 f9 ff ff be 01 00 00 00 e8 46 d6 c7 ff 48 8b 83 f0 e9 ff ff 48 8b 50 28 48 8b 82 e8 11 00 00 48 85 c0 74 1c 48 8b 48 38 <8b> 49 08 89 88 58 02 00 00 48 8b 82 e8 11 00 00 48 8b 40 38 83
RIP: nouveau_fbcon_set_suspend_work+0x45/0xf0 [nouveau] RSP: ffffbb5a81a2be28
CR2: 0000000000000008
---[ end trace 01aa0aae272659a2 ]---
BUG: sleeping function called from invalid context at ./include/linux/percpu-rwsem.h:33
in_atomic(): 0, irqs_disabled(): 1, pid: 35, name: kworker/4:0
INFO: lockdep is turned off.
irq event stamp: 9496
hardirqs last  enabled at (9495): [<ffffffff999d5726>] _raw_spin_unlock_irqrestore+0x36/0x60
hardirqs last disabled at (9496): [<ffffffff999d770c>] error_entry+0x7c/0xd0
softirqs last  enabled at (9428): [<ffffffff9913d4cc>] srcu_invoke_callbacks+0xbc/0x190
softirqs last disabled at (9424): [<ffffffff9913d4cc>] srcu_invoke_callbacks+0xbc/0x190
CPU: 4 PID: 35 Comm: kworker/4:0 Tainted: G      D         4.14.0-0.rc2.git4.1.fc28.x86_64 #1
Hardware name: LENOVO 20EGS0R60R/20EGS0R60R, BIOS GNET81WW (2.29 ) 11/24/2016
Workqueue: events nouveau_fbcon_set_suspend_work [nouveau]
Call Trace:
 dump_stack+0x8e/0xd6
 ___might_sleep+0x164/0x250
 __might_sleep+0x4a/0x80
 exit_signals+0x33/0x240
 do_exit+0xb9/0xda0
 ? kthread+0x171/0x190
 rewind_stack_do_exit+0x17/0x20
Comment 1 Igor Raits 2017-10-02 07:30:27 UTC
Someone submitted same bug on FDO bugzilla, but I think this is more kernel bug..

https://bugs.freedesktop.org/show_bug.cgi?id=102381
Comment 2 The Linux kernel's regression tracker (Thorsten Leemhuis) 2017-10-08 10:57:00 UTC
Did you try to bring this to the nouveau developer list? Not sure if the drivers developers look here. Did you consider to bisect this? 

(BTW: Thx for pointing me to this bug, I added it to this weeks regression report)
Comment 3 Pierre Moreau 2017-10-08 13:21:16 UTC
Nouveau bugs, regardless of whether they lie in the kernel, Mesa or the Nouveau DDX, should be reported at bugs.freedesktop.org (see https://nouveau.freedesktop.org/wiki/Bugs/). You might want to add yourself to the freedesktop bug, as updates are more likely to happen in that bug report than this one.
Updates to this bug report are sent to the dri-devel mailing list, but not every Nouveau developer follows it.

I’ll give 4.14-rc3 a try on my laptop (Optimus + GK107 as well).
Comment 4 Igor Raits 2017-10-17 16:14:29 UTC
(In reply to Thorsten Leemhuis from comment #2)
> Did you try to bring this to the nouveau developer list? Not sure if the
> drivers developers look here. Did you consider to bisect this? 
Not really, I am kinda busy with $dayjob so I just tried to report to upstream something useful.

If I will get time, I will try to do this.
Comment 5 The Linux kernel's regression tracker (Thorsten Leemhuis) 2017-10-29 12:29:48 UTC
Is this still happening? There was a patch recently to fix oops without fbdev emulation: https://git.kernel.org/torvalds/c/4813766325374af6ed0b66879ba6a0bbb05c83b6
Comment 6 Igor Raits 2017-10-30 06:12:37 UTC
(In reply to Thorsten Leemhuis from comment #5)
> Is this still happening? There was a patch recently to fix oops without
> fbdev emulation:
> https://git.kernel.org/torvalds/c/4813766325374af6ed0b66879ba6a0bbb05c83b6

Doesn't seem to be happening anymore with Linux version 4.14.0-0.rc6.git3.2.fc28.x86_64 (mockbuild@buildvm-02.phx2.fedoraproject.org) (gcc version 7.2.1 20170829 (Red Hat 7.2.1-1) (GCC)) #1 SMP Thu Oct 26 21:53:34 UTC 2017