Bug 199115

Summary: [gma500] BUG: unable to handle kernel NULL pointer dereference at 0000000000000081
Product: Drivers Reporter: Dominik Mierzejewski (dominik)
Component: Video(DRI - non Intel)Assignee: drivers_video-dri
Status: RESOLVED CODE_FIX    
Severity: normal CC: jwrdegoede
Priority: P1    
Hardware: x86-64   
OS: Linux   
See Also: https://bugs.freedesktop.org/show_bug.cgi?id=106470
Kernel Version: 4.15.7-300.fc27.x86_64 Subsystem:
Regression: No Bisected commit-id:
Attachments: journalctl -k --no-hostname --no-pager output

Description Dominik Mierzejewski 2018-03-14 13:56:36 UTC
Created attachment 274725 [details]
journalctl -k  --no-hostname --no-pager output

I'm consistently getting this kernel NULL pointer dereference on a Thecus N5550 NAS box (Intel Atom D2550 CPU) on every boot if gma500_gfx module is not blacklisted:

Mar 14 13:44:02 kernel: BUG: unable to handle kernel NULL pointer dereference at 0000000000000081
Mar 14 13:44:02 kernel: IP: drm_fb_helper_is_bound.isra.16+0x5/0xa0 [drm_kms_helper]
Mar 14 13:44:03 kernel: PGD 0 P4D 0 
Mar 14 13:44:03 kernel: Oops: 0000 [#1] SMP NOPTI
Mar 14 13:44:03 kernel: Modules linked in: it87 hwmon_vid vfat fat snd_hda_codec_realtek snd_hda_codec_generic gma500_gfx snd_hda_codec_hdmi iTCO_wdt iTCO_vendor_support snd
Mar 14 13:44:03 kernel: CPU: 0 PID: 277 Comm: kworker/0:2 Not tainted 4.15.7-300.fc27.x86_64 #1
Mar 14 13:44:03 kernel: Hardware name: Intel Corporation Milstead Platform/Granite Well, BIOS CDV_T30 X64 09/17/2012
Mar 14 13:44:03 kernel: Workqueue: events output_poll_execute [drm_kms_helper]
Mar 14 13:44:03 kernel: RIP: 0010:drm_fb_helper_is_bound.isra.16+0x5/0xa0 [drm_kms_helper]
Mar 14 13:44:03 kernel: RSP: 0018:ffffb118005ebe28 EFLAGS: 00010286
Mar 14 13:44:03 kernel: RAX: 0000000000000000 RBX: ffff9da9b656c800 RCX: 000000000000e800
Mar 14 13:44:03 kernel: RDX: ffff9da9b632da00 RSI: 0000000000000031 RDI: ffff9da9b656c800
Mar 14 13:44:03 kernel: RBP: ffff9da9b656c8d0 R08: 00000000000251a0 R09: 0000000000000000
Mar 14 13:44:03 kernel: R10: fffff2ca40f61c00 R11: 0000000000000000 R12: 0000000000000001
Mar 14 13:44:03 kernel: R13: ffff9da9b71e7800 R14: ffff9da9b71e79f8 R15: 0000000000000000
Mar 14 13:44:03 kernel: FS:  0000000000000000(0000) GS:ffff9da9bb400000(0000) knlGS:0000000000000000
Mar 14 13:44:03 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Mar 14 13:44:03 kernel: CR2: 0000000000000081 CR3: 000000007b2a0000 CR4: 00000000000006f0
Mar 14 13:44:03 kernel: Call Trace:
Mar 14 13:44:03 kernel:  drm_fb_helper_hotplug_event.part.29+0x34/0xb0 [drm_kms_helper]
Mar 14 13:44:03 kernel:  output_poll_execute+0x185/0x1b0 [drm_kms_helper]
Mar 14 13:44:03 kernel:  process_one_work+0x175/0x390
Mar 14 13:44:03 kernel:  worker_thread+0x2e/0x380
Mar 14 13:44:03 kernel:  ? process_one_work+0x390/0x390
Mar 14 13:44:03 kernel:  kthread+0x113/0x130
Mar 14 13:44:03 kernel:  ? kthread_create_worker_on_cpu+0x70/0x70
Mar 14 13:44:03 kernel:  ret_from_fork+0x1f/0x40
Mar 14 13:44:03 kernel: Code: 4c d0 f8 8b 45 20 39 c3 7c e6 83 e8 01 4c 89 ef 31 db 89 45 20 e8 6c b0 ea c4 eb 9f 83 c3 02 eb bf 0f 1f 44 00 00 0f 1f 44 00 00 <48> 8b 56 50 
Mar 14 13:44:03 kernel: RIP: drm_fb_helper_is_bound.isra.16+0x5/0xa0 [drm_kms_helper] RSP: ffffb118005ebe28
Mar 14 13:44:03 kernel: CR2: 0000000000000081
Mar 14 13:44:03 kernel: ---[ end trace 0bc03676f9e43f5d ]---

Full dmesg attached.
Comment 1 Dominik Mierzejewski 2018-06-26 17:20:50 UTC
Looks like this is no longer occurring in 4.17.2 (Fedora 28). I haven't checked if the actual video output still works after unblacklisting the gma500_gfx driver, but at least it's no longer trying to do a NULL pointer dereference. I don't have time to do a bisect to see which commit fixed it, sadly.
Comment 2 Hans de Goede 2019-01-22 09:39:47 UTC
As discussed in https://bugs.freedesktop.org/show_bug.cgi?id=106470, gfx are working on the Thecus N5550 NAS now, so this can be closed.

The only remaining issue is the gma500 driver thinking there is an LVDS panel, this can be worked around by passing: "video=LVDS-1:d" on the kernel commandline.

Fixing the LVDS issue properly (with a DMI based blacklist) is being tracked here: https://bugzilla.redhat.com/show_bug.cgi?id=1665766