Bug 95621

Summary: BUG in drm_framebuffer_free_bug at drivers/gpu/drm/drm_crtc.c:530
Product: Drivers Reporter: Zan Lynx (zlynx)
Component: Video(DRI - Intel)Assignee: intel-gfx-bugs (intel-gfx-bugs)
Status: RESOLVED CODE_FIX    
Severity: normal CC: abacabadabacaba, benopen, intel-gfx-bugs, peter.weber
Priority: P1    
Hardware: x86-64   
OS: Linux   
Kernel Version: 4.0-rc4 Subsystem:
Regression: No Bisected commit-id:

Description Zan Lynx 2015-03-25 23:55:13 UTC
I believe I was trying to switch from the GDM login console to a text console.

Shortly after I tried to switch the machine froze up and I had to force power-off with the power button.

This was in the log when I came back.

------------[ cut here ]------------
kernel BUG at drivers/gpu/drm/drm_crtc.c:530!
invalid opcode: 0000 [#1] SMP 
Modules linked in: vhost_net vhost macvtap macvlan rfcomm fuse ccm xt_CHECKSUM ipt_MASQUERADE nf_nat_masquerade_ipv4 tun nf_conntrack_netbios_ns nf_conntrack_broadcast ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ebtable_nat ebtable_broute bridge stp llc ebtable_filter ebtables ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_security ip6table_raw ip6table_filter ip6_tables iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_security iptable_raw bnep arc4 pn544_mei mei_phy pn544 hci intel_rapl iosf_mbi nfc x86_pkg_temp_thermal coretemp dell_wmi dell_laptop iTCO_wdt sparse_keymap kvm_intel iTCO_vendor_support iwlmvm dcdbas kvm snd_hda_codec_realtek mac80211 snd_hda_codec_hdmi vfat snd_hda_codec_generic
 fat i8k snd_hda_intel snd_hda_controller snd_hda_codec crct10dif_pclmul snd_hwdep crc32_pclmul uvcvideo snd_seq iwlwifi videobuf2_vmalloc btusb ghash_clmulni_intel videobuf2_core snd_seq_device bluetooth videobuf2_memops cfg80211 snd_pcm i2c_i801 v4l2_common hid_multitouch videodev serio_raw media rtsx_pci_ms snd_timer rfkill tpm_tis memstick snd ie31200_edac mei_me soundcore lpc_ich mei edac_core shpchp tpm int3403_thermal dell_smo8800 int3402_thermal processor_thermal_device int3400_thermal int340x_thermal_zone acpi_thermal_rel nfsd auth_rpcgss nfs_acl lockd grace sunrpc btrfs xor nouveau i915 raid6_pq rtsx_pci_sdmmc mmc_core ttm i2c_algo_bit drm_kms_helper mxm_wmi drm crc32c_intel rtsx_pci mfd_core video wmi
CPU: 0 PID: 857 Comm: systemd-logind Not tainted 4.0.0-0.rc4.git0.1.fc22.x86_64 #1
Hardware name: Dell Inc. Dell Precision M3800/Dell Precision M3800, BIOS A08 11/14/2014
task: ffff880468736ca0 ti: ffff88045bae4000 task.ti: ffff88045bae4000
RIP: 0010:[<ffffffffa0069659>]  [<ffffffffa0069659>] drm_framebuffer_free_bug+0x9/0x10 [drm]
RSP: 0018:ffff88045bae7988  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff880469217400 RCX: 0000000000000000
RDX: 0000000180150010 RSI: ffffea00074349c0 RDI: ffff880079068cc8
RBP: ffff88045bae7988 R08: 00000000d0d27f01 R09: 000000018015000f
R10: ffff8801d0d27f00 R11: 0000000000000007 R12: ffff880079068cc0
R13: ffff88046921bf00 R14: ffff8804692a0360 R15: 0000000000000080
FS:  00007fe8a33ed8c0(0000) GS:ffff88047fa00000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f99b61f1018 CR3: 000000045c4a3000 CR4: 00000000001407f0
Stack:
 ffff88045bae79a8 ffffffffa006b7dd ffff880469217400 ffff8804692a0000
 ffff88045bae79e8 ffffffffa00d06a1 00000000ffffffff ffff88046921bf00
 ffff8804692a0000 0000000000000000 0000000000200001 0000000000000080
Call Trace:
 [<ffffffffa006b7dd>] drm_plane_force_disable+0x9d/0xd0 [drm]
 [<ffffffffa00d06a1>] restore_fbdev_mode+0x51/0xf0 [drm_kms_helper]
 [<ffffffffa00d2769>] drm_fb_helper_restore_fbdev_mode_unlocked+0x29/0x80 [drm_kms_helper]
 [<ffffffffa00d27e2>] drm_fb_helper_set_par+0x22/0x50 [drm_kms_helper]
 [<ffffffffa01c5c1a>] intel_fbdev_set_par+0x1a/0x60 [i915]
 [<ffffffff81418346>] ? fb_set_var+0x2f6/0x480
 [<ffffffff81418286>] fb_set_var+0x236/0x480
 [<ffffffff810d5079>] ? check_preempt_wakeup+0x1a9/0x230
 [<ffffffff810d3e1c>] ? update_curr+0x5c/0x160
 [<ffffffff8140dd8c>] fbcon_blank+0x34c/0x390
 [<ffffffff8101365b>] ? __switch_to+0x19b/0x5f0
 [<ffffffff81494cca>] do_unblank_screen+0xda/0x1d0
 [<ffffffff8148931d>] complete_change_console+0x5d/0xf0
 [<ffffffff8148a7bb>] vt_ioctl+0x140b/0x1430
 [<ffffffff8147c601>] tty_ioctl+0x3f1/0xc30
 [<ffffffff8147d470>] ? tty_release+0x320/0x5a0
 [<ffffffff81232ba6>] do_vfs_ioctl+0x2c6/0x4d0
 [<ffffffff81232e31>] SyS_ioctl+0x81/0xa0
 [<ffffffff81787849>] system_call_fastpath+0x12/0x17
Code: c0 71 e1 4c 89 e7 e8 a7 5e 19 e1 48 83 c4 08 5b 41 5c 41 5d 5d c3 66 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 55 48 89 e5 <0f> 0b 0f 1f 44 00 00 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 53 
RIP  [<ffffffffa0069659>] drm_framebuffer_free_bug+0x9/0x10 [drm]
 RSP <ffff88045bae7988>
---[ end trace be68833852e5bc6b ]---
Comment 1 Jani Nikula 2015-03-26 12:17:10 UTC
Please try drm-intel-fixes branch of http://cgit.freedesktop.org/drm-intel.
Comment 2 Evgeny Kapun 2015-04-01 13:58:38 UTC
This bug affects me as well. I use vanilla 3.19.3 kernel on Debian. For me, it happens when I log out. At this point, X server should restart, but instead the machine hangs.

Log:
------------[ cut here ]------------
kernel BUG at drivers/gpu/drm/drm_crtc.c:536!
invalid opcode: 0000 [#1] SMP 
Modules linked in: dm_mod ctr ccm binfmt_misc cpufreq_stats pci_stub cpufreq_powersave cpufreq_conservative vboxpci(O) cpufreq_userspace vboxnetadp(O) vboxnetflt(O) vboxdrv(O) arc4 iwldvm btusb i915 bluetooth mac80211 coretemp kvm_intel kvm iwlwifi snd_hda_codec_conexant pcmcia snd_hda_codec_generic cfg80211 joydev iTCO_wdt snd_hda_intel iTCO_vendor_support snd_hda_controller snd_hda_codec drm_kms_helper psmouse r852 evdev sm_common nand pcspkr snd_hwdep serio_raw nand_ecc snd_pcm_oss drm yenta_socket nand_bch bch nand_ids pcmcia_rsrc mtd pcmcia_core r592 memstick lpc_ich snd_mixer_oss snd_pcm i2c_i801 mfd_core thinkpad_acpi snd_timer nvram snd tpm_tis wmi acpi_cpufreq mei_me tpm processor shpchp rfkill video mei i2c_algo_bit i2c_core battery 8250_fintek ac soundcore button fuse parport_pc ppdev lp parport autofs4 ext4 crc16 mbcache jbd2 sg hid_generic sd_mod sr_mod cdrom usbhid hid ahci libahci libata scsi_mod firewire_ohci sdhci_pci sdhci firewire_core mmc_core crc_itu_t e1000e thermal thermal_sys ehci_pci uhci_hcd ehci_hcd usbcore ptp pps_core usb_common
CPU: 1 PID: 1534 Comm: Xorg Tainted: G           O   3.19.3 #1
Hardware name: LENOVO 2765RAG/2765RAG, BIOS 7VET66WW (2.16 ) 04/22/2009
task: ffff88007edacca0 ti: ffff88007ee1c000 task.ti: ffff88007ee1c000
RIP: 0010:[<ffffffffa03fc9c5>]  [<ffffffffa03fc9c5>] drm_framebuffer_free_bug+0x5/0x10 [drm]
RSP: 0018:ffff88007ee1fac0  EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff88007e65dc00 RCX: 0000000000000001
RDX: 0000000000000040 RSI: 0000000000000000 RDI: ffff88007f86ff48
RBP: ffff88007f86ff40 R08: ffff88007ea09170 R09: 0000000000000000
R10: ffff88007e088d00 R11: 0000000000003246 R12: ffff88007f835e00
R13: ffff88012e6bf378 R14: ffff88007ee1fbb8 R15: 0000000000000080
FS:  00007fcbb76c8980(0000) GS:ffff88013bc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00007fd45bf18010 CR3: 0000000109a4c000 CR4: 00000000000407e0
Stack:
 ffffffffa03fe7ad ffff88007faef000 ffff88007e65dc00 ffff88012e6bf000
 ffffffffa04aa679 ffff88007f835e00 ffff88012e6bf000 0000000000200001
 ffff8800b3844860 ffff88007ee1fbb8 ffffffffa04ac610 ffff88007f835e00
Call Trace:
 [<ffffffffa03fe7ad>] ? drm_plane_force_disable+0x9d/0xd0 [drm]
 [<ffffffffa04aa679>] ? restore_fbdev_mode+0x49/0xe0 [drm_kms_helper]
 [<ffffffffa04ac610>] ? drm_fb_helper_restore_fbdev_mode_unlocked+0x20/0x60 [drm_kms_helper]
 [<ffffffffa04ac672>] ? drm_fb_helper_set_par+0x22/0x50 [drm_kms_helper]
 [<ffffffffa0972ef6>] ? intel_fbdev_set_par+0x16/0x60 [i915]
 [<ffffffff813361ee>] ? fb_set_var+0x15e/0x3a0
 [<ffffffffa017e165>] ? jbd2_journal_dirty_metadata+0xd5/0x2a0 [jbd2]
 [<ffffffff8132d1e3>] ? fbcon_blank+0x223/0x2e0
 [<ffffffff813a4ea7>] ? do_unblank_screen+0xb7/0x1f0
 [<ffffffff8139b698>] ? vt_ioctl+0x698/0x1360
 [<ffffffff8138e9ee>] ? tty_ioctl+0x3de/0xc40
 [<ffffffff811d5c91>] ? dput+0x21/0x1a0
 [<ffffffff811d2648>] ? do_vfs_ioctl+0x2e8/0x4f0
 [<ffffffff810887dc>] ? task_work_run+0xbc/0xf0
 [<ffffffff811d28d1>] ? SyS_ioctl+0x81/0xa0
 [<ffffffff81550baf>] ? int_signal+0x12/0x17
 [<ffffffff8155090d>] ? system_call_fastpath+0x16/0x1b
Code: 02 03 ed e0 4c 89 e7 e8 da 1c 15 e1 5b 48 89 ef 5d 41 5c e9 5e 72 da e0 66 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 66 66 66 66 90 <0f> 0b 66 0f 1f 84 00 00 00 00 00 66 66 66 66 90 53 48 89 fb e8 
RIP  [<ffffffffa03fc9c5>] drm_framebuffer_free_bug+0x5/0x10 [drm]
 RSP <ffff88007ee1fac0>
---[ end trace c658404e860f3cfe ]---
Comment 4 BenoƮt Wallecan 2015-04-01 17:38:49 UTC
(In reply to Evgeny Kapun from comment #3)
> Looks like this commit is relevant:
> 
> http://cgit.freedesktop.org/drm-intel/commit/?h=drm-intel-
> fixes&id=8218c3f4df3bb1c637c17552405039a6dd3c1ee1

I confirm this fixed the problem for me. (patched against 3.19.3 vanilla)
The bug can be triggered at boot time by GDM 3.16 (Wayland mode), or at shutdown.
Comment 5 Jani Nikula 2015-04-02 10:59:35 UTC
Also http://mid.gmane.org/551A449B.8020300@intel.com

This is fixed upstream, and the relevant commit is cc: stable, i.e. should find its way to some v3.19.y. Closing, thanks for the report.
Comment 6 Peter Weber 2015-04-12 12:41:51 UTC
Hi!
I'm using Archlinux with GNOME 3.16 and also run into this issue. I applied this patch to my self compiled kernel (release 3.19.3) and it works.

But occassionally the switching from GNOME (X11) to a specific TTY seems to hang *when multiple monitors* are connected.

Steps to reproduce:
1. Repeated switching between GDM (TTY1, WAYLAND) and GNOME (first unoccupied TTY, X11) and a TTY managed by systemd to start getty.
2. At some point, switching from GNOME to a specific TTY isn't anymore possible

Workaround:
Switch to some other TTY, but not the specific TTY, which was originally desired. Now switching should be possible again.

Maybe releated messages from dmesg:
...
[  278.625941] [drm:ironlake_irq_handler] *ERROR* CPU pipe A FIFO underrun
[  278.628987] [drm:intel_set_pch_fifo_underrun_reporting] *ERROR* uncleared pch fifo underrun on pch transcoder A
[  278.629002] [drm:cpt_irq_handler] *ERROR* PCH transcoder A FIFO underrun
[  283.067207] traps: polkitd[359] general protection ip:7f93bb4b9582 sp:7ffcf7336240 error:0 in libmozjs-17.0.so[7f93bb381000+3a8000]
[  283.067274] audit: type=1701 audit(1428835568.152:8): auid=4294967295 uid=102 gid=102 ses=4294967295 pid=359 comm="polkitd" exe="/usr/lib/polkit-1/polkitd" sig=11
[  532.180506] thinkpad_acpi: docked into hotplug port replicator
[  532.295384] usb 1-2: new high-speed USB device number 2 using xhci_hcd
[  532.409582] usb 1-2: New USB device found, idVendor=17ef, idProduct=100a
[  532.409592] usb 1-2: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[  532.411457] hub 1-2:1.0: USB hub found
[  532.411892] hub 1-2:1.0: 6 ports detected
[  532.692764] usb 1-2.2: new full-speed USB device number 3 using xhci_hcd
[  532.786116] usb 1-2.2: New USB device found, idVendor=17ef, idProduct=604d
[  532.786127] usb 1-2.2: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[  532.786132] usb 1-2.2: Product: Lenovo USB Receiver
[  532.786474] usb 1-2.2: ep 0x81 - rounding interval to 64 microframes, ep desc says 80 microframes
[  532.786497] usb 1-2.2: ep 0x83 - rounding interval to 64 microframes, ep desc says 80 microframes
[  532.790763] input: Lenovo USB Receiver as /devices/pci0000:00/0000:00:1c.6/0000:0e:00.0/usb1/1-2/1-2.2/1-2.2:1.0/0003:17EF:604D.0001/input/input19
[  532.841848] hid-generic 0003:17EF:604D.0001: input,hidraw0: USB HID v1.11 Keyboard [Lenovo USB Receiver] on usb-0000:0e:00.0-2.2/input0
[  532.845191] input: Lenovo USB Receiver as /devices/pci0000:00/0000:00:1c.6/0000:0e:00.0/usb1/1-2/1-2.2/1-2.2:1.1/0003:17EF:604D.0002/input/input20
[  532.845646] hid-generic 0003:17EF:604D.0002: input,hidraw1: USB HID v1.11 Mouse [Lenovo USB Receiver] on usb-0000:0e:00.0-2.2/input1
[  532.853314] input: Lenovo USB Receiver as /devices/pci0000:00/0000:00:1c.6/0000:0e:00.0/usb1/1-2/1-2.2/1-2.2:1.2/0003:17EF:604D.0003/input/input21
[  532.904010] hid-generic 0003:17EF:604D.0003: input,hiddev0,hidraw2: USB HID v1.11 Keypad [Lenovo USB Receiver] on usb-0000:0e:00.0-2.2/input2
[  533.685648] thinkpad_acpi: EC reports that Thermal Table has changed
[ 6178.364357] traps: polkitd[2079] general protection ip:7f1a9ac57582 sp:7ffc3b851e30 error:0 in libmozjs-17.0.so[7f1a9ab1f000+3a8000]
[ 6178.364419] audit: type=1701 audit(1428841461.879:9): auid=4294967295 uid=102 gid=102 ses=4294967295 pid=2079 comm="polkitd" exe="/usr/lib/polkit-1/polkitd" sig=11
[ 6192.804819] [drm:ironlake_irq_handler] *ERROR* CPU pipe A FIFO underrun
[ 6192.804897] [drm:intel_set_pch_fifo_underrun_reporting] *ERROR* uncleared pch fifo underrun on pch transcoder A
[ 6192.804907] [drm:cpt_irq_handler] *ERROR* PCH transcoder A FIFO underrun
[ 6252.008162] traps: polkitd[2365] general protection ip:7ff0feb15582 sp:7ffdac6c57e0 error:0 in libmozjs-17.0.so[7ff0fe9dd000+3a8000]
[ 6252.008232] audit: type=1701 audit(1428841535.503:10): auid=4294967295 uid=102 gid=102 ses=4294967295 pid=2365 comm="polkitd" exe="/usr/lib/polkit-1/polkitd" sig=11
...

I think the "traps" and "audit" message are releated to the message from "drm".