Created attachment 162571 [details] full dmesg It does not happen very often, but when it does, the game freezes and has to be killed. radeon takes a little wile, but recovers. 00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09) 01:00.0 VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Wimbledon XT [Radeon HD 7970M] (rev ff) mesa was a then recent git master build with master from https://github.com/iXit/Mesa-3D.git merged into it so there's a possibility it doesn't happen with pure upstream mesa. I think it has only happened in skyrim with nine so far. [106308.053804] BUG: unable to handle kernel paging request at ffff8004a2fa79e8 [106308.055548] IP: [<ffffffffa01c03ee>] ttm_eu_reserve_buffers+0xbe/0x390 [ttm] [106308.057054] PGD 0 [106308.057059] Oops: 0000 [#1] PREEMPT SMP [106308.057088] Modules linked in: hidp uvcvideo rfcomm joydev btrfs xor ecb bnep msr videobuf2_vmalloc videobuf2_memops videobuf2_core v4l2_common raid6_pq videodev media coretemp btusb arc4 mousedev intel_rapl bluetooth iosf_mbi x86_pkg_temp_thermal intel_powerclamp kvm_intel iTCO_wdt iTCO_vendor_support snd_hda_codec_hdmi kvm snd_hda_codec_realtek iwldvm snd_hda_codec_generic led_class mac80211 crct10dif_pclmul crc32_pclmul crc32c_intel snd_hda_intel ghash_clmulni_intel snd_hda_controller aesni_intel snd_hda_codec aes_x86_64 lrw gf128mul glue_helper snd_hwdep iwlwifi snd_pcm ablk_helper cfg80211 snd_timer psmouse cryptd r8169 i2c_i801 serio_raw snd pcspkr rtsx_pci_ms soundcore memstick rfkill lpc_ich mii wmi tpm_tis tpm mei_me mei shpchp evdev battery ac thermal processor mac_hid sch_fq_codel nfs [106308.057104] lockd grace sunrpc fscache fuse ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod hid_generic usbhid hid rtsx_pci_sdmmc mmc_core atkbd libps2 ahci ehci_pci libahci libata xhci_pci xhci_hcd firewire_ohci ehci_hcd scsi_mod firewire_core crc_itu_t rtsx_pci usbcore usb_common i8042 serio radeon hwmon ttm i915 button intel_gtt video i2c_algo_bit drm_kms_helper drm i2c_core [last unloaded: uvcvideo] [106308.057107] CPU: 2 PID: 3057 Comm: TESV.exe Not tainted 3.19.0-1-mainline #1 [106308.057108] Hardware name: CLEVO P170EM/P170EM, BIOS 4.6.5 08/22/2012 [106308.057109] task: ffff88070cae13e0 ti: ffff8804a30dc000 task.ti: ffff8804a30dc000 [106308.057115] RIP: 0010:[<ffffffffa01c03ee>] [<ffffffffa01c03ee>] ttm_eu_reserve_buffers+0xbe/0x390 [ttm] [106308.057116] RSP: 0018:ffff8804a30df738 EFLAGS: 00010286 [106308.057117] RAX: 0000000000000000 RBX: ffff8804a30dfb30 RCX: 0000000000000008 [106308.057117] RDX: 0000000000000004 RSI: 0000000000000058 RDI: 0000000000000000 [106308.057118] RBP: ffff8804a30df788 R08: ffffc9002295b488 R09: 0000000000000000 [106308.057118] R10: ffffffff817215ea R11: ffffea0013907a00 R12: 0000000000000000 [106308.057119] R13: ffff8004a2fa7868 R14: ffff880808e08000 R15: ffff88031f24c388 [106308.057120] FS: 000000007ffd8000(0063) GS:ffff88082f280000(006b) knlGS:00000000eb4adb40 [106308.057121] CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033 [106308.057121] CR2: ffff8004a2fa79e8 CR3: 000000070d02d000 CR4: 00000000001407e0 [106308.057122] Stack: [106308.057124] ffff880809282d80 ffff8804a30df7e0 0100000000000000 ffff8804a30dfcd8 [106308.057125] ffff880337cba800 ffff8804a30dfb30 ffff8804a30dfa80 ffff8804a30dfb30 [106308.057127] ffff880808e08000 ffff8804a30dfae8 ffff8804a30df828 ffffffffa01eeaa7 [106308.057127] Call Trace: [106308.057144] [<ffffffffa01eeaa7>] radeon_bo_list_validate+0x97/0x230 [radeon] [106308.057158] [<ffffffffa0206f9d>] radeon_cs_parser_relocs+0x34d/0x440 [radeon] [106308.057172] [<ffffffffa0207ac0>] radeon_cs_ioctl+0x2a0/0x810 [radeon] [106308.057176] [<ffffffff81014935>] ? __switch_to+0x445/0x5f0 [106308.057186] [<ffffffffa001ccff>] drm_ioctl+0x1df/0x680 [drm] [106308.057197] [<ffffffffa01cd04c>] radeon_drm_ioctl+0x4c/0x80 [radeon] [106308.057208] [<ffffffffa02cf5d4>] radeon_kms_compat_ioctl+0x14/0x30 [radeon] [106308.057211] [<ffffffff812276c0>] compat_SyS_ioctl+0xf0/0x1260 [106308.057214] [<ffffffff810f2fe4>] ? compat_SyS_futex+0x84/0x1a0 [106308.057216] [<ffffffff81091339>] ? task_work_run+0xd9/0xf0 [106308.057220] [<ffffffff815639b6>] sysenter_dispatch+0x7/0x25 [106308.057234] Code: 00 00 00 85 c0 0f 8f d2 01 00 00 41 80 7f 18 00 0f 85 a7 01 00 00 4d 8b 3f 49 39 df 0f 84 eb 01 00 00 48 83 7d c8 00 4d 8b 6f 10 <49> 8b bd 80 01 00 00 0f 84 65 01 00 00 80 7d c7 00 48 8b 75 c8 [106308.057238] RIP [<ffffffffa01c03ee>] ttm_eu_reserve_buffers+0xbe/0x390 [ttm] [106308.057239] RSP <ffff8804a30df738> [106308.057239] CR2: ffff8004a2fa79e8 [106308.069117] ---[ end trace 086e470f5f9bd070 ]---
Can you try decoding the backtrace with scripts/decode_stacktrace.sh from the kernel tree? Does it only happen with a 3.19 kernel, or also with older ones?
I'll have to build a kernel with symbols later and replicate it. It sometimes takes even a few hours of gameplay to have this happen, so it could take some time. But I am relatively sure that it did not happen with 3.18.
Hm, interesting. I compiled 3.19-rc4 with debug symbols. I'm also testing Tom Stellard's VGPR register spilling llvm and mesa branches. After a while of playing skyrim I got the familiar hang where skyrim just freezes, but I did NOT get "BUG: unable to handle kernel paging request" in the system log. Instead I got this in the terminal from which I started skyrim: radeon: mmap failed, errno: 12 radeon: mmap failed, errno: 12 radeon: mmap failed, errno: 12 radeon: mmap failed, errno: 12 I'm not very good with the wine debugger... Attaching to the TESV.exe process and then getting a backtrace shows: Wine-dbg>bt Backtrace: =>0 0xf7702bee __kernel_vsyscall+0xe() in [vdso].so (0x7eada510) 1 0xf7514e02 __lll_lock_wait+0x21() in libpthread.so.0 (0x7eada510) 2 0xf750f5ae __GI___pthread_mutex_lock+0x8d() in libpthread.so.0 (0x7eada510) 3 0xed73ba1d in d3dadapter9.so.1 (+0x128a1c) (0x7eada510) 4 0x0069df9c in tesv (+0x29df9b) (0x7eada510) 5 0xfff0e400 (0x526077e9) I would try to find out where exactly in d3dadapter9.so.1 this happens but I don't get how to properly attach winedbg --gdb and addr2linux didn't give a line with code, so I probably used it wrong.
(In reply to Christoph Haag from comment #3) > radeon: mmap failed, errno: 12 That's ENOMEM, so it looks like the kernel runs out of memory. Maybe a leak somewhere. (In reply to Christoph Haag from comment #2) > But I am relatively sure that it did not happen with 3.18. Can you bisect?
I didn't answer for a while because I didn't have too much time, but also because it hasn't happened anymore. I think it has meanwhile been fixed, wherever the problem was.