Subject : 2.6.36-rc6: WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x35a/0x3c0 Submitter : Alexey Dobriyan <adobriyan@gmail.com> Date : 2010-09-29 21:29 Message-ID : 20100929212923.GA5578@core2.telecom.by References : http://marc.info/?l=linux-kernel&m=128579579400315&w=2 This entry is being used for tracking a regression from 2.6.35. Please don't close it until the problem is fixed in the mainline.
Is this really a regression from 2.6.35? Could also be a userspace issue.
Well, normally userspace shouldn't be able to wreck the kernel. (I know, graphics are somewhat difficult in that respect.. but still) Alexey, does this still happen on 2.6.37 or something newer?
Do not use that card anymore, happened once IIRC.
Created attachment 50022 [details] dmesg log Hi, it happened to me with 2.6.38-rc7 several times. Seems to be quite easy to reproduce on my system: 1. start system (gentoo), login to KDE 2. start Stellarium in window mode 3. resize the window Crash usually occurs when the window is almost maximized (about 1800 x 1000 px). Sometimes the lockup happens only once and system seems to be fine, sometimes there are two or more lockups and it also happened that display shut off. kernel message: WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:248 radeon_fence_wait+0x39e/0x400() Hardware name: H55M-USB3 GPU lockup (waiting for 0x00002922 last fence id 0x00002921) Modules linked in: sit tunnel4 ipv6 coretemp it87 hwmon_vid iptable_mangle iptable_nat nf_nat kvm_intel kvm snd_hda_codec_realtek snd_hda_intel snd_usb_audio snd_hda_codec snd_pcm snd_timer snd_hwdep snd_usbmidi_lib snd_rawmidi snd r8169 i2c_i801 mii soundcore snd_page_alloc Pid: 2336, comm: stellarium Not tainted 2.6.38-rc7 #1 Call Trace: [<ffffffff81039ffb>] ? warn_slowpath_common+0x7b/0xc0 [<ffffffff8103a0f5>] ? warn_slowpath_fmt+0x45/0x50 [<ffffffff8129821e>] ? radeon_fence_wait+0x39e/0x400 [<ffffffff81055210>] ? autoremove_wake_function+0x0/0x30 [<ffffffff812609cd>] ? ttm_bo_wait+0x10d/0x1c0 [<ffffffff812b0e8b>] ? radeon_gem_wait_idle_ioctl+0x8b/0x110 [<ffffffff8124aa9c>] ? drm_ioctl+0x38c/0x450 [<ffffffff810a2136>] ? __pte_alloc+0xc6/0xd0 [<ffffffff812b0e00>] ? radeon_gem_wait_idle_ioctl+0x0/0x110 [<ffffffff810a4ccd>] ? handle_mm_fault+0xfd/0x220 [<ffffffff810242e9>] ? do_page_fault+0x199/0x410 [<ffffffff810a9daf>] ? mmap_region+0x1df/0x4b0 [<ffffffff810d1711>] ? do_vfs_ioctl+0x91/0x510 [<ffffffff810d1bd9>] ? sys_ioctl+0x49/0x80 [<ffffffff810024fb>] ? system_call_fastpath+0x16/0x1b lspci: 01:00.0 VGA compatible controller: ATI Technologies Inc Juniper [Radeon HD 5700 Series] (prog-if 00 [VGA controller]) Subsystem: Micro-Star International Co., Ltd. Device 2140 Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at e0000000 (64-bit, prefetchable) [size=256M] Memory at fbcc0000 (64-bit, non-prefetchable) [size=128K] I/O ports at ee00 [size=256] [virtual] Expansion ROM at fbc00000 [disabled] [size=128K] Capabilities: [50] Power Management version 3 Capabilities: [58] Express Legacy Endpoint, MSI 00 Capabilities: [a0] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [100] Vendor Specific Information: ID=0001 Rev=1 Len=010 <?> Capabilities: [150] Advanced Error Reporting Kernel driver in use: radeon
Hi, the same happens here with kernel 2.6.36.4 and 2.6.37.2 during watching some youtube videos with Opera or Iceweasel, I'm on Debian stable here with a Radeon HD 3450. radeon 0000:01:00.0: GPU lockup CP stall for more than 10035msec ------------[ cut here ]------------ WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:235 radeon_fence_wait+0x235/0x2d3() Hardware name: MS-7376 GPU lockup (waiting for 0x00008D65 last fence id 0x00008D63) Modules linked in: xt_limit xt_tcpudp iptable_mangle ipt_LOG ipt_MASQUERADE nf_nat xt_DSCP ipt_REJECT nf_conntrack_irc nf_conntrack_ftp nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack iptable_filter ip_tables x_tables aes_generic fuse loop arc4 ecb crypto_blkcipher cryptomgr aead crypto_algapi rt73usb crc_itu_t rt2x00usb rt2x00lib mac80211 cfg80211 hid_cherry usbhid [last unloaded: scsi_wait_scan] Pid: 2071, comm: Xorg Not tainted 2.6.36.4 #1 Call Trace: [<ffffffff810322e0>] ? warn_slowpath_common+0x78/0x8c [<ffffffff81032393>] ? warn_slowpath_fmt+0x45/0x4a [<ffffffff811e8407>] ? radeon_fence_wait+0x235/0x2d3 [<ffffffff81046eeb>] ? autoremove_wake_function+0x0/0x2a [<ffffffff811bcb3e>] ? ttm_bo_wait+0xc7/0x16e [<ffffffff811f992b>] ? radeon_gem_wait_idle_ioctl+0x7a/0xdf [<ffffffff811ab980>] ? drm_ioctl+0x236/0x2ea [<ffffffff811f98b1>] ? radeon_gem_wait_idle_ioctl+0x0/0xdf [<ffffffff8100a545>] ? save_i387_xstate+0x12e/0x1bd [<ffffffff81001906>] ? do_signal+0x58b/0x679 [<ffffffff8109db9d>] ? do_vfs_ioctl+0x418/0x465 [<ffffffff81001c1f>] ? sys_rt_sigreturn+0x1c7/0x228 [<ffffffff8109dc26>] ? sys_ioctl+0x3c/0x5c [<ffffffff81001e6b>] ? system_call_fastpath+0x16/0x1b ---[ end trace 9d3e75f9935ec99b ]--- [drm] Disabling audio support radeon 0000:01:00.0: GPU softreset radeon 0000:01:00.0: R_008010_GRBM_STATUS=0xE57024E0 radeon 0000:01:00.0: R_008014_GRBM_STATUS2=0x00110103 radeon 0000:01:00.0: R_000E50_SRBM_STATUS=0x200010C0 radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00007FEE radeon 0000:01:00.0: R_008020_GRBM_SOFT_RESET=0x00000001 radeon 0000:01:00.0: R_008010_GRBM_STATUS=0xA0003030 radeon 0000:01:00.0: R_008014_GRBM_STATUS2=0x00000003 radeon 0000:01:00.0: R_000E50_SRBM_STATUS=0x200080C0 radeon 0000:01:00.0: GPU reset succeed [drm] ring test succeeded in 1 usecs [drm] ib test succeeded in 1 usecs [drm] Enabling audio support
If userspace is feeding bad commands to the gpu it might hickup. There is not much you can do on the kernel side about that. (Besides sanity checking the whole command stream, I'm not shure what currently is done in that regard). But even if you manage to do that, userspace could still try to exploit bugs in the gpu silicon. If everything humps along as good as before, I'd say the kernel has done his homework. Under that assumption this bug will be closed as a userspace bug and thus invalid for the kernel. On the other hand, if you manage to find a kernel > 2.6.32 which does not exhibit the problem, we should probably revisit that assumption. You might try to upgrade your userspace graphics stack and you could probably take a look at bugzilla.freedesktop.org and file a bug there. I guess there are many of those, as the symptom (gpu lockup) would fit a lot of userspace bugs. But if you are able to reproduce this, it might even have a chance to get fixed over there.