Bug 13227

Summary: Potential BUG_ON assertion fails in drm_gem_object_free
Product: Drivers Reporter: Alexey Khoroshilov (khoroshilov)
Component: Video(DRI - Intel)Assignee: Eric Anholt (eric)
Status: RESOLVED CODE_FIX    
Severity: normal CC: chris, eric, gordon.jin, jbarnes
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 2.6.30-rc4 Subsystem:
Regression: No Bisected commit-id:

Description Alexey Khoroshilov 2009-05-02 22:39:22 UTC
drm_gem_object_free() (drivers/gpu/drm/drm_gem.c) requires dev->struct_mutex to be locked before a call.

> drm_gem_object_free(struct kref *kref)
> {
>     struct drm_gem_object *obj = (struct drm_gem_object *) kref;
>     struct drm_device *dev = obj->dev;
> 
>     BUG_ON(!mutex_is_locked(&dev->struct_mutex));
> ...

drm_gem_object_free is mostly called through drm_gem_object_unreference(include/drm/drmP.h).

> static inline void
> drm_gem_object_unreference(struct drm_gem_object *obj)
> {
>         if (obj == NULL)
>                 return;
> 
>         kref_put(&obj->refcount, drm_gem_object_free);
> }

The potential issues may come from the following sources.

1. i915_gem_set_tiling (drivers/gpu/drm/i915/i915_gem_tiling.c)

> int
> i915_gem_set_tiling(struct drm_device *dev, void *data,
>            struct drm_file *file_priv)
> {
>         struct drm_i915_gem_set_tiling *args = data;
>         drm_i915_private_t *dev_priv = dev->dev_private;
>         struct drm_gem_object *obj;
>         struct drm_i915_gem_object *obj_priv;
>
>         obj = drm_gem_object_lookup(dev, file_priv, args->handle);
>         if (obj == NULL)
>             return -EINVAL;
>         obj_priv = obj->driver_private;
>
>         if (!i915_tiling_ok(dev, args->stride, obj->size, args->tiling_mode))
>         {
>               drm_gem_object_unreference(obj);  <----- ISSUE #1: unreference
>               before unlock
>               return -EINVAL;
>         }
>
>         mutex_lock(&dev->struct_mutex);
>         ...
line 317:
>               ret = i915_gem_object_unbind(obj);
>               if (ret != 0) {
>                       WARN(ret != -ERESTARTSYS,
>                            "failed to unbind object for tiling switch");
>                       args->tiling_mode = obj_priv->tiling_mode;
>                       mutex_unlock(&dev->struct_mutex);
>                       drm_gem_object_unreference(obj);  <----- ISSUE #2:
>                       unreference after unlock
>
>                       return ret;

2. i915_gem_pread_ioctl and i915_gem_pwrite_ioctl (drivers/gpu/drm/i915/i915_gem.c)

> int
> i915_gem_pread_ioctl(struct drm_device *dev, void *data,
>                      struct drm_file *file_priv)
> {
>       struct drm_i915_gem_pread *args = data;
>       struct drm_gem_object *obj;
>       struct drm_i915_gem_object *obj_priv;
>       int ret;
>
>       obj = drm_gem_object_lookup(dev, file_priv, args->handle);
>       if (obj == NULL)
>               return -EBADF;
>       obj_priv = obj->driver_private;
>
>       /* Bounds check source.
>        *
>        * XXX: This could use review for overflow issues...
>        */
>       if (args->offset > obj->size || args->size > obj->size ||
>           args->offset + args->size > obj->size) {
>               drm_gem_object_unreference(obj);  <----- ISSUE #3: unreference
>               without lock
>               return -EINVAL;
>       }
>
>       if (i915_gem_object_needs_bit17_swizzle(obj)) {
>               ret = i915_gem_shmem_pread_slow(dev, obj, args, file_priv);
>       } else {
>               ret = i915_gem_shmem_pread_fast(dev, obj, args, file_priv);
>               if (ret != 0)
>                       ret = i915_gem_shmem_pread_slow(dev, obj, args,
>                                                       file_priv);
>       }
>
>       drm_gem_object_unreference(obj);  <----- ISSUE #4: unreference without
>       lock
>
>       return ret;
> }

The same is for i915_gem_pwrite_ioctl().
Comment 1 Gordon Jin 2009-09-17 03:33:05 UTC
wondering if this issue still exist in the latest kernel?
Comment 2 Alexey Khoroshilov 2009-09-17 17:31:53 UTC
(In reply to comment #1)
> wondering if this issue still exist in the latest kernel?

I have checked Linus tree as of today.

The first two issues have been fixed, the third and the fourth ones are still present.

Is a more up-to-date tree available anywhere?
Comment 3 Jesse Barnes 2009-12-02 19:08:11 UTC
Yep, these look like real locking bugs.  Unfortunately just taking the struct mutex in the pread/pwrite ioctls won't work since the callees take those and have ordering requirements vs. the mmap sem.  I guess we'll have to push the bounds checking and final unref into the callees as well.

Eric, I think you messed with this code last to fix the mmap vs. struct mutex inversion?
Comment 4 Jesse Barnes 2009-12-02 19:08:35 UTC
Eric?
Comment 5 Chris Wilson 2010-05-07 16:43:01 UTC
This was cleaned up with the introduction of drm_gem_object_unreference_unlocked() - though we may have missed a few, or even added some more since.