[Symptom] Processes that try to open a cx88-blackbird driven MPEG device will hang up. [Cause] Nested mutex_locks (which are not allowed) result in a deadlock. [Details] There has been resent work on removing BKL (BigKernelLock) calls from kernel code. (see http://kernelnewbies.org/BigKernelLock) This was not properly done for the cx88-blackbird driver: Source-File: drivers/media/video/cx88/cx88-blackbird.c Function: int mpeg_open(struct file *file) Problem: the calls to drv->request_acquire(drv); and drv->request_release(drv); will hang because they try to lock a mutex that has already been locked by a previouse call to mutex_lock(&dev->core->lock) ... 1050 static int mpeg_open(struct file *file) 1051 { [...] 1060 mutex_lock(&dev->core->lock); // MUTEX LOCKED !!!!!!!!!!!!!!!! 1061 1062 /* Make sure we can acquire the hardware */ 1063 drv = cx8802_get_driver(dev, CX88_MPEG_BLACKBIRD); 1064 if (drv) { 1065 err = drv->request_acquire(drv); // HANGS !!!!!!!!!!!!!!!!!!! 1066 if(err != 0) { 1067 dprintk(1,"%s: Unable to acquire hardware, %d\n", __func__, err); 1068 mutex_unlock(&dev->core->lock);; 1069 return err; 1070 } 1071 } [...] Here's the relevant kernel log extract (Linux version 2.6.38-1-amd64 (Debian 2.6.38-1)) ... Mar 24 21:25:10 xen kernel: [ 241.472067] INFO: task v4l_id:1000 blocked for more than 120 seconds. Mar 24 21:25:10 xen kernel: [ 241.478845] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Mar 24 21:25:10 xen kernel: [ 241.482412] v4l_id D ffff88006bcb6540 0 1000 1 0x00000000 Mar 24 21:25:10 xen kernel: [ 241.486031] ffff88006bcb6540 0000000000000086 ffff880000000001 ffff88006981c380 Mar 24 21:25:10 xen kernel: [ 241.489694] 0000000000013700 ffff88006be5bfd8 ffff88006be5bfd8 0000000000013700 Mar 24 21:25:10 xen kernel: [ 241.493301] ffff88006bcb6540 ffff88006be5a010 ffff88006bcb6540 000000016be5a000 Mar 24 21:25:10 xen kernel: [ 241.496766] Call Trace: Mar 24 21:25:10 xen kernel: [ 241.500145] [<ffffffff81321c4a>] ? __mutex_lock_common+0x127/0x193 Mar 24 21:25:10 xen kernel: [ 241.503630] [<ffffffff81321d82>] ? mutex_lock+0x1a/0x33 Mar 24 21:25:10 xen kernel: [ 241.507145] [<ffffffffa09dd155>] ? cx8802_request_acquire+0x66/0xc6 [cx8802] Mar 24 21:25:10 xen kernel: [ 241.510699] [<ffffffffa0aab7f2>] ? mpeg_open+0x7a/0x1fc [cx88_blackbird] Mar 24 21:25:10 xen kernel: [ 241.514279] [<ffffffff8123bfb6>] ? kobj_lookup+0x139/0x173 Mar 24 21:25:10 xen kernel: [ 241.517856] [<ffffffffa062d5fd>] ? v4l2_open+0xb3/0xdf [videodev]
Thanks. LKML thread: http://thread.gmane.org/gmane.linux.kernel/1118815
Created attachment 52902 [details] patch that fixes cx88_blackbird driver lock issues Ben Hutchings provided me with a patch that solved the deadlock during mpeg_open() but left some other lock issues unresolved. I could do the remaining work and fixed all issues I had since kernel 2.6.37. (Tested on a PC with 2 Hauppauge HVR1300 TV cards.) Everything works fine for me now. The new patch (cx88-2.6.38-fix-driver-deadlocks.patch) is attached ...
I know this isn't a help forum but I wanted to see if this patch works but I'm not sure how to use the patch. I tried this patch cx88-blackbird.c /home/fred/cx88-2.6.38-fix-driver-deadlocks.patch but it didn't work. Am I doing this wrong?
(In reply to comment #3) > I know this isn't a help forum but I wanted to see if this patch works but > I'm > not sure how to use the patch. > I tried this > > patch cx88-blackbird.c /home/fred/cx88-2.6.38-fix-driver-deadlocks.patch > > but it didn't work. Am I doing this wrong? Hi Andrew, you need to patch the kernel sources. If you have extracted them into a folder let's say /usr/src/linux-source-2.6.38/ change dir into this folder and execute patch -p1 < /home/fred/cx88-2.6.38-fix-driver-deadlocks.patch This should work. Andi.
I tried patching but I get an error saying it was previously patched, I'm running Gentoo so maybe the patch is already installed. I'm wondering if I actually have the same bug as everyone here. The problem I have is no channels actually get detected on either my HVR-1300 or WinTV Nova-T although it goes through the motions without any errors. Is this associated with this bug or a different issue completely?
You could try the patch in comment #147 from https://bugs.launchpad.net/mythtv/+bug/439163 with regards to that.
Hi Damien, Damien Churchill wrote: > You could try the patch in comment #147 from > https://bugs.launchpad.net/mythtv/+bug/439163 with regards to that. The launchpad bug you reference seems to be about something else. (1) It is from 2009-09-29, way before the BKL conversion (2) It is about a card being recognized incorrectly rather than hangs. Are you sure you have the right bug? If so, have you tried the patches at [1] (which seem to work ok)? Naturally I'd be very interested in problems with in the patch series (regressions or locking problems the patch missed). [1] http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/31187
(In reply to comment #7) > Are you sure you have the right bug? If so, have you tried the > patches at [1] (which seem to work ok)? Naturally I'd be very > interested in problems with in the patch series (regressions or > locking problems the patch missed). > Sorry I was replying to the person above me, I should have made that clearer. He has the same card as I do and seems to experience a similar issue to me (no channels being found) and I discovered that patch whilst hunting around earlier. I was also experiencing this bug until I applied the attached patch and it has successfully fixed it so can confirm the patch works well on a Hauppauge HVR1300.
Created attachment 53722 [details] patch from http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/31187 Ah, thanks, Damien. I'm attaching a patch that squashes together the fixes from http://thread.gmane.org/gmane.linux.drivers.video-input-infrastructure/31187 for convenience. It's almost the same as Andi's patch but closes a few more races and keeps the reference count as a reference count to avoid breaking when multiple dvb frontends try concurrently to access the device. The no-channels-found problem is probably somehow related to <https://bugzilla.kernel.org/show_bug.cgi?id=26962>.
Excellent, I'll have to rebuild and give your patch a try. I can say that removing the 4 lines as suggested in the Launchpad link (comment 147) does actually allow w_scan, scan and gstreamer all to work with my card now which is a definite improvement. Not having looked at the code I have no idea if it's a bad fix however.
Comment on attachment 52902 [details] patch that fixes cx88_blackbird driver lock issues premature and replaced by https://bugzilla.kernel.org/attachment.cgi?id=53722
This and related problems should be fixed by - 8a317a87 ([media] cx88: protect per-device driver list with device lock) - 1fe70e96 ([media] cx88: fix locking of sub-driver operations) - 1d6213ab ([media] cx88: hold device lock during sub-driver initialization) - 344d6c6b ([media] cx88: protect cx8802_devlist with a mutex) - 579b2b45 ([media] cx88: gracefully reject attempts to use unregistered cx88-blackbird driver) - f4bd4be8 ([media] cx88: don't use atomic_t for core->mpeg_users) which have hit mainline (hoorah!). For the sake of people using old kernels: the regression was probably introduced in v2.6.37-rc1~64^2~350 (V4L/DVB: cx88: Remove BKL, 2010-09-15) or thereabouts (BKL removal). Anything older than that should be okay.
Many thanks to Jonathan! I'm closing this bugreport.
A patch referencing this bug report has been merged in v3.0-rc1: commit 1fe70e963028f34ba5e32488a7870ff4b410b19b Author: Jonathan Nieder <jrnieder@gmail.com> Date: Sun May 1 06:29:37 2011 -0300 [media] cx88: fix locking of sub-driver operations
So in which kernel version will be fixed? 2.6.39?
(In reply to comment #15) > So in which kernel version will be fixed? 2.6.39? Both 2.6.38 and 2.6.39 got the fix: http://lwn.net/Articles/445974/ http://lwn.net/Articles/445972/