Bug 199323
Summary: | DVBSky USB TV tuners do not work since 4.10 due to mutex issues | ||
---|---|---|---|
Product: | Drivers | Reporter: | Olli Salonen (olli.salonen) |
Component: | Video(Other) | Assignee: | drivers_video-other |
Status: | RESOLVED CODE_FIX | ||
Severity: | normal | CC: | berndkuhls, forum, forum, guido.aulisi, jahutchinson99, kernel, lbm, n_alp, o.freyermuth, olli.salonen, peterz, sergey79, stefan.bruens, wolfgang |
Priority: | P1 | ||
Hardware: | All | ||
OS: | Linux | ||
Kernel Version: | 4.10 | Subsystem: | |
Regression: | Yes | Bisected commit-id: | |
Attachments: |
Debug logging for I2C
media: dvbsky: use a single mutex and state buffers for all R/W ops mutex fix syslog before reboot m88ds3103-serialize-reset-messages.patch |
Description
Olli Salonen
2018-04-08 17:36:40 UTC
The diseqc command failures are due to I2C read failures. When the problem starts it can be seen that any I2C read returns "07 ff 49 00" or a subset of that depending on how many bytes are read. In the log snippet below it can be seen that after the request: Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 07 Any I2C response back starts with 07. Apr 09 10:33:46 nucserver kernel: dvb_core:dvb_frontend_swzigzag_autotune: usb 1-1: dvb_frontend_swzigzag_autotune: drift:0 inversion:0 auto_step:2 auto_sub_step:0 started_auto_step:0 Apr 09 10:33:46 nucserver kernel: m88ds3103:m88ds3103_set_frontend: m88ds3103 4-0068: delivery_system=5 modulation=0 frequency=2119000 symbol_rate=2960000 inversion=2 pilot=2 rolloff=0 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 07 80 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 08 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 09 01 01 68 3f Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 08 ff Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 07 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 09 01 01 60 3d Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 07 ff Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 07 00 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 07 Apr 09 10:33:46 nucserver kernel: ts2020:ts2020_set_params: frequency=2119500 offset=500 f_vco_khz=4238000 pll_n=2198 div_ref=14 div_out=2 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 Apr 09 10:33:46 nucserver kernel: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< 07 I've now confirmed that I can get a 4.10 kernel with working DVBSky S960 by reverting the following 4 patches: 549bdd3 Revert "locking/mutex: Add lock handoff to avoid starvation" 3210f31 Revert "locking/mutex: Restructure wait loop" 418a170 Revert "locking/mutex: Simplify some ww_mutex code in __mutex_lock_common()" 0b1fb8f Revert "locking/mutex: Enable optimistic spinning of woken waiter" c470abd Linux 4.10 Created attachment 275433 [details]
Debug logging for I2C
This bug is also present in kernel 4.9.35 and in 4.4.50. I downgraded 4.9.35 to 4.4.50 with same error, after approx 20 tuning requests all subsequent tuning requests fail. Upgraded kernel to 4.9.13, error occurred but after many more tuning requests. For me the error occurs but downgrading does not reproduce the error. This bug is still present in Kernel 4.17.2 and renders the widely spread DVBSky 960 v2 useless in Linux. I’m not sure if I’m skilled enough to revert the four patches and rebuild the kernel but I will try. I hope this will be fixed in a future kernel release. Probably the same issue here after upgrading from kernel 4.9.80 to 4.14.62. After the signal is lost (e.g. due to bad weather) and when it comes back - the tuner (DVBSky S960) does not lock the signal anymore. A reboot is required. I was able to reproduce this issue multiple times by blocking the signal from the LNB. Never had this issue with kernel 4.9. I also have this same problem with the DVBSky S960 USB adapter after upgrading to kernel v4.9.111. After a number of tuning requests the device locks up with timeout (-110) errors repeating in dmesg: [934966.146448] m88ds3103 1-0068: i2c wr failed=-110 [934968.283100] usb 1-1: dvb_usb_v2: usb_bulk_msg() failed=-110 [934968.283107] usb 1-1: failed=-110 As reported by others, a reboot is required in order to clear the problem. In kernel 4.18.0 dvbsky.c driver is completely broken - commit 7d95fb7 has to be reverted to make the card work. I tried backporting commit 7d95fb7 to a 4.9 kernel (just out of curiosity) and had the same problem as sergey79@gmail.com - my DVBSky S960 device did not work whatsoever! (In reply to sergey79 from comment #7) > In kernel 4.18.0 dvbsky.c driver is completely broken - commit 7d95fb7 has > to be reverted to make the card work. Confirmed, same problem with 4.18.7, fixed by reverting 7d95fb7. Same here, commit 7d95fb7 hardbreaks the card :-(. I tried my luck on the mailing list, to no avail: https://www.spinics.net/lists/linux-media/msg140839.html https://www.spinics.net/lists/linux-media/msg141067.html What's the way to contact the linux-media developers? It seems both bugzilla and mailing list are not monitored. The tuning/mutex issue that was introduced with kernel 4.10 seem to be fixed in current kernel 4.19. Can anyone confirm? Commit 7d95fb7 still has to be reverted though. (In reply to Oliver Freyermuth from comment #11) > I tried my luck on the mailing list, to no avail: > https://www.spinics.net/lists/linux-media/msg140839.html > https://www.spinics.net/lists/linux-media/msg141067.html > What's the way to contact the linux-media developers? > It seems both bugzilla and mailing list are not monitored. I found this thread, which is where the troublesome commit 7d95fb7 was worked on: https://www.spinics.net/lists/linux-media/msg133995.html Not sure whether replying to that thread would get this issue on the developers radar. On Wed, Oct 03, 2018 at 03:35:59PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > Not sure whether replying to that thread would get this issue on the > developers > radar. Going by this one: https://www.spinics.net/lists/linux-media/msg133161.html Since you all seem to be able to reproduce, could you try with something like the below on top? It is unsafe, it can crash your kernel (if it compiles at all of course), but if it runs it should give us some clues as to who is holding the mutex and how long it takes to acquire it. Output should end up in: /sys/kernel/debug/tracing/trace --- diff --git a/drivers/media/usb/dvb-usb-v2/dvbsky.c b/drivers/media/usb/dvb-usb-v2/dvbsky.c index 1aa88d94e57f..ccbd80c1ae0b 100644 --- a/drivers/media/usb/dvb-usb-v2/dvbsky.c +++ b/drivers/media/usb/dvb-usb-v2/dvbsky.c @@ -51,7 +51,12 @@ static int dvbsky_usb_generic_rw(struct dvb_usb_device *d, int ret; struct dvbsky_state *state = d_to_priv(d); - mutex_lock(&d->usb_mutex); + if (!mutex_trylock(&d->usb_mutex)) { + struct task_struct *p = __mutex_owner(&d->usb_mutex); + trace_printk("usb_mutex owned by: %d/%s\n", p->pid, p->comm); + mutex_lock(&d->usb_mutex); + trace_printk("usb_mutex acquired\n"); + } if (wlen != 0) memcpy(state->obuf, wbuf, wlen); (In reply to peterz from comment #14) > Since you all seem to be able to reproduce, could you try with something > like the below on top? Sorry to ask before trying it out, but - on top of a "vanilla" kernel or after reverting commit 7d95fb7 ? If I read 7d95fb7 correctly, it removed the call to dvbsky_usb_generic_rw from the stream_ctrl code, and essentially keeps the usb_mutex locked for the time the stream_mutex was locked previously. I must even wonder if the issue caused by 7d95fb7 is a mutex issue, or whether the difference is that dvbsky_usb_generic_rw always wrote to the "global" state-buffers obuf and ibuf, which dvb_usbv2_generic_rw_locked naturally does not do anymore. (In reply to peterz from comment #14) > Since you all seem to be able to reproduce, could you try with something > like the below on top? > > Output should end up in: /sys/kernel/debug/tracing/trace > Ok, tested that patch against a "vanilla" kernel 4.18, i.e. with 7d95fb7 applied. Card does not tune, as is the case whenever 7d95fb7 is applied. Nothing in trace log, nothing in kernel log. Do I need to activate tracing somehow? Sorry for my ignorance in case that's obvious :-(. On Wed, Oct 03, 2018 at 08:47:38PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote: > https://bugzilla.kernel.org/show_bug.cgi?id=199323 > > --- Comment #16 from Oliver Freyermuth (o.freyermuth@googlemail.com) --- > (In reply to peterz from comment #14) > > Since you all seem to be able to reproduce, could you try with something > > like the below on top? > > > > Output should end up in: /sys/kernel/debug/tracing/trace > > > > Ok, tested that patch against a "vanilla" kernel 4.18, i.e. with 7d95fb7 > applied. > Card does not tune, as is the case whenever 7d95fb7 is applied. Nothing in > trace log, nothing in kernel log. Do I need to activate tracing somehow? > Sorry for my ignorance in case that's obvious :-(. Hurm.. IIRC all you need is CONFIG_TRACING=y for trace_printk() to work. So if you have the ftrace debugfs stuff, it _should_ all work. Also, you should get a big 'scary' message in dmesg when trace_printk() is active, that's usually a good sign that it works. And yes, on top of vanilla, as we want to know what is preventing things from working there. (In reply to peterz from comment #17) > Hurm.. IIRC all you need is CONFIG_TRACING=y for trace_printk() to work. > So if you have the ftrace debugfs stuff, it _should_ all work. > CONFIG_TRACING=y CONFIG_TRACING_SUPPORT=y are on, and /sys/kernel/debug/tracing/tracing_on contains "1". Sadly, no output, and no scary message in dmesg. > Also, you should get a big 'scary' message in dmesg when trace_printk() > is active, that's usually a good sign that it works. > > And yes, on top of vanilla, as we want to know what is preventing things > from working there. Let's presume it's something we won't see this way. For example, what about the changed timing? Before 7d95fb7, the usb-mutex was locked and unlocked twice in dvbsky_stream_ctrl via the call to dvbsky_usb_generic_rw, in addition to locking the stream-mutex. Now, it's locked only once. That should change the timing between the two calls, no? Since I have no experience with kernel programming, I don't have a feeling how much it will affect the timing, though. Maybe it's insignificant? Any other idea on how to debug it? Argh! My bad. I did only "modules_install" and presumed it would depend on "modules", but it does not :-(. Retrying again now, let's hope we get trace output now. Ok, that did the trick. Scary message seen. This is what I got in trace: # TASK-PID CPU# |||| TIMESTAMP FUNCTION # | | | |||| | | kworker/1:1-67 [001] .... 963.283565: dvbsky_usb_generic_rw: usb_mutex owned by: 8585/kdvb-ad-0-fe-0 kdvb-ad-0-fe-0-8585 [000] .... 963.287983: dvbsky_usb_generic_rw: usb_mutex owned by: 67/kworker/1:1 kworker/1:1-67 [001] .... 963.287990: dvbsky_usb_generic_rw: usb_mutex acquired kdvb-ad-0-fe-0-8585 [000] .... 963.288234: dvbsky_usb_generic_rw: usb_mutex acquired kworker/1:1-67 [001] .... 963.595574: dvbsky_usb_generic_rw: usb_mutex owned by: 8585/kdvb-ad-0-fe-0 kdvb-ad-0-fe-0-8585 [000] .... 963.600189: dvbsky_usb_generic_rw: usb_mutex owned by: 67/kworker/1:1 kworker/1:1-67 [001] .... 963.600205: dvbsky_usb_generic_rw: usb_mutex acquired kdvb-ad-0-fe-0-8585 [000] .... 963.600367: dvbsky_usb_generic_rw: usb_mutex acquired kworker/1:1-67 [001] .... 963.907570: dvbsky_usb_generic_rw: usb_mutex owned by: 8585/kdvb-ad-0-fe-0 kdvb-ad-0-fe-0-8585 [000] .... 963.912435: dvbsky_usb_generic_rw: usb_mutex owned by: 67/kworker/1:1 kworker/1:1-67 [001] .... 963.912454: dvbsky_usb_generic_rw: usb_mutex acquired kdvb-ad-0-fe-0-8585 [000] .... 963.912690: dvbsky_usb_generic_rw: usb_mutex acquired kworker/1:1-67 [001] .... 964.523534: dvbsky_usb_generic_rw: usb_mutex owned by: 8604/kdvb-ad-0-fe-0 kworker/1:1-67 [001] .... 964.524490: dvbsky_usb_generic_rw: usb_mutex acquired kdvb-ad-0-fe-0-8604 [005] .... 964.524499: dvbsky_usb_generic_rw: usb_mutex owned by: 67/kworker/1:1 kdvb-ad-0-fe-0-8604 [005] .... 964.524635: dvbsky_usb_generic_rw: usb_mutex acquired kworker/1:1-67 [001] .... 987.930957: dvbsky_usb_generic_rw: usb_mutex owned by: 8584/w_scan kworker/1:1-67 [001] .... 987.931394: dvbsky_usb_generic_rw: usb_mutex acquired This was when running a scan with w_scan. Does that help? Mauro has just submitted the revert of 7d95fb7 upstream, including 4.18 stable. Apparently, this caused similar issues on RPi, even though in my case it was "standard" Intel x86_64 hardware with an xhci host controller. He presumed the failure might have been caused by changes in the USB subsystem, since the patch initially worked for him. I'll add links to the mail thread as soon as the mails made it to the archive at spinics later. That took long. I notified him on IRC some 2 months ago about this issue. And if this is fix, there is still another older bug present which lets the device stop working after some hours: https://forum.libreelec.tv/thread/12452-problems-with-dvbsky-technotrend-usb-dvb-s2-tuners-since-le8/ I can still confirm this bug. (In reply to DocMAX from comment #22) > there is still another older bug present which lets the > device stop working after some hours This seem to be fixed in kernel 4.19 (probably with latest mutex.c commits). I'm running 4.19-rc6 and I was unable to reproduce that issue (it was easy to reproduce with 4.14 and 4.18). Here the promised link to the mail archives: https://www.spinics.net/lists/linux-media/msg141418.html (In reply to sergey79 from comment #23) > This seem to be fixed in kernel 4.19 (probably with latest mutex.c commits). > I'm running 4.19-rc6 and I was unable to reproduce that issue (it was easy > to reproduce with 4.14 and 4.18). I spoke too soon. Today it stopped working after 3 days. Going back to kernel 4.9... The debug tracing attached to this thread suggests that the device malfunctions shortly after receiving an extra {10} message in-between the {37 00 00}, {36 03 00} message pair: dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 37 00 00 dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 10 dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< ff ff dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 36 03 00 ... m88ds3103:m88ds3103_diseqc_send_master_cmd: m88ds3103 4-0068: diseqc tx timeout m88ds3103:m88ds3103_diseqc_send_master_cmd: m88ds3103 4-0068: failed=-110 Looks to me like this is the bug Mauro was trying to resolve in 7d95fb7, i.e. by removing stream_mutex, and using only the usb_mutex inside dvbsky_stream_ctrl i.e. the same mutex used elsewhere in the driver, to ensure correct serialization. However, as we know, this patch appears to render the device unusable for several users including myself. I've done a couple of minor re-writes to 7d95fb7; most notably so it uses the state buffers rather than passing along a static variable. So far, I've only managed to verify that the device works correctly with this patch applied. I performed several scans & channel tuning attempts and everything worked as normal. However, it can take several days/weeks before the -110 timeout errors occur on my x86_64 machine, so I would appreciate if others who are able to reproduce the issue more readily could test the patch attached to this thread. Note: This should be applied after reverting 7d95fb7. Created attachment 279005 [details]
media: dvbsky: use a single mutex and state buffers for all R/W ops
(In reply to James from comment #27) > Created attachment 279005 [details] > media: dvbsky: use a single mutex and state buffers for all R/W ops That patch didn't fix the issue for me (I also reported this on libreelec forum). The quickest way to reproduce this is to try to tune to transponders when there is no signal from LNB. Otherwise it could take days. So if there is no fix for the mutex thing, how can i fix the stop working problem? If i plug out and in the usb cable the device works. So is a "usb port reset" possible on command line whenever the device freezes? OK, finally, at least on Arch-Linux, it's fixed with kernel 4.19.2. Created attachment 279899 [details]
mutex fix
Comment on attachment 279899 [details]
mutex fix
I found a solution for the mutex issue.
After applying that patch the random tuner failures are gone. Tested with DVBSky S960, kernel 4.19.7. (In reply to sergey79 from comment #32) > Comment on attachment 279899 [details] > mutex fix > > I found a solution for the mutex issue. IMHO that patch isn't an acceptable solution, it disables lock handoff (commit 9d659ae14b545c4296e812c70493bfdc999b5c1c), which I think is not acceptable upstream. It was a fast way to revert that commit (see [0]) and test if it's still causing [1]. I think using single mutex was the right way to patch all handoff issues, but it seems not to work here. [0]: https://bugzilla.kernel.org/show_bug.cgi?id=195471#c27 [1]: https://bugzilla.kernel.org/show_bug.cgi?id=195471 (In reply to Guido Aulisi from comment #34) > IMHO that patch isn't an acceptable solution I agree but it's better than nothing. At least I can use new kernels now. > I think using single mutex was the right way to patch all handoff issues, > but it seems not to work here. That is right, the original "single mutex" commit that appeared in kernel 4.18 completely broke the driver. With the patch from James (media: dvbsky: use a single mutex and state buffers for all R/W ops) the driver works but it didn't fix the original mutex issue. Created attachment 280127 [details] syslog before reboot (In reply to sergey79 from comment #32) > Comment on attachment 279899 [details] > mutex fix > > I found a solution for the mutex issue. While this patch overall worked great during the last days the machine rebooted a couple of minutes ago, syslog is attached, kernel 4.19.12 x86_64. (In reply to Bernd Kuhls from comment #36) > While this patch overall worked great during the last days the machine > rebooted a couple of minutes ago, syslog is attached, kernel 4.19.12 x86_64. I'm using kernel 4.19 with this patch for 2 weeks, running VDR server 24/7. No issues so far. Created attachment 280407 [details]
m88ds3103-serialize-reset-messages.patch
I managed to find some time to run this issue to ground, and have developed the following patch which should resolve issue introduced by the following commit:
9d659ae: ("locking/mutex: Add lock handoff to avoid starvation")
Would be interested to know whether this patch gives others the same success it gives me, before I submit for upstream consideration.
Whilst debugging, I also spotted a race condition in the driver which has been around for a few years, but that's a separate issue for another day and another thread. In the meantime, this should hopefully get the driver as stable as before commit 9d659ae.
(In reply to James from comment #38) > Created attachment 280407 [details] > m88ds3103-serialize-reset-messages.patch I tested your patch with kernel 4.19.14 and I can confirm that it fixed the issue! (In reply to James from comment #38) > Created attachment 280407 [details] > m88ds3103-serialize-reset-messages.patch Same here, while sometimes I even had failure on the first tune, it now seems back to rock stability. Tested with 4.20.1. Thanks a lot! (In reply to Oliver Freyermuth from comment #40) > (In reply to James from comment #38) > > Created attachment 280407 [details] > > m88ds3103-serialize-reset-messages.patch > > Same here, while sometimes I even had failure on the first tune, it now > seems back to rock stability. > Tested with 4.20.1. > > Thanks a lot! Thanks for the confirmation. I have submitted the patch for the consideration of the upstream devs: https://patchwork.kernel.org/patch/10761423/ (In reply to James from comment #38) > Created attachment 280407 [details] > m88ds3103-serialize-reset-messages.patch Tested with 4.20.1 and had no issues in the last four days. (In reply to James from comment #26) > The debug tracing attached to this thread suggests that the device > malfunctions shortly after receiving an extra {10} message in-between the > {37 00 00}, {36 03 00} message pair: > > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 37 00 > 00 > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 10 > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< ff ff > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 36 03 > 00 "10" is the remote control poll. Would be interesting if this still works after Un-reverting and using "options dvb_usb_dvbsky disable_rc=1" ... (In reply to Stefan Brüns from comment #43) > (In reply to James from comment #26) > > The debug tracing attached to this thread suggests that the device > > malfunctions shortly after receiving an extra {10} message in-between the > > {37 00 00}, {36 03 00} message pair: > > > > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 37 00 > > 00 > > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 10 > > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: <<< ff ff > > dvb_usb_v2:dvb_usb_v2_generic_io: usb 1-1: dvb_usb_v2_generic_io: >>> 36 03 > > 00 > > "10" is the remote control poll. Would be interesting if this still works > after Un-reverting and using "options dvb_usb_dvbsky disable_rc=1" ... I had also spotted this and did indeed try setting disable_rc=1 which did not fix the issue. The capture you are looking at relates to my earlier patch attempt to remove the stream_mutex, and use only the usb_mutex; which proved not to be the issue. The real issue is described in the comments I included in the patch, where i also included a capture that is more accurately indicitive of the problem.... i.e. In the following capture: - thread #1325 is running m88ds3103_set_frontend - thread #42 is running ts2020_stat_work a> [1325] usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 07 80 [1325] usb 1-1: dvb_usb_v2_generic_io: <<< 08 [42] usb 1-1: dvb_usb_v2_generic_io: >>> 09 01 01 68 3f [42] usb 1-1: dvb_usb_v2_generic_io: <<< 08 ff [42] usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 [42] usb 1-1: dvb_usb_v2_generic_io: <<< 07 [42] usb 1-1: dvb_usb_v2_generic_io: >>> 09 01 01 60 3d [42] usb 1-1: dvb_usb_v2_generic_io: <<< 07 ff b> [1325] usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 07 00 [1325] usb 1-1: dvb_usb_v2_generic_io: <<< 07 [42] usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 [42] usb 1-1: dvb_usb_v2_generic_io: <<< 07 [42] usb 1-1: dvb_usb_v2_generic_io: >>> 09 01 01 60 21 [42] usb 1-1: dvb_usb_v2_generic_io: <<< 07 ff [42] usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 [42] usb 1-1: dvb_usb_v2_generic_io: <<< 07 [42] usb 1-1: dvb_usb_v2_generic_io: >>> 09 01 01 60 66 [42] usb 1-1: dvb_usb_v2_generic_io: <<< 07 ff [1325] usb 1-1: dvb_usb_v2_generic_io: >>> 08 68 02 03 11 [1325] usb 1-1: dvb_usb_v2_generic_io: <<< 07 [1325] usb 1-1: dvb_usb_v2_generic_io: >>> 08 60 02 10 0b [1325] usb 1-1: dvb_usb_v2_generic_io: <<< 07 Two i2c messages are sent to perform a reset in m88ds3103_set_frontend: a. 0x07, 0x80 b. 0x07, 0x00 However, as shown in the capture, the regmap mutex is being handed over to another thread (ts2020_stat_work in this example) in between these two messages. From here, the device responds to every i2c message with an 07 message, and will only return to normal operation following a power cycle. I managed to capture the issue several times with similar debug traces to that above. They all followed this same pattern. The ts2020_stat_work thread wakes up every 2000ms to gather signal strength stats from the device. From my debug captures i found that the problem occurs if that tread happens to wake up whilst another thread is in m88ds3103_set_frontend (per example above). The patch resolves this by using regmap_multi_reg_write to group the two reset messages, ensuring both are processed before the regmap mutex/i2c gate is unlocked. |