Bug 207219 - [BISECTED] Sony Vaio laptop built-in ricoh webcam no longer found on Fedora 31 with 5.4.11 kernel
Summary: [BISECTED] Sony Vaio laptop built-in ricoh webcam no longer found on Fedora 3...
Status: NEW
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: x86-64 Linux
: P1 normal
Assignee: Default virtual assignee for Drivers/USB
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2020-04-12 23:35 UTC by William Bader
Modified: 2020-05-02 02:57 UTC (History)
3 users (show)

See Also:
Kernel Version: 5.5.15
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments
/sys/kernel/debug/usb/usbmon/1u trace on bad kernet (24.46 KB, text/plain)
2020-04-13 17:54 UTC, William Bader
Details
trace on good kernel (33.67 KB, text/plain)
2020-04-13 17:54 UTC, William Bader
Details
5.6.6 config (213.98 KB, text/plain)
2020-04-27 17:26 UTC, William Bader
Details
logs from /sys/kernel/debug/usb/usbmon/3u and 4u (83.79 KB, application/x-bzip)
2020-04-28 22:17 UTC, William Bader
Details
usbmon logs (14.89 KB, application/x-bzip)
2020-04-30 23:26 UTC, William Bader
Details
usbmon log (35.69 KB, text/plain)
2020-05-01 21:55 UTC, William Bader
Details

Description William Bader 2020-04-12 23:35:34 UTC
Linux no longer finds the built-in Ricoh webcam on my Sony Vaio laptop.

Let me know if there is anything that I can try. Steve on RedHat's bugzilla showed me how to build kernels from source and then test patches.

/var/log/messages shows the lines below, and Linux does not create /dev/video*.

Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: config 247 has too many interfaces: 120, using maximum allowed: 32
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: config 247 descriptor has 1 excess byte, ignoring
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: config 247 has 0 interfaces, different from the descriptor's value: 120
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: New USB device found, idVendor=05ca, idProduct=18c0, bcdDevice= 7.32
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: Product: USB2.0 Camera
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: Manufacturer: Ricoh Company Ltd.
Mar 30 15:00:58 scslaptop37 kernel: usb 1-1.3: can't set config #247, error -32

I did a kernel bisection and identified the 5.4.10 commit [7cbdf96cda1fbffb17ec26ea65e1fe63c9aed430] usb: missing parentheses in USE_NEW_SCHEME
The change is just one line:
drivers/usb/core/hub.c
-#define USE_NEW_SCHEME(i, scheme)      ((i) / 2 == (int)scheme)
+#define USE_NEW_SCHEME(i, scheme)      ((i) / 2 == (int)(scheme))
The macro is called as
  USE_NEW_SCHEME(retry, old_scheme_first_port || old_scheme_first || quick_enumeration)
so the parentheses change it from using the new scheme to using the old scheme if old_scheme_first or quick_enumeration is set. That seems to make sense, but the change doesn't seem to work for the webcam.

There is a discussion at https://bugzilla.redhat.com/show_bug.cgi?id=1818952

I tried applying the patches at https://bugzilla.kernel.org/show_bug.cgi?id=203419#c27 (testpatch that doesn't clear TT buffer after protocol STALL) but they didn't help.

Here are the results of booting unpatched kernels with
dyndbg='module xhci_hcd =pmf ; module usbcore =pmf'
and running
journalctl --no-hostname -k | grep -v audit | grep -C1 -n 'usb 1-1.3'

Good kernel (5.4.10 through 093d658a06cd1831c629ceeee207572895c1a872)

943-Apr 12 00:10:37 kernel: psmouse serio1: synaptics: queried max coordinates: x [..5692], y [..4714]
944:Apr 12 00:10:37 kernel: usb 1-1.3: new high-speed USB device number 4 using ehci-pci
945-Apr 12 00:10:37 kernel: psmouse serio1: synaptics: Touchpad model: 1, fw: 7.4, id: 0x1c0b1, caps: 0xd04731/0xa40000/0xa0000/0x0, board id: 0, fw id: 631103
946:Apr 12 00:10:37 kernel: usbcore:usb_parse_configuration: usb 1-1.3: skipped 1 descriptor after configuration
947:Apr 12 00:10:37 kernel: usbcore:usb_parse_interface: usb 1-1.3: skipped 5 descriptors after interface
948:Apr 12 00:10:37 kernel: usbcore:usb_parse_endpoint: usb 1-1.3: skipped 1 descriptor after endpoint
949:Apr 12 00:10:37 kernel: usbcore:usb_parse_interface: usb 1-1.3: skipped 17 descriptors after interface
950:Apr 12 00:10:37 kernel: usbcore:usb_get_langid: usb 1-1.3: default language 0x0409
951:Apr 12 00:10:37 kernel: usbcore:usb_new_device: usb 1-1.3: udev 4, busnum 1, minor = 3
952:Apr 12 00:10:37 kernel: usb 1-1.3: New USB device found, idVendor=05ca, idProduct=18c0, bcdDevice= 7.32
953:Apr 12 00:10:37 kernel: usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
954:Apr 12 00:10:37 kernel: usb 1-1.3: Product: USB2.0 Camera
955:Apr 12 00:10:37 kernel: usb 1-1.3: Manufacturer: Ricoh Company Ltd.
956:Apr 12 00:10:37 kernel: usbcore:usb_probe_device: usb 1-1.3: usb_probe_device
957:Apr 12 00:10:37 kernel: usbcore:usb_choose_configuration: usb 1-1.3: configuration #1 chosen from 1 choice
958:Apr 12 00:10:37 kernel: usbcore:usb_set_configuration: usb 1-1.3: adding 1-1.3:1.0 (config #1, interface 0)
959:Apr 12 00:10:37 kernel: usbcore:usb_set_configuration: usb 1-1.3: adding 1-1.3:1.1 (config #1, interface 1)
960-Apr 12 00:10:37 kernel: input: SynPS/2 Synaptics TouchPad as /devices/platform/i8042/serio1/input/input4
--
1142-Apr 12 00:10:43 kernel: tun: Universal TUN/TAP device driver, 1.6
1143:Apr 12 00:10:45 kernel: usbcore:usb_port_suspend: usb 1-1.3: usb auto-suspend, wakeup 0
1144-Apr 12 00:10:47 kernel: [drm] enabling PCIE gen 2 link speeds, disable with radeon.pcie_gen2=0
--
1222-Apr 12 00:11:08 kernel: usbcore:hub_event: hub 1-1:1.0: state 7 ports 6 chg 0000 evt 0000
1223:Apr 12 00:11:08 kernel: usbcore:usb_port_resume: usb 1-1.3: usb auto-resume
1224:Apr 12 00:11:09 kernel: usbcore:wait_for_connected: usb 1-1.3: Waited 0ms for CONNECT
1225:Apr 12 00:11:09 kernel: usbcore:finish_port_resume: usb 1-1.3: finish resume
1226:Apr 12 00:11:11 kernel: usbcore:usb_port_suspend: usb 1-1.3: usb auto-suspend, wakeup 0
1227-Apr 12 00:11:11 kernel: usbcore:hub_suspend: hub 1-1:1.0: hub_suspend

bad kernel (Fedora distributed 5.5.15-200.fc31.x86_64):

931-Apr 12 00:29:30 kernel: Code: 1f 80 00 00 00 00 e8 9b c2 ff ff 48 8d bd 38 ff ff ff be 3d 00 00 00 48 89 85 28 ff ff ff 48 89 85 38 ff ff ff e8 2c f6 ff ff <80> 38 69 0f 85 2b 02 00 00 80 78 01 70 0f 85 21 02 00 00 0f b6 58
932:Apr 12 00:29:30 kernel: usb 1-1.3: new high-speed USB device number 4 using ehci-pci
933:Apr 12 00:29:30 kernel: usb 1-1.3: config 247 has too many interfaces: 120, using maximum allowed: 32
934:Apr 12 00:29:30 kernel: usb 1-1.3: config 247 descriptor has 1 excess byte, ignoring
935:Apr 12 00:29:30 kernel: usb 1-1.3: config 247 has 0 interfaces, different from the descriptor's value: 120
936:Apr 12 00:29:30 kernel: usbcore:usb_parse_configuration: usb 1-1.3: skipped 7 descriptors after configuration
937:Apr 12 00:29:30 kernel: usbcore:usb_get_langid: usb 1-1.3: default language 0x0409
938:Apr 12 00:29:30 kernel: usbcore:usb_new_device: usb 1-1.3: udev 4, busnum 1, minor = 3
939:Apr 12 00:29:30 kernel: usb 1-1.3: New USB device found, idVendor=05ca, idProduct=18c0, bcdDevice= 7.32
940:Apr 12 00:29:30 kernel: usb 1-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
941:Apr 12 00:29:30 kernel: usb 1-1.3: Product: USB2.0 Camera
942:Apr 12 00:29:30 kernel: usb 1-1.3: Manufacturer: Ricoh Company Ltd.
943:Apr 12 00:29:30 kernel: usbcore:usb_probe_device: usb 1-1.3: usb_probe_device
944:Apr 12 00:29:30 kernel: usbcore:usb_choose_configuration: usb 1-1.3: configuration #247 chosen from 1 choice
945:Apr 12 00:29:30 kernel: usb 1-1.3: can't set config #247, error -32
946-Apr 12 00:29:30 kernel: psmouse serio1: synaptics: queried max coordinates: x [..5692], y [..4714]
Comment 1 Alan Stern 2020-04-13 17:16:14 UTC
You should collect usbmon traces showing what happens when the camera is initialized, under both a good and a bad kernel.  The best way to do this is probably:

  cat /sys/kernel/debug/usb/usbmon/1u >OUTPUT-FILENAME &
  echo 0 >/sys/bus/usb/devices/1-1/bConfigurationValue
  echo 1 >/sys/bus/usb/devices/1-1/bConfigurationValue
  ... wait a few seconds ...
  fg
  ^C

It's obvious that the camera is sending nonsense data to the system; maybe the trace will help show why.
Comment 2 William Bader 2020-04-13 17:54:02 UTC
Created attachment 288423 [details]
/sys/kernel/debug/usb/usbmon/1u trace on bad kernet

I made a trace on good and bad kernels.
Let me know if I captured enough data.
My boot command line has 'rhgb quiet'.
If that limits the data, I can remove it and do the capture again.
Comment 3 William Bader 2020-04-13 17:54:33 UTC
Created attachment 288425 [details]
trace on good kernel
Comment 4 Alan Stern 2020-04-14 15:19:46 UTC
In fact, I think your problem isn't caused by commit 7cbdf96cda1f at all.  It's caused by commit bd0e6c9614b9 ("usb: hub: try old enumeration scheme first for high speed devices").  The bisection settled on 7cbdf96cda1f because the bug it fixed was hiding your real problem.

Try starting with a vanilla (bad) kernel and reverting commit bd0e6c9614b9; let's see if that fixes the camera problem.

Incidentally, the usbmon traces didn't help very much.  The trace under the bad kernel showed that the camera had already gotten messed up before the trace began, probably when the camera was probed during boot.  The camera bug (and it is a bug in the camera, not in the kernel) showed up in the trace in two ways: The camera connected to the USB bus at full speed instead of high speed, and when it was connected it didn't respond to any requests.
Comment 5 William Bader 2020-04-14 21:59:23 UTC
Thanks for looking at it.

>Try starting with a vanilla (bad) kernel and reverting commit bd0e6c9614b9;
>let's see if that fixes the camera problem.

Yes, that fixed the camera problem.

What is the next step?

>it is a bug in the camera, not in the kernel

It wouldn't surprise me.

Here is what I did:

>$ git clone --branch linux-5.4.y
>https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.4
>$ git status
> On branch linux-5.4.y
> Your branch is up to date with 'origin/linux-5.4.y'.
> nothing to commit, working tree clean
>$ git log --oneline -n1 HEAD
> bc844d58f697 (HEAD -> linux-5.4.y, tag: v5.4.32, origin/linux-5.4.y) Linux
> 5.4.32
>$ # <- build, install, and confirm that the webcam does not work
>$ git log --oneline -n1 bd0e6c9614b9
> bd0e6c9614b9 usb: hub: try old enumeration scheme first for high speed
> devices
>$ git revert bd0e6c9614b9
>$ git status
> On branch linux-5.4.y
> Your branch is ahead of 'origin/linux-5.4.y' by 1 commit.
> (use "git push" to publish your local commits)
> nothing to commit, working tree clean
>$ git diff v5.4.32 | grep '^[+-]'
>--- a/Documentation/admin-guide/kernel-parameters.txt
>+++ b/Documentation/admin-guide/kernel-parameters.txt
>-                       scheme,  applies only to low and full-speed devices
>-                        (default 0 = off).
>+                       scheme (default 0 = off).
>--- a/drivers/usb/core/hub.c
>+++ b/drivers/usb/core/hub.c
>-       int quick_enumeration = (udev->speed == USB_SPEED_HIGH);
>-       return USE_NEW_SCHEME(retry, old_scheme_first_port || old_scheme_first
>-                             || quick_enumeration);
>+       return USE_NEW_SCHEME(retry, old_scheme_first_port ||
>old_scheme_first);
>$ # <- the commit makes quick_enumeration use the new scheme before
>"[7cbdf96cda1fbffb17ec26ea65e1fe63c9aed430] usb: missing parentheses in
>USE_NEW_SCHEME"
>$ # <- and the old scheme after.
>$ make mrproper # <- was this needed?
>$ cp ../config .config # <- my saved config based on Fedora's
>$ uemacs .config # bump CONFIG_LOCALVERSION to ".localversion13"
>$ cp -p .config ../config-`grep Linux .config | head -1 | awk '{print
>$3}'`-`grep -i CONFIG_LOCALVERSION= .config | sed -e 's/.*=".//' -e 's/"//g'`
>$ make oldconfig
>$ time make
>$ time make binrpm-pkg
>$ sudo dnf install
>kernel-5.4.32.localversion13_00001_g0dfde04ccc52-1.x86_64.rpm
><- reboot
>$ uname -r
>5.4.32.localversion13-00001-g0dfde04ccc52
><- the webcam works
Comment 6 Alan Stern 2020-04-15 01:26:15 UTC
On Tue, 14 Apr 2020 bugzilla-daemon@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=207219
> 
> --- Comment #5 from William Bader (williambader@hotmail.com) ---
> Thanks for looking at it.
> 
> >Try starting with a vanilla (bad) kernel and reverting commit bd0e6c9614b9;
> >let's see if that fixes the camera problem.
> 
> Yes, that fixed the camera problem.
> 
> What is the next step?

I think the next step is for me to revert that commit.  It was written
by Zeng Tao; he said that it makes Linux behave more like Windows.  
However, based on the behavior of your webcam, I doubt that is true.  
If Windows tried to use the "old scheme" for enumerating the webcam, it 
wouldn't work at all.  At least, not if the webcam behaves the same as 
it does in your tests.

Zeng, any comments?  You can see the discussion on Bugzilla at the URL 
above.

Alan Stern
Comment 7 William Bader 2020-04-16 23:22:44 UTC
It might also affect some USB flash drives: https://bugzilla.redhat.com/show_bug.cgi?id=1818952#c163
Comment 8 Alan Stern 2020-04-17 15:24:32 UTC
Don't worry about that "device not accepting address 2, error -71" line.  Clearly it was just a temporary problem, since the next attempt at initializing that device succeeded.  Unlike the case with your webcam, where a single failure renders the device unusable until the next cold boot.

It's not unusual for some devices to be a little fussy when they start up.  As long they work okay after a retry or two, you shouldn't be concerned.
Comment 9 William Bader 2020-04-25 21:51:04 UTC
Thanks for working on it so quickly. I got an email a few days ago that a patch "USB: hub: Revert commit bd0e6c9614b9 ("usb: hub: try old enumeration" was added to the usb-linus branch of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
I can see it as https://kernel.googlesource.com/pub/scm/linux/kernel/git/gregkh/usb/+/3155f4f40811c5d7e3c686215051acf504e05565
Considering that reverting the commit will make USB enumeration slower and that apparently no one else has reported a problem, I would be fine with adding something like usbcore.old_scheme_first=1 to my kernel command line (although that particular option didn't help on kernel 5.5.15-200.fc31.x86_64).
Comment 10 Alan Stern 2020-04-26 14:25:33 UTC
It won't slow down enumeration all that much.

If you had a problem then it's likely that a bunch of other people had the same problem too; they just didn't bother to report it.  Fixing your system will help all those other people as well -- certainly it should help anyone who has the same kind of laptop with the same kind of camera.  (Unless the problem was a hardware flaw in your webcam, not any others.)

In theory we could add another usbcore module parameter for this.  But in fact, usbcore already has too many module parameters; adding new ones is discouraged.  Besides, that sort of thing isn't guaranteed always to work.  Some devices will want the old scheme to come first, and others (like your webcam) will want the new scheme to come first.  What can you do if your system has both types of device present?

If you really want to track this down farther, here's something you can try.  Set up your system so that the ehci-pci driver doesn't get loaded automatically, in either the running system or the initrd image.  Then when the system has started up, start up a usbmon trace from the 0u file (that is, all buses) and load ehci-pci manually.  That way we can see exactly what happens when the webcam gets initialized for the first time.
Comment 11 William Bader 2020-04-26 17:55:03 UTC
I see the instructions for making a usbmon trace in an earlier comment.
How do I keep the ehci-pci driver from getting loaded, and how do I load it manually?

Is this how to keep it from being loaded?
# echo blacklist ehci_hcd > /etc/modprobe.d/ehci_hcd.conf
# cp /boot/initrd.$(uname -r).img save
# mkinitrd -o /boot/initrd.$(uname -r).img $(uname -r)
# reboot

How do I load it? I was guessing modprobe, but "lsmod | grep hci" shows only sdhci, cqhci, and sdhci_pci.

"Set up your system so that the ehci-pci driver doesn't get loaded automatically, in either the running system or the initrd image.  Then when the system has started up, start up a usbmon trace from the 0u file (that is, all buses) and load ehci-pci manually.  That way we can see exactly what happens when the webcam gets initialized for the first time."
Comment 12 Alan Stern 2020-04-26 21:07:04 UTC
Yes, that procedure should prevent the driver from being loaded, except that the module you want to affect is ehci-pci, not ehci-hcd.

You load the driver manually by doing:

   insmod /lib/modules/$(uname -r)/kernel/drivers/usb/host/ehci-pci.ko

Using modprobe won't work when you have blacklisted the driver.  Also, make sure your .config has:

   CONFIG_USB_EHCI_PCI=m

so that the driver is built as a module.

(An alternate approach, that doesn't involve blacklisting, is to simply rename the driver file.  Then modprobe won't be able to find and load it.  But of course you still have to rebuild the initrd image.)
Comment 13 William Bader 2020-04-27 17:26:30 UTC
Created attachment 288777 [details]
5.6.6 config

I checked the kernels that I built for bisection, and they all had CONFIG_USB_EHCI_PCI=y.
I am trying to build a vanilla 5.6.6 kernel because I have to build a new one anyway, and Fedora 31 is currently at 5.6.6-200.fc31.x86_64 (You wanted me to test on a kernel where the webcam does not work, right?)
When I set CONFIG_USB_EHCI_PCI=m and run 'make oldconfig', it changes back to y. Are there other options that also have to be modules?
I got an automated email that the change to revert the commit is about to be included, so does it still matter if I do more testing?
Comment 14 Alan Stern 2020-04-28 18:44:57 UTC
I have no idea what's going wrong with "make oldconfig".  You could try running "make menuconfig"; maybe that will provide a clue.

It's up to you whether you want to do more testing.
Comment 15 William Bader 2020-04-28 22:17:28 UTC
Created attachment 288811 [details]
logs from /sys/kernel/debug/usb/usbmon/3u and 4u

I made the logs. Here is what I did:
use make menuconfig to set
 CONFIG_USB_EHCI_HCD=m CONFIG_USB_OHCI_HCD=m CONFIG_USB_OHCI_HCD_PCI=m CONFIG_USB_UHCI_HCD=m CONFIG_USB_XHCI_HCD=m CONFIG_USB_XHCI_PCI=m
build, install, and cold boot with new kernel
confirm no /dev/video
rename /lib/modules/$(uname -r)/kernel/drivers/usb/host/ehci-pci.ko
sudo mkinitrd --force /boot/initramfs-$(uname -r).img $(uname -r)
Creating: target|kernel|dracut args|basicmodules
dracut-install: could not get modinfo from 'ehci_pci': No such file or directory
ls -l /boot
35318679 Apr 28 21:34 initramfs-5.6.6.localversion14.img <- before
35314182 Apr 28 21:43 initramfs-5.6.6.localversion14.img <- after
cold reboot
I did not have a mouse. I could log into Mate desktop but I couldn't find a way to launch an xterm.
Switch to a text console.
no /dev/video
cat /sys/kernel/debug/usb/usbmon/3u > usbmon3.txt &
cat /sys/kernel/debug/usb/usbmon/4u > usbmon4.txt &
lsusb
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
restore ehci-pci.ko
insmod /lib/modules/$(uname -r)/kernel/drivers/usb/host/ehci-pci.ko
the mouse is working but still no /dev/video
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 003 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
 Bus 003 Device 006: ID 0489:e036 Foxconn / Hon Hai
 Bus 003 Device 007: ID 05ca:18c0 Ricoh Co., Ltd <- webcam
 Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
 Bus 004 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
 Bus 004 Device 003: ID 093a:2510 Pixart Imaging, Inc. Optical Mouse
echo 0 >/sys/bus/usb/devices/3-1/bConfigurationValue
echo 1 >/sys/bus/usb/devices/3-1/bConfigurationValue
the webcam is working
Comment 16 Alan Stern 2020-04-30 16:49:17 UTC
There's something wrong with what you did, but I can't tell what.  To start with, before ehci-pci.ko was loaded there should have been no usbmon/3u or usbmon/4u files, so your cat commands should have failed.  Those filenames correspond directly to the USB bus numbers, and your lsusb output shows only buses 1 and 2.

For another, the device numbering is off.  On bus 3, device 1 is the root hub and device 2 is the integrated hub -- fine.  But then the Foxconn and Ricoh devices should have been numbers 3 and 4, not 6 and 7.  Clearly something happened before you started your monitoring.

Also, your usbmon3 trace shows the webcam working after it was initialized using the old scheme!  It's pretty hard to tell what that means, though, especially since the actions shown in the trace probably weren't the first time the webcam had been initialized since the last cold reboot.
Comment 17 William Bader 2020-04-30 23:26:18 UTC
Created attachment 288863 [details]
usbmon logs

>To start with, before ehci-pci.ko was loaded there should have been no
>usbmon/3u or usbmon/4u files

I might have transcribed the lines in the wrong order. The mouse wasn't working, so I had to type into a text console and I couldn't copy and paste to another window.

>your usbmon3 trace shows the webcam working after it was initialized using the
>old scheme!

I know. I am pretty sure that I did a cold boot, and that I tested for /dev/video* after booting.
Since I reported the problem, I have continued using the official Fedora kernels for day-to-day work, and I use the test kernels only for short periods for testing.
I have been checking the webcam every day out of curiosity when I boot my laptop in the morning, and maybe 1 out of 10 times the old scheme works.

In any case, I just did the procedure again, taking better care.

cold boot from kernel 5.6.6.localversion14

no /dev/video
ls: cannot access '/dev/vid*': No such file or directory

lsusb
 Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

sudo ls -l /sys/kernel/debug/usb/usbmon
total 0
-rw------- 1 root root 0 May  1 00:03 0s
-rw------- 1 root root 0 May  1 00:03 0u
-rw------- 1 root root 0 May  1 00:03 1s
-rw------- 1 root root 0 May  1 00:03 1t
-rw------- 1 root root 0 May  1 00:03 1u
-rw------- 1 root root 0 May  1 00:03 2s
-rw------- 1 root root 0 May  1 00:03 2t
-rw------- 1 root root 0 May  1 00:03 2u

restore ehci-pci.ko

lsusb unchanged with only 2 devices

insmod /lib/modules/$(uname -r)/kernel/drivers/usb/host/ehci-pci.ko
cat /sys/kernel/debug/usb/usbmon/3u > usbmon3.txt &
cat /sys/kernel/debug/usb/usbmon/4u > usbmon4.txt &

no /dev/video

lsusb
Bus 004 Device 003: ID 093a:2510 Pixart Imaging, Inc. Optical Mouse
Bus 004 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 004: ID 05ca:18c0 Ricoh Co., Ltd
Bus 003 Device 005: ID 0489:e036 Foxconn / Hon Hai
Bus 003 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

echo 0 >/sys/bus/usb/devices/3-1/bConfigurationValue
echo 1 >/sys/bus/usb/devices/3-1/bConfigurationValue
no /dev/video*
Comment 18 Alan Stern 2020-05-01 19:21:41 UTC
Okay.  You need to start the usbmon data collection _before_ you do the insmod, so that the trace can capture _all_ of the USB traffic.  And since the usbmon/4u file doesn't exist before you do the insmod, you will have to capture the trace from the usbmon/0u file instead (that file reports all the USB traffic on all the buses).

Also, it wouldn't hurt to get a copy of the system log showing what happens when you run the test.  Before running insmod, do:

   dmesg -C >/dev/null

After you run insmod, wait a little while (something like 10 seconds), then stop the usbmon trace, and do:

  dmesg >logfile.txt

You don't even need to do the writes to the bConfigurationValue file.  Let's see what the usbmon trace and the logfile show.
Comment 19 William Bader 2020-05-01 21:55:33 UTC
Created attachment 288867 [details]
usbmon log

I put the commands in a script to make the procedure faster and more reproducible without a mouse to cut and paste.
I have the shell log of the script and the dmesg output below.
I attached the usbmon log.

+ echo 'Test webcam by loading ehci-pci after booting'
Test webcam by loading ehci-pci after booting
+ rel=5.6.6.localversion14
+ date='date +%Y%m%d-%H%M%S'
+ mod=/lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko
+ uname -r
5.6.6.localversion14
++ uname -r
+ '[' 5.6.6.localversion14 '!=' 5.6.6.localversion14 ']'
+ ls -l '/dev/vid*'
ls: cannot access '/dev/vid*': No such file or directory
+ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+ '[' '!' -f /lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko- ']'
+ '[' -f /lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko ']'
+ cp -pv /lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko- /lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko
'/lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko-' -> '/lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko'
+ sleep 1
+ lsusb
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
+ sleep 1
+ dmesg
+ tail
[   29.325653] wlp2s0: associated
[   29.325778] ath: EEPROM regdomain: 0x826c
[   29.325779] ath: EEPROM indicates we should expect a country code
[   29.325780] ath: doing EEPROM country->regdmn map search
[   29.325781] ath: country maps to regdmn code: 0x37
[   29.325782] ath: Country alpha2 being used: PT
[   29.325782] ath: Regpair used: 0x37
[   29.325784] ath: regdomain 0x826c dynamically updated by country element
[   29.357848] IPv6: ADDRCONF(NETDEV_CHANGE): wlp2s0: link becomes ready
[   29.417277] wlp2s0: Limiting TX power to 20 (20 - 0) dBm as advertised by ac:61:75:cc:bd:58
+ dmesg -C
+ pid=2205
+ cat /sys/kernel/debug/usb/usbmon/0u
+ '[' -z 2205 ']'
+ sleep 1
++ date +%Y%m%d-%H%M%S
+ insmod /lib/modules/5.6.6.localversion14/kernel/drivers/usb/host/ehci-pci.ko
+ sleep 10
+ dmesg
++ date +%Y%m%d-%H%M%S
+ kill 2205

I should have added this to the end of the script:
$ lsusb
Bus 004 Device 003: ID 093a:2510 Pixart Imaging, Inc. Optical Mouse
Bus 004 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 004 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 002 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
Bus 003 Device 004: ID 05ca:18c0 Ricoh Co., Ltd 
Bus 003 Device 005: ID 0489:e036 Foxconn / Hon Hai 
Bus 003 Device 002: ID 8087:0024 Intel Corp. Integrated Rate Matching Hub
Bus 003 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
$ ls -l /dev/vid*
ls: cannot access '/dev/vid*': No such file or directory

dmesg log:

[  144.769704] ehci-pci: EHCI PCI platform driver
[  144.770153] ehci-pci 0000:00:1a.0: EHCI Host Controller
[  144.770385] ehci-pci 0000:00:1a.0: new USB bus registered, assigned bus number 3
[  144.770415] ehci-pci 0000:00:1a.0: debug port 2
[  144.774351] ehci-pci 0000:00:1a.0: cache line size of 64 is not supported
[  144.774360] ehci-pci 0000:00:1a.0: irq 16, io mem 0xf7d08000
[  144.781318] ehci-pci 0000:00:1a.0: USB 2.0 started, EHCI 1.00
[  144.781435] usb usb3: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.06
[  144.781437] usb usb3: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  144.781439] usb usb3: Product: EHCI Host Controller
[  144.781440] usb usb3: Manufacturer: Linux 5.6.6.localversion14 ehci_hcd
[  144.781440] usb usb3: SerialNumber: 0000:00:1a.0
[  144.781628] hub 3-0:1.0: USB hub found
[  144.781635] hub 3-0:1.0: 2 ports detected
[  144.781886] ehci-pci 0000:00:1d.0: EHCI Host Controller
[  144.781927] ehci-pci 0000:00:1d.0: new USB bus registered, assigned bus number 4
[  144.781938] ehci-pci 0000:00:1d.0: debug port 2
[  144.785861] ehci-pci 0000:00:1d.0: cache line size of 64 is not supported
[  144.785892] ehci-pci 0000:00:1d.0: irq 23, io mem 0xf7d07000
[  144.792254] ehci-pci 0000:00:1d.0: USB 2.0 started, EHCI 1.00
[  144.792333] usb usb4: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.06
[  144.792335] usb usb4: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[  144.792337] usb usb4: Product: EHCI Host Controller
[  144.792338] usb usb4: Manufacturer: Linux 5.6.6.localversion14 ehci_hcd
[  144.792340] usb usb4: SerialNumber: 0000:00:1d.0
[  144.792558] hub 4-0:1.0: USB hub found
[  144.792572] hub 4-0:1.0: 2 ports detected
[  145.106318] usb 3-1: new high-speed USB device number 2 using ehci-pci
[  145.114281] usb 4-1: new high-speed USB device number 2 using ehci-pci
[  145.121056] usb 3-1: New USB device found, idVendor=8087, idProduct=0024, bcdDevice= 0.00
[  145.121063] usb 3-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[  145.121681] hub 3-1:1.0: USB hub found
[  145.121802] hub 3-1:1.0: 6 ports detected
[  145.129798] usb 4-1: New USB device found, idVendor=8087, idProduct=0024, bcdDevice= 0.00
[  145.129804] usb 4-1: New USB device strings: Mfr=0, Product=0, SerialNumber=0
[  145.130315] hub 4-1:1.0: USB hub found
[  145.130439] hub 4-1:1.0: 6 ports detected
[  145.392316] usb 3-1.2: new full-speed USB device number 3 using ehci-pci
[  145.400313] usb 4-1.2: new low-speed USB device number 3 using ehci-pci
[  145.472947] usb 3-1.2: New USB device found, idVendor=0489, idProduct=e036, bcdDevice= 0.01
[  145.472955] usb 3-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  145.472959] usb 3-1.2: Product: Bluetooth USB Host Controller
[  145.472962] usb 3-1.2: Manufacturer: Atheros Communications
[  145.472966] usb 3-1.2: SerialNumber: Alaska Day 2006
[  145.483172] usb 4-1.2: New USB device found, idVendor=093a, idProduct=2510, bcdDevice= 1.00
[  145.483178] usb 4-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[  145.483182] usb 4-1.2: Product: USB OPTICAL MOUSE
[  145.483186] usb 4-1.2: Manufacturer: PIXART
[  145.487523] input: PIXART USB OPTICAL MOUSE as /devices/pci0000:00/0000:00:1d.0/usb4/4-1/4-1.2/4-1.2:1.0/0003:093A:2510.0001/input/input13
[  145.487824] hid-generic 0003:093A:2510.0001: input,hidraw0: USB HID v1.11 Mouse [PIXART USB OPTICAL MOUSE] on usb-0000:00:1d.0-1.2/input0
[  145.538206] usb 3-1.3: new high-speed USB device number 4 using ehci-pci
[  145.539293] Bluetooth: Core ver 2.22
[  145.539311] NET: Registered protocol family 31
[  145.539312] Bluetooth: HCI device and connection manager initialized
[  145.539317] Bluetooth: HCI socket layer initialized
[  145.539319] Bluetooth: L2CAP socket layer initialized
[  145.539323] Bluetooth: SCO socket layer initialized
[  145.547061] usbcore: registered new interface driver btusb
[  145.556685] usb 3-1.3: config 247 has too many interfaces: 120, using maximum allowed: 32
[  145.556688] usb 3-1.3: config 247 descriptor has 1 excess byte, ignoring
[  145.556690] usb 3-1.3: config 247 has 0 interfaces, different from the descriptor's value: 120
[  145.557672] usb 3-1.3: New USB device found, idVendor=05ca, idProduct=18c0, bcdDevice= 7.32
[  145.557675] usb 3-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[  145.557676] usb 3-1.3: Product: USB2.0 Camera
[  145.557678] usb 3-1.3: Manufacturer: Ricoh Company Ltd.
[  145.557903] usb 3-1.3: can't set config #247, error -32
[  145.594790] usbcore: registered new interface driver ath3k
[  145.799929] usb 3-1.2: USB disconnect, device number 3
[  145.972344] usb 3-1.2: new full-speed USB device number 5 using ehci-pci
[  151.252988] usb 3-1.2: New USB device found, idVendor=0489, idProduct=e036, bcdDevice= 0.02
[  151.252996] usb 3-1.2: New USB device strings: Mfr=1, Product=2, SerialNumber=3
[  151.253000] usb 3-1.2: Product: Bluetooth USB Host Controller
[  151.253003] usb 3-1.2: Manufacturer: Atheros Communications
[  151.253007] usb 3-1.2: SerialNumber: Alaska Day 2006
Comment 20 Alan Stern 2020-05-02 01:52:29 UTC
That got it.  It's kind of interesting to compare this usbmon trace with the earlier one you made under a "good" kernel.  Here's where the difference first shows up.  The first lines show the system asking the device for the first nine bytes of its configuration descriptor, and the second lines show the device's reply (the last nine hex bytes on the line).

Good trace:

ffff895e0a2063c0 123165855 S Ci:1:007:0 s 80 06 0200 0000 0009 9 <
ffff895e0a2063c0 123166213 C Ci:1:007:0 0 9 = 09026502 02010080 64

Bad trace:

ffffa0e9e4fdb680 145518314 S Ci:3:004:0 s 80 06 0200 0000 0009 9 <
ffffa0e9e4fdb680 145518517 C Ci:3:004:0 0 9 = 09027602 78f7e4ff 02

As you can see, the replies are very different.  The third and fourth bytes are supposed to contain the descriptor's full length (16-bit little endian); the values are 0x0265 vs. 0x0276, or 613 vs. 630 in decimal.  This explains the difference in the next request, where the system asks the device to send the entire configuration descriptor.  The good trace asks for 613 bytes and the bad trace asks for 630.

Good trace:

ffff895e0a2063c0 123166260 S Ci:1:007:0 s 80 06 0200 0000 0265 613 <
ffff895e0a2063c0 123170157 C Ci:1:007:0 0 613 = 09026502 02010080 64080b00 020e0300 04090400 00010e01 00040d24 0100014d

Bad trace:

ffffa0e9e4fdb680 145518526 S Ci:3:004:0 s 80 06 0200 0000 0276 630 <
ffffa0e9e4fdb680 145522024 C Ci:3:004:0 0 630 = 09027602 78f7e4ff 029e5f02 4675e490 b197f0a3 f07b017a b0790012 28ef7404

Comparing the responses (which only contain the first 32 bytes of each reply) shows that they are almost completely different.  It looks like the device is sending random data in the bad trace!

As we suspected, there's nothing really wrong with the "bad" kernel.  The difference is that it initializes the device using the old scheme instead of the new scheme.  And the initialization works, sort of, but the device goes crazy and starts sending garbage data.  Later on in the bad trace, the device stops responding entirely.

Clearly the webcam is buggy.  But since it does work well enough when initialized with the old scheme, it looks like the patch that was just accepted is the right thing to do.

The kernel log even shows why the Foxconn device ends up being number 5 instead of number 3.  Originally it is number 3, but then it disconnects from the USB bus (evidently as the result of a firmware upload and a restart) and reconnects.  The system assigns it the next device number (5) at that point.
Comment 21 William Bader 2020-05-02 02:57:24 UTC
Thanks for analyzing it!

Note You need to log in before you can comment on or make changes to this bug.