Bug 214131

Summary: ch341 communication problem
Product: Drivers Reporter: Paul Größel (pb.g)
Component: USBAssignee: Johan Hovold (johan)
Status: RESOLVED CODE_FIX    
Severity: normal CC: bondar.den, brandys, flole, hkz85825915, johan, kernel_bugzilla, kovszilard
Priority: P1    
Hardware: All   
OS: Linux   
Kernel Version: 5.14-rc5 Subsystem:
Regression: No Bisected commit-id:
Attachments: captured communication
lsusb -v

Description Paul Größel 2021-08-22 07:11:49 UTC
Created attachment 298411 [details]
captured communication

Several users reported their USB-Serial Adapter based on ch341 driver does not communicate correctly anymore with kernels somewhere above 5.10.56-1.  

The problem is that flashing an ESP8266 board is not possible anymore with Arduino IDE resulting in a 

"/dev/ttyUSB0 failed to connect: Failed to connect to ESP8266: Timed out waiting for packet header"

error.

I can confirm this issue up to kernel 5.14-rc5. 
Downgrading my kernel to 5.10.52 also solves the problem.

I attached two communication samples from github user jypma.

The issue is was discussed here:
https://github.com/espressif/esptool/issues/653
Comment 1 loqs 2021-08-22 23:32:30 UTC
What if you revert 3c18e9baee0ef97510dcda78c82285f52626764b
which was back-ported to 5.10-58 and 5.13.10?

I believe it is the same bug as discussed here https://bugs.archlinux.org/task/71830
Comment 2 Paul Größel 2021-08-23 07:09:22 UTC
Indeed, removing 

.bulk_in_size      = 512,

and recompiling my 5.13.10 kernel solved the issue.
Comment 3 Johan Hovold 2021-08-23 09:07:27 UTC
Thanks for reporting and tracking down the commit that caused the
regression.

Could you be a bit more specific about the symptoms here? Judging from a
quick look at the github thread, it appears that the device is still
working although timing may have changes slightly. The arch thread
indicates that the device doesn't even enumerate, which does not seem to
be the case here.

Also please provide the output of "lsusb -v" for this device.
Comment 4 Paul Größel 2021-08-23 09:14:29 UTC
Created attachment 298433 [details]
lsusb -v
Comment 5 Paul Größel 2021-08-23 09:14:49 UTC
I agree, I couldn't find any enumerate related symptoms here:

dmesg:

[ 7572.641499] usb 3-3.4: new full-speed USB device number 7 using xhci_hcd
[ 7572.731906] usb 3-3.4: New USB device found, idVendor=1a86, idProduct=7523, bcdDevice= 2.63
[ 7572.731910] usb 3-3.4: New USB device strings: Mfr=0, Product=2, SerialNumber=0
[ 7572.731912] usb 3-3.4: Product: USB2.0-Serial
[ 7572.788929] ch341 3-3.4:1.0: ch341-uart converter detected
[ 7572.802958] usb 3-3.4: ch341-uart converter now attached to ttyUSB0

lsusb -v see attachment above
Comment 6 Paul Größel 2021-08-23 09:18:26 UTC
Sorry, I have no idea how I could tackle down to more specific symptoms. I do not own a protocol analyzer nor I am a coder.
Comment 7 Johan Hovold 2021-08-24 11:44:23 UTC
On Mon, Aug 23, 2021 at 09:14:49AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=214131
> 
> --- Comment #5 from Paul Größel (pb.g@gmx.de) ---
> I agree, I couldn't find any enumerate related symptoms here:

I was able to reproduce the problem here. The device doesn't send a
zero-length package in case the received data is a multiple of the
endpoint size so that the bulk transfer doesn't complete (e.g. your
flashing application may not receive replies).

We need to revert the offending commit until we can figure out how to
configure the device to send ZLPs.

Thanks again for reporting this, and sorry about the breakage.

Johan
Comment 8 Johan Hovold 2021-08-25 07:21:23 UTC
For the record, I've applied the revert now and it should be backported
to the stable trees shortly:

	https://git.kernel.org/pub/scm/linux/kernel/git/johan/usb-serial.git/commit/?h=usb-linus&id=df7b16d1c00ecb3da3a30c999cdb39f273c99a2f
Comment 9 Bogusław Brandys 2021-10-04 09:21:09 UTC
5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

Ubuntu 20.04.3 LTS

problem with ch341 driver re-appeared while in 5.4.0.-86 is working fine.
Comment 10 Denis Bondar 2021-10-09 20:08:07 UTC
Hi,
This version probably has the same or similar problem:

Linux home 5.11.0-37-generic #41~20.04.2-Ubuntu
Comment 11 Johan Hovold 2021-10-11 08:55:24 UTC
On Mon, Oct 04, 2021 at 09:21:09AM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:

> --- Comment #9 from Bogusław Brandys (brandys@o2.pl) ---
> 5.4.0-88-generic #99-Ubuntu SMP Thu Sep 23 17:29:00 UTC 2021 x86_64 x86_64
> x86_64 GNU/Linux
> 
> Ubuntu 20.04.3 LTS
> 
> problem with ch341 driver re-appeared while in 5.4.0.-86 is working fine.

This issue has been fixed in mainline (and stable), but we have no idea
what Ubuntu puts in their kernels. Please report it them. 

Johan
Comment 12 Denis Bondar 2021-10-11 09:57:10 UTC
(In reply to Johan Hovold from comment #11)
> On Mon, Oct 04, 2021 at 09:21:09AM +0000,
> bugzilla-daemon@bugzilla.kernel.org wrote:
> 
> This issue has been fixed in mainline (and stable), but we have no idea
> what Ubuntu puts in their kernels. Please report it them. 
> 
> Johan

Thank you very much. Sorry for the inconvenience.
Comment 13 Flole 2022-01-27 18:14:48 UTC
(In reply to Johan Hovold from comment #7)
> We need to revert the offending commit until we can figure out how to
> configure the device to send ZLPs.

Maybe the "alternative" driver at  https://github.com/juliagoda/CH341SER can provide some clues? It has some magic constants and as far as I can see it doesn't set the bulk_in_size at all? So it should be enabled and obviously it's working properly, or at least nobody reported an error in the repo.
Comment 14 Johan Hovold 2022-02-01 11:16:21 UTC
On Thu, Jan 27, 2022 at 06:14:48PM +0000, bugzilla-daemon@bugzilla.kernel.org wrote:
> https://bugzilla.kernel.org/show_bug.cgi?id=214131
> 
> Flole (flole@flole.de) changed:
> 
>            What    |Removed                     |Added
> ----------------------------------------------------------------------------
>                  CC|                            |flole@flole.de
> 
> --- Comment #13 from Flole (flole@flole.de) ---
> (In reply to Johan Hovold from comment #7)
> > We need to revert the offending commit until we can figure out how to
> > configure the device to send ZLPs.
> 
> Maybe the "alternative" driver at  https://github.com/juliagoda/CH341SER can
> provide some clues? It has some magic constants and as far as I can see it
> doesn't set the bulk_in_size at all? So it should be enabled and obviously
> it's
> working properly, or at least nobody reported an error in the repo.

Not setting the bulk_in_size in the driver means using the endpoint
max-packet size, which is precisely what was needed in mainline to avoid
the ZLP stalls.

So I'm afraid it seems unlikely that that driver will provide any clues
here.

Johan