Bug 14163 - Data loss in CDC-ACM reception
Summary: Data loss in CDC-ACM reception
Status: CLOSED CODE_FIX
Alias: None
Product: Drivers
Classification: Unclassified
Component: USB (show other bugs)
Hardware: All Linux
: P1 normal
Assignee: Greg Kroah-Hartman
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-11 13:49 UTC by Simon Berg
Modified: 2009-11-07 16:27 UTC (History)
3 users (show)

See Also:
Kernel Version: 2.6.30.5
Subsystem:
Regression: Yes
Bisected commit-id:


Attachments

Description Simon Berg 2009-09-11 13:49:50 UTC
I have a CDC-ACM device sending pseudo random data as fast as possible. On my computer I check the received sequence using a pseudo random generator identical to the one in the device. On a 2.6.29.2 kernel this runs without any errors.
On a 2.6.30.5 kernel it fails after approx. 1e6 bytes. 
Using usbmon I can see that all data is correctly received but one packet (of 128 bytes) is lost on it's way to the tty.
Checking git I found out that the older kernel hade the line
tty->low_latency = 1;
in cdc-acm.c

After reintroducing that line into the driver it worked without errors agian.
Comment 1 Andrew Morton 2009-09-13 23:00:15 UTC
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Fri, 11 Sep 2009 13:49:50 GMT bugzilla-daemon@bugzilla.kernel.org wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=14163
> 
>            Summary: Data loss in CDC-ACM reception
>            Product: Drivers
>            Version: 2.5
>     Kernel Version: 2.6.30.5
>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: USB
>         AssignedTo: greg@kroah.com
>         ReportedBy: ksb@shell.linux.se
>         Regression: No
> 
> 
> I have a CDC-ACM device sending pseudo random data as fast as possible. On my
> computer I check the received sequence using a pseudo random generator
> identical to the one in the device. On a 2.6.29.2 kernel this runs without
> any
> errors.
> On a 2.6.30.5 kernel it fails after approx. 1e6 bytes. 
> Using usbmon I can see that all data is correctly received but one packet (of
> 128 bytes) is lost on it's way to the tty.
> Checking git I found out that the older kernel hade the line
> tty->low_latency = 1;
> in cdc-acm.c
> 
> After reintroducing that line into the driver it worked without errors agian.
> 

I'll mark this as a regression.

Can we just set ->low_latency again?  What are the implications of that?
Comment 2 Anonymous Emailer 2009-09-14 08:19:16 UTC
Reply-To: oliver@neukum.org

Am Montag, 14. September 2009 00:59:40 schrieb Andrew Morton:

> > Checking git I found out that the older kernel hade the line
> > tty->low_latency = 1;
> > in cdc-acm.c
> >
> > After reintroducing that line into the driver it worked without errors
> > agian.
>
> I'll mark this as a regression.
>
> Can we just set ->low_latency again?  What are the implications of that?

/**
 *	tty_flip_buffer_push	-	terminal
 *	@tty: tty to push
 *
 *	Queue a push of the terminal flip buffers to the line discipline. This
 *	function must not be called from IRQ context if tty->low_latency is set.

Apparently we cannot. Possibly throtteling has to set in earlier.
Increasing the buffer's size wouldn't work either, as we'd hit kmalloc's limit.

	Regards
		Oliver
Comment 3 Alan 2009-09-15 14:46:47 UTC
> > After reintroducing that line into the driver it worked without errors
> agian.
> > 
> 
> I'll mark this as a regression.
> 
> Can we just set ->low_latency again?  What are the implications of that?

Locking violations and crashes. It's always been unsafe to do that. If
its dropping a frame very occasionally that means the BH is getting
excessively delayed which points at problems elsewhere. Right now the
input buffering is 64K, edit the tty buffer code and you can just tweak
it to say 128K and see if that helps.
Comment 4 Daniel Qarras 2009-10-23 18:22:22 UTC
This might be related to an issue I'm now seeing with Fedora 12 Beta / 2.6.31.1. On earlier Fedora release my Nokia E75 has been working well but now when connecting I see these errors in dmesg output and, e.g., NetworkManager cannot use the device anymore.

usb 1-8: new high speed USB device using ehci_hcd and address 14
usb 1-8: New USB device found, idVendor=0421, idProduct=010e
usb 1-8: New USB device strings: Mfr=1, Product=2, SerialNumber=3
usb 1-8: Product: Nokia E75
usb 1-8: Manufacturer: Nokia
usb 1-8: SerialNumber: 1234567890
usb 1-8: configuration #1 chosen from 1 choice
cdc_acm 1-8:1.1: ttyACM0: USB ACM device
usb 1-8: bad CDC descriptors
usb 1-8: bad CDC descriptors
cdc_phonet: probe of 1-8:1.10 failed with error -22
Comment 5 Daniel Qarras 2009-10-26 21:08:47 UTC
I've now tested kernel.org kernels 2.6.28.8, 2.6.29.6, 2.6.30.9, and 2.6.31.3 and with them no errors are printed on my system. Perhaps my issue is not related, after all.
Comment 6 Daniel Qarras 2009-10-26 21:09:24 UTC
FWIW, downstream report for my issue is at

https://bugzilla.redhat.com/show_bug.cgi?id=530714
Comment 7 Alan Stern 2009-11-06 18:48:45 UTC
Is everything okay now with 2.6.31.5?  If it is, you can close out the bug report.
Comment 8 Daniel Qarras 2009-11-06 20:46:17 UTC
FWIW, for me cdc-acm on kernel-2.6.31.5-122.fc12.i686 ok but I'm not the original reporter, I'll let him decide.
Comment 9 Greg Kroah-Hartman 2009-11-06 21:45:19 UTC
Ok, will close out now, this should be resolved.
Comment 10 Simon Berg 2009-11-07 16:27:21 UTC
I ran the test on 2.6.31.5 for several minutes without any errors.
According to dd I had a speed of about 1.1 MiB/s, which I consider fairly good on full speed USB.
It seems that the problem is fixed.

Note You need to log in before you can comment on or make changes to this bug.