The USB layer will output text device names "as is", sometimes not in 7bit ASCII or UTF-8 but ISO-8859-1 These names are reused by the xorg evdev driver to identify devices: Section "InputDevice" Identifier "track-expl" Driver "evdev" Option "Protocol" "evdev" Option "Name" "Microsoft Microsoft Trackball Explorer
Reply-To: akpm@linux-foundation.org On Sat, 7 Apr 2007 03:12:49 -0700 bugme-daemon@bugzilla.kernel.org wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8310 > > Summary: USB device names are not sanitized for UTF-8 > Kernel Version: 2.6.21-rc5 > Status: NEW > Severity: normal > Owner: drivers_input-devices@kernel-bugs.osdl.org > Submitter: Nicolas.Mailhot@LaPoste.net > > > The USB layer will output text device names "as is", sometimes not in 7bit ASCII > or UTF-8 but ISO-8859-1 > > These names are reused by the xorg evdev driver to identify devices: > > Section "InputDevice" > Identifier "track-expl" > Driver "evdev" > Option "Protocol" "evdev" > Option "Name" "Microsoft Microsoft Trackball Explorer_" > Option "ZAxisMapping" "4 5" > Option "Buttons" "7" > EndSection > > evdev matching requires the "Name" string match byte-to-byte to the string > exposed by the kernel > > That means finding a way to create a ISO-8859-1 conf file on a UTF-8 distro (not > easy nowadays) > > Also all the Xorg.conf tools will rewrite the file in UTF-8 at the slightest > opportunity, breaking the matching and killing X startup. Some of the commonly > installed tools rewrite the file at each boot, and you get to fix X setup > manually every time > > Can't the USB layer filter byte strings that are incompatible with today's main > Linux encoding ?
Sure enough, hid-core.c just does strlcpy() of the ISO-8859-1 strings returned by usbcore, based on the UTF-16 returned by various devices. It's not clear what "bug" this intends to report though. Complaining that non-ASCII characters are emitted? That it doesn't convert the UTF-16 into UTF-8? That it doesn't return UTF-16 in the first place? That the X tools are stupid? It'd be simple enough for hid-core to morph characters with the high bit set into the '?' used by usbcore for characters with the high byte set...
I didn't want to suggest a solution, just point out the problem. But since you insist my preferred resolution would be (in this order) 1. convert the UTF-16 into UTF-8 2. filter strings to report 7bit ASCII only (will probably fail spectacularly as soon as the chinese start using chinese names for gadgets targeted at their internal market) Anything else can not be used with UTF-8 userspace (also why is the kernel reporting UTF-16 strings when almost no one uses it under Linux) I definitively agree xorg is stupid to do matching that's encoding-sensitive. This can be mitigated by moving to the de-facto common Linux encoding today: UTF-8. Asking the tools that process xorg.conf to know encoding conversion rules apply to everything but the evdev name strings (that must be considered as opaque byte strings) is way too baroque to succeed
If we do any sanitization in kernel I think it should be done when we populate dev->manufacturer and dev->product in UDB core. It would also make sense for X to use strstr when matching name and phys strings.
Any more thought on how to proceed on this issue? Should Xorg developers be contacted also? Thanks.
The xorg devs already wrote me it was not their problem and they didn't want to fix the xorg side (actually, xorg input has his share of problems so trying to workaround a kernel mistake would not really be a good developer time use) A patch to fix this issue kernel-side has been posted on mailing lists and is being discussed. It works for my system
Subject: Re: USB device names are not sanitized for UTF-8 On 6/14/07, bugme-daemon@bugzilla.kernel.org <bugme-daemon@bugzilla.kernel.org> wrote: > http://bugzilla.kernel.org/show_bug.cgi?id=8310 > > > > > > ------- Comment #6 from Nicolas.Mailhot@LaPoste.net 2007-06-14 09:52 ------- > The xorg devs already wrote me it was not their problem and they didn't want > to > fix the xorg side (actually, xorg input has his share of problems so trying > to > workaround a kernel mistake would not really be a good developer time use) > > A patch to fix this issue kernel-side has been posted on mailing lists and is > being discussed. It works for my system > > > -- > Configure bugmail: http://bugzilla.kernel.org/userprefs.cgi?tab=email > ------- You are receiving this mail because: ------- > You are on the CC list for the bug, or are watching someone who is. > This is great, can you please attach the patch and maybe link to the discussion? Thanks!
> > A patch to fix this issue kernel-side has been posted on mailing lists and > is > > being discussed. It works for my system > This is great, can you please attach the patch and maybe link to the > discussion? http://article.gmane.org/gmane.linux.usb.devel/54768
Alan, your patch worked great, are you planning to submit it? Then we can close the bug :) Thanks, --Natalie
No, the patch isn't suitable. I'm going to add some library routines to the kernel for converting to/from UTF-8. Then the USB code will use the library routines.
USB core has been updated to translate strings from UTF-16LE to UTF-8, and it seems to work well with my Microsoft Intellimouse Explorer, closing.