Bug 8289 - filesystem breaks Unicode canonical equivalence
Summary: filesystem breaks Unicode canonical equivalence
Status: REJECTED WILL_NOT_FIX
Alias: None
Product: File System
Classification: Unclassified
Component: ext3 (show other bugs)
Hardware: i386 Linux
: P2 normal
Assignee: Alexey Dobriyan
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2007-03-31 03:34 UTC by Denis Moyogo Jacquerye
Modified: 2008-09-08 23:15 UTC (History)
0 users

See Also:
Kernel Version: 2.6.20
Subsystem:
Regression: ---
Bisected commit-id:


Attachments

Description Denis Moyogo Jacquerye 2007-03-31 03:34:16 UTC
Most recent kernel where this bug did *NOT* occur: none
Distribution: Any
Hardware Environment: Any
Software Environment: Any
Problem Description: canonically equivalent file/directory names are not
considered equivalent.

Steps to reproduce:
- create a file name "
Comment 1 Alexey Dobriyan 2008-09-08 22:33:20 UTC
Well, normalization by default will break everyone who does not expect it.

Regardless, there is no place for such tricky code in kernel.

I think some LD_PRELOAD hack which redefines all system calls which take
pathname as argument will do the trick.
Comment 2 Denis Moyogo Jacquerye 2008-09-08 23:15:35 UTC
(In reply to comment #1)
> Well, normalization by default will break everyone who does not expect it.
If you expect Unicode support, you should expect normalization.
http://unicode.org/faq/normalization.html
http://unicode.org/reports/tr15/

But it's true that application could handle it. Or it could be optional in the kernel.

HFS does it. Has there been serious complains about HFS supporting Unicode properly by doing normalization?

> Regardless, there is no place for such tricky code in kernel.
HFS does it. If there's a place in HFS, why not in any FS?

> I think some LD_PRELOAD hack which redefines all system calls which take
> pathname as argument will do the trick.

Where should I look to know how to do that?

Thanks
Comment 3 Alexey Dobriyan 2008-09-09 00:04:28 UTC
On Mon, Sep 08, 2008 at 11:15:35PM -0700, bugme-daemon@bugzilla.kernel.org wrote:
> > Well, normalization by default will break everyone who does not expect it.
> If you expect Unicode support, you should expect normalization.
> http://unicode.org/faq/normalization.html
> http://unicode.org/reports/tr15/

Don't mix things.

You expect Unicode support from operating system.

Whether kernel should be responsible for it or some core library is whole
separate quiestion.

> But it's true that application could handle it. Or it could be optional in
> the
> kernel.

Everything could be optional in kernel.

> HFS does it. Has there been serious complains about HFS supporting Unicode
> properly by doing normalization?

HFS+ is different because this normalization stuff is deeply in filesystem's
on-disk format. Linux can't write just anything VFS gives to HFS+ filesystem.

Normal Linux filesystem like ext2 (and VFS in general) do not care.
All they care is absense of / in path component.

> > Regardless, there is no place for such tricky code in kernel.
> HFS does it. If there's a place in HFS, why not in any FS?

Because other filesystems are more forgiving.

Again, study LD_PRELOAD first.

Note You need to log in before you can comment on or make changes to this bug.