Bug 3452

Summary: [Patch] Update fs/nls/nls_cp936.c (Chinese codepage)
Product: File System Reporter: hashao (hashao2)
Component: OtherAssignee: xexz (xexz)
Status: CLOSED CODE_FIX    
Severity: normal CC: protasnb
Priority: P2    
Hardware: i386   
OS: Linux   
Kernel Version: 2.6.9 Subsystem:
Regression: --- Bisected commit-id:
Attachments: Patch to update the cp936 with mapping from MS site.
Patch to update the cp936 with mapping from MS site.
Patch to update the cp936 with mapping from MS site.
Patch to update the cp950 with mapping from MS site.
[Patch] Update the cp936 with correct mapping from MS site.
[Patch] Update the cp950 with correct mapping from MS site.

Description hashao 2004-09-24 02:59:13 UTC
The current conversion table for codepage cp936 (Chinese Simplified) has many
wrong mapping. I don't know where did the original table come from. As a result,
Chinese filenames created on a vfat partition under Linux has some characters
that cannot be accessed under Windows.

The cp936 table can be found at:

http://www.microsoft.com/globaldev/reference/dbcs/936.htm

e.g.: CP936 code point: 0x8179 0x81ED
Comment 1 hashao 2004-09-24 03:02:37 UTC
Created attachment 3712 [details]
Patch to update the cp936 with mapping from MS site.

Also add an alias to GBK to the code page.
Comment 2 hashao 2004-09-24 03:16:04 UTC
Created attachment 3713 [details]
Patch to update the cp936 with mapping from MS site.

Unicode mapping start from 0x0000 instead of 0x0100. There are some symbols in
that 0x0000-0x0100 range.
Comment 3 hashao 2004-09-24 03:19:14 UTC
Created attachment 3714 [details]
Patch to update the cp936 with mapping from MS site.

Remove debug garbage.
Comment 4 hashao 2004-09-25 03:55:29 UTC
Created attachment 3720 [details]
Patch to update the cp950 with mapping from MS site.


This one is for codepage CP950, which is for traditional Chinese.

The conversion table was based on the gnu glibc's BIG5.gz charmap
(/usr/share/i18n/charmaps/BIG5.gz) which has some additional mapping for
popular extension.

The actual Microsoft table can be found at:
http://www.microsoft.com/globaldev/reference/dbcs/950.htm

P.S. The GBK table in glibc is the same as the MS table.
Comment 5 hashao 2005-01-06 22:51:47 UTC
Created attachment 4346 [details]
[Patch] Update the cp936 with correct mapping from MS site.

Fix a bug for ascii in mapping function.
Comment 6 hashao 2005-01-06 22:55:10 UTC
Created attachment 4347 [details]
[Patch] Update the cp950 with correct mapping from MS site.

The same ascii char fix from cp936
Comment 7 Natalie Protasevich 2007-10-16 06:31:54 UTC
Hashao,
Is the problem still there with recent kernels? I would be surprised if it is, probably fixed by now. Can you confirm please so we close the bug.
Thanks.
Comment 8 hashao 2007-10-22 02:16:26 UTC
Yes, it is fixed.