display charset

TAKAHASHI Motonobu monyo at samba.gr.jp
Wed Aug 15 15:15:57 GMT 2001


I wrote the same matter which I've already told you directly to
discussion on samba-technical and sugj-tech.

Andrew Tridgell wrote:
>This presents a real problem though, because using iconv as the basis
>for our character set handling doesn't give us enough information. We
>need to know how many columns each character consumes.
>
>Do you know if other languages do this? (ie. number of characters !=
>number of columns)

You can get this information from EastAsianWidth.txt provided by
Unicode.org but I think we need to examine if this file is correct.

In Japanese encode method such as SJIS, EUC-JP a 1byte character
consumes 1 column and the other consumes 2 column, though in EUC a
3byte character begun with 0x8e consumes only 1 column.

In JIS, there are some escape code, so it is more complex. I think we
do not need to support JIS as filename. JIS filename has security
problem as I mentioned before.
Indeed JIS encoded filename is not used now as far as I know.

>so, if we want to get this right then we'll need 3 functions per
>character set:
>
>1) a function to convert to ucs2
>2) a function to convert from ucs2

iconv() has another probrem.

For example full-width minus sign (0x817c SJIS) is converted to 
  - U+2212 (MINUS SIGN) with iconv() in libiconv and Solaris 8
  - U+FF0D (FULL-WIDTH-HYPHEN-MINUS) in WideCharToMultiByte()

The problem I pointed is not a bug but the DESING matter. iconv() uses
Letterlike Symbol, which is "normal" Unicode characters and is refered
to Unicode.org mapping, which we can get from
  ftp://ftp.unicode.org/Public/MAPPINGS/EASTASIA/JIS/SHIFTJIS.TXT 

WideCharToMultiByte() uses CJK Compatibility Ideographs to achieve
maximum compatibility with traditional character set. 

>Eventually I think we may need to offer the use of iconv, but also
>allow for dynamic loading of better charset tables for some
>languages. 

To support "Windows-nized Japanese", unfortunately this unique system
is necessary.

-----
TAKAHASHI, Motonobu(monyo)         monyo at samba.org
Personal - http://home.monyo.com/
Samba Team - http://samba.org/     Samba-JP - http://www.samba.gr.jp/  
JWNTUG - http://www.jwntug.or.jp/  Analog-JP - http://www.jp.analog.cx/
MCSE+I, SCNA, CCNA, Turbo-CI





More information about the samba-technical mailing list