display charset

TAKAHASHI Motonobu monyo at samba.gr.jp
Wed Aug 15 15:15:57 GMT 2001

I wrote the same matter which I've already told you directly to
discussion on samba-technical and sugj-tech.

Andrew Tridgell wrote:
>This presents a real problem though, because using iconv as the basis
>for our character set handling doesn't give us enough information. We
>need to know how many columns each character consumes.
>Do you know if other languages do this? (ie. number of characters !=
>number of columns)

You can get this information from EastAsianWidth.txt provided by
Unicode.org but I think we need to examine if this file is correct.

In Japanese encode method such as SJIS, EUC-JP a 1byte character
consumes 1 column and the other consumes 2 column, though in EUC a
3byte character begun with 0x8e consumes only 1 column.

In JIS, there are some escape code, so it is more complex. I think we
do not need to support JIS as filename. JIS filename has security
problem as I mentioned before.
Indeed JIS encoded filename is not used now as far as I know.

>so, if we want to get this right then we'll need 3 functions per
>character set:
>1) a function to convert to ucs2
>2) a function to convert from ucs2

iconv() has another probrem.

For example full-width minus sign (0x817c SJIS) is converted to 
  - U+2212 (MINUS SIGN) with iconv() in libiconv and Solaris 8
  - U+FF0D (FULL-WIDTH-HYPHEN-MINUS) in WideCharToMultiByte()

The problem I pointed is not a bug but the DESING matter. iconv() uses
Letterlike Symbol, which is "normal" Unicode characters and is refered
to Unicode.org mapping, which we can get from

WideCharToMultiByte() uses CJK Compatibility Ideographs to achieve
maximum compatibility with traditional character set. 

>Eventually I think we may need to offer the use of iconv, but also
>allow for dynamic loading of better charset tables for some

To support "Windows-nized Japanese", unfortunately this unique system
is necessary.

TAKAHASHI, Motonobu(monyo)         monyo at samba.org
Personal - http://home.monyo.com/
Samba Team - http://samba.org/     Samba-JP - http://www.samba.gr.jp/  
JWNTUG - http://www.jwntug.or.jp/  Analog-JP - http://www.jp.analog.cx/

More information about the samba-technical mailing list