What does Japanese Samba fix?

TAKAHASHI Motonobu monyo at home.monyo.com
Fri Feb 8 10:01:06 GMT 2002


Shirish Kalele wrote:

>From what I could make out, these are the main changes the Japanese
>edition of Samba has:
>
>1. SWAT pages are i18n'ed (based on the HTTP language negotiation).
>
>2. Japanese documentation has been included.
>
>3. Support for user-defined characters. (I didn't understand what you
>mean by "user-defined")

I think 

3. Support for all Japanese characters (including "user-defined
   character)

is more proper.

Anyway, basically you are right.

* Japanese character set has "Vendor defined characters and
  User-defined characters" in historical reason.
  You know there are much KANJI characters over 60000 or 70000
  characters.

  Some of them (more than 6300 chars) are standardized by JIS
  (Japanese Industrial Standard). But in those days every vendor such
  as NEC, Sony, DEC, IBM, etc defined much more KANJI chars as "Vendor
  defined characters". And also some area is reserved for users to
  define their preferred characters, that are "user-defined
  characters".

  If you are interested in that, O'Reilly's "Understanding Japanese
  Information Processing" will give you usefull information.

>The thing I'm particularly interested in, is the fact that some kanji
>characters in filenames are not handled correctly by English Samba
>edition.

Unfortunately no.

  The problems are mainly caused by that:

  1) Japanese strings are not expected there.

    for example domain name / computer name / volume name 
    command(i.e. smbclient) in/out / and etc.

    More test is needed to examine to support non-ASCII characters.

  2) Japanese-origin problem

    Some characters are sent with different character code between
    Windows 9x/Me and Windows NT/2000/XP on wire. We have to recognize
    such different character code as same character. This routine is
    included in Japanese Samba but not in original Samba.
    This problem will be partly fixed if UCS-2 is used on wire.
    If we set "use UCS-2" bit, still some strings are sent in ANSI
    character set, so this problem may be caused.

    There some special "KANJI" characters meaning Roman Numeric or
    Cyllic / Greek characters or such non-English alphabet, which have
    "case" information.
    Basically KANJI characters do not have "case", but these special
    "KANJI" characters have case information *only* on Japanese
    Windows NT/2000/XP :(
    This routing is partly included in original Samba.

  3) is_sjis() cannot recognize user-defined characters as Japanese.
    This problem is fixed in Samba 2.2.3.

-----
TAKAHASHI, Motonobu(monyo)         monyo at samba.gr.jp
Samba Team - http://samba.org/     Samba-JP - http://www.samba.gr.jp/  
JWNTUG - http://www.jwntug.or.jp/  Analog-JP - http://www.jp.analog.cx/
MCSE(NT40,W2K), SCNA, CCNA, Turbo-CI




More information about the samba-technical mailing list