samba-3.0.0beta1 codeset issue on non-Linux

TAKAHASHI Motonobu monyo at home.monyo.com
Sun Jun 15 18:39:43 GMT 2003


Steve Langasek wrote:
>Are you again speaking of pre-NT versions of Windows?  Does the Japanese
>edition of Win2K not use Unicode internally?

Both Windows 9x and Windows NT+ clients can send Unicode on the wire, so
character handling problems between Windows clients and Samba are
basically solved, I think. 

>It's my understanding that round-trip conversion of CJK characters only
>breaks if you use a different platform for each leg of the conversion.

Converting through Unicode, it can be easily broken. For example now I
have a character "U+FF5E (FULL WIDTH TILDE)". If I convert U+FF5E from
Unicode to EUC-JP and then convert EUC-JP to Unicode with GNU
libiconv, it becomes U+301C (WAVE DASH), not U+FF5E. This case can
occur if we put "unix charset = EUC-JP" in smb.conf.

/* Of course there are no problem for converting on Windows platform
  with MultiByteToWideChar() and WideCharToMultiByte(). */

So the conversion problems still remain when Samba writes/reads (non
UTF-8) filename strings to/from UNIX.

>If you have a mix of NT and pre-NT clients, however, it's probably
>best to disable unicode support on the server to ensure all filenames
>pass through the same set of conversions.

In Japanese, this breaks things because Windows 9x
clients and Windows NT/2000/XP clients send a different code for
a paticular character in DOS code pages. So we cannot pass through
filename strings. Of course using Unicode, this problem is solved.

-----
TAKAHASHI, Motonobu (monyo)                    monyo at home.monyo.com
                                               http://www.monyo.com/



More information about the samba-technical mailing list