samba-3.0.0beta1 codeset issue on non-Linux

TAKAHASHI Motonobu monyo at
Sun Jun 15 18:39:43 GMT 2003

Steve Langasek wrote:
>Are you again speaking of pre-NT versions of Windows?  Does the Japanese
>edition of Win2K not use Unicode internally?

Both Windows 9x and Windows NT+ clients can send Unicode on the wire, so
character handling problems between Windows clients and Samba are
basically solved, I think. 

>It's my understanding that round-trip conversion of CJK characters only
>breaks if you use a different platform for each leg of the conversion.

Converting through Unicode, it can be easily broken. For example now I
have a character "U+FF5E (FULL WIDTH TILDE)". If I convert U+FF5E from
Unicode to EUC-JP and then convert EUC-JP to Unicode with GNU
libiconv, it becomes U+301C (WAVE DASH), not U+FF5E. This case can
occur if we put "unix charset = EUC-JP" in smb.conf.

/* Of course there are no problem for converting on Windows platform
  with MultiByteToWideChar() and WideCharToMultiByte(). */

So the conversion problems still remain when Samba writes/reads (non
UTF-8) filename strings to/from UNIX.

>If you have a mix of NT and pre-NT clients, however, it's probably
>best to disable unicode support on the server to ensure all filenames
>pass through the same set of conversions.

In Japanese, this breaks things because Windows 9x
clients and Windows NT/2000/XP clients send a different code for
a paticular character in DOS code pages. So we cannot pass through
filename strings. Of course using Unicode, this problem is solved.

TAKAHASHI, Motonobu (monyo)                    monyo at

More information about the samba-technical mailing list