samba-3.0.0beta1 codeset issue on non-Linux
TAKAHASHI Motonobu
monyo at home.monyo.com
Sun Jun 15 18:39:43 GMT 2003
Steve Langasek wrote:
>Are you again speaking of pre-NT versions of Windows? Does the Japanese
>edition of Win2K not use Unicode internally?
Both Windows 9x and Windows NT+ clients can send Unicode on the wire, so
character handling problems between Windows clients and Samba are
basically solved, I think.
>It's my understanding that round-trip conversion of CJK characters only
>breaks if you use a different platform for each leg of the conversion.
Converting through Unicode, it can be easily broken. For example now I
have a character "U+FF5E (FULL WIDTH TILDE)". If I convert U+FF5E from
Unicode to EUC-JP and then convert EUC-JP to Unicode with GNU
libiconv, it becomes U+301C (WAVE DASH), not U+FF5E. This case can
occur if we put "unix charset = EUC-JP" in smb.conf.
/* Of course there are no problem for converting on Windows platform
with MultiByteToWideChar() and WideCharToMultiByte(). */
So the conversion problems still remain when Samba writes/reads (non
UTF-8) filename strings to/from UNIX.
>If you have a mix of NT and pre-NT clients, however, it's probably
>best to disable unicode support on the server to ensure all filenames
>pass through the same set of conversions.
In Japanese, this breaks things because Windows 9x
clients and Windows NT/2000/XP clients send a different code for
a paticular character in DOS code pages. So we cannot pass through
filename strings. Of course using Unicode, this problem is solved.
-----
TAKAHASHI, Motonobu (monyo) monyo at home.monyo.com
http://www.monyo.com/
More information about the samba-technical
mailing list