i18n question.

TAKAHASHI Motonobu monyo at home.monyo.com
Sat Mar 6 13:55:14 GMT 2004

At first I think below 2 matters should be discussed seperately.

|(2) Seperate internal charset from "unix charset"
|(3) Suggest UCS-2 as the "internal charset"

I think using UTF-8 internally is acceptable, but anyway separating
Samba "internal charset" from "unix charset" is important.


For (2), you disscuss mainly at the view of performance? I think
the stability of code is also important. At the view of stability,
seperating "internal charset" from "unix charset" is better. Remember
"unix charset" is variable, not always UTF-8.

At the view of performance, for example using UTF-8 as both "unix
charset" and "internal charset" will be solved the problem?


For (3), assuming that we seperate "internal charset" from "unix

tridge at samba.org wrote:

| > (3) Suggest UCS-2 as the "internal charset"
| >   The internal charset should be any of Unicode.
| >   Currently UCS-2 is better that UTF-8, because UCS-2 is a charset
| >   sent from Windows.
|As I said before, UCS-2 is dead. My understanding is that Microsoft
|have already switched over to sending UTF-16 on the wire. If you have
|evidence that this isn't the case then please let me know.

Sorry, I cannot find the evidence that UTF-16 is used on the wire.

As the URL says:

Windows uses UTF-16 internally, but I think SMB does not use...

Anyway, my opinion is that

> Simply I suggest using same charset as Windows uses on the wire.

So if currently SMB supports UTF-16, then using UTF-16 is better
because of avoiding code conversion between the wire and Samba
internal. UTF-8 is also acceptable but one more code conversion

  Windows -----------> Samba -(Convert)-> Filesystem
    UTF-16             UTF-16             Unix charset

  Windows -(Convert)-> Samba -(Convert)-> Filesystem
    UTF-16             UTF-8              Unix charset

TAKAHASHI, Motonobu (monyo)                    monyo at home.monyo.com

More information about the samba-technical mailing list