monyo at home.monyo.com
Sat Mar 6 13:55:14 GMT 2004
At first I think below 2 matters should be discussed seperately.
|(2) Seperate internal charset from "unix charset"
|(3) Suggest UCS-2 as the "internal charset"
I think using UTF-8 internally is acceptable, but anyway separating
Samba "internal charset" from "unix charset" is important.
For (2), you disscuss mainly at the view of performance? I think
the stability of code is also important. At the view of stability,
seperating "internal charset" from "unix charset" is better. Remember
"unix charset" is variable, not always UTF-8.
At the view of performance, for example using UTF-8 as both "unix
charset" and "internal charset" will be solved the problem?
For (3), assuming that we seperate "internal charset" from "unix
tridge at samba.org wrote:
| > (3) Suggest UCS-2 as the "internal charset"
| > The internal charset should be any of Unicode.
| > Currently UCS-2 is better that UTF-8, because UCS-2 is a charset
| > sent from Windows.
|As I said before, UCS-2 is dead. My understanding is that Microsoft
|have already switched over to sending UTF-16 on the wire. If you have
|evidence that this isn't the case then please let me know.
Sorry, I cannot find the evidence that UTF-16 is used on the wire.
As the URL says:
Windows uses UTF-16 internally, but I think SMB does not use...
Anyway, my opinion is that
> Simply I suggest using same charset as Windows uses on the wire.
So if currently SMB supports UTF-16, then using UTF-16 is better
because of avoiding code conversion between the wire and Samba
internal. UTF-8 is also acceptable but one more code conversion
Windows -----------> Samba -(Convert)-> Filesystem
UTF-16 UTF-16 Unix charset
Windows -(Convert)-> Samba -(Convert)-> Filesystem
UTF-16 UTF-8 Unix charset
TAKAHASHI, Motonobu (monyo) monyo at home.monyo.com
More information about the samba-technical