[Draft #2] Samba 3.0 roadmap...idmap storage & central idmap
repository
Michael Sweet
mike at easysw.com
Tue Jul 9 13:31:06 GMT 2002
Simo Sorce wrote:
> ...
>>Also, some SMB clients are using UTF-16 now (superset of UCS-2 to
>>support code points in other Unicode planes) instead of UCS-2.
>
>
> which clients?
IIRC, MacOS X and Windows XP clients use UTF-16, although unless
you are a Chinese user you will never notice.
> ...
>>In addition, no matter what Unicode representation is used, you
>>still have to deal with different representations of the "same"
>>character (is it a single character "a" with an umlat, or "a"
>>plus a combining umlat character?, etc.)
>
>
> If for that problem it does not matter which rep to use, than better go
> with the one that ease programming (and easily avoid lots of errors,
> specially in inside-string character or string search and
> uppercasing/lowercasing).
The issue is more that clients are free to provide whichever
representation they want, and you may need to convert this to
any of 4 normalization forms required by your local OS in order
to do the proper comparisons.
To make life even more interesting, case comparisons are a
locale-dependent solution. That is, "A" with an umlat may
not compare equal to "a" with an umlat in some locales (or
shouldn't, anyways).
--
______________________________________________________________________
Michael Sweet, Easy Software Products mike at easysw.com
Printing Software for UNIX http://www.easysw.com
More information about the samba-technical
mailing list