[Draft #2] Samba 3.0 roadmap...idmap storage & central idmap repository

Michael Sweet mike at easysw.com
Tue Jul 9 13:31:06 GMT 2002

Simo Sorce wrote:
> ...
>>Also, some SMB clients are using UTF-16 now (superset of UCS-2 to
>>support code points in other Unicode planes) instead of UCS-2.
> which clients?

IIRC, MacOS X and Windows XP clients use UTF-16, although unless
you are a Chinese user you will never notice.

> ...
>>In addition, no matter what Unicode representation is used, you
>>still have to deal with different representations of the "same"
>>character (is it a single character "a" with an umlat, or "a"
>>plus a combining umlat character?, etc.)
> If for that problem it does not matter which rep to use, than better go
> with the one that ease programming (and easily avoid lots of errors,
> specially in inside-string character or string search and
> uppercasing/lowercasing).

The issue is more that clients are free to provide whichever
representation they want, and you may need to convert this to
any of 4 normalization forms required by your local OS in order
to do the proper comparisons.

To make life even more interesting, case comparisons are a
locale-dependent solution.  That is, "A" with an umlat may
not compare equal to "a" with an umlat in some locales (or
shouldn't, anyways).

Michael Sweet, Easy Software Products                  mike at easysw.com
Printing Software for UNIX                       http://www.easysw.com

More information about the samba-technical mailing list