[Draft #2] Samba 3.0 roadmap...idmap storage & central idmap repository

Michael Sweet mike at easysw.com
Tue Jul 9 13:31:06 GMT 2002


Simo Sorce wrote:
> ...
>>Also, some SMB clients are using UTF-16 now (superset of UCS-2 to
>>support code points in other Unicode planes) instead of UCS-2.
> 
> 
> which clients?

IIRC, MacOS X and Windows XP clients use UTF-16, although unless
you are a Chinese user you will never notice.

> ...
>>In addition, no matter what Unicode representation is used, you
>>still have to deal with different representations of the "same"
>>character (is it a single character "a" with an umlat, or "a"
>>plus a combining umlat character?, etc.)
> 
> 
> If for that problem it does not matter which rep to use, than better go
> with the one that ease programming (and easily avoid lots of errors,
> specially in inside-string character or string search and
> uppercasing/lowercasing).

The issue is more that clients are free to provide whichever
representation they want, and you may need to convert this to
any of 4 normalization forms required by your local OS in order
to do the proper comparisons.

To make life even more interesting, case comparisons are a
locale-dependent solution.  That is, "A" with an umlat may
not compare equal to "a" with an umlat in some locales (or
shouldn't, anyways).

-- 
______________________________________________________________________
Michael Sweet, Easy Software Products                  mike at easysw.com
Printing Software for UNIX                       http://www.easysw.com





More information about the samba-technical mailing list