[cifs-protocol] Active Directory server sort Unicode normalization
douglas.bagnall at catalyst.net.nz
Tue Jan 19 00:04:36 UTC 2016
I am trying to improve server sort for Samba. Is there anywhere a
machine readable version of the file "Active Directory Sort Table
v02.pdf" from http://www.microsoft.com/en-us/download/details.aspx?id=1175
and if so, is there a license for its use?
I'm thinking of something akin to the text files in
Alternatively, is one of these more or less the same as the PDF (which
*looks* to be the case)? And how are these licensed?
Or, is there somewhere a description of how to derive these tables
from the Unicode documents?
Another question: does Active Directory have any way of sorting
characters outside the basic multilingual plane? Following RFC4518,
the character "🅌" ("SQUARED SD" https://codepoints.net/U+1F14C) would
be NFKC normalized and sort equivalently to "SD", but I can't see how
Windows would deal with that.
In general it looks like Windows *almost* follows the RFC, but retains
a bit of low precedence information in the case weight field
(distinguishing e.g. the plain digit 1 from the superscript digit ¹)
that would be lost to strict NFKC normalization. And it stops at
0xFFFF. Is that a fair summary?
More information about the cifs-protocol