Doubts about Samba's unicode translation tables

Xavi Hernandez xhernandez at gmail.com
Fri Apr 19 09:04:40 UTC 2024


Hi all,

I'm currently trying to integrate Samba with CephFS, and one of the
important things to improve is to access CephFS files in an insensitive way
without needing to scan the entire directory from smbd.

During this work I've found that Samba does the case insensitive comparison
using a couple of UTF16 translation tables (one that converts to uppercase
and another that converts to lowercase).

Looking at how NTFS does the same thing I've found that it also uses a
UTF16 table stored in the $UpCase special NTFS file located in the root of
the volume.

The first question is why Samba uses two tables while Windows only requires
one ?
For what purpose is the lowercase translation table in Samba used ?
Is the Samba's case-insensitive comparison method actually equal to Windows
?

I've also extracted the $UpCase file from a Windows 11 machine and I've
found that the Samba's uppercase table is very similar but not identical
(there are 339 different values). Is this expected ?

I'm new to Samba, so I will be very grateful for any insights you might
give me about how the unicode tables work in Samba and any other important
details related to the case-insensitive accesses.

Best regards,

Xavi


More information about the samba-technical mailing list