Managing DNs in libads only in utf8

Jeremy Allison jra at samba.org
Tue Feb 27 07:14:24 GMT 2007


On Tue, Feb 27, 2007 at 01:50:23AM -0500, simo wrote:
> Hello technical people,
> 
> after a report about a possible problem with how we manage DNs,
> I discovered we currently may have some problems in case "unix charset"
> is not set to UTF-8 and we are using security = ads. *
> 
> The problem is that we always convert everything coming out of ldap to
> the local unix charset and then we convert** it back utf8 before using
> it (see ads_get_dn()).
> 
> The problem in doing this is that we convert some DN this way:
> utf8 -> local -> utf8
> 
> If the local unix charset is not able to represent one of the characters
> of the DN, we actually corrupt the DN by doing the double conversion.

Ok, what kind of things break here ?

This is *exactly* the same problem that
people have with filenames/usernames when
using SJIS or EUC (Japanese character sets)
as Samba unix charset when mixing with
Windows clients that might send UTF16 names
not compatible with SJIS or EUC names the
server is using.

What do people do who want these charsets
do ? They live with it, as the advantage to
them of having SJIS or EUC on the server outweighs
the advantage of utf8. They just ensure the
clients "don't do that".

So before you go down this route I'd like
a good example of what will unexpectedly
fail vs. the complexity of internally "remembering"
some strings are now natively utf8 internally
rather than "unix charset". Remember you need
to track this and convert across all boundaries.

Right now it's simple - internal -> onto wire means
convert from unix -> utf8, wire -> internal means
convert utf8 -> unix. If you blur that boundary
I think you will break more than you fix.

Jeremy.


More information about the samba-technical mailing list