[LDB] Store index DNs as canonical case

Andrew Bartlett abartlet at samba.org
Mon Aug 31 16:22:06 MDT 2009


On Mon, 2009-08-31 at 13:10 -0400, simo wrote:
> On Mon, 2009-08-31 at 23:27 +1000, Andrew Bartlett wrote:
> > The attached patch reworks our index code to always store the canonical
> > casefolded form of the DN in an index.  It does not work yet, and needs
> > to add a 'index version' to the ldb to trigger a reindex.  The
> > casefolded index entries should be backward compatible, because the
> > previous code accepted any case variation, so we are simply being more
> > strict in what we now write.  
> > 
> > This was inspired by a bug where we would not delete index entries
> > because the DN was not in a canonical from, and the existing
> > strcasecmp() didn't match.  
> > 
> > (strcasecmp isn't the right option any more anyway)
> > 
> > This stems from the fact that LDB DNs were just case-insensitive strings
> > originally, but have become far more complex since then. 
> > 
> > Any comments would be most welcome while I chase down the remaining
> > issues. 
> 
> Comment:
> this means that the index string format depends on the case sensitivity
> of an attribute, this is a change in behavior, although I see you
> recognize the need of a re-index the db on upgrade.

Given that the on-disk TDB_KEY DN=<casefold_dn> already varies like
this, we simply get closer to what I think should have done in the first
place, and stored the TDB key in the index)!

> Question:
> Have you done any test performance-wise ?

Not yet.  While I hope to improve performance, this is actually
initiated from thoughts of correctness (on delete, the old code did a
strcasecmp() - masked behind ldb_attr_cmp() - on the DN string, and
could therefore possibly remove the wrong DN). 

> Aside: I have seen some odd behavior with indexes I think we need to be
> a bit smarter with some search filters and reorder internal searches so
> that we parse first indexes with the smallest number of entries, esp
> when you have and 'and' expression of the form (&(foo=x)(bar=y)).
> Have you looked into any of this by chance ?

No.  To do that we would have to start storing the number of values in
the index somehow.  

We do reorder to look for 'unique' indexes first, but sadly there are
less of these than tridge and I hoped for when we added that (only
objectGUID and objectSID). 

Andrew Bartlett

-- 
Andrew Bartlett
http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Samba Developer, Cisco Inc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20090901/2b21660a/attachment.pgp>


More information about the samba-technical mailing list