[s4] LDB TDB indexes

Tue Nov 17 14:37:31 MST 2009

Hi Matthias,

 > Obviously this doesn't work with my patch. For sure we'll have to be 
 > careful how to handle existing indexes and maybe convert them. Since 
 > this problem is important I would appreciate to hear tridge's comment on 
 > my thoughts.

Do you have an example of a search that gets the wrong result with the
current index DNs?

 > - DN comparison: The function doesn't seem that efficient. I "upgraded" it a bit
 >  to be more powerful (added a second length check and do both before the string
 >  comparison)

I think this change is OK, but do you have any reason to think it
really is faster? (ie. does it change the overall speed of something
like a provision?). Remember that strncmp() is a function that returns
as soon as it finds one character different.

> - The outside API contains "DN" string arguments: Bad. Since in this way we
>  fully rely on the outside calls regarding the right DN format. Solution: Use
>  always a "struct ldb_dn" entry. Since this one is interchangeable and we can
>  handle it in our preferred way.

have you tested what the performance impact of this change is? It
might be OK, but I'd want some assurance that you had done some real
performance testing.

> - DN normalisation: I think the actual call "ldb_dn_get_linearized"
>   isn't the right one. Since we e.g. have the DNs "DC=A,DC=org" and
>   "DC=a,DC=org" (linearised form). As we see they don't seem to be
>   the same but in fact they should be (DNs are case insensitive when
>   matching). Therefore I propose to use "ldb_dn_get_casefold" which
>   normalises them in the right way (upcases them).  And the the two
>   example DNs become the same as they should.

If this is a real bug then please give an example, and we should also
add a test case to our testsuite that demonstrates it. 

I can't actually see how the existing code could produce an incorrect
answer, as the ltdb_dn_list_remove_duplicates() along with the post
filtering will ensure we don't end up with duplicates, and the use of
the correct case in the index keys means we should always find all the
records. I could be wrong though, so please provide an example.

I'm wary of using ldb_dn_get_casefold() as it is sometimes much slower
than ldb_dn_get_linearized(). If it is really needed then we'll find a
way of making it fast, but first you need to convince me its needed.

Cheers, Tridge