dirsync review

Tue Mar 22 15:20:35 MDT 2011

Hi Tridge,

First Thanks for the review,

> Hi Matthieu,
>
> Andrew and I looked over the dirsync code today. Impressive work!
>
> The comments below are in two parts. One is "micro" comments on the
> individual lines of code. The other is higher level comments on the
> broad approach taken.
>
> Here are the 'micro' comments:
>
>   - the strncmp() in rootdse.c for LDB_CONTROL_DIRSYNC_OID should be a
>     strcmp(), otherwise it will match other OIDs (OIDs can be different
>     lengths)
>
I'll correct it but correct me if I'm wrong this should only happen if 
someone has the bad idea of using the same base OID than the one of 
LDB_CONTROL_DIRSYNC_OID ?

>   - the DSDB_CONTROL_DIRSYNC_RELAX_CHECK_OID doesn't seem to be ever
>     set. It's used in acl_read, but not set anywhere. Did you miss a
>     patch?
Yep seems it went away, I'll readd it.
>   - DSDB_CONTROL_DIRSYNC_RELAX_CHECK_OID might be better implemented as
>     a handle flag, like LDB_HANDLE_FLAG_UNTRUSTED. The control works,
>     but adding/checking controls is a lot more expensive than flags.
>
Basically it's not a question of trusted/not trusted or more exactly 
it's just the opposite as acl_read needs to have a special behavior when 
dealing with deleted objects for a simple user, so marking the request 
trusted won't help, we can introduce a new ldb flags but I'm not too 
sure Simo will accept.
Is there a way to add a flag in a ldb request other than defining it in 
ldb_private.h ?

>   - it might be better to call dsdb_module_search_ex() something like
>     dsdb_module_search_tree(). Otherwise next time we add a new feature
>     we'll end up with ex2 and so on :-)
>
>   - a talloc_free(tmp_ctx) is missing when ldb_parse_tree() fails in
>     dsdb_module_search()
>
>   - in dirsync_ldb_search() the ldb_dn_is_special() check should be
>     earlier, before dsc is allocated
>
>   - same with the ldb_request_get_control() call
>
>   - the check for it being critical can come before the
>     talloc_get_type() call
>
>   - you need to check if the dirsync_ctl is NULL after the
>     talloc_get_type() (as users could supply the control with no data)
>
>   - its best to make attribute lists the same scope as the call, so
>     they are easier to find in the code, so acl_attrs[] should be
>     declared inside the LDAP_DIRSYNC_OBJECT_SECURITY if() check
>
>   - you don't need to check for acl_res->count != 1 after a
>     dsdb_module_search_dn(). When searching on one dn, you either get 1
>     answer, or an error (its checked for inside the search)
>
>   - its nice to keep the talloc hierarchy when creating arrays, as it
>     makes debugging easier. so attrs[0] = talloc_asprintf(dsc, "*")
>     should be attrs[0] = talloc_asprintf(attrs, "*") (there are a few
>     other places like this)
>
>   - please use ldb_operr() instead of "return
>     LDB_ERR_OPERATIONS_ERROR", as that sets a error message with file
>     and line, so we can find the error (there are quite a few of these
>     in dirsync.c)
>
>   - Similarly, use ldb_error() instead of just returning a LDB_ERR_* in
>     modules, unless you know an error string is already set. It helps a
>     lot to get the ldap protocol to send back an error string saying
>     exactly what went wrong
>
>   - the code checks for failures of functions that do allocation in
>     some places but not others (eg. calls to ldb_attr_list_copy_add())
>
Ok
>   - in this code:
>
> 		attrs[0] = talloc_asprintf(dsc, "*");
> 		attrs[1] = "parentGUID";
>
>     why use a allocation for the first one, and a constant string for
>     the 2nd one? If it is needed for some reason, then I think a
>     comment would be useful :-)
>
Don't know I must have started with this and then realize that I need 
parentGuid and that I don't need to talloc it much ...
>   - the function dirsync_ldb_search() adds a lot of controls. Do you
>     know if we should fail if the user already has specified one of
>     these? Should we do this for example:
>
> 	ret = ldb_request_add_control(req, LDB_CONTROL_SHOW_RECYCLED_OID, false, NULL);
> 	if (ret != LDB_SUCCESS&&  ret != LDB_ERR_ATTRIBUTE_OR_VALUE_EXISTS) {
> 		return ret;
> 	}
In fact I faced the problem today while playing more with dirsync, the 
good solution is just not to add it if it's already present.
I added some tests.
>
>   - its probably worth hanging most things off dsc instead of req, if
>     its related to the dirsync control, eg. the ldb_parse_tree in the
>     USN check. That will make it a bit easier to track down leaks if
>     they happen
>
>   - the code that creates the ldb_parse_tree structure would probably
>     be much clearer using talloc_asprintf() of an expression, and then
>     using that expression. Creating a ldb_parse_tree() manually does
>     work, but its very hard to read the code. For example, do this:
>
>       expression = talloc_asprintf(dsc, "(&%s(uSNChanged>=%llu))",
>                                    ldb_filter_from_tree(dsc, tree),
> 				  dsc->fromreqUSN + 1);
>
>     I know its less efficient than the code you have, but it is much
>     easier to read, and I don't think this function is speed critical
>
Ok I think I took this code from a portion of metze in replmetadata or 
deleted module.
>   - saved_controls is never used, so just pass NULL to
>     ldb_save_controls() (we never use the saved controls in Samba, so
>     we should probably have a ldb_remove_control() function to make it
>     clearer)
>
>   - in the LDB_REPLY_REFERRAL handling, I think it would be worth
>     checking that the 8 chars that are being skipped really are what
>     you expect (ie. a ldap://)
>
Ok
>   - why the strange cast in this code:
>
> 		val = (struct ldb_val*)&res2->msgs[0]->elements[0].values[0].data;
>
>     perhaps that should be just:
>
> 		val =&res2->msgs[0]->elements[0].values[0];
Ok
>   - why steal the val here?
>
> 		talloc_steal(res->msgs[0]->elements, val);
>
Because it has been talloc to a context related to res2, and from my 
understanding of how ldb_msg_add_value works, is seems that val will not 
be "stealed" in the context of res as it is just doing 
el->values[el->num_values] = *val; (so basically copying data and length 
from *val to el->values[el->num_values]).

>   - I think this code should just use ldb_oom():
>
> 		control = talloc_zero(ares->controls, struct ldb_dirsync_control);
> 		if (control == NULL) {
> 			return ldb_module_done(dsc->req, NULL, NULL, LDB_ERR_OPERATIONS_ERROR);
> 		}
>     (there are a few others like this)
>
Ok
>   - when allocating things like struct drsuapi_DsReplicaCursor, its
>     best to use talloc_zero() not talloc(). I think the code may have
>     left some fields unset.
>
Ok
>   - this looks very strange:
>
> 		cookie->blob.guid1 = *(talloc_steal(cookie, dsc->our_invocation_id));
>
>     I think you just want this:
>
> 		cookie->blob.guid1 = *dsc->our_invocation_id;
I just feared that the value of cookie->blob.guid1 might be freed 
earilier than wanted  (as when the context of dsc->our_invocation_id 
will be free, who knows how long the cookie will last as it might be 
stolen on a long living context).

>   - use ldb_error() when returning errors for things like
>     ndr_push_struct_blob() so we have a record of what the error is in
>     the ldap response packet
Ok
>   - why use strncmp() here:
>
> 		if (strncmp(msg->elements[i].name, "uSNChanged", sizeof("uSNChanged") -1) == 0) {
>
>     I think strcmp() is correct. Otherwise you'll match things starting
>     with uSNChanged. You only really need strncmp() when you are
>     comparing with a blob that has a specified length, such as when it
>     is a ldb_val, and even then just doing the strncmp() isn't enough,
>     you'd have to check for the termination character.
>
>     There are several cases like this in the code, please check all
>     strncmp() calls in dirsync.c - I think they should all be strcmp()
>     or ldb_attr_cmp(). I think most of them should be ldb_attr_cmp() as
>     they are not case sensitive
>
>     There are also quite a few cases where the code uses strcmp() which
>     I suspect should be ldb_attr_cmp(), for example looking for
>     "instanceType"
>
>
>   - what is this for:
>
> 			val = strtoull((const char *)msg->elements[i].values[0].data, NULL, 0);
>
>     I don't think val is used afterwards?
>
it's used at the end, line 548.
>   - this is another cast and steal that I don't think is right:
>
> 			replMetaData = (struct ldb_val *)talloc_steal(dsc,&msg->elements[i].values[0].data);
>
>     (one is a structure, the other an element of the structure, this
>     could well cause a crash on some systems. The code relies on
>     structure order and structure packing, which is dangerous)
>
>     if you need the steal, then I think what you want is this:
>
>        	  	struct ldb_val replMetaData = data_blob_null;
> 		if (some_condition) {
> 		        replMetaData = msg->elements[i].values[0];
> 			talloc_steal(dsc, replMetaData.data);
> 		}
> 		/* later on .... */
> 		if (replMetaData.data != NULL) {
> 		   /* use it .... */
> 		}
>
>    - the O() analysis may be a bit optimistic, as
>      ldb_msg_remove_element() costs O(n), as it does a memmove(), so I
>      think its O(n^3). It may be better to first create a new array
>      equal in size to the old array, then populate it with ones that
>      match the filter.
Really ? I thought that memmove wan't that costly

>    - the check
>
>        tmp = strstr(txt, ",CN=Deleted Objects,");
>
>      doesn't look right. That string could appear anywhere in a DN, and
>      not have a special meaning.
>
>      Perhaps its better to check for the isDeleted attribute?
>
Is this attribute always returned ?

>    - this code looks a bit strange:
>
> 		tmpval.data = NULL;
> 		tmpval.length = 0;
> 		/* code cut out ..... */
> 		msg->dn = ldb_dn_from_ldb_val(msg, ldb,&tmpval);
>
>     do you just want the empty DN here? I think
>       ldb_dn_new(msg, ldb, "")
>     would be clearer if that is what's wanted
>
Well It seems that I tried this but didn't get the expected results, 
I'll try one more time !

>    - in this code:
>
> 		for (i = msg->num_elements - 1; i>= 0; i--) {
> 			attr = dsdb_attribute_by_lDAPDisplayName(dsc->schema,
> 					msg->elements[i].name);
>
>      I can't see where attr ends up being used. Is it needed?
>
>   - in that same loop, it looks like what is being done is to remove
>     all elements if they aren't objectGUID. Perhaps just do this:
>
>      if (ldb_msg_find_element(msg, "objectGUID") != NULL) {
>         guidfound = true;
>      }

But I have then to had to create a new message and just put the 
objectGUID, it might be quicker ... right !
>   - the code to cope with guidfound==false seems strange. Under what
>     circumstances would we not be able to get the objectGUID if we ask
>     for it? Parsing the deleted objected DEL: string seems very strange
>     to me!
Well that's an hypothesis, maybe it can't happen and the objectGUID will 
always be available (and in this case we just ignore this code).
>
> Now for the 'broad' comments on the basic approach.
>
> One of the things that concerns me is the efficiency of the approach
> in the code. Let's say that a directory has 100k records, and someone
> uses dirsync to pull them all, 100 elements at a time. If I have
> understood the protocol correctly, then I think that means 1000 calls
> will be made, the first without a cookie and the rest with a cookie
> returned by the previous call.
Right and wrong as for the moment we send the whole result that is 100K 
the first time ... (might be a bit too much I agree).
> I think that the dirsync.c code as currently implemented will end up
> doing an unindexed search over all 100k elements in the database 1000
> times. So we are about 1000 times less efficient than we should be.
>
> An alternative approach is what the getncchanges.c code does, which is
> to load a list of matching objectGUIDs on the first step of the
> search, which we save in the guids[] array of
> drsuapi_getncchanges_state. Then subsequent calls return records from
> this list of GUIDs, by fetching those records one at a time. This
> means we only do one non-indexed search for a complete directory
> return.
>
> I think this same strategy could be used in dirsync. You'd keep a
> persistent set of dirsync state structures in the dirsync module
> private data. When a dirsync search comes in, then if it doesn't have
> a cookie then create a new one, and remember it. For real efficiency,
> use an idr tree.
What is an idr tree?

> If the request did have a cookie then look it up in the idr tree, and
> get back the state structure for that request.
Well in theory it could work but a couple of remarks: if I have a 
partial search this is indicated in the control returned by dirsync, and 
we don't return uptodateness vector (it's how windows do) so in fact 
this information should be kept up to the moment where the last cookie 
is returned (the one with the uptodateness vector). Your idea is working 
well because you expect the client to keep on querying the server up to 
the moment when the control has the flag "more information" set to 0 
(this field is using the flags field of the input control as the 
structure is the same for received and sent controls). Basically if I 
receive a partial cookie I look for the usn and the timestamp and from 
it I get the guid list (we make the supposition that we already filtered 
out the case when it's not our invocation_id when dealing with partial 
cookie.
But there is nothing that oblige me to have the same search expression 
between 2 searches (ie. I asked samaccount=s*, received cookie1, then I 
present cookie1 and search for samaccountname=p*),  that's a bit stupid 
but it seems correct in terms of dirsync use (so we might be obliged to 
store the search expression).

> The state structure would have an array of GUIDs that match the simple
> search uSNChanged>=N from the first search. That is a fairly compact
> representation of all the records that could be returned.
Well I guess we could also store the last index of the treated object, 
then get it for the next partial search in anycase this means that we 
have to store objectGUID by usnChanged.
> Then the code would just walk this list, fetch each record one at a
> time, and adding it to the returned list if it matches. You'd have
> just one non-base search per set of dirsync calls.
>
> It might also be worth having a timer that deletes stale dirsync
> cookies, in case a caller asks for one then doesn't come back
> again. Does the spec talk about the state timing out?
No but the spec is not very verbose.  One more time this is just for 
dealing more efficiently with the case: "I have a lot of stuff to 
return", it is envisioned to be mostly the case on the first synchro 
(when I'm trying to get my 100K objects) but it could happen also if I 
have an AD that is very active and that I don't query much often. 
Already for the next requests (I mean after receiving the control with 
"more information" set to 0) we will have an indexed (or soon to be) as 
we are using the usnchanged as the first search criteria, and we expect 
the amount of data to be returned to be much smaller (ie. even with a 
100K object AD, the first time you receive 100K objects but then day by 
day you might just have 100 changed objects maybe 1000).

This would be useless after a search that returns the "more information" 
flags set to 0 as the next time I'll do my search the list of objects 
that will satisfy my search criteria will be (or could be) pretty different.
> Do you think that approach would work?
>
Well of course that's a good idea ! But:

* it means a lot of rework of the code
* code for dealing with more than 1 DC is not finished yet I would like 
to have it working first
* I pretty badly need this to work and the sooner the better as I want 
to have a very clean situation with my HQ and it's not the case right 
now, ok it's been like this for 2 years but the thing is that in the 
same time I'm hopping to move to another job
* I'm not aware yet of someone having a 100K users directory and willing 
to use dirsync

So as a conclusion I would quote someone that you know well :"first have 
it working, then make it quick", can this plan work, I work on the more 
than 1 DC stuff I clean all the bugs/pb/remarks that you noted in the 
first part, I put it in production. Then I go back to it and make it fast.

Matthieu.

-- 
Matthieu Patou
Samba Team        http://samba.org
Private repo      http://git.samba.org/?p=mat/samba.git;a=summary