[PATCH] Some follow-up replmd clean-ups and flappy test fix

Tim Beale timbeale at catalyst.net.nz
Thu Nov 22 00:12:28 UTC 2018


Attached are some follow-up patches to the recent replmd changes:

- The getnc_exop test_link_utdv_hwm test started flapping recently. It
turns out the flappiness was unrelated to any of the recent DRS changes,
but I thought I'd better check. I've fixed the flappiness anyway.
- We can prevent replmd_process_linked_attribute() from messing with the
msg (thus invalidating pointers to msg->elements[]) by just not passing
it the msg (turns out it's not really needed).
- Tidied up replmd_replicated_apply_isDeleted() further by refactoring
out the early-return logic and getting rid of unnecessary indent.

CI pass: https://gitlab.com/samba-team/devel/samba/pipelines/37454488

Merge request: https://gitlab.com/samba-team/samba/merge_requests/109

Review appreciated. Thanks.


-------------- next part --------------
From 927ebca98bffad5a79080f8ca10577efbf10ab10 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Tue, 20 Nov 2018 17:15:41 +1300
Subject: [PATCH 1/5] tests: Add better error for DRS tests

We've got a flappy test hitting this assertion failure, but we can't
tell why it's failing intermittently (probably because we're bumping the
RID-Set, but there's no way to confirm this).

Add some extra debug info if the test assertion fails.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 source4/torture/drs/python/drs_base.py | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/source4/torture/drs/python/drs_base.py b/source4/torture/drs/python/drs_base.py
index 2e8598d..77abd43 100644
--- a/source4/torture/drs/python/drs_base.py
+++ b/source4/torture/drs/python/drs_base.py
@@ -384,15 +384,15 @@ class DrsBaseTestCase(SambaToolCmdTest):
         """
         Check that a ctr6 matches the specified parameters.
         """
-        self.assertEqual(ctr6.object_count, len(expected_dns))
+        ctr6_dns = self._get_ctr6_dn_list(ctr6)
+        self.assertEqual(ctr6.object_count, len(expected_dns),
+                         "Received unexpected objects (%s)" % ctr6_dns)
         self.assertEqual(ctr6.linked_attributes_count, len(expected_links))
         self.assertEqual(ctr6.more_data, more_data)
         self.assertEqual(ctr6.nc_object_count, nc_object_count)
         self.assertEqual(ctr6.nc_linked_attributes_count, nc_linked_attributes_count)
         self.assertEqual(ctr6.drs_error[0], drs_error)
 
-        ctr6_dns = self._get_ctr6_dn_list(ctr6)
-
         i = 0
         for dn in expected_dns:
             # Expect them back in the exact same order as specified.
-- 
2.7.4


From 99529f370fb7d8ce5697f6cb3ab8bc857babbe18 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Tue, 20 Nov 2018 17:30:37 +1300
Subject: [PATCH 2/5] tests: Fix flappiness in DRS tests due to RID Set
 changing

The test_link_utdv_hwm test case in getnc_exop has started getting
slightly flappy (8 failures in the last 2 weeks). The problem is the
test case creates a new computer, which can occasionally result in a new
RID pool being allocated.

The problem can be reproduced by running the test case repeatedly (it
usually fails after ~250 times).

This patch updates the _check_ctr6() assertion to filter out the 'CN=RID
Set' object, if it happens to be present.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 source4/torture/drs/python/drs_base.py | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/source4/torture/drs/python/drs_base.py b/source4/torture/drs/python/drs_base.py
index 77abd43..d19e625 100644
--- a/source4/torture/drs/python/drs_base.py
+++ b/source4/torture/drs/python/drs_base.py
@@ -384,9 +384,20 @@ class DrsBaseTestCase(SambaToolCmdTest):
         """
         Check that a ctr6 matches the specified parameters.
         """
-        ctr6_dns = self._get_ctr6_dn_list(ctr6)
-        self.assertEqual(ctr6.object_count, len(expected_dns),
+        ctr6_raw_dns = self._get_ctr6_dn_list(ctr6)
+
+        # filter out changes to the RID Set objects, as these can happen
+        # intermittently and mess up the test assertions
+        ctr6_dns = []
+        for dn in ctr6_raw_dns:
+            if "CN=RID Set," in dn or "CN=RID Manager$," in dn:
+                print("Removing {0} from GetNCChanges reply".format(dn))
+            else:
+                ctr6_dns.append(dn)
+
+        self.assertEqual(len(ctr6_dns), len(expected_dns),
                          "Received unexpected objects (%s)" % ctr6_dns)
+        self.assertEqual(ctr6.object_count, len(ctr6_raw_dns))
         self.assertEqual(ctr6.linked_attributes_count, len(expected_links))
         self.assertEqual(ctr6.more_data, more_data)
         self.assertEqual(ctr6.nc_object_count, nc_object_count)
-- 
2.7.4


From cef021c407dedce8a5a80b128f2414b258e0aa13 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Tue, 20 Nov 2018 11:45:07 +1300
Subject: [PATCH 3/5] replmd: Avoid passing msg to
 replmd_process_linked_attribute()

We can prevent anyone from inadvertently adding/removing msg->elements[]
in replmd_process_linked_attribute() by just not passing msg into the
function. Currently we only actually need the source DN and a memory
context for reallocating old_el->values.

The warning comment has been moved to a more appropriate place.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 source4/dsdb/samdb/ldb_modules/repl_meta_data.c | 40 ++++++++++++++-----------
 1 file changed, 22 insertions(+), 18 deletions(-)

diff --git a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
index d89776e..0a01bd4 100644
--- a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
+++ b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
@@ -7939,13 +7939,11 @@ static int replmd_check_singleval_la_conflict(struct ldb_module *module,
 
 /**
  * Processes one linked attribute received via replication.
- * @param msg the source object for the link. For optimization, the same msg
- * can be used across multiple calls to replmd_process_linked_attribute().
- * @note this function should not add or remove any msg attributes (it should
- * only add/modify values for the linked attribute being processed). Otherwise
- * msg->elements is realloc'd and old_el/pdn_list pointers will be invalidated
+ * @param src_dn the DN of the source object for the link
  * @param attr schema info for the linked attribute
  * @param la_entry the linked attribute info received via DRS
+ * @param element_ctx mem context for msg->element[] (when adding a new value
+ * we need to realloc old_el->values)
  * @param old_el the corresponding msg->element[] for the linked attribute
  * @param pdn_list a (binary-searchable) parsed DN array for the existing link
  * values in the msg. E.g. for a group, this is the existing members.
@@ -7956,11 +7954,12 @@ static int replmd_check_singleval_la_conflict(struct ldb_module *module,
 static int replmd_process_linked_attribute(struct ldb_module *module,
 					   TALLOC_CTX *mem_ctx,
 					   struct replmd_private *replmd_private,
-					   struct ldb_message *msg,
+					   struct ldb_dn *src_dn,
 					   const struct dsdb_attribute *attr,
 					   struct la_entry *la_entry,
 					   struct ldb_request *parent,
 					   struct ldb_message_element *old_el,
+					   TALLOC_CTX *element_ctx,
 					   struct parsed_dn *pdn_list,
 					   replmd_link_changed *change)
 {
@@ -7987,12 +7986,12 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 	if (!W_ERROR_IS_OK(status)) {
 		ldb_asprintf_errstring(ldb, "Failed to parsed linked attribute blob for %s on %s - %s\n",
 				       attr->lDAPDisplayName,
-				       ldb_dn_get_linearized(msg->dn),
+				       ldb_dn_get_linearized(src_dn),
 				       win_errstr(status));
 		return LDB_ERR_OPERATIONS_ERROR;
 	}
 
-	ret = replmd_check_target_exists(module, dsdb_dn, la_entry, msg->dn,
+	ret = replmd_check_target_exists(module, dsdb_dn, la_entry, src_dn,
 					 true, &guid, &ignore_link);
 
 	if (ret != LDB_SUCCESS) {
@@ -8021,7 +8020,7 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 
 	if (!replmd_link_update_is_newer(pdn, la)) {
 		DEBUG(3,("Discarding older DRS linked attribute update to %s on %s from %s\n",
-			 old_el->name, ldb_dn_get_linearized(msg->dn),
+			 old_el->name, ldb_dn_get_linearized(src_dn),
 			 GUID_string(mem_ctx, &la->meta_data.originating_invocation_id)));
 		return LDB_SUCCESS;
 	}
@@ -8038,7 +8037,7 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 	 */
 	if (active) {
 		ret = replmd_check_singleval_la_conflict(module, replmd_private,
-							 mem_ctx, msg->dn, la,
+							 mem_ctx, src_dn, la,
 							 dsdb_dn, pdn, pdn_list,
 							 old_el, schema, attr,
 							 seq_num,
@@ -8055,7 +8054,7 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 			/* remove the existing backlink */
 			ret = replmd_add_backlink(module, replmd_private,
 						  schema, 
-						  msg->dn,
+						  src_dn,
 						  &pdn->guid, false, attr,
 						  parent);
 			if (ret != LDB_SUCCESS) {
@@ -8090,7 +8089,7 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 			}
 		}
 
-		old_el->values = talloc_realloc(msg->elements, old_el->values,
+		old_el->values = talloc_realloc(element_ctx, old_el->values,
 						struct ldb_val, old_el->num_values+1);
 		if (!old_el->values) {
 			ldb_module_oom(module);
@@ -8124,7 +8123,7 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 
 		/* Set the new link as inactive/deleted to avoid conflicts */
 		ret = replmd_delete_link_value(module, replmd_private, old_el,
-					       msg->dn, schema, attr, seq_num,
+					       src_dn, schema, attr, seq_num,
 					       false, &guid, dsdb_dn,
 					       val_to_update);
 
@@ -8137,7 +8136,7 @@ static int replmd_process_linked_attribute(struct ldb_module *module,
 		/* if the new link is active, then add the new backlink */
 		ret = replmd_add_backlink(module, replmd_private,
 					  schema,
-					  msg->dn,
+					  src_dn,
 					  &guid, true, attr,
 					  parent);
 		if (ret != LDB_SUCCESS) {
@@ -8281,7 +8280,12 @@ static int replmd_process_la_group(struct ldb_module *module,
 
 	/*
 	 * go through and process the link target value(s) for this particular
-	 * source object and attribute
+	 * source object and attribute. For optimization, the same msg is used
+	 * across multiple calls to replmd_process_linked_attribute().
+	 * Note that we should not add or remove any msg attributes inside the
+	 * loop (we should only add/modify *values* for the attribute being
+	 * processed). Otherwise msg->elements is realloc'd and old_el/pdn_list
+	 * pointers will be invalidated
 	 */
 	for (la = DLIST_TAIL(la_group->la_entries); la; la=prev) {
 		prev = DLIST_PREV(la);
@@ -8304,9 +8308,9 @@ static int replmd_process_la_group(struct ldb_module *module,
 		}
 		ret = replmd_process_linked_attribute(module, tmp_ctx,
 						      replmd_private,
-						      msg, attr, la, NULL,
-						      old_el, pdn_list,
-						      &change_type);
+						      msg->dn, attr, la, NULL,
+						      msg->elements, old_el,
+						      pdn_list, &change_type);
 		if (ret != LDB_SUCCESS) {
 			replmd_txn_cleanup(replmd_private);
 			return ret;
-- 
2.7.4


From 17c544a5cb1d039f38e97d047682971bcf68ef15 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Tue, 20 Nov 2018 15:54:31 +1300
Subject: [PATCH 4/5] replmd: Move logic into new
 replmd_should_apply_isDeleted() function

It's easier to follow the logic involved here when it's split out into a
separate function.

This patch should not alter the existing logic/functionality.

Note the 'else' case is somewhat redundant, but it avoids excessive
whitespace changes to the function. It'll be tidied up in the next
patch.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 source4/dsdb/samdb/ldb_modules/repl_meta_data.c | 52 +++++++++++++++++++++----
 1 file changed, 45 insertions(+), 7 deletions(-)

diff --git a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
index 0a01bd4..97ab6ac 100644
--- a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
+++ b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
@@ -6799,26 +6799,64 @@ static int replmd_replicated_apply_next(struct replmd_replicated_request *ar)
 }
 
 /*
+ * Returns true if we need to do extra processing to handle deleted object
+ * changes received via replication
+ */
+static bool replmd_should_apply_isDeleted(struct replmd_replicated_request *ar,
+					  struct ldb_message *msg)
+{
+	struct ldb_dn *deleted_objects_dn;
+	int ret;
+
+	if (!ar->isDeleted) {
+
+		/* not a deleted object, so don't set isDeleted */
+		return false;
+	}
+
+	ret = dsdb_get_deleted_objects_dn(ldb_module_get_ctx(ar->module),
+					  msg, msg->dn,
+					  &deleted_objects_dn);
+
+	/*
+	 * if the Deleted Object container lookup failed, then just apply
+	 * isDeleted (note that it doesn't exist for the Schema partition)
+	 */
+	if (ret != LDB_SUCCESS) {
+		return true;
+	}
+
+	/*
+	 * the Deleted Objects container has isDeleted set but is not entirely
+	 * a deleted object, so DON'T re-apply isDeleted to it
+	 */
+	if (ldb_dn_compare(msg->dn, deleted_objects_dn) == 0) {
+		return false;
+	}
+
+	return true;
+}
+
+/*
  * This is essentially a wrapper for replmd_replicated_apply_next()
  *
  * This is needed to ensure that both codepaths call this handler.
  */
 static int replmd_replicated_apply_isDeleted(struct replmd_replicated_request *ar)
 {
-	struct ldb_dn *deleted_objects_dn;
 	struct ldb_message *msg = ar->objs->objects[ar->index_current].msg;
 	int ret;
+	bool apply_isDeleted;
 
-	if (!ar->isDeleted) {
+	apply_isDeleted = replmd_should_apply_isDeleted(ar, msg);
 
-		/* not a deleted object, so nothing to do */
+	if (!apply_isDeleted) {
+
+		/* nothing to do */
 		ar->index_current++;
 		return replmd_replicated_apply_next(ar);
-	}
 
-	ret = dsdb_get_deleted_objects_dn(ldb_module_get_ctx(ar->module), msg,
-					  msg->dn, &deleted_objects_dn);
-	if (ret != LDB_SUCCESS || ldb_dn_compare(msg->dn, deleted_objects_dn) != 0) {
+	} else {
 		/*
 		 * Do a delete here again, so that if there is
 		 * anything local that conflicts with this
-- 
2.7.4


From fcbd93a4b8337eb3c58b0bfaf940f56fe0a88db9 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Tue, 20 Nov 2018 16:02:05 +1300
Subject: [PATCH 5/5] replmd: remove unnecessary indent

The previous refactor now means we return early if we don't need to
re-apply isDeleted to the object. The 'else' is redundant and we can
remove it to avoid unnecessary indent.

This patch is basically just a whitespace change. It should not alter
functionality.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 source4/dsdb/samdb/ldb_modules/repl_meta_data.c | 107 ++++++++++++------------
 1 file changed, 53 insertions(+), 54 deletions(-)

diff --git a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
index 97ab6ac..cfa63af 100644
--- a/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
+++ b/source4/dsdb/samdb/ldb_modules/repl_meta_data.c
@@ -6847,6 +6847,9 @@ static int replmd_replicated_apply_isDeleted(struct replmd_replicated_request *a
 	struct ldb_message *msg = ar->objs->objects[ar->index_current].msg;
 	int ret;
 	bool apply_isDeleted;
+	struct ldb_request *del_req = NULL;
+	struct ldb_result *res = NULL;
+	TALLOC_CTX *tmp_ctx = NULL;
 
 	apply_isDeleted = replmd_should_apply_isDeleted(ar, msg);
 
@@ -6855,67 +6858,63 @@ static int replmd_replicated_apply_isDeleted(struct replmd_replicated_request *a
 		/* nothing to do */
 		ar->index_current++;
 		return replmd_replicated_apply_next(ar);
+	}
 
-	} else {
-		/*
-		 * Do a delete here again, so that if there is
-		 * anything local that conflicts with this
-		 * object being deleted, it is removed.  This
-		 * includes links.  See MS-DRSR 4.1.10.6.9
-		 * UpdateObject.
-		 *
-		 * If the object is already deleted, and there
-		 * is no more work required, it doesn't do
-		 * anything.
-		 */
-
-		/* This has been updated to point to the DN we eventually did the modify on */
+	/*
+	 * Do a delete here again, so that if there is
+	 * anything local that conflicts with this
+	 * object being deleted, it is removed.  This
+	 * includes links.  See MS-DRSR 4.1.10.6.9
+	 * UpdateObject.
+	 *
+	 * If the object is already deleted, and there
+	 * is no more work required, it doesn't do
+	 * anything.
+	 */
 
-		struct ldb_request *del_req;
-		struct ldb_result *res;
+	/* This has been updated to point to the DN we eventually did the modify on */
 
-		TALLOC_CTX *tmp_ctx = talloc_new(ar);
-		if (!tmp_ctx) {
-			ret = ldb_oom(ldb_module_get_ctx(ar->module));
-			return ret;
-		}
+	tmp_ctx = talloc_new(ar);
+	if (!tmp_ctx) {
+		ret = ldb_oom(ldb_module_get_ctx(ar->module));
+		return ret;
+	}
 
-		res = talloc_zero(tmp_ctx, struct ldb_result);
-		if (!res) {
-			ret = ldb_oom(ldb_module_get_ctx(ar->module));
-			talloc_free(tmp_ctx);
-			return ret;
-		}
+	res = talloc_zero(tmp_ctx, struct ldb_result);
+	if (!res) {
+		ret = ldb_oom(ldb_module_get_ctx(ar->module));
+		talloc_free(tmp_ctx);
+		return ret;
+	}
 
-		/* Build a delete request, which hopefully will artually turn into nothing */
-		ret = ldb_build_del_req(&del_req, ldb_module_get_ctx(ar->module), tmp_ctx,
-					msg->dn,
-					NULL,
-					res,
-					ldb_modify_default_callback,
-					ar->req);
-		LDB_REQ_SET_LOCATION(del_req);
-		if (ret != LDB_SUCCESS) {
-			talloc_free(tmp_ctx);
-			return ret;
-		}
+	/* Build a delete request, which hopefully will artually turn into nothing */
+	ret = ldb_build_del_req(&del_req, ldb_module_get_ctx(ar->module), tmp_ctx,
+				msg->dn,
+				NULL,
+				res,
+				ldb_modify_default_callback,
+				ar->req);
+	LDB_REQ_SET_LOCATION(del_req);
+	if (ret != LDB_SUCCESS) {
+		talloc_free(tmp_ctx);
+		return ret;
+	}
 
-		/*
-		 * This is the guts of the call, call back
-		 * into our delete code, but setting the
-		 * re_delete flag so we delete anything that
-		 * shouldn't be there on a deleted or recycled
-		 * object
-		 */
-		ret = replmd_delete_internals(ar->module, del_req, true);
-		if (ret == LDB_SUCCESS) {
-			ret = ldb_wait(del_req->handle, LDB_WAIT_ALL);
-		}
+	/*
+	 * This is the guts of the call, call back
+	 * into our delete code, but setting the
+	 * re_delete flag so we delete anything that
+	 * shouldn't be there on a deleted or recycled
+	 * object
+	 */
+	ret = replmd_delete_internals(ar->module, del_req, true);
+	if (ret == LDB_SUCCESS) {
+		ret = ldb_wait(del_req->handle, LDB_WAIT_ALL);
+	}
 
-		talloc_free(tmp_ctx);
-		if (ret != LDB_SUCCESS) {
-			return ret;
-		}
+	talloc_free(tmp_ctx);
+	if (ret != LDB_SUCCESS) {
+		return ret;
 	}
 
 	ar->index_current++;
-- 
2.7.4



More information about the samba-technical mailing list