[WIP][PATCH] Safe LDB locking for better replication

Andrew Bartlett abartlet at samba.org
Fri Jun 16 04:55:47 UTC 2017


On Thu, 2017-06-15 at 16:58 +1200, Andrew Bartlett via samba-technical
wrote:
> On Wed, 2017-06-14 at 22:02 +1200, Andrew Bartlett via samba-
> technical
> wrote:
> > > Further comments most welcome, and I plan to re-write it to use
> > > read_lock() and read_unlock() operations tomorrow. 
> 
> Attached is the patch set against master
> 
> As mentioned previously, my TODO list is:
>  - fix tests not to hang when failing

They don't seem to hang, at least when reverting the whole-db lock.

>  - add test for the talloc_free(req) unlock case

I've added such a test.

>  - add some kind of test for the partitions locking 

I've added a test for this.

>  - a dbcheck rule for out of date @ATTRIBUTES and @INDEXLIST (rather
> than a runtime check)

This is still TODO.

With the proviso that I need that dbcheck rule (or drop this aspect of
the patches), can I please get your review on this series.  I'll run
some more builds, but I'm pretty confident on the patch series at this
point and would like to get this into master as soon as you are
comfortable. 

It would be good to have at least a week of this in master before rc1
freezes, so we can deal with any unexpected fallout, despite our
thorough testing.

Thanks,

Andrew Bartlett
-- 
Andrew Bartlett
https://samba.org/~abartlet/
Authentication Developer, Samba Team         https://samba.org
Samba Development and Support, Catalyst IT   
https://catalyst.net.nz/services/samba



-------------- next part --------------
From d03814aef6de22e8fba27aa6276dff4e42e45e2b Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze at samba.org>
Date: Sun, 11 Jun 2017 23:19:01 +0200
Subject: [PATCH 01/24] krb5_wrap: handle KRB5_ERR_HOST_REALM_UNKNOWN in
 smb_krb5_get_realm_from_hostname()

Signed-off-by: Stefan Metzmacher <metze at samba.org>
---
 lib/krb5_wrap/krb5_samba.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/lib/krb5_wrap/krb5_samba.c b/lib/krb5_wrap/krb5_samba.c
index 2e43f797144..0c8b402c21e 100644
--- a/lib/krb5_wrap/krb5_samba.c
+++ b/lib/krb5_wrap/krb5_samba.c
@@ -2669,6 +2669,10 @@ char *smb_krb5_get_realm_from_hostname(TALLOC_CTX *mem_ctx,
 	}
 
 	kerr = krb5_get_host_realm(ctx, hostname, &realm_list);
+	if (kerr == KRB5_ERR_HOST_REALM_UNKNOWN) {
+		realm_list = NULL;
+		kerr = 0;
+	}
 	if (kerr != 0) {
 		DEBUG(3,("kerberos_get_realm_from_hostname %s: "
 			"failed %s\n",
-- 
2.11.0


From abcce032c41a8ae7dac0940f597f612aa061ccb2 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Wed, 14 Jun 2017 13:11:56 +1200
Subject: [PATCH 02/24] selftest: Fix failure message in dsdb_schema_info

The rename changes the CN, not the lDAPDisplayName

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 source4/dsdb/tests/python/dsdb_schema_info.py | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/source4/dsdb/tests/python/dsdb_schema_info.py b/source4/dsdb/tests/python/dsdb_schema_info.py
index 0ae95b3836b..f3452d67acb 100755
--- a/source4/dsdb/tests/python/dsdb_schema_info.py
+++ b/source4/dsdb/tests/python/dsdb_schema_info.py
@@ -141,7 +141,7 @@ systemOnly: FALSE
         try:
             self.sam_db.rename(attr_dn, attr_dn_new)
         except LdbError, (num, _):
-            self.fail("failed to change lDAPDisplayName for %s: %s" % (attr_name, _))
+            self.fail("failed to change CN for %s: %s" % (attr_name, _))
 
         # compare resulting schemaInfo
         schi_after = self._getSchemaInfo()
@@ -187,7 +187,7 @@ systemOnly: FALSE
         try:
             self.sam_db.rename(class_dn, class_dn_new)
         except LdbError, (num, _):
-            self.fail("failed to change lDAPDisplayName for %s: %s" % (class_name, _))
+            self.fail("failed to change CN for %s: %s" % (class_name, _))
 
         # compare resulting schemaInfo
         schi_after = self._getSchemaInfo()
-- 
2.11.0


From 66de8247655c124df21d711e086d3eef7df4124b Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Thu, 15 Jun 2017 16:19:17 +1200
Subject: [PATCH 03/24] selftest: Correctly print message when nbt is not up in
 20 seconds

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12843
---
 selftest/target/Samba4.pm | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/selftest/target/Samba4.pm b/selftest/target/Samba4.pm
index 316ef8346d9..c52c99b1759 100755
--- a/selftest/target/Samba4.pm
+++ b/selftest/target/Samba4.pm
@@ -207,7 +207,7 @@ sub wait_for_start($$)
 		}
 		$count++;
 	} while ($ret != 0 && $count < 20);
-	if ($count == 10) {
+	if ($count == 20) {
 		warn("nbt not reachable after 20 retries\n");
 		teardown_env($self, $testenv_vars);
 		return 0;
-- 
2.11.0


From 2214923f7b4eced6bf4a3acefa2d8199b3f9405e Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Thu, 15 Jun 2017 16:20:11 +1200
Subject: [PATCH 04/24] selftest: Also wait for winbindd to start

This ensures that the posixacl.py test does not race against winbindd starting up and so
give wrong mappings

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12843
---
 selftest/target/Samba4.pm | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/selftest/target/Samba4.pm b/selftest/target/Samba4.pm
index c52c99b1759..ea81d7dd9da 100755
--- a/selftest/target/Samba4.pm
+++ b/selftest/target/Samba4.pm
@@ -245,6 +245,28 @@ sub wait_for_start($$)
 			sleep(1);
 		}
 	}
+
+	my $wbinfo =  Samba::bindir_path($self, "wbinfo");
+
+	$count = 0;
+	do {
+		my $cmd = "NSS_WRAPPER_PASSWD=$testenv_vars->{NSS_WRAPPER_PASSWD} ";
+		$cmd .= "NSS_WRAPPER_GROUP=$testenv_vars->{NSS_WRAPPER_GROUP} ";
+		$cmd .= "SELFTEST_WINBINDD_SOCKET_DIR=$testenv_vars->{SELFTEST_WINBINDD_SOCKET_DIR} ";
+		$cmd .= "$wbinfo -p";
+		$ret = system($cmd);
+
+		if ($ret != 0) {
+			sleep(1);
+		}
+		$count++;
+	} while ($ret != 0 && $count < 20);
+	if ($count == 20) {
+		warn("winbind not reachable after 20 retries\n");
+		teardown_env($self, $testenv_vars);
+		return 0;
+	}
+
 	print $self->getlog_env($testenv_vars);
 
 	return $ret
-- 
2.11.0


From 6650754908f334344e60efe611849d91c9b4b53f Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 16 Jun 2017 14:13:42 +1200
Subject: [PATCH 05/24] selftest: confirm that two attributes are also
 correctly set in the @ records

This shows that the current behaviour in dsdb_schema_set_indices_and_attributes(), while
not ideal, is not actually buggy.

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 python/samba/tests/dsdb_schema_attributes.py | 41 ++++++++++++++++++++++++++--
 1 file changed, 39 insertions(+), 2 deletions(-)

diff --git a/python/samba/tests/dsdb_schema_attributes.py b/python/samba/tests/dsdb_schema_attributes.py
index 28f9078a5b9..df6c8bb11b2 100644
--- a/python/samba/tests/dsdb_schema_attributes.py
+++ b/python/samba/tests/dsdb_schema_attributes.py
@@ -112,9 +112,7 @@ systemOnly: FALSE
         self.assertIn(attr_ldap_name, [str(x) for x in idx_res[0]["@IDXATTR"]])
 
 
-
     def test_AddUnIndexedAttribute(self):
-
         # create names for an attribute to add
         (attr_name, attr_ldap_name, attr_dn) = self._make_obj_names("schemaAttributes-Attr-")
         ldif = self._make_attr_ldif(attr_name, attr_dn, 2)
@@ -136,3 +134,42 @@ systemOnly: FALSE
         idx_res = self.samdb.search(base="@INDEXLIST", scope=ldb.SCOPE_BASE)
 
         self.assertNotIn(attr_ldap_name, [str(x) for x in idx_res[0]["@IDXATTR"]])
+
+
+    def test_AddTwoIndexedAttributes(self):
+        # create names for an attribute to add
+        (attr_name, attr_ldap_name, attr_dn) = self._make_obj_names("schemaAttributes-Attr-")
+        ldif = self._make_attr_ldif(attr_name, attr_dn, 3,
+                                    "searchFlags: %d" % samba.dsdb.SEARCH_FLAG_ATTINDEX)
+
+        # add the new attribute
+        self.samdb.add_ldif(ldif)
+        self._ldap_schemaUpdateNow()
+
+        # create names for an attribute to add
+        (attr_name2, attr_ldap_name2, attr_dn2) = self._make_obj_names("schemaAttributes-Attr-")
+        ldif = self._make_attr_ldif(attr_name2, attr_dn2, 4,
+                                    "searchFlags: %d" % samba.dsdb.SEARCH_FLAG_ATTINDEX)
+
+        # add the new attribute
+        self.samdb.add_ldif(ldif)
+        self._ldap_schemaUpdateNow()
+
+        # Check @ATTRIBUTES
+
+        attr_res = self.samdb.search(base="@ATTRIBUTES", scope=ldb.SCOPE_BASE)
+
+        self.assertIn(attr_ldap_name, attr_res[0])
+        self.assertEquals(len(attr_res[0][attr_ldap_name]), 1)
+        self.assertEquals(attr_res[0][attr_ldap_name][0], "CASE_INSENSITIVE")
+
+        self.assertIn(attr_ldap_name2, attr_res[0])
+        self.assertEquals(len(attr_res[0][attr_ldap_name2]), 1)
+        self.assertEquals(attr_res[0][attr_ldap_name2][0], "CASE_INSENSITIVE")
+
+        # Check @INDEXLIST
+
+        idx_res = self.samdb.search(base="@INDEXLIST", scope=ldb.SCOPE_BASE)
+
+        self.assertIn(attr_ldap_name, [str(x) for x in idx_res[0]["@IDXATTR"]])
+        self.assertIn(attr_ldap_name2, [str(x) for x in idx_res[0]["@IDXATTR"]])
-- 
2.11.0


From e564d200f3e1de2e9422baf49f28fe65cdca0cc7 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Thu, 8 Jun 2017 23:05:26 +1200
Subject: [PATCH 06/24] dsdb: Rework schema_init module to avoid database write
 and use valid memory

The schema can go away unless the second argument (the memory context) is supplied

There is no need to write the @ATTRIBUTES and @INDEXLIST on every DB load
we only need to write it if the schema is changed, and the repl_meta_data module
will notice if that happens and trigger the DSDB_EXTENDED_SCHEMA_UPDATE_NOW_OID
extended operation.

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 source4/dsdb/samdb/ldb_modules/schema_load.c | 64 ++++++++++++++++++----------
 1 file changed, 41 insertions(+), 23 deletions(-)

diff --git a/source4/dsdb/samdb/ldb_modules/schema_load.c b/source4/dsdb/samdb/ldb_modules/schema_load.c
index 6ffa4650dd9..3fba97e4f20 100644
--- a/source4/dsdb/samdb/ldb_modules/schema_load.c
+++ b/source4/dsdb/samdb/ldb_modules/schema_load.c
@@ -358,29 +358,15 @@ failed:
 	return ret;
 }	
 
-
-static int schema_load_init(struct ldb_module *module)
+static int schema_load(struct ldb_context *ldb,
+		       struct ldb_module *module)
 {
-	struct schema_load_private_data *private_data;
-	struct ldb_context *ldb = ldb_module_get_ctx(module);
 	struct dsdb_schema *schema;
 	void *readOnlySchema;
 	int ret, metadata_ret;
-
-	private_data = talloc_zero(module, struct schema_load_private_data);
-	if (private_data == NULL) {
-		return ldb_oom(ldb);
-	}
-	private_data->module = module;
+	TALLOC_CTX *frame = talloc_stackframe();
 	
-	ldb_module_set_private(module, private_data);
-
-	ret = ldb_next_init(module);
-	if (ret != LDB_SUCCESS) {
-		return ret;
-	}
-
-	schema = dsdb_get_schema(ldb, NULL);
+	schema = dsdb_get_schema(ldb, frame);
 
 	metadata_ret = schema_metadata_open(module);
 
@@ -394,10 +380,12 @@ static int schema_load_init(struct ldb_module *module)
 				ldb_debug_set(ldb, LDB_DEBUG_FATAL,
 					      "schema_load_init: dsdb_set_schema_refresh_fns() failed: %d:%s: %s",
 					      ret, ldb_strerror(ret), ldb_errstring(ldb));
+				TALLOC_FREE(frame);
 				return ret;
 			}
 		}
 
+		TALLOC_FREE(frame);
 		return LDB_SUCCESS;
 	}
 
@@ -408,11 +396,12 @@ static int schema_load_init(struct ldb_module *module)
 	 * have to update the backend server schema too */
 	if (readOnlySchema != NULL) {
 		struct dsdb_schema *new_schema;
-		ret = dsdb_schema_from_db(module, private_data, 0, &new_schema);
+		ret = dsdb_schema_from_db(module, frame, 0, &new_schema);
 		if (ret != LDB_SUCCESS) {
 			ldb_debug_set(ldb, LDB_DEBUG_FATAL,
 				      "schema_load_init: dsdb_schema_from_db() failed: %d:%s: %s",
 				      ret, ldb_strerror(ret), ldb_errstring(ldb));
+			TALLOC_FREE(frame);
 			return ret;
 		}
 
@@ -422,6 +411,7 @@ static int schema_load_init(struct ldb_module *module)
 			ldb_debug_set(ldb, LDB_DEBUG_FATAL,
 				      "schema_load_init: dsdb_set_schema() failed: %d:%s: %s",
 				      ret, ldb_strerror(ret), ldb_errstring(ldb));
+			TALLOC_FREE(frame);
 			return ret;
 		}
 
@@ -432,26 +422,54 @@ static int schema_load_init(struct ldb_module *module)
 			ldb_debug_set(ldb, LDB_DEBUG_FATAL,
 				      "schema_load_init: dsdb_set_schema_refresh_fns() failed: %d:%s: %s",
 				      ret, ldb_strerror(ret), ldb_errstring(ldb));
+			TALLOC_FREE(frame);
 			return ret;
 		}
 	} else {
 		ldb_debug_set(ldb, LDB_DEBUG_FATAL,
 			      "schema_load_init: failed to open metadata.tdb");
+		TALLOC_FREE(frame);
 		return metadata_ret;
 	}
 
-	schema = dsdb_get_schema(ldb, NULL);
+	schema = dsdb_get_schema(ldb, frame);
 
 	/* We do this, invoking the refresh handler, so we know that it works */
 	if (schema == NULL) {
 		ldb_debug_set(ldb, LDB_DEBUG_FATAL,
 			      "schema_load_init: dsdb_get_schema failed");
+		TALLOC_FREE(frame);
 		return LDB_ERR_OPERATIONS_ERROR;
 	}
 
-	/* Now check the @INDEXLIST is correct, or fix it up */
-	ret = dsdb_schema_set_indices_and_attributes(ldb, schema,
-						     true);
+	TALLOC_FREE(frame);
+	return LDB_SUCCESS;
+}
+
+static int schema_load_init(struct ldb_module *module)
+{
+	struct ldb_context *ldb = ldb_module_get_ctx(module);
+	struct schema_load_private_data *private_data;
+	int ret;
+
+	private_data = talloc_zero(module, struct schema_load_private_data);
+	if (private_data == NULL) {
+		return ldb_oom(ldb);
+	}
+	private_data->module = module;
+
+	ldb_module_set_private(module, private_data);
+
+	ret = ldb_next_init(module);
+	if (ret != LDB_SUCCESS) {
+		return ret;
+	}
+
+	ret = schema_load(ldb, module);
+	if (ret != LDB_SUCCESS) {
+		return ret;
+	}
+
 	if (ret != LDB_SUCCESS) {
 		ldb_asprintf_errstring(ldb, "Failed to update "
 				       "@INDEXLIST and @ATTRIBUTES "
-- 
2.11.0


From c133720c4641630b3f027c22a1a845ec04cf47f8 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Thu, 8 Jun 2017 23:17:20 +1200
Subject: [PATCH 07/24] dsdb: Do not prevent searches for @ATTRIBUTES because
 the DB is not set up yet

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 source4/dsdb/samdb/ldb_modules/show_deleted.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/source4/dsdb/samdb/ldb_modules/show_deleted.c b/source4/dsdb/samdb/ldb_modules/show_deleted.c
index 773dcfbf3fb..6b5fdaaa2c0 100644
--- a/source4/dsdb/samdb/ldb_modules/show_deleted.c
+++ b/source4/dsdb/samdb/ldb_modules/show_deleted.c
@@ -51,6 +51,11 @@ static int show_deleted_search(struct ldb_module *module, struct ldb_request *re
 	int ret;
 	const char *attr_filter = NULL;
 
+	/* do not manipulate our control entries */
+	if (ldb_dn_is_special(req->op.search.base)) {
+		return ldb_next_request(module, req);
+	}
+
 	ldb = ldb_module_get_ctx(module);
 
 	state = talloc_get_type(ldb_module_get_private(module), struct show_deleted_state);
-- 
2.11.0


From 9b72e5e31acf8ffbdb752a611977008d8269d7fb Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 31 Mar 2017 17:34:13 +1300
Subject: [PATCH 08/24] tdb: Remove locking from tdb_traverse_read()

This restores the original intent of tdb_traverse_read() in
7dd31288a701d772e45b1960ac4ce4cc1be782ed

This is needed to avoid a deadlock with tdb_lockall() and the
transaction start, as ldb_tdb should take the allrecord lock during a
search (which calls tdb_traverse), and can otherwise deadlock against
a transaction starting in another process

We add a test to show that a transaction can now start while a read
traverse is in progress

This allows more operations to happen in parallel.  The blocking point
is moved to the prepare commit.

This in turn permits a roughly doubling of unindexed search
performance, because currently ldb_tdb omits to take the lock due to
an unrelated bug, but taking the allrecord lock triggers the
above-mentioned deadlock.

This behaviour was added in 251aaafe3a9213118ac3a92def9ab2104c40d12a for
Solaris 10 in 2005. But the run-fcntl-deadlock test works also on Solaris 10,
see https://lists.samba.org/archive/samba-technical/2017-April/119876.html.

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/tdb/common/traverse.c          | 10 +---------
 lib/tdb/test/run-nested-traverse.c | 31 +++++++++++++++++++++++++++----
 2 files changed, 28 insertions(+), 13 deletions(-)

diff --git a/lib/tdb/common/traverse.c b/lib/tdb/common/traverse.c
index f33ef34ab8d..f62306e5560 100644
--- a/lib/tdb/common/traverse.c
+++ b/lib/tdb/common/traverse.c
@@ -244,7 +244,7 @@ out:
 
 
 /*
-  a read style traverse - temporarily marks the db read only
+  a read style traverse - temporarily marks each record read only
 */
 _PUBLIC_ int tdb_traverse_read(struct tdb_context *tdb,
 		      tdb_traverse_func fn, void *private_data)
@@ -252,19 +252,11 @@ _PUBLIC_ int tdb_traverse_read(struct tdb_context *tdb,
 	struct tdb_traverse_lock tl = { NULL, 0, 0, F_RDLCK };
 	int ret;
 
-	/* we need to get a read lock on the transaction lock here to
-	   cope with the lock ordering semantics of solaris10 */
-	if (tdb_transaction_lock(tdb, F_RDLCK, TDB_LOCK_WAIT)) {
-		return -1;
-	}
-
 	tdb->traverse_read++;
 	tdb_trace(tdb, "tdb_traverse_read_start");
 	ret = tdb_traverse_internal(tdb, fn, private_data, &tl);
 	tdb->traverse_read--;
 
-	tdb_transaction_unlock(tdb, F_RDLCK);
-
 	return ret;
 }
 
diff --git a/lib/tdb/test/run-nested-traverse.c b/lib/tdb/test/run-nested-traverse.c
index 22ee3e2a2a6..aeaa0859793 100644
--- a/lib/tdb/test/run-nested-traverse.c
+++ b/lib/tdb/test/run-nested-traverse.c
@@ -41,7 +41,30 @@ static int traverse2(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
 	return 0;
 }
 
-static int traverse1(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
+static int traverse1r(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
+		     void *p)
+{
+	ok1(correct_key(key));
+	ok1(correct_data(data));
+	ok1(external_agent_operation(agent, TRANSACTION_START, tdb_name(tdb))
+	    == SUCCESS);
+	ok1(external_agent_operation(agent, STORE, tdb_name(tdb))
+	    == SUCCESS);
+	ok1(external_agent_operation(agent, TRANSACTION_COMMIT, tdb_name(tdb))
+	    == WOULD_HAVE_BLOCKED);
+	tdb_traverse(tdb, traverse2, NULL);
+
+	/* That should *not* release the all-records lock! */
+	ok1(external_agent_operation(agent, TRANSACTION_START, tdb_name(tdb))
+	    == SUCCESS);
+	ok1(external_agent_operation(agent, STORE, tdb_name(tdb))
+	    == SUCCESS);
+	ok1(external_agent_operation(agent, TRANSACTION_COMMIT, tdb_name(tdb))
+	    == WOULD_HAVE_BLOCKED);
+	return 0;
+}
+
+static int traverse1w(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
 		     void *p)
 {
 	ok1(correct_key(key));
@@ -50,7 +73,7 @@ static int traverse1(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
 	    == WOULD_HAVE_BLOCKED);
 	tdb_traverse(tdb, traverse2, NULL);
 
-	/* That should *not* release the transaction lock! */
+	/* That should *not* release the all-records lock! */
 	ok1(external_agent_operation(agent, TRANSACTION_START, tdb_name(tdb))
 	    == WOULD_HAVE_BLOCKED);
 	return 0;
@@ -80,8 +103,8 @@ int main(int argc, char *argv[])
 	data.dsize = strlen("world");
 
 	ok1(tdb_store(tdb, key, data, TDB_INSERT) == 0);
-	tdb_traverse(tdb, traverse1, NULL);
-	tdb_traverse_read(tdb, traverse1, NULL);
+	tdb_traverse(tdb, traverse1w, NULL);
+	tdb_traverse_read(tdb, traverse1r, NULL);
 	tdb_close(tdb);
 
 	return exit_status();
-- 
2.11.0


From 767b57412e9831408e50e04754d8eae17e923e5a Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze at samba.org>
Date: Tue, 11 Apr 2017 17:27:33 +0200
Subject: [PATCH 09/24] TODO: tdb: version 1.3.14

* allow tdb_traverse_read before tdb_transaction[_prepare]_commit()
---
 lib/tdb/ABI/tdb-1.3.14.sigs | 70 +++++++++++++++++++++++++++++++++++++++++++++
 lib/tdb/wscript             |  2 +-
 2 files changed, 71 insertions(+), 1 deletion(-)
 create mode 100644 lib/tdb/ABI/tdb-1.3.14.sigs

diff --git a/lib/tdb/ABI/tdb-1.3.14.sigs b/lib/tdb/ABI/tdb-1.3.14.sigs
new file mode 100644
index 00000000000..48f4278890a
--- /dev/null
+++ b/lib/tdb/ABI/tdb-1.3.14.sigs
@@ -0,0 +1,70 @@
+tdb_add_flags: void (struct tdb_context *, unsigned int)
+tdb_append: int (struct tdb_context *, TDB_DATA, TDB_DATA)
+tdb_chainlock: int (struct tdb_context *, TDB_DATA)
+tdb_chainlock_mark: int (struct tdb_context *, TDB_DATA)
+tdb_chainlock_nonblock: int (struct tdb_context *, TDB_DATA)
+tdb_chainlock_read: int (struct tdb_context *, TDB_DATA)
+tdb_chainlock_read_nonblock: int (struct tdb_context *, TDB_DATA)
+tdb_chainlock_unmark: int (struct tdb_context *, TDB_DATA)
+tdb_chainunlock: int (struct tdb_context *, TDB_DATA)
+tdb_chainunlock_read: int (struct tdb_context *, TDB_DATA)
+tdb_check: int (struct tdb_context *, int (*)(TDB_DATA, TDB_DATA, void *), void *)
+tdb_close: int (struct tdb_context *)
+tdb_delete: int (struct tdb_context *, TDB_DATA)
+tdb_dump_all: void (struct tdb_context *)
+tdb_enable_seqnum: void (struct tdb_context *)
+tdb_error: enum TDB_ERROR (struct tdb_context *)
+tdb_errorstr: const char *(struct tdb_context *)
+tdb_exists: int (struct tdb_context *, TDB_DATA)
+tdb_fd: int (struct tdb_context *)
+tdb_fetch: TDB_DATA (struct tdb_context *, TDB_DATA)
+tdb_firstkey: TDB_DATA (struct tdb_context *)
+tdb_freelist_size: int (struct tdb_context *)
+tdb_get_flags: int (struct tdb_context *)
+tdb_get_logging_private: void *(struct tdb_context *)
+tdb_get_seqnum: int (struct tdb_context *)
+tdb_hash_size: int (struct tdb_context *)
+tdb_increment_seqnum_nonblock: void (struct tdb_context *)
+tdb_jenkins_hash: unsigned int (TDB_DATA *)
+tdb_lock_nonblock: int (struct tdb_context *, int, int)
+tdb_lockall: int (struct tdb_context *)
+tdb_lockall_mark: int (struct tdb_context *)
+tdb_lockall_nonblock: int (struct tdb_context *)
+tdb_lockall_read: int (struct tdb_context *)
+tdb_lockall_read_nonblock: int (struct tdb_context *)
+tdb_lockall_unmark: int (struct tdb_context *)
+tdb_log_fn: tdb_log_func (struct tdb_context *)
+tdb_map_size: size_t (struct tdb_context *)
+tdb_name: const char *(struct tdb_context *)
+tdb_nextkey: TDB_DATA (struct tdb_context *, TDB_DATA)
+tdb_null: dptr = 0xXXXX, dsize = 0
+tdb_open: struct tdb_context *(const char *, int, int, int, mode_t)
+tdb_open_ex: struct tdb_context *(const char *, int, int, int, mode_t, const struct tdb_logging_context *, tdb_hash_func)
+tdb_parse_record: int (struct tdb_context *, TDB_DATA, int (*)(TDB_DATA, TDB_DATA, void *), void *)
+tdb_printfreelist: int (struct tdb_context *)
+tdb_remove_flags: void (struct tdb_context *, unsigned int)
+tdb_reopen: int (struct tdb_context *)
+tdb_reopen_all: int (int)
+tdb_repack: int (struct tdb_context *)
+tdb_rescue: int (struct tdb_context *, void (*)(TDB_DATA, TDB_DATA, void *), void *)
+tdb_runtime_check_for_robust_mutexes: bool (void)
+tdb_set_logging_function: void (struct tdb_context *, const struct tdb_logging_context *)
+tdb_set_max_dead: void (struct tdb_context *, int)
+tdb_setalarm_sigptr: void (struct tdb_context *, volatile sig_atomic_t *)
+tdb_store: int (struct tdb_context *, TDB_DATA, TDB_DATA, int)
+tdb_storev: int (struct tdb_context *, TDB_DATA, const TDB_DATA *, int, int)
+tdb_summary: char *(struct tdb_context *)
+tdb_transaction_cancel: int (struct tdb_context *)
+tdb_transaction_commit: int (struct tdb_context *)
+tdb_transaction_prepare_commit: int (struct tdb_context *)
+tdb_transaction_start: int (struct tdb_context *)
+tdb_transaction_start_nonblock: int (struct tdb_context *)
+tdb_transaction_write_lock_mark: int (struct tdb_context *)
+tdb_transaction_write_lock_unmark: int (struct tdb_context *)
+tdb_traverse: int (struct tdb_context *, tdb_traverse_func, void *)
+tdb_traverse_read: int (struct tdb_context *, tdb_traverse_func, void *)
+tdb_unlock: int (struct tdb_context *, int, int)
+tdb_unlockall: int (struct tdb_context *)
+tdb_unlockall_read: int (struct tdb_context *)
+tdb_validate_freelist: int (struct tdb_context *, int *)
+tdb_wipe_all: int (struct tdb_context *)
diff --git a/lib/tdb/wscript b/lib/tdb/wscript
index 09bc0a3192c..478255038c7 100644
--- a/lib/tdb/wscript
+++ b/lib/tdb/wscript
@@ -1,7 +1,7 @@
 #!/usr/bin/env python
 
 APPNAME = 'tdb'
-VERSION = '1.3.13'
+VERSION = '1.3.14'
 
 blddir = 'bin'
 
-- 
2.11.0


From 8621ba78966ac499ac606e619103234a12ccc490 Mon Sep 17 00:00:00 2001
From: Garming Sam <garming at catalyst.net.nz>
Date: Thu, 30 Mar 2017 12:03:17 +1300
Subject: [PATCH 10/24] ldb_tdb: Ensure we correctly decrement
 ltdb->read_lock_count

If we do not do this, then we never take the all record lock, and instead do a lock
for every record as we go, which is very slow during a large search

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
Signed-off-by: Garming Sam <garming at catalyst.net.nz>
---
 lib/ldb/ldb_tdb/ldb_tdb.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/ldb/ldb_tdb/ldb_tdb.c b/lib/ldb/ldb_tdb/ldb_tdb.c
index f470e023d19..ad15a5e70a5 100644
--- a/lib/ldb/ldb_tdb/ldb_tdb.c
+++ b/lib/ldb/ldb_tdb/ldb_tdb.c
@@ -119,6 +119,7 @@ int ltdb_unlock_read(struct ldb_module *module)
 	struct ltdb_private *ltdb = talloc_get_type(data, struct ltdb_private);
 	if (ltdb->in_transaction == 0 && ltdb->read_lock_count == 1) {
 		tdb_unlockall_read(ltdb->tdb);
+		ltdb->read_lock_count--;
 		return 0;
 	}
 	ltdb->read_lock_count--;
-- 
2.11.0


From e00cbc57ab7f793d5798db762098686cad4af6bf Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Tue, 25 Apr 2017 22:33:53 +1200
Subject: [PATCH 11/24] ldb: Show that writes do not appear during an
 ldb_search()

A modify or rename during a search must not cause a search to change
output, and attributes having an index should in particular not see
any change in behaviour in this respect

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/tests/ldb_mod_op_test.c | 340 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 340 insertions(+)

diff --git a/lib/ldb/tests/ldb_mod_op_test.c b/lib/ldb/tests/ldb_mod_op_test.c
index fc4b1344797..98793e73f04 100644
--- a/lib/ldb/tests/ldb_mod_op_test.c
+++ b/lib/ldb/tests/ldb_mod_op_test.c
@@ -1505,6 +1505,334 @@ static void test_ldb_search_against_transaction(void **state)
 
 }
 
+/*
+ * This test is also complex.
+ * The purpose is to test if a modify can occour during a ldb_search()
+ * This would be a failure if if in process
+ * (1) and (2):
+ *  - (1) ltdb_search() starts and calls back for one entry
+ *  - (2) one of the entries to be matched is modified
+ *  - (1) the indexed search tries to return the modified entry, but
+ *        it is no longer found, either:
+ *          - despite it still matching (dn changed)
+ *          - it no longer matching (attrs changed)
+ *
+ * We also try un-indexed to show that the behaviour differs on this
+ * point, which it should not (an index should only impact search
+ * speed).
+ */
+
+struct modify_during_search_test_ctx {
+	struct ldbtest_ctx *test_ctx;
+	int res_count;
+	pid_t child_pid;
+	struct ldb_dn *basedn;
+	bool got_cn;
+	bool got_2_cn;
+	bool rename;
+};
+
+/*
+ * This purpose of this callback is to trigger a write in
+ * the child process while a search is in progress.
+ *
+ * In tdb 1.3.12 tdb_traverse_read() take the read transaction lock
+ * however in ldb 1.1.29 ltdb_search() forgets to take the all-record
+ * lock (except the very first time) due to a ref-counting bug.
+ *
+ * We assume that if the write will proceed, it will proceed in a 3
+ * second window after the function is called.
+ */
+
+static int test_ldb_modify_during_search_callback1(struct ldb_request *req,
+						   struct ldb_reply *ares)
+{
+	int ret;
+	int pipes[2];
+	char buf[2];
+	struct modify_during_search_test_ctx *ctx = req->context;
+	switch (ares->type) {
+	case LDB_REPLY_ENTRY:
+	{
+		const struct ldb_val *cn_val
+			= ldb_dn_get_component_val(ares->message->dn, 0);
+		const char *cn = (char *)cn_val->data;
+		ctx->res_count++;
+		if (strcmp(cn, "test_search_cn") == 0) {
+			ctx->got_cn = true;
+		} else if (strcmp(cn, "test_search_2_cn") == 0) {
+			ctx->got_2_cn = true;
+		}
+		if (ctx->res_count == 2) {
+			return LDB_SUCCESS;
+		}
+		break;
+	}
+	case LDB_REPLY_REFERRAL:
+		return LDB_SUCCESS;
+
+	case LDB_REPLY_DONE:
+		return ldb_request_done(req, LDB_SUCCESS);
+	}
+
+	ret = pipe(pipes);
+	assert_int_equal(ret, 0);
+
+	ctx->child_pid = fork();
+	if (ctx->child_pid == 0 && ctx->rename) {
+		TALLOC_CTX *tmp_ctx = NULL;
+		struct ldb_dn *dn, *new_dn;
+		TALLOC_FREE(ctx->test_ctx->ldb);
+		TALLOC_FREE(ctx->test_ctx->ev);
+		ctx->test_ctx->ev = tevent_context_init(ctx->test_ctx);
+		if (ctx->test_ctx->ev == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ctx->test_ctx->ldb = ldb_init(ctx->test_ctx,
+					      ctx->test_ctx->ev);
+		if (ctx->test_ctx->ldb == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_connect(ctx->test_ctx->ldb,
+				  ctx->test_ctx->dbpath, 0, NULL);
+		if (ret != LDB_SUCCESS) {
+			exit(ret);
+		}
+
+		tmp_ctx = talloc_new(ctx->test_ctx);
+		if (tmp_ctx == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		if (ctx->got_cn) {
+			/* Modify the other one */
+			dn = ldb_dn_new_fmt(tmp_ctx, ctx->test_ctx->ldb,
+					    "cn=test_search_2_cn,"
+					    "dc=search_test_entry");
+		} else {
+			dn = ldb_dn_new_fmt(tmp_ctx, ctx->test_ctx->ldb,
+					    "cn=test_search_cn,"
+					    "dc=search_test_entry");
+		}
+		if (dn == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		new_dn = ldb_dn_new_fmt(tmp_ctx, ctx->test_ctx->ldb,
+					"cn=test_search_cn_renamed,"
+					"dc=search_test_entry");
+		if (new_dn == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_transaction_start(ctx->test_ctx->ldb);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		if (write(pipes[1], "GO", 2) != 2) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_rename(ctx->test_ctx->ldb, dn, new_dn);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		ret = ldb_transaction_commit(ctx->test_ctx->ldb);
+		exit(ret);
+
+	} else if (ctx->child_pid == 0) {
+		TALLOC_CTX *tmp_ctx = NULL;
+		struct ldb_message *msg;
+		struct ldb_message_element *el;
+		TALLOC_FREE(ctx->test_ctx->ldb);
+		TALLOC_FREE(ctx->test_ctx->ev);
+		ctx->test_ctx->ev = tevent_context_init(ctx->test_ctx);
+		if (ctx->test_ctx->ev == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ctx->test_ctx->ldb = ldb_init(ctx->test_ctx,
+					      ctx->test_ctx->ev);
+		if (ctx->test_ctx->ldb == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_connect(ctx->test_ctx->ldb,
+				  ctx->test_ctx->dbpath, 0, NULL);
+		if (ret != LDB_SUCCESS) {
+			exit(ret);
+		}
+
+		tmp_ctx = talloc_new(ctx->test_ctx);
+		if (tmp_ctx == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		msg = ldb_msg_new(tmp_ctx);
+		if (msg == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		if (ctx->got_cn) {
+			/* Modify the other one */
+			msg->dn = ldb_dn_new_fmt(msg, ctx->test_ctx->ldb,
+						 "cn=test_search_2_cn,"
+						 "dc=search_test_entry");
+		} else {
+			msg->dn = ldb_dn_new_fmt(msg, ctx->test_ctx->ldb,
+						 "cn=test_search_cn,"
+						 "dc=search_test_entry");
+		}
+		if (msg->dn == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_msg_add_string(msg, "filterAttr", "TRUE");
+		if (ret != 0) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+		el = ldb_msg_find_element(msg, "filterAttr");
+		if (el == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+		el->flags = LDB_FLAG_MOD_REPLACE;
+
+		ret = ldb_transaction_start(ctx->test_ctx->ldb);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		if (write(pipes[1], "GO", 2) != 2) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_modify(ctx->test_ctx->ldb, msg);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		ret = ldb_transaction_commit(ctx->test_ctx->ldb);
+		exit(ret);
+	}
+
+	ret = read(pipes[0], buf, 2);
+	assert_int_equal(ret, 2);
+
+	sleep(3);
+
+	return LDB_SUCCESS;
+}
+
+static void test_ldb_modify_during_search(void **state, bool add_index,
+					  bool rename)
+{
+	struct search_test_ctx *search_test_ctx = talloc_get_type_abort(*state,
+			struct search_test_ctx);
+	struct modify_during_search_test_ctx
+		ctx =
+		{ .res_count = 0,
+		  .test_ctx = search_test_ctx->ldb_test_ctx,
+		  .rename = rename
+		};
+
+	int ret;
+	struct ldb_request *req;
+	pid_t pid;
+	int wstatus;
+
+	if (add_index) {
+		struct ldb_message *msg;
+		struct ldb_dn *indexlist = ldb_dn_new(search_test_ctx,
+						      search_test_ctx->ldb_test_ctx->ldb,
+						      "@INDEXLIST");
+		assert_non_null(indexlist);
+
+		msg = ldb_msg_new(search_test_ctx);
+		assert_non_null(msg);
+
+		msg->dn = indexlist;
+
+		ret = ldb_msg_add_string(msg, "@IDXATTR", "cn");
+		assert_int_equal(ret, LDB_SUCCESS);
+
+		ret = ldb_add(search_test_ctx->ldb_test_ctx->ldb,
+			      msg);
+
+		assert_int_equal(ret, LDB_SUCCESS);
+	}
+
+	tevent_loop_allow_nesting(search_test_ctx->ldb_test_ctx->ev);
+
+	ctx.basedn
+		= ldb_dn_new_fmt(search_test_ctx,
+				 search_test_ctx->ldb_test_ctx->ldb,
+				 "%s",
+				 search_test_ctx->base_dn);
+	assert_non_null(ctx.basedn);
+
+
+	/*
+	 * This search must be over multiple items, and should include
+	 * the new name after a rename, to show that it would match
+	 * both before and after that modify
+	 */
+	ret = ldb_build_search_req(&req,
+				   search_test_ctx->ldb_test_ctx->ldb,
+				   search_test_ctx,
+				   ctx.basedn,
+				   LDB_SCOPE_SUBTREE,
+				   "(&(!(filterAttr=*))"
+				   "(|(cn=test_search_cn_renamed)(cn=test_search_cn)(cn=test_search_2_cn)))",
+				   NULL,
+				   NULL,
+				   &ctx,
+				   test_ldb_modify_during_search_callback1,
+				   NULL);
+	assert_int_equal(ret, 0);
+	ret = ldb_request(search_test_ctx->ldb_test_ctx->ldb, req);
+
+	if (ret == LDB_SUCCESS) {
+		ret = ldb_wait(req->handle, LDB_WAIT_ALL);
+	}
+	assert_int_equal(ret, 0);
+	assert_int_equal(ctx.res_count, 2);
+	assert_int_equal(ctx.got_cn, true);
+	assert_int_equal(ctx.got_2_cn, true);
+
+	pid = waitpid(ctx.child_pid, &wstatus, 0);
+	assert_int_equal(pid, ctx.child_pid);
+
+	assert_true(WIFEXITED(wstatus));
+
+	assert_int_equal(WEXITSTATUS(wstatus), 0);
+
+
+}
+
+static void test_ldb_modify_during_indexed_search(void **state)
+{
+	return test_ldb_modify_during_search(state, true, false);
+}
+
+static void test_ldb_modify_during_unindexed_search(void **state)
+{
+	return test_ldb_modify_during_search(state, false, false);
+}
+
+static void test_ldb_rename_during_indexed_search(void **state)
+{
+	return test_ldb_modify_during_search(state, true, true);
+}
+
+static void test_ldb_rename_during_unindexed_search(void **state)
+{
+	return test_ldb_modify_during_search(state, false, true);
+}
+
 static int ldb_case_test_setup(void **state)
 {
 	int ret;
@@ -2054,6 +2382,18 @@ int main(int argc, const char **argv)
 		cmocka_unit_test_setup_teardown(test_ldb_search_against_transaction,
 						ldb_search_test_setup,
 						ldb_search_test_teardown),
+		cmocka_unit_test_setup_teardown(test_ldb_modify_during_unindexed_search,
+						ldb_search_test_setup,
+						ldb_search_test_teardown),
+		cmocka_unit_test_setup_teardown(test_ldb_modify_during_indexed_search,
+						ldb_search_test_setup,
+						ldb_search_test_teardown),
+		cmocka_unit_test_setup_teardown(test_ldb_rename_during_unindexed_search,
+						ldb_search_test_setup,
+						ldb_search_test_teardown),
+		cmocka_unit_test_setup_teardown(test_ldb_rename_during_indexed_search,
+						ldb_search_test_setup,
+						ldb_search_test_teardown),
 		cmocka_unit_test_setup_teardown(test_ldb_attrs_case_insensitive,
 						ldb_case_test_setup,
 						ldb_case_test_teardown),
-- 
2.11.0


From 82007423b6623a28715920a82ea26ddc6767b155 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Mon, 22 May 2017 16:18:20 +1200
Subject: [PATCH 12/24] ldb: Add test encoding current locking behaviour during
 ldb_search()

Currently, a lock is not held against modifications once the final
record is returned via a callback, so modifications can be made
during the DONE callback.  This makes it hard to write modules
that interpert an ldb search result and do further processing
so will change in the future to allow the full search to be
atomic.

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/tests/ldb_mod_op_test.c | 204 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 204 insertions(+)

diff --git a/lib/ldb/tests/ldb_mod_op_test.c b/lib/ldb/tests/ldb_mod_op_test.c
index 98793e73f04..e2090380aa6 100644
--- a/lib/ldb/tests/ldb_mod_op_test.c
+++ b/lib/ldb/tests/ldb_mod_op_test.c
@@ -1833,6 +1833,207 @@ static void test_ldb_rename_during_unindexed_search(void **state)
 	return test_ldb_modify_during_search(state, false, true);
 }
 
+/*
+ * This test is also complex.
+ *
+ * The purpose is to test if a modify can occour during a ldb_search()
+ * before the end of the callback
+ *
+ * This would be a failure if if in process
+ * (1) and (2):
+ *  - (1) ldb_search() starts and calls back for a number of entries
+ *  - (2) an entry in the DB is allowed to change before the callback returns
+ *  - (1) the callback can see the modification
+ *
+ */
+
+/*
+ * This purpose of this callback is to trigger a write in
+ * the child process while a search DONE callback is in progress.
+ *
+ * In ldb 1.1.29 ldb_search() omitted to take a all-record
+ * lock for the full duration of the search and callbacks
+ *
+ * We assume that if the write will proceed, it will proceed in a 3
+ * second window after the function is called.
+ */
+
+static int test_ldb_modify_during_whole_search_callback1(struct ldb_request *req,
+							 struct ldb_reply *ares)
+{
+	int ret;
+	int pipes[2];
+	char buf[2];
+	struct modify_during_search_test_ctx *ctx = req->context;
+	struct ldb_dn *search_dn;
+	struct ldb_result *res2;
+
+	switch (ares->type) {
+	case LDB_REPLY_ENTRY:
+	case LDB_REPLY_REFERRAL:
+		return LDB_SUCCESS;
+
+	case LDB_REPLY_DONE:
+		break;
+	}
+
+	ret = pipe(pipes);
+	assert_int_equal(ret, 0);
+
+	ctx->child_pid = fork();
+	if (ctx->child_pid == 0) {
+		TALLOC_CTX *tmp_ctx = NULL;
+		struct ldb_message *msg;
+		struct ldb_message_element *el;
+		TALLOC_FREE(ctx->test_ctx->ldb);
+		TALLOC_FREE(ctx->test_ctx->ev);
+		ctx->test_ctx->ev = tevent_context_init(ctx->test_ctx);
+		if (ctx->test_ctx->ev == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ctx->test_ctx->ldb = ldb_init(ctx->test_ctx,
+					      ctx->test_ctx->ev);
+		if (ctx->test_ctx->ldb == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_connect(ctx->test_ctx->ldb,
+				  ctx->test_ctx->dbpath, 0, NULL);
+		if (ret != LDB_SUCCESS) {
+			exit(ret);
+		}
+
+		tmp_ctx = talloc_new(ctx->test_ctx);
+		if (tmp_ctx == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		msg = ldb_msg_new(tmp_ctx);
+		if (msg == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		msg->dn = ldb_dn_new_fmt(msg, ctx->test_ctx->ldb,
+					 "cn=test_search_cn,"
+					 "dc=search_test_entry");
+		if (msg->dn == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_msg_add_string(msg, "filterAttr", "TRUE");
+		if (ret != 0) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+		el = ldb_msg_find_element(msg, "filterAttr");
+		if (el == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+		el->flags = LDB_FLAG_MOD_REPLACE;
+
+		ret = ldb_transaction_start(ctx->test_ctx->ldb);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		if (write(pipes[1], "GO", 2) != 2) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_modify(ctx->test_ctx->ldb, msg);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		ret = ldb_transaction_commit(ctx->test_ctx->ldb);
+		exit(ret);
+	}
+
+	ret = read(pipes[0], buf, 2);
+	assert_int_equal(ret, 2);
+
+	sleep(3);
+
+	/*
+	 * If writes are not blocked until after this function, we
+	 * will be able to successfully search for this modification
+	 * here
+	 */
+
+	search_dn = ldb_dn_new_fmt(ares, ctx->test_ctx->ldb,
+				   "cn=test_search_cn,"
+				   "dc=search_test_entry");
+
+	ret = ldb_search(ctx->test_ctx->ldb, ares,
+			 &res2, search_dn, LDB_SCOPE_BASE, NULL,
+			 "filterAttr=TRUE");
+	assert_int_equal(ret, 0);
+
+	/* We got the result */
+	assert_int_equal(res2->count, 1);
+
+	return ldb_request_done(req, LDB_SUCCESS);
+}
+
+static void test_ldb_modify_during_whole_search(void **state)
+{
+	struct search_test_ctx *search_test_ctx = talloc_get_type_abort(*state,
+			struct search_test_ctx);
+	struct modify_during_search_test_ctx
+		ctx =
+		{
+		  .test_ctx = search_test_ctx->ldb_test_ctx,
+		};
+
+	int ret;
+	struct ldb_request *req;
+	pid_t pid;
+	int wstatus;
+
+	tevent_loop_allow_nesting(search_test_ctx->ldb_test_ctx->ev);
+
+	ctx.basedn
+		= ldb_dn_new_fmt(search_test_ctx,
+				 search_test_ctx->ldb_test_ctx->ldb,
+				 "%s",
+				 search_test_ctx->base_dn);
+	assert_non_null(ctx.basedn);
+
+
+	/*
+	 * The search just needs to call DONE, we don't care about the
+	 * contents of the search for this test
+	 */
+	ret = ldb_build_search_req(&req,
+				   search_test_ctx->ldb_test_ctx->ldb,
+				   search_test_ctx,
+				   ctx.basedn,
+				   LDB_SCOPE_SUBTREE,
+				   "(&(!(filterAttr=*))"
+				   "(cn=test_search_cn))",
+				   NULL,
+				   NULL,
+				   &ctx,
+				   test_ldb_modify_during_whole_search_callback1,
+				   NULL);
+	assert_int_equal(ret, 0);
+	ret = ldb_request(search_test_ctx->ldb_test_ctx->ldb, req);
+
+	if (ret == LDB_SUCCESS) {
+		ret = ldb_wait(req->handle, LDB_WAIT_ALL);
+	}
+	assert_int_equal(ret, 0);
+
+	pid = waitpid(ctx.child_pid, &wstatus, 0);
+	assert_int_equal(pid, ctx.child_pid);
+
+	assert_true(WIFEXITED(wstatus));
+
+	assert_int_equal(WEXITSTATUS(wstatus), 0);
+
+
+}
+
 static int ldb_case_test_setup(void **state)
 {
 	int ret;
@@ -2394,6 +2595,9 @@ int main(int argc, const char **argv)
 		cmocka_unit_test_setup_teardown(test_ldb_rename_during_indexed_search,
 						ldb_search_test_setup,
 						ldb_search_test_teardown),
+		cmocka_unit_test_setup_teardown(test_ldb_modify_during_whole_search,
+						ldb_search_test_setup,
+						ldb_search_test_teardown),
 		cmocka_unit_test_setup_teardown(test_ldb_attrs_case_insensitive,
 						ldb_case_test_setup,
 						ldb_case_test_teardown),
-- 
2.11.0


From 11a4f98e8c3ff878d6c3cf50c0405bf52ce2df27 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Thu, 15 Jun 2017 12:10:51 +1200
Subject: [PATCH 13/24] ldb: Add read_lock and read_unlock to ldb_module_ops

This will be used to implement read locking in ldb_tdb

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/common/ldb_modules.c | 38 ++++++++++++++++++++++++++++++++++++++
 lib/ldb/include/ldb_module.h |  4 ++++
 2 files changed, 42 insertions(+)

diff --git a/lib/ldb/common/ldb_modules.c b/lib/ldb/common/ldb_modules.c
index ca93299ab00..f50b6c56130 100644
--- a/lib/ldb/common/ldb_modules.c
+++ b/lib/ldb/common/ldb_modules.c
@@ -641,6 +641,44 @@ int ldb_next_end_trans(struct ldb_module *module)
 	return ret;
 }
 
+int ldb_next_read_lock(struct ldb_module *module)
+{
+	int ret;
+	FIND_OP(module, read_lock);
+	ret = module->ops->read_lock(module);
+	if (ret == LDB_SUCCESS) {
+		return ret;
+	}
+	if (!ldb_errstring(module->ldb)) {
+		/* Set a default error string, to place the blame somewhere */
+		ldb_asprintf_errstring(module->ldb, "read_lock error in module %s: %s (%d)", module->ops->name, ldb_strerror(ret), ret);
+	}
+	if ((module && module->ldb->flags & LDB_FLG_ENABLE_TRACING)) {
+		ldb_debug(module->ldb, LDB_DEBUG_TRACE, "ldb_next_read_lock error: %s",
+			  ldb_errstring(module->ldb));
+	}
+	return ret;
+}
+
+int ldb_next_read_unlock(struct ldb_module *module)
+{
+	int ret;
+	FIND_OP(module, read_unlock);
+	ret = module->ops->read_unlock(module);
+	if (ret == LDB_SUCCESS) {
+		return ret;
+	}
+	if (!ldb_errstring(module->ldb)) {
+		/* Set a default error string, to place the blame somewhere */
+		ldb_asprintf_errstring(module->ldb, "read_unlock error in module %s: %s (%d)", module->ops->name, ldb_strerror(ret), ret);
+	}
+	if ((module && module->ldb->flags & LDB_FLG_ENABLE_TRACING)) {
+		ldb_debug(module->ldb, LDB_DEBUG_TRACE, "ldb_next_read_unlock error: %s",
+			  ldb_errstring(module->ldb));
+	}
+	return ret;
+}
+
 int ldb_next_prepare_commit(struct ldb_module *module)
 {
 	int ret;
diff --git a/lib/ldb/include/ldb_module.h b/lib/ldb/include/ldb_module.h
index 3d56e68eddd..0f5feeb084c 100644
--- a/lib/ldb/include/ldb_module.h
+++ b/lib/ldb/include/ldb_module.h
@@ -77,6 +77,8 @@ struct ldb_module_ops {
 	int (*end_transaction)(struct ldb_module *);
 	int (*del_transaction)(struct ldb_module *);
 	int (*sequence_number)(struct ldb_module *, struct ldb_request *);
+	int (*read_lock)(struct ldb_module *);
+	int (*read_unlock)(struct ldb_module *);
 	void *private_data;
 };
 
@@ -203,6 +205,8 @@ int ldb_next_end_trans(struct ldb_module *module);
 int ldb_next_del_trans(struct ldb_module *module);
 int ldb_next_prepare_commit(struct ldb_module *module);
 int ldb_next_init(struct ldb_module *module);
+int ldb_next_read_lock(struct ldb_module *module);
+int ldb_next_read_unlock(struct ldb_module *module);
 
 void ldb_set_errstring(struct ldb_context *ldb, const char *err_string);
 void ldb_asprintf_errstring(struct ldb_context *ldb, const char *format, ...) PRINTF_ATTRIBUTE(2,3);
-- 
2.11.0


From 6d06c8101aedf42e4e0015d6a197b33d1a0ef01b Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 12 May 2017 01:39:08 +0200
Subject: [PATCH 14/24] ldb_tdb: Implement read_lock and read_unlock module
 operations

This allows Samba to provide a consistent view of the DB
despite the use of multiple databases via the partitions module
and over multiple callbacks via a module stack.

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/ldb_tdb/ldb_tdb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/lib/ldb/ldb_tdb/ldb_tdb.c b/lib/ldb/ldb_tdb/ldb_tdb.c
index ad15a5e70a5..2ac1967ee15 100644
--- a/lib/ldb/ldb_tdb/ldb_tdb.c
+++ b/lib/ldb/ldb_tdb/ldb_tdb.c
@@ -1577,6 +1577,8 @@ static const struct ldb_module_ops ltdb_ops = {
 	.end_transaction   = ltdb_end_trans,
 	.prepare_commit    = ltdb_prepare_commit,
 	.del_transaction   = ltdb_del_trans,
+	.read_lock         = ltdb_lock_read,
+	.read_unlock       = ltdb_unlock_read,
 };
 
 /*
-- 
2.11.0


From f9863838aa4e18102938763b9f05d65f596db305 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Thu, 15 Jun 2017 13:56:46 +1200
Subject: [PATCH 15/24] ldb: Lock the whole backend database for the duration
 of a search

We must hold locks not just for the duration of each search, but for the whole search
as our module stack may make multiple search requests to build up the whole result.

This is explains a number of replication and read corruption issues in Samba

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/common/ldb.c            | 158 +++++++++++++++++++++++++++++++++++++++-
 lib/ldb/tests/ldb_mod_op_test.c |  38 +++++++++-
 2 files changed, 193 insertions(+), 3 deletions(-)

diff --git a/lib/ldb/common/ldb.c b/lib/ldb/common/ldb.c
index 700d89c65d8..e07b8560041 100644
--- a/lib/ldb/common/ldb.c
+++ b/lib/ldb/common/ldb.c
@@ -966,6 +966,145 @@ static int ldb_msg_check_element_flags(struct ldb_context *ldb,
 	return LDB_SUCCESS;
 }
 
+/*
+ * This context allows us to make the unlock be a talloc destructor
+ *
+ * This ensures that a request started, but not waited on, will still
+ * unlock.
+ */
+struct ldb_db_lock_context {
+	struct ldb_request *req;
+	struct ldb_context *ldb;
+};
+
+/*
+ * We have to have a the unlock on a destructor so that we unlock the
+ * DB if a caller calls talloc_free(req).  We trust that the ldb
+ * context has not already gone away.
+ */
+static int ldb_db_lock_destructor(struct ldb_db_lock_context *lock_context)
+{
+	int ret;
+	struct ldb_module *next_module;
+	FIRST_OP_NOERR(lock_context->ldb, read_unlock);
+	if (next_module != NULL) {
+		ret = next_module->ops->read_unlock(next_module);
+	} else {
+		ret = LDB_SUCCESS;
+	}
+
+	if (ret != LDB_SUCCESS) {
+		ldb_debug(lock_context->ldb,
+			  LDB_DEBUG_FATAL,
+			  "Failed to unlock db: %s / %s",
+			  ldb_errstring(lock_context->ldb),
+			  ldb_strerror(ret));
+	}
+	return 0;
+}
+
+static int ldb_lock_backend_callback(struct ldb_request *req,
+				     struct ldb_reply *ares)
+{
+	struct ldb_db_lock_context *lock_context;
+	int ret;
+
+	lock_context = talloc_get_type(req->context,
+				       struct ldb_db_lock_context);
+
+	if (!ares) {
+		return ldb_module_done(lock_context->req, NULL, NULL,
+					LDB_ERR_OPERATIONS_ERROR);
+	}
+	if (ares->error != LDB_SUCCESS || ares->type == LDB_REPLY_DONE) {
+		ret = ldb_module_done(lock_context->req, ares->controls,
+				      ares->response, ares->error);
+		/*
+		 * If this is a LDB_REPLY_DONE or an error, unlock the
+		 * DB by calling the destructor on this context
+		 */
+		talloc_free(lock_context);
+		return ret;
+	}
+
+	/* Otherwise pass on the callback */
+	switch (ares->type) {
+	case LDB_REPLY_ENTRY:
+		return ldb_module_send_entry(lock_context->req, ares->message, ares->controls);
+
+	case LDB_REPLY_REFERRAL:
+		return ldb_module_send_referral(lock_context->req, ares->referral);
+	default:
+		/* Can't happen */
+		return LDB_ERR_OPERATIONS_ERROR;
+	}
+}
+
+/*
+ * Do an ldb_search() with a lock held, but release it if the request
+ * is freed with talloc_free()
+ */
+static int lock_search(struct ldb_module *lock_module, struct ldb_request *req)
+{
+	/* Used in FIRST_OP_NOERR to find where to send the lock request */
+	struct ldb_module *next_module = NULL;
+
+	struct ldb_request *down_req = NULL;
+	struct ldb_db_lock_context *lock_context;
+	struct ldb_context *ldb = ldb_module_get_ctx(lock_module);
+	int ret;
+
+	lock_context = talloc(req, struct ldb_db_lock_context);
+	if (lock_context == NULL) {
+		return ldb_oom(ldb);
+	}
+
+	lock_context->ldb = ldb;
+	lock_context->req = req;
+
+	ret = ldb_build_search_req_ex(&down_req, ldb, req,
+				      req->op.search.base,
+				      req->op.search.scope,
+				      req->op.search.tree,
+				      req->op.search.attrs,
+				      req->controls,
+				      lock_context,
+				      ldb_lock_backend_callback,
+				      req);
+	LDB_REQ_SET_LOCATION(down_req);
+	if (ret != LDB_SUCCESS) {
+		return ret;
+	}
+
+	/* call DB lock */
+	FIRST_OP_NOERR(ldb, read_lock);
+	if (next_module != NULL) {
+		ret = next_module->ops->read_lock(next_module);
+	} else {
+		ret = LDB_ERR_UNSUPPORTED_CRITICAL_EXTENSION;
+	}
+
+	if (ret == LDB_ERR_UNSUPPORTED_CRITICAL_EXTENSION) {
+		/* We might be talking LDAP */
+		ldb_reset_err_string(ldb);
+		ret = 0;
+		TALLOC_FREE(lock_context);
+
+		return ldb_next_request(lock_module, req);
+	} else if ((ret != LDB_SUCCESS) && (ldb->err_string == NULL)) {
+		/* if no error string was setup by the backend */
+		ldb_asprintf_errstring(ldb, "Failed to get DB lock: %s (%d)",
+				       ldb_strerror(ret), ret);
+	} else {
+		talloc_set_destructor(lock_context, ldb_db_lock_destructor);
+	}
+
+	if (ret != LDB_SUCCESS) {
+		return ret;
+	}
+
+	return ldb_next_request(lock_module, down_req);
+}
 
 /*
   start an ldb request
@@ -991,15 +1130,32 @@ int ldb_request(struct ldb_context *ldb, struct ldb_request *req)
 	/* call the first module in the chain */
 	switch (req->operation) {
 	case LDB_SEARCH:
+	{
+		/*
+		 * A fake module to allow ldb_next_request() to be
+		 * re-used and to keep the locking out of this fucntion.
+		 */
+		struct ldb_module_ops lock_module_ops = {
+			.name = "lock_searches",
+			.search = lock_search
+		};
+		struct ldb_module lock_module = {
+			.ldb = ldb,
+			.next = ldb->modules,
+			.ops = &lock_module_ops
+		};
+		next_module = &lock_module;
+
 		/* due to "ldb_build_search_req" base DN always != NULL */
 		if (!ldb_dn_validate(req->op.search.base)) {
 			ldb_asprintf_errstring(ldb, "ldb_search: invalid basedn '%s'",
 					       ldb_dn_get_linearized(req->op.search.base));
 			return LDB_ERR_INVALID_DN_SYNTAX;
 		}
-		FIRST_OP(ldb, search);
+
 		ret = next_module->ops->search(next_module, req);
 		break;
+	}
 	case LDB_ADD:
 		if (!ldb_dn_validate(req->op.add.message->dn)) {
 			ldb_asprintf_errstring(ldb, "ldb_add: invalid dn '%s'",
diff --git a/lib/ldb/tests/ldb_mod_op_test.c b/lib/ldb/tests/ldb_mod_op_test.c
index e2090380aa6..c134d342aed 100644
--- a/lib/ldb/tests/ldb_mod_op_test.c
+++ b/lib/ldb/tests/ldb_mod_op_test.c
@@ -1430,6 +1430,12 @@ static int test_ldb_search_against_transaction_callback1(struct ldb_request *req
 				   ctx,
 				   test_ldb_search_against_transaction_callback2,
 				   NULL);
+	/*
+	 * NOTE WELL:
+	 * If these assertions fail, we will also fail to
+	 * clean up as we hold locks
+	 */
+
 	assert_int_equal(ret, 0);
 	ret = ldb_request(ctx->test_ctx->ldb, req);
 
@@ -1967,10 +1973,16 @@ static int test_ldb_modify_during_whole_search_callback1(struct ldb_request *req
 	ret = ldb_search(ctx->test_ctx->ldb, ares,
 			 &res2, search_dn, LDB_SCOPE_BASE, NULL,
 			 "filterAttr=TRUE");
+
+	/*
+	 * NOTE WELL:
+	 * If these assertions fail, we will also fail to
+	 * clean up as we hold locks
+	 */
 	assert_int_equal(ret, 0);
 
-	/* We got the result */
-	assert_int_equal(res2->count, 1);
+	/* We should not have got the result */
+	assert_int_equal(res2->count, 0);
 
 	return ldb_request_done(req, LDB_SUCCESS);
 }
@@ -1989,6 +2001,8 @@ static void test_ldb_modify_during_whole_search(void **state)
 	struct ldb_request *req;
 	pid_t pid;
 	int wstatus;
+	struct ldb_dn *search_dn;
+	struct ldb_result *res2;
 
 	tevent_loop_allow_nesting(search_test_ctx->ldb_test_ctx->ev);
 
@@ -2031,6 +2045,26 @@ static void test_ldb_modify_during_whole_search(void **state)
 
 	assert_int_equal(WEXITSTATUS(wstatus), 0);
 
+	/*
+	 * If writes are blocked until after the search function, we
+	 * will be able to successfully search for this modification
+	 * now
+	 */
+
+	search_dn = ldb_dn_new_fmt(search_test_ctx,
+				   search_test_ctx->ldb_test_ctx->ldb,
+				   "cn=test_search_cn,"
+				   "dc=search_test_entry");
+
+	ret = ldb_search(search_test_ctx->ldb_test_ctx->ldb,
+			 search_test_ctx,
+			 &res2, search_dn, LDB_SCOPE_BASE, NULL,
+			 "filterAttr=TRUE");
+	assert_int_equal(ret, 0);
+
+	/* We got the result */
+	assert_int_equal(res2->count, 1);
+
 
 }
 
-- 
2.11.0


From 48f2b025044c34e5659e7dca4e8a2faab734ee14 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 16 Jun 2017 12:18:39 +1200
Subject: [PATCH 16/24] ldb: Correct comment about version numbers

(ldb releases have been made while this patch set was in train)

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/tests/ldb_mod_op_test.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/ldb/tests/ldb_mod_op_test.c b/lib/ldb/tests/ldb_mod_op_test.c
index c134d342aed..14e5eadc581 100644
--- a/lib/ldb/tests/ldb_mod_op_test.c
+++ b/lib/ldb/tests/ldb_mod_op_test.c
@@ -1285,7 +1285,7 @@ static void test_search_match_basedn(void **state)
  *  - (2) the ldb_transaction_commit() is called.
  *        This returns LDB_ERR_BUSY if the deadlock is detected
  *
- * With ldb 1.1.29 and tdb 1.3.12 we avoid this only due to a missing
+ * With ldb 1.1.31 and tdb 1.3.12 we avoid this only due to a missing
  * lock call in ltdb_search() due to a refcounting bug in
  * ltdb_lock_read()
  */
@@ -1327,7 +1327,7 @@ static int test_ldb_search_against_transaction_callback2(struct ldb_request *req
  * we take any locks in the tdb_traverse_read() handler.
  *
  * In tdb 1.3.12 tdb_traverse_read() take the read transaction lock
- * however in ldb 1.1.29 ltdb_search() forgets to take the all-record
+ * however in ldb 1.1.31 ltdb_search() forgets to take the all-record
  * lock (except the very first time) due to a ref-counting bug.
  *
  */
-- 
2.11.0


From ec1d6950a8b7a3d093f27e8beed941d5cfc01545 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 16 Jun 2017 12:19:00 +1200
Subject: [PATCH 17/24] ldb: Add test to show that locks are released on
 TALLOC_FREE(req)

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/tests/ldb_mod_op_test.c | 229 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 229 insertions(+)

diff --git a/lib/ldb/tests/ldb_mod_op_test.c b/lib/ldb/tests/ldb_mod_op_test.c
index 14e5eadc581..23c487fd5db 100644
--- a/lib/ldb/tests/ldb_mod_op_test.c
+++ b/lib/ldb/tests/ldb_mod_op_test.c
@@ -2068,6 +2068,232 @@ static void test_ldb_modify_during_whole_search(void **state)
 
 }
 
+/*
+ * This test is also complex.
+ *
+ * The purpose is to test if a modify can occour during a ldb_search()
+ * before the request is destroyed with TALLOC_FREE()
+ *
+ * This would be a failure if in process
+ * (1) and (2):
+ *  - (1) ldb_search() starts and waits
+ *  - (2) an entry in the DB is allowed to change before the ldb_wait() is called
+ *  - (1) the original process can see the modification before the TALLOC_FREE()
+ * also we check that
+ *  - (1) the original process can see the modification after the TALLOC_FREE()
+ *
+ */
+
+/*
+ * This purpose of this callback is to trigger a write in
+ * the child process before the ldb_wait() is called
+ *
+ * In ldb 1.1.31 ldb_search() omitted to take a all-record
+ * lock for the full duration of the search and callbacks
+ *
+ * We assume that if the write will proceed, it will proceed in a 3
+ * second window after the function is called.
+ */
+
+static int test_ldb_modify_before_ldb_wait_callback1(struct ldb_request *req,
+						     struct ldb_reply *ares)
+{
+	switch (ares->type) {
+	case LDB_REPLY_ENTRY:
+	case LDB_REPLY_REFERRAL:
+		return LDB_SUCCESS;
+
+	case LDB_REPLY_DONE:
+		break;
+	}
+
+	return ldb_request_done(req, LDB_SUCCESS);
+}
+
+static void test_ldb_modify_before_ldb_wait(void **state)
+{
+	struct search_test_ctx *search_test_ctx = talloc_get_type_abort(*state,
+			struct search_test_ctx);
+
+	int ret;
+	struct ldb_request *req;
+	pid_t pid;
+	int wstatus;
+	struct ldb_dn *search_dn;
+	struct ldb_dn *basedn;
+	struct ldb_result *res2;
+	int pipes[2];
+	char buf[2];
+	pid_t child_pid;
+
+	search_dn = ldb_dn_new_fmt(search_test_ctx,
+				   search_test_ctx->ldb_test_ctx->ldb,
+				   "cn=test_search_cn,"
+				   "dc=search_test_entry");
+
+	basedn
+		= ldb_dn_new_fmt(search_test_ctx,
+				 search_test_ctx->ldb_test_ctx->ldb,
+				 "%s",
+				 search_test_ctx->base_dn);
+	assert_non_null(basedn);
+
+
+	/*
+	 * The search just needs to call DONE, we don't care about the
+	 * contents of the search for this test
+	 */
+	ret = ldb_build_search_req(&req,
+				   search_test_ctx->ldb_test_ctx->ldb,
+				   search_test_ctx,
+				   basedn,
+				   LDB_SCOPE_SUBTREE,
+				   "(&(!(filterAttr=*))"
+				   "(cn=test_search_cn))",
+				   NULL,
+				   NULL,
+				   NULL,
+				   test_ldb_modify_before_ldb_wait_callback1,
+				   NULL);
+	assert_int_equal(ret, 0);
+	ret = ldb_request(search_test_ctx->ldb_test_ctx->ldb, req);
+
+	ret = pipe(pipes);
+	assert_int_equal(ret, 0);
+
+	child_pid = fork();
+	if (child_pid == 0) {
+		TALLOC_CTX *tmp_ctx = NULL;
+		struct ldb_message *msg;
+		struct ldb_message_element *el;
+		TALLOC_FREE(search_test_ctx->ldb_test_ctx->ldb);
+		TALLOC_FREE(search_test_ctx->ldb_test_ctx->ev);
+		search_test_ctx->ldb_test_ctx->ev = tevent_context_init(search_test_ctx->ldb_test_ctx);
+		if (search_test_ctx->ldb_test_ctx->ev == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		search_test_ctx->ldb_test_ctx->ldb = ldb_init(search_test_ctx->ldb_test_ctx,
+					     search_test_ctx->ldb_test_ctx->ev);
+		if (search_test_ctx->ldb_test_ctx->ldb == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_connect(search_test_ctx->ldb_test_ctx->ldb,
+				  search_test_ctx->ldb_test_ctx->dbpath, 0, NULL);
+		if (ret != LDB_SUCCESS) {
+			exit(ret);
+		}
+
+		tmp_ctx = talloc_new(search_test_ctx->ldb_test_ctx);
+		if (tmp_ctx == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		msg = ldb_msg_new(tmp_ctx);
+		if (msg == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		/*
+		 * We must re-create this DN from a string to ensure
+		 * it does not reference the now-gone LDB context of
+		 * the parent
+		 */
+		msg->dn = ldb_dn_new_fmt(search_test_ctx,
+					 search_test_ctx->ldb_test_ctx->ldb,
+					 "cn=test_search_cn,"
+					 "dc=search_test_entry");
+
+		if (msg->dn == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_msg_add_string(msg, "filterAttr", "TRUE");
+		if (ret != 0) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+		el = ldb_msg_find_element(msg, "filterAttr");
+		if (el == NULL) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+		el->flags = LDB_FLAG_MOD_REPLACE;
+
+		ret = ldb_transaction_start(search_test_ctx->ldb_test_ctx->ldb);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		if (write(pipes[1], "GO", 2) != 2) {
+			exit(LDB_ERR_OPERATIONS_ERROR);
+		}
+
+		ret = ldb_modify(search_test_ctx->ldb_test_ctx->ldb, msg);
+		if (ret != 0) {
+			exit(ret);
+		}
+
+		ret = ldb_transaction_commit(search_test_ctx->ldb_test_ctx->ldb);
+		exit(ret);
+	}
+
+	ret = read(pipes[0], buf, 2);
+	assert_int_equal(ret, 2);
+
+	sleep(3);
+
+	/*
+	 * If writes are not blocked until after the (never called) ldb_wait(), we
+	 * will be able to successfully search for this modification
+	 * here
+	 */
+
+	ret = ldb_search(search_test_ctx->ldb_test_ctx->ldb, search_test_ctx,
+			 &res2, search_dn, LDB_SCOPE_BASE, NULL,
+			 "filterAttr=TRUE");
+
+	/*
+	 * NOTE WELL:
+	 * If these assertions fail, we will also fail to
+	 * clean up as we hold locks
+	 */
+	assert_int_equal(ret, 0);
+
+	/* We should not have got the result */
+	assert_int_equal(res2->count, 0);
+
+	TALLOC_FREE(req);
+
+	pid = waitpid(child_pid, &wstatus, 0);
+	assert_int_equal(pid, child_pid);
+
+	assert_true(WIFEXITED(wstatus));
+
+	assert_int_equal(WEXITSTATUS(wstatus), 0);
+
+	/*
+	 * If writes are blocked until after the search request was freed, we
+	 * will be able to successfully search for this modification
+	 * now
+	 */
+
+	search_dn = ldb_dn_new_fmt(search_test_ctx,
+				   search_test_ctx->ldb_test_ctx->ldb,
+				   "cn=test_search_cn,"
+				   "dc=search_test_entry");
+
+	ret = ldb_search(search_test_ctx->ldb_test_ctx->ldb,
+			 search_test_ctx,
+			 &res2, search_dn, LDB_SCOPE_BASE, NULL,
+			 "filterAttr=TRUE");
+	assert_int_equal(ret, 0);
+
+	/* We got the result */
+	assert_int_equal(res2->count, 1);
+
+
+}
+
 static int ldb_case_test_setup(void **state)
 {
 	int ret;
@@ -2632,6 +2858,9 @@ int main(int argc, const char **argv)
 		cmocka_unit_test_setup_teardown(test_ldb_modify_during_whole_search,
 						ldb_search_test_setup,
 						ldb_search_test_teardown),
+		cmocka_unit_test_setup_teardown(test_ldb_modify_before_ldb_wait,
+						ldb_search_test_setup,
+						ldb_search_test_teardown),
 		cmocka_unit_test_setup_teardown(test_ldb_attrs_case_insensitive,
 						ldb_case_test_setup,
 						ldb_case_test_teardown),
-- 
2.11.0


From b8d66c129b455ad372a5f03b67e1983ad900b3ee Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 16 Jun 2017 15:44:46 +1200
Subject: [PATCH 18/24] ldb: Extend api.py testsuite to show transaction
 contents can not be seen outside the transaction

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/tests/python/api.py | 66 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)

diff --git a/lib/ldb/tests/python/api.py b/lib/ldb/tests/python/api.py
index 4d290ef0c01..f9999baeb9a 100755
--- a/lib/ldb/tests/python/api.py
+++ b/lib/ldb/tests/python/api.py
@@ -1315,6 +1315,72 @@ class LdbResultTests(TestCase):
                 found = True
         self.assertTrue(found)
 
+
+    # Show that search results can't see into a transaction
+    def test_search_against_trans(self):
+        found11 = False
+
+        (r1, w1) = os.pipe()
+
+        (r2, w2) = os.pipe()
+
+        # For the first element, fork a child that will
+        # write to the DB
+        pid = os.fork()
+        if pid == 0:
+            # In the child, re-open
+            del(self.l)
+            gc.collect()
+
+            child_ldb = ldb.Ldb(self.filename)
+            # start a transaction
+            child_ldb.transaction_start()
+
+            # write to it
+            child_ldb.add({"dn": "OU=OU11,DC=SAMBA,DC=ORG",
+                           "name": b"samba.org"})
+
+            os.write(w1, b"added")
+
+            # Now wait for the search to be done
+            os.read(r2, 6)
+
+            # and commit
+            try:
+                child_ldb.transaction_commit()
+            except LdbError as err:
+                # We print this here to see what went wrong in the child
+                print(err)
+                os._exit(1)
+
+            os.write(w1, b"transaction")
+            os._exit(0)
+
+        self.assertEqual(os.read(r1, 5), b"added")
+
+        # This should not turn up until the transaction is concluded
+        res11 = self.l.search(base="OU=OU11,DC=SAMBA,DC=ORG",
+                            scope=ldb.SCOPE_BASE)
+        self.assertEqual(len(res11), 0)
+
+        os.write(w2, b"search")
+
+        # Now wait for the transaction to be done.  This should
+        # deadlock, but the search doesn't hold a read lock for the
+        # iterator lifetime currently.
+        self.assertEqual(os.read(r1, 11), b"transaction")
+
+        # This should now turn up, as the transaction is over
+        res11 = self.l.search(base="OU=OU11,DC=SAMBA,DC=ORG",
+                            scope=ldb.SCOPE_BASE)
+        self.assertEqual(len(res11), 1)
+
+        self.assertFalse(found11)
+
+        (got_pid, status) = os.waitpid(pid, 0)
+        self.assertEqual(got_pid, pid)
+
+
     def test_search_iter_against_trans(self):
         found = False
         found11 = False
-- 
2.11.0


From aff259e15ecf4b7a728878fbbc7c25bf06173666 Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 16 Jun 2017 15:49:16 +1200
Subject: [PATCH 19/24] ldb: Extend api.py testsuite to show
 transaction_commit() blocks against the whole-db read lock

The new ldb whole-db lock behaviour now allows this test

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 lib/ldb/tests/python/api.py | 18 ++++++++----------
 1 file changed, 8 insertions(+), 10 deletions(-)

diff --git a/lib/ldb/tests/python/api.py b/lib/ldb/tests/python/api.py
index f9999baeb9a..0c3b20e28ad 100755
--- a/lib/ldb/tests/python/api.py
+++ b/lib/ldb/tests/python/api.py
@@ -1393,16 +1393,13 @@ class LdbResultTests(TestCase):
 
         (r2, w2) = os.pipe()
 
-        l = next(res)
-        if str(l.dn) == "OU=OU10,DC=SAMBA,DC=ORG":
-            found = True
-
         # For the first element, with the sequence open (which
         # means with ldb locks held), fork a child that will
         # write to the DB
         pid = os.fork()
         if pid == 0:
             # In the child, re-open
+            del(res)
             del(self.l)
             gc.collect()
 
@@ -1439,25 +1436,26 @@ class LdbResultTests(TestCase):
 
         os.write(w2, b"search")
 
-        # Now wait for the transaction to be done.  This should
-        # deadlock, but the search doesn't hold a read lock for the
-        # iterator lifetime currently.
-        self.assertEqual(os.read(r1, 11), b"transaction")
+        # allow the transaction to start
+        time.sleep(1)
 
         # This should not turn up until the search finishes and
         # removed the read lock, but for ldb_tdb that happened as soon
         # as we called the first res.next()
         res11 = self.l.search(base="OU=OU11,DC=SAMBA,DC=ORG",
                             scope=ldb.SCOPE_BASE)
-        self.assertEqual(len(res11), 1)
+        self.assertEqual(len(res11), 0)
 
-        # These results were actually collected at the first next(res) call
+        # These results are all collected at the first next(res) call
         for l in res:
             if str(l.dn) == "OU=OU10,DC=SAMBA,DC=ORG":
                 found = True
             if str(l.dn) == "OU=OU11,DC=SAMBA,DC=ORG":
                 found11 = True
 
+        # Now wait for the transaction to be done.
+        self.assertEqual(os.read(r1, 11), b"transaction")
+
         # This should now turn up, as the transaction is over and all
         # read locks are gone
         res11 = self.l.search(base="OU=OU11,DC=SAMBA,DC=ORG",
-- 
2.11.0


From 3734b9843a7b3e6d5a1e6fedbf9a5b6b91481531 Mon Sep 17 00:00:00 2001
From: Stefan Metzmacher <metze at samba.org>
Date: Tue, 11 Apr 2017 17:50:08 +0200
Subject: [PATCH 20/24] TODO: ldb: version 1.1.32

* fix ldb_tdb locking (performance) problems
* fix ldb_tdb search inconsistencies
* add cmocka based tests for the locking issues

TODO: review...
---
 lib/ldb/wscript | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/ldb/wscript b/lib/ldb/wscript
index 1fce3b3ba57..8bea264bca4 100644
--- a/lib/ldb/wscript
+++ b/lib/ldb/wscript
@@ -1,7 +1,7 @@
 #!/usr/bin/env python
 
 APPNAME = 'ldb'
-VERSION = '1.1.31'
+VERSION = '1.1.32'
 
 blddir = 'bin'
 
-- 
2.11.0


From 1d53913f1ed0c9567a0944b9d1f24546dc09d04f Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Fri, 16 Jun 2017 15:49:45 +1200
Subject: [PATCH 21/24] dsdb: Add test showing that the whole DB (including
 partitions) is locked during a search

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 python/samba/tests/dsdb.py       | 108 ++++++++++++++++++++++++++++++++++++++-
 selftest/knownfail.d/ldb-locking |   1 +
 2 files changed, 108 insertions(+), 1 deletion(-)
 create mode 100644 selftest/knownfail.d/ldb-locking

diff --git a/python/samba/tests/dsdb.py b/python/samba/tests/dsdb.py
index 4a34bacb73f..71f5ae5366a 100644
--- a/python/samba/tests/dsdb.py
+++ b/python/samba/tests/dsdb.py
@@ -26,7 +26,8 @@ from samba.dcerpc import drsblobs
 import ldb
 import os
 import samba
-
+import gc
+import time
 
 class DsdbTests(TestCase):
 
@@ -149,3 +150,108 @@ class DsdbTests(TestCase):
         version = self.samdb.get_attribute_replmetadata_version(dn, "description")
         self.samdb.set_attribute_replmetadata_version(dn, "description", version + 2)
         self.assertEqual(self.samdb.get_attribute_replmetadata_version(dn, "description"), version + 2)
+
+    def test_db_lock(self):
+        basedn = self.samdb.get_default_basedn()
+        backend_filename = "%s.ldb" % basedn.get_casefold()
+        backend_subpath = os.path.join("sam.ldb.d",
+                                       backend_filename)
+        backend_path = self.lp.private_path(backend_subpath)
+        (r1, w1) = os.pipe()
+
+        pid = os.fork()
+        if pid == 0:
+            # In the child, close the main DB, re-open just one DB
+            del(self.samdb)
+            gc.collect()
+
+            backenddb = ldb.Ldb(backend_path)
+
+
+            backenddb.transaction_start()
+
+            backenddb.add({"dn":"@DSDB_LOCK_TEST"})
+            backenddb.delete("@DSDB_LOCK_TEST")
+
+            # Obtain a write lock
+            backenddb.transaction_prepare_commit()
+            os.write(w1, b"added")
+            time.sleep(2)
+
+            # Drop the write lock
+            backenddb.transaction_cancel()
+            os._exit(0)
+
+        self.assertEqual(os.read(r1, 5), b"added")
+
+        start = time.time()
+
+        # We need to hold this iterator open to hold the all-record
+        # lock.
+        res = self.samdb.search_iterator()
+
+        # This should take at least 2 seconds because the transaction
+        # has a write lock on one backend db open
+
+        # Relase the locks
+        for l in res:
+            pass
+
+        end = time.time()
+        self.assertGreater(end - start, 1.9)
+
+        (got_pid, status) = os.waitpid(pid, 0)
+        self.assertEqual(got_pid, pid)
+
+
+    def test_full_db_lock(self):
+        basedn = self.samdb.get_default_basedn()
+        backend_filename = "%s.ldb" % basedn.get_casefold()
+        backend_subpath = os.path.join("sam.ldb.d",
+                                       backend_filename)
+        backend_path = self.lp.private_path(backend_subpath)
+        (r1, w1) = os.pipe()
+
+        pid = os.fork()
+        if pid == 0:
+            # In the child, close the main DB, re-open just one DB
+            del(self.samdb)
+            gc.collect()
+
+            backenddb = ldb.Ldb(backend_path)
+
+
+            backenddb.transaction_start()
+
+            backenddb.add({"dn":"@DSDB_LOCK_TEST"})
+            backenddb.delete("@DSDB_LOCK_TEST")
+
+            # Obtain a write lock
+            backenddb.transaction_prepare_commit()
+            os.write(w1, b"added")
+            time.sleep(2)
+
+            # Drop the write lock
+            backenddb.transaction_cancel()
+            os._exit(0)
+
+        self.assertEqual(os.read(r1, 5), b"added")
+
+        start = time.time()
+
+        # We need to hold this iterator open to hold the all-record
+        # lock.
+        res = self.samdb.search_iterator()
+
+        # This should take at least 2 seconds because the transaction
+        # has a write lock on one backend db open
+
+        end = time.time()
+        self.assertGreater(end - start, 1.9)
+
+        # Relase the locks
+        for l in res:
+            pass
+
+        (got_pid, status) = os.waitpid(pid, 0)
+        self.assertEqual(got_pid, pid)
diff --git a/selftest/knownfail.d/ldb-locking b/selftest/knownfail.d/ldb-locking
new file mode 100644
index 00000000000..5b569d26e86
--- /dev/null
+++ b/selftest/knownfail.d/ldb-locking
@@ -0,0 +1 @@
+samba.tests.dsdb.samba.tests.dsdb.DsdbTests.test_full_db_lock\(ad_dc_ntvfs:local\)
-- 
2.11.0


From f831c2c5ffaaf40d06457178ba09ed99122477ba Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Tue, 23 May 2017 15:11:59 +1200
Subject: [PATCH 22/24] dsdb: Teach the Samba partition module how to lock all
 the DB backends

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 selftest/knownfail.d/ldb-locking           |   1 -
 source4/dsdb/samdb/ldb_modules/partition.c | 149 ++++++++++++++++++++++++++++-
 2 files changed, 148 insertions(+), 2 deletions(-)
 delete mode 100644 selftest/knownfail.d/ldb-locking

diff --git a/selftest/knownfail.d/ldb-locking b/selftest/knownfail.d/ldb-locking
deleted file mode 100644
index 5b569d26e86..00000000000
--- a/selftest/knownfail.d/ldb-locking
+++ /dev/null
@@ -1 +0,0 @@
-samba.tests.dsdb.samba.tests.dsdb.DsdbTests.test_full_db_lock\(ad_dc_ntvfs:local\)
diff --git a/source4/dsdb/samdb/ldb_modules/partition.c b/source4/dsdb/samdb/ldb_modules/partition.c
index 08a959cabb5..5620a34bfc8 100644
--- a/source4/dsdb/samdb/ldb_modules/partition.c
+++ b/source4/dsdb/samdb/ldb_modules/partition.c
@@ -814,7 +814,7 @@ static int partition_start_trans(struct ldb_module *module)
 		ldb_debug(ldb_module_get_ctx(module), LDB_DEBUG_TRACE, "partition_start_trans() -> (metadata partition)");
 	}
 
-	/* This order must match that in prepare_commit() */
+	/* This order must match that in prepare_commit() and lock_backend */
 	ret = ldb_next_start_trans(module);
 	if (ret != LDB_SUCCESS) {
 		return ret;
@@ -1144,6 +1144,151 @@ static int partition_sequence_number(struct ldb_module *module, struct ldb_reque
 	return ldb_module_done(req, NULL, ext, LDB_SUCCESS);
 }
 
+/* lock or unlock all the backends */
+static int partition_lock(struct ldb_module *module)
+{
+	int i;
+	int ret;
+	int ret2;
+	struct ldb_context *ldb = ldb_module_get_ctx(module);
+	struct partition_private_data *data = talloc_get_type(ldb_module_get_private(module),
+							      struct partition_private_data);
+	if (ldb_module_flags(ldb) & LDB_FLG_ENABLE_TRACING) {
+		ldb_debug(ldb, LDB_DEBUG_TRACE, "partition_lock_backend() -> (metadata partition)");
+	}
+
+	/*
+	 * It is important to only do this for LOCK because:
+	 * - we don't want to unlock what we did not lock
+	 *
+	 * - we don't want to make a new lock on the sam.ldb
+	 *   (triggered inside this routine due to the seq num check)
+	 *   during an unlock phase as that will violate the lock
+	 *   ordering
+	 */
+
+	/*
+	 * This will lock the metadata partition (sam.ldb) and
+	 * will also call event loops, so we do it before we
+	 * get the whole db lock.
+		 */
+	ret = partition_reload_if_required(module, data, NULL);
+	if (ret != LDB_SUCCESS) {
+		return ret;
+	}
+
+	/*
+	 * This order must match that in prepare_commit(), start with
+	 * the metadata partition (sam.ldb) lock
+	 */
+	ret = ldb_next_read_lock(module);
+	if (ret != LDB_SUCCESS) {
+		ldb_debug_set(ldb,
+			      LDB_DEBUG_FATAL,
+			      "Failed to lock db: %s / %s for metadata partition",
+			      ldb_errstring(ldb),
+			      ldb_strerror(ret));
+
+		return ret;
+	}
+
+	for (i=0; data && data->partitions && data->partitions[i]; i++) {
+		if ((module && ldb_module_flags(ldb) & LDB_FLG_ENABLE_TRACING)) {
+			ldb_debug(ldb, LDB_DEBUG_TRACE,
+				  "partition_lock_backend() -> %s",
+				  ldb_dn_get_linearized(data->partitions[i]->ctrl->dn));
+		}
+		ret = ldb_next_read_lock(data->partitions[i]->module);
+		if (ret == LDB_SUCCESS) {
+			continue;
+		}
+
+		ldb_debug_set(ldb,
+			      LDB_DEBUG_FATAL,
+			      "Failed to lock db: %s / %s for %s",
+			      ldb_errstring(ldb),
+			      ldb_strerror(ret),
+			      ldb_dn_get_linearized(data->partitions[i]->ctrl->dn));
+
+		/* Back it out, if it fails on one */
+		for (i--; i >= 0; i--) {
+			ret2 = ldb_next_read_unlock(data->partitions[i]->module);
+			if (ret2 != LDB_SUCCESS) {
+				ldb_debug(ldb,
+					  LDB_DEBUG_FATAL,
+					  "Failed to unlock db: %s / %s",
+					  ldb_errstring(ldb),
+					  ldb_strerror(ret2));
+			}
+		}
+		ret2 = ldb_next_read_unlock(module);
+		if (ret2 != LDB_SUCCESS) {
+			ldb_debug(ldb,
+				  LDB_DEBUG_FATAL,
+				  "Failed to unlock db: %s / %s",
+				  ldb_errstring(ldb),
+				  ldb_strerror(ret2));
+		}
+		return ret;
+	}
+
+	return LDB_SUCCESS;
+}
+
+static int partition_unlock(struct ldb_module *module)
+{
+	int i;
+	int ret;
+	int ret2;
+	struct ldb_context *ldb = ldb_module_get_ctx(module);
+	struct partition_private_data *data = talloc_get_type(ldb_module_get_private(module),
+							      struct partition_private_data);
+	if (ldb_module_flags(ldb) & LDB_FLG_ENABLE_TRACING) {
+		ldb_debug(ldb, LDB_DEBUG_TRACE, "partition_unlock_backend() -> (metadata partition)");
+	}
+
+	/*
+	 * This order must match that in read_lock(), start with
+	 * the metadata partition (sam.ldb) lock
+	 */
+	ret = ldb_next_read_unlock(module);
+	if (ret != LDB_SUCCESS) {
+		ldb_debug_set(ldb,
+			      LDB_DEBUG_FATAL,
+			      "Failed to unlock db: %s / %s for metadata partition",
+			      ldb_errstring(ldb),
+			      ldb_strerror(ret));
+	}
+
+	for (i=0; data && data->partitions && data->partitions[i]; i++) {
+		if ((module && ldb_module_flags(ldb) & LDB_FLG_ENABLE_TRACING)) {
+			ldb_debug(ldb, LDB_DEBUG_TRACE,
+				  "partition_unlock_backend() -> %s",
+				  ldb_dn_get_linearized(data->partitions[i]->ctrl->dn));
+		}
+		ret2 = ldb_next_read_unlock(data->partitions[i]->module);
+		if (ret2 != LDB_SUCCESS) {
+			ldb_debug_set(ldb,
+				      LDB_DEBUG_FATAL,
+				      "Failed to lock db: %s / %s for %s",
+				      ldb_errstring(ldb),
+				      ldb_strerror(ret),
+				      ldb_dn_get_linearized(data->partitions[i]->ctrl->dn));
+
+			/*
+			 * Don't overwrite the original failure code
+			 * if there was one
+			 */
+			if (ret == LDB_SUCCESS) {
+				ret = ret2;
+			}
+		}
+	}
+
+	return ret;
+}
+
+
 /* extended */
 static int partition_extended(struct ldb_module *module, struct ldb_request *req)
 {
@@ -1198,6 +1343,8 @@ static const struct ldb_module_ops ldb_partition_module_ops = {
 	.prepare_commit    = partition_prepare_commit,
 	.end_transaction   = partition_end_trans,
 	.del_transaction   = partition_del_trans,
+	.read_lock         = partition_lock,
+	.read_unlock       = partition_unlock
 };
 
 int ldb_partition_module_init(const char *version)
-- 
2.11.0


From 3bf5168d5071782c1884ea316b78f1738c88a28e Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Wed, 7 Jun 2017 10:44:50 +1200
Subject: [PATCH 23/24] dsdb: Do not run dsdb_replace() on the calculated
 difference between old and new schema

We can set the database @INDEXLIST and @ATTRIBUTES to the full calculated
values, not the difference, and let the ldb layer work it out under the
transaction lock.

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 source4/dsdb/schema/schema_set.c | 14 ++++++++++++--
 1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/source4/dsdb/schema/schema_set.c b/source4/dsdb/schema/schema_set.c
index 977c9e339b6..df27e19a944 100644
--- a/source4/dsdb/schema/schema_set.c
+++ b/source4/dsdb/schema/schema_set.c
@@ -174,7 +174,12 @@ int dsdb_schema_set_indices_and_attributes(struct ldb_context *ldb,
 			goto op_error;
 		}
 		if (mod_msg->num_elements > 0) {
-			ret = dsdb_replace(ldb, mod_msg, 0);
+			/*
+			 * Do the replace with the constructed message,
+			 * to avoid needing a lock between this search
+			 * and the replace
+			 */
+			ret = dsdb_replace(ldb, msg, 0);
 		}
 		talloc_free(mod_msg);
 	}
@@ -210,7 +215,12 @@ int dsdb_schema_set_indices_and_attributes(struct ldb_context *ldb,
 			goto op_error;
 		}
 		if (mod_msg->num_elements > 0) {
-			ret = dsdb_replace(ldb, mod_msg, 0);
+			/*
+			 * Do the replace with the constructed message,
+			 * to avoid needing a lock between this search
+			 * and the replace
+			 */
+			ret = dsdb_replace(ldb, msg_idx, 0);
 		}
 		talloc_free(mod_msg);
 	}
-- 
2.11.0


From 11d24e388c51b6338079fe689a1186ea004263fb Mon Sep 17 00:00:00 2001
From: Andrew Bartlett <abartlet at samba.org>
Date: Sat, 10 Jun 2017 19:23:34 +1200
Subject: [PATCH 24/24] dsdb: Add comment explaining requirements on
 DSDB_EXTENDED_SCHEMA_UPDATE_NOW_OID

Signed-off-by: Andrew Bartlett <abartlet at samba.org>
---
 source4/dsdb/samdb/ldb_modules/schema_load.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/source4/dsdb/samdb/ldb_modules/schema_load.c b/source4/dsdb/samdb/ldb_modules/schema_load.c
index 3fba97e4f20..92160943a89 100644
--- a/source4/dsdb/samdb/ldb_modules/schema_load.c
+++ b/source4/dsdb/samdb/ldb_modules/schema_load.c
@@ -530,12 +530,13 @@ static int schema_load_del_transaction(struct ldb_module *module)
 	return ldb_next_del_trans(module);
 }
 
+/* This is called in a transaction held by the callers */
 static int schema_load_extended(struct ldb_module *module, struct ldb_request *req)
 {
 	struct ldb_context *ldb = ldb_module_get_ctx(module);
 	struct dsdb_schema *schema;
 	int ret;
-	
+
 	if (strcmp(req->op.extended.oid, DSDB_EXTENDED_SCHEMA_UPDATE_NOW_OID) != 0) {
 		return ldb_next_request(module, req);
 	}
-- 
2.11.0



More information about the samba-technical mailing list