[PATCH] One-level searches were inefficient

Tim Beale timbeale at catalyst.net.nz
Thu Jan 31 04:39:13 UTC 2019


I noticed that some types of one-level LDB searches could end up being
inefficient and trying to filter more objects than they needed to. The
attached patch addresses the problem, and tries to improve the comments
in the surrounding code a bit.

CI pass: https://gitlab.com/catalyst-samba/samba/pipelines/45501963

Review appreciated. Thanks.

-------------- next part --------------
From d2373e6132267ec1d877439ac94b05d235734d9e Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Thu, 10 Jan 2019 13:34:18 +1300
Subject: [PATCH 1/4] ldb: Avoid inefficient one-level searches

Commit 88ae60ed186c9 introduced a problem that made one-level
searches inefficient if there were a lot of child objects in the same
level, and the requested object didn't exist. Basically, it ignored the
case where ldb_kv_index_dn() returned LDB_ERR_NO_SUCH_OBJECT, i.e. the
indexed lookup was successful, but didn't find a match. At which point,
there was no more processing we needed to do.

The behaviour after 88ae60ed186c9 was to fall-through and run the
ldb_kv_index_filter() function over *all* the children. This still
returned the correct result, but could be costly if there were a lot of
children.

The case 88ae60ed186c9 was trying to fix was where we could not do
an indexed search (e.g. trying to match on a 'attribute=*' filter). In
which case we want to ignore the LDB_ERR_OPERATIONS_ERROR and just run
ldb_kv_index_filter() over all the children. This is still more
efficient than the fallback of doing a full database scan.

This patch adds in a short-circuit for the NO_SUCH_OBJECT case, so we
can skip the unnecessary ldb_kv_index_filter() work.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13762

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 lib/ldb/ldb_key_value/ldb_kv_index.c | 17 ++++++++++++++---
 1 file changed, 14 insertions(+), 3 deletions(-)

diff --git a/lib/ldb/ldb_key_value/ldb_kv_index.c b/lib/ldb/ldb_key_value/ldb_kv_index.c
index 6c21c19..2ada48d 100644
--- a/lib/ldb/ldb_key_value/ldb_kv_index.c
+++ b/lib/ldb/ldb_key_value/ldb_kv_index.c
@@ -2051,13 +2051,24 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 			}
 			/*
 			 * Here we load the index for the tree.
-			 *
+			 */
+			ret = ldb_kv_index_dn(
+			    ac->module, ldb_kv, ac->tree, idx_one_tree_list);
+
+			/*
+			 * We can stop if we're sure the object doesn't exist
+			 */
+			if (ret == LDB_ERR_NO_SUCH_OBJECT) {
+				talloc_free(idx_one_tree_list);
+				talloc_free(dn_list);
+				return LDB_ERR_NO_SUCH_OBJECT;
+			}
+
+			/*
 			 * We only care if this is successful, if the
 			 * index can't trim the result list down then
 			 * the ONELEVEL index is still good enough.
 			 */
-			ret = ldb_kv_index_dn(
-			    ac->module, ldb_kv, ac->tree, idx_one_tree_list);
 			if (ret == LDB_SUCCESS) {
 				if (!list_intersect(ldb,
 						    ldb_kv,
-- 
2.7.4


From 7c3417898fa3404f1512a930f38b84ff5f52bf8d Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Thu, 10 Jan 2019 13:53:47 +1300
Subject: [PATCH 2/4] ldb: Remove comment that no longer makes sense

This comment was written before the GUID_index_attribute block of code
existed. So we now *do* load the index values and *do* check for a
strict intersect, so the comment is redundant.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 lib/ldb/ldb_key_value/ldb_kv_index.c | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/lib/ldb/ldb_key_value/ldb_kv_index.c b/lib/ldb/ldb_key_value/ldb_kv_index.c
index 2ada48d..9c65b6f 100644
--- a/lib/ldb/ldb_key_value/ldb_kv_index.c
+++ b/lib/ldb/ldb_key_value/ldb_kv_index.c
@@ -2008,11 +2008,6 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 		return ldb_operr(ldb);
 
 	case LDB_SCOPE_ONELEVEL:
-		/*
-		 * If we ever start to also load the index values for
-		 * the tree, we must ensure we strictly intersect with
-		 * this list, as we trust the ONELEVEL index
-		 */
 		ret = ldb_kv_index_dn_one(ac->module,
 					  ldb_kv,
 					  ac->base,
-- 
2.7.4


From 6d7ccfab1119bfe9f130b5636f77104f32df3283 Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Thu, 10 Jan 2019 14:19:19 +1300
Subject: [PATCH 3/4] ldb: Elaborate on ldb_kv_search_indexed() comments

Disclaimer: this is based on my limited understanding of what the code
is doing.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 lib/ldb/ldb_key_value/ldb_kv_index.c | 33 ++++++++++++++++++++++++++-------
 1 file changed, 26 insertions(+), 7 deletions(-)

diff --git a/lib/ldb/ldb_key_value/ldb_kv_index.c b/lib/ldb/ldb_key_value/ldb_kv_index.c
index 9c65b6f..d8bdf61 100644
--- a/lib/ldb/ldb_key_value/ldb_kv_index.c
+++ b/lib/ldb/ldb_key_value/ldb_kv_index.c
@@ -2008,6 +2008,12 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 		return ldb_operr(ldb);
 
 	case LDB_SCOPE_ONELEVEL:
+
+		/*
+		 * First, load all the one-level child objects (regardless of
+		 * whether they match the search filter or not). The database
+		 * maintains a one-level index, so retrieving this is quick.
+		 */
 		ret = ldb_kv_index_dn_one(ac->module,
 					  ldb_kv,
 					  ac->base,
@@ -2019,9 +2025,12 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 		}
 
 		/*
-		 * If we have too many matches, running the filter
-		 * tree over the SCOPE_ONELEVEL can be quite expensive
-		 * so we now check the filter tree index as well.
+		 * If we have too many children, running ldb_kv_index_filter()
+		 * over all the child objects can be quite expensive. So next
+		 * we do a separate indexed query using the search filter.
+		 *
+		 * This should be quick, but it may return objects that are not
+		 * the direct one-level child objects we're interested in.
 		 *
 		 * We only do this in the GUID index mode, which is
 		 * O(n*log(m)) otherwise the intersection below will
@@ -2044,8 +2053,9 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 				talloc_free(dn_list);
 				return LDB_ERR_OPERATIONS_ERROR;
 			}
+
 			/*
-			 * Here we load the index for the tree.
+			 * Try to do an indexed database search
 			 */
 			ret = ldb_kv_index_dn(
 			    ac->module, ldb_kv, ac->tree, idx_one_tree_list);
@@ -2060,9 +2070,18 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 			}
 
 			/*
-			 * We only care if this is successful, if the
-			 * index can't trim the result list down then
-			 * the ONELEVEL index is still good enough.
+			 * Once we have a successful search result, we
+			 * intersect it with the one-level children (dn_list).
+			 * This should give us exactly the result we're after
+			 * (we still need to run ldb_kv_index_filter() to
+			 * handle potential index truncation cases).
+			 *
+			 * The indexed search may fail because we don't support
+			 * indexing on that type of search operation, e.g.
+			 * matching against '*'. In which case we fall through
+			 * and run ldb_kv_index_filter() over all the one-level
+			 * children (which is still better than bailing out here
+			 * and falling back to a full DB scan).
 			 */
 			if (ret == LDB_SUCCESS) {
 				if (!list_intersect(ldb,
-- 
2.7.4


From 3519f1f0599844d7b50f235cc1e2f0a306a4246b Mon Sep 17 00:00:00 2001
From: Tim Beale <timbeale at catalyst.net.nz>
Date: Thu, 10 Jan 2019 14:25:06 +1300
Subject: [PATCH 4/4] ldb: Rename variable

The old name confused me because it's not really related to the
one-level index at all. It's the result from evaluating the indexed
search specified in the ac->tree.

Signed-off-by: Tim Beale <timbeale at catalyst.net.nz>
---
 lib/ldb/ldb_key_value/ldb_kv_index.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/lib/ldb/ldb_key_value/ldb_kv_index.c b/lib/ldb/ldb_key_value/ldb_kv_index.c
index d8bdf61..9a4a0db 100644
--- a/lib/ldb/ldb_key_value/ldb_kv_index.c
+++ b/lib/ldb/ldb_key_value/ldb_kv_index.c
@@ -2041,15 +2041,15 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 		 * fast enough in the small case.
 		 */
 		if (ldb_kv->cache->GUID_index_attribute != NULL) {
-			struct dn_list *idx_one_tree_list
+			struct dn_list *indexed_search_result
 				= talloc_zero(ac, struct dn_list);
-			if (idx_one_tree_list == NULL) {
+			if (indexed_search_result == NULL) {
 				talloc_free(dn_list);
 				return ldb_module_oom(ac->module);
 			}
 
 			if (!ldb_kv->cache->attribute_indexes) {
-				talloc_free(idx_one_tree_list);
+				talloc_free(indexed_search_result);
 				talloc_free(dn_list);
 				return LDB_ERR_OPERATIONS_ERROR;
 			}
@@ -2058,13 +2058,14 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 			 * Try to do an indexed database search
 			 */
 			ret = ldb_kv_index_dn(
-			    ac->module, ldb_kv, ac->tree, idx_one_tree_list);
+			    ac->module, ldb_kv, ac->tree,
+			    indexed_search_result);
 
 			/*
 			 * We can stop if we're sure the object doesn't exist
 			 */
 			if (ret == LDB_ERR_NO_SUCH_OBJECT) {
-				talloc_free(idx_one_tree_list);
+				talloc_free(indexed_search_result);
 				talloc_free(dn_list);
 				return LDB_ERR_NO_SUCH_OBJECT;
 			}
@@ -2087,8 +2088,8 @@ int ldb_kv_search_indexed(struct ldb_kv_context *ac, uint32_t *match_count)
 				if (!list_intersect(ldb,
 						    ldb_kv,
 						    dn_list,
-						    idx_one_tree_list)) {
-					talloc_free(idx_one_tree_list);
+						    indexed_search_result)) {
+					talloc_free(indexed_search_result);
 					talloc_free(dn_list);
 					return LDB_ERR_OPERATIONS_ERROR;
 				}
-- 
2.7.4



More information about the samba-technical mailing list