RFC: dbwrap_ctdb and empty vs deleted records

Ralph Böhme slow at samba.org
Mon Aug 8 16:35:09 UTC 2016


On Mon, Aug 08, 2016 at 12:56:20PM +0200, Michael Adam wrote:
> On 2016-08-08 at 12:38 +0200, Michael Adam wrote:
> > Hi Ralph,
> > 
> > after thinking for quite some time about this, and
> > looking through the code with increasingly watery
> > eyes, I have a few comments:
> > 
> > In summary, I think the patch is wrong in this exact
> > form. But I have a proposal of a modification that
> > should work always.
> > 
> > See the details below:
> > 
> > ...
> >
> > Alternative patch attached.
> 
> OOps, that one was buggy.
> New attempt attached :-)

after a fruitful pair-programming session with Michael we came up with
a simplified and hopefully correct version that I also just pushed. :)

Cheerio!
-slow
-------------- next part --------------
From f68a25904dfc6af62d361dfb31ec04a39348bec2 Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Sat, 23 Jul 2016 11:08:13 +0200
Subject: [PATCH 1/2] s4/torture: add a test for ctdb-tombstrone-record
 deadlock

This tests for a possible deadlock between smbd and ctdb dealing with
ctdb tombstone records.

Commit 925625b52886d40b50fc631bad8bdc81970f7598 explains the deadlock in
more details and contains the fix. It's a fix for a regression
introduced by the patch for bug 10008 (1cae59ce112c).

If you ever want to use this test against that specific commit:

$ git checkout 925625b52886d40b50fc631bad8bdc81970f7598
$ git cherry-pick THIS_COMMIT

This should not deadlock on a ctdb cluster.

$ git revert 925625b52886d40b50fc631bad8bdc81970f7598

This will deadlock.

Bug: https://bugzilla.samba.org/show_bug.cgi?id=12005

Pair-Programmed-With: Michael Adam <obnox at samba.org>

Signed-off-by: Ralph Boehme <slow at samba.org>
Signed-off-by: Michael Adam <obnox at samba.org>
---
 source4/torture/smb2/lock.c | 64 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 64 insertions(+)

diff --git a/source4/torture/smb2/lock.c b/source4/torture/smb2/lock.c
index 68e353d..3900abf 100644
--- a/source4/torture/smb2/lock.c
+++ b/source4/torture/smb2/lock.c
@@ -3033,6 +3033,69 @@ done:
 	return ret;
 }
 
+/**
+ * Test lock interaction between smbd and ctdb with tombstone records.
+ *
+ * Re-locking an unlocked record could lead to a deadlock between
+ * smbd and ctdb. Make sure we don't regress.
+ *
+ * https://bugzilla.samba.org/show_bug.cgi?id=12005
+ * https://bugzilla.samba.org/show_bug.cgi?id=10008
+ */
+static bool test_deadlock(struct torture_context *torture,
+			  struct smb2_tree *tree)
+{
+	NTSTATUS status;
+	bool ret = true;
+	struct smb2_handle _h;
+	struct smb2_handle *h = NULL;
+	uint8_t buf[200];
+	const char *fname = BASEDIR "\\deadlock.txt";
+
+	if (!lpcfg_clustering(torture->lp_ctx)) {
+		torture_skip(torture, "Test must be run on a ctdb cluster\n");
+		return true;
+	}
+
+	status = torture_smb2_testdir(tree, BASEDIR, &_h);
+	torture_assert_ntstatus_ok(torture, status,
+				   "torture_smb2_testdir failed");
+	smb2_util_close(tree, _h);
+
+	status = torture_smb2_testfile(tree, fname, &_h);
+	torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+					"torture_smb2_testfile failed");
+	h = &_h;
+
+	ZERO_STRUCT(buf);
+	status = smb2_util_write(tree, *h, buf, 0, ARRAY_SIZE(buf));
+	torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+					"smb2_util_write failed");
+
+	status = test_smb2_lock(tree, *h, 0, 1, true);
+	torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+					"test_smb2_lock failed");
+
+	status = test_smb2_unlock(tree, *h, 0, 1);
+	torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+					"test_smb2_unlock failed");
+
+	status = test_smb2_lock(tree, *h, 0, 1, true);
+	torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+					"test_smb2_lock failed");
+
+	status = test_smb2_unlock(tree, *h, 0, 1);
+	torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+					"test_smb2_unlock failed");
+
+done:
+	if (h != NULL) {
+		smb2_util_close(tree, *h);
+	}
+	smb2_deltree(tree, BASEDIR);
+	return ret;
+}
+
 /* basic testing of SMB2 locking
 */
 struct torture_suite *torture_smb2_lock_init(void)
@@ -3068,6 +3131,7 @@ struct torture_suite *torture_smb2_lock_init(void)
 	torture_suite_add_2smb2_test(suite, "overlap", test_overlap);
 	torture_suite_add_1smb2_test(suite, "truncate", test_truncate);
 	torture_suite_add_1smb2_test(suite, "replay", test_replay);
+	torture_suite_add_1smb2_test(suite, "ctdb-delrec-deadlock", test_deadlock);
 
 	suite->description = talloc_strdup(suite, "SMB2-LOCK tests");
 
-- 
2.5.5


From 17dcdd37b6ac1bbf4911556e99ce5428f962d4cd Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Mon, 8 Aug 2016 16:58:51 +0200
Subject: [PATCH 2/2] dbwrap_ctdb: treat empty records in ltdb as non-existing

When fetching records from remote ctdb nodes via ctdbd_parse() or in
db_ctdb_traverse(), we already check for tombstone records and skip
them. This was originally also done for the ltdb checks.

See also bug: https://bugzilla.samba.org/show_bug.cgi?id=10008
(commit 1cae59ce112ccb51b45357a52b902f80fce1eef1).

Commit 925625b52886d40b50fc631bad8bdc81970f7598 reverted part of the
patch of bug 10008 due to a deadlock it introduced.

This patch re-introduces the consistent treatment of empty records in
the ltdb but avoids the deadlock by correctly signalling
NT_STATUS_NOT_FOUND if an empty record is found authoritatively in
the ltdb and not calling ctdb in this case.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=12005

Pair-Programmed-With: Michael Adam <obnox at samba.org>

Signed-off-by: Ralph Boehme <slow at samba.org>
Signed-off-by: Michael Adam <obnox at samba.org>
---
 source3/lib/dbwrap/dbwrap_ctdb.c | 27 ++++++++++++++++++++++++++-
 1 file changed, 26 insertions(+), 1 deletion(-)

diff --git a/source3/lib/dbwrap/dbwrap_ctdb.c b/source3/lib/dbwrap/dbwrap_ctdb.c
index 3bbb9be..5123b91 100644
--- a/source3/lib/dbwrap/dbwrap_ctdb.c
+++ b/source3/lib/dbwrap/dbwrap_ctdb.c
@@ -1221,6 +1221,7 @@ struct db_ctdb_parse_record_state {
 	uint32_t my_vnn;
 	bool ask_for_readonly_copy;
 	bool done;
+	bool empty_record;
 };
 
 static void db_ctdb_parse_record_parser(
@@ -1240,7 +1241,16 @@ static void db_ctdb_parse_record_parser_nonpersistent(
 		(struct db_ctdb_parse_record_state *)private_data;
 
 	if (db_ctdb_can_use_local_hdr(header, state->my_vnn, true)) {
-		state->parser(key, data, state->private_data);
+		/*
+		 * A record consisting only of the ctdb header can be
+		 * a validly created empty record or a tombstone
+		 * record of a deleted record (not vacuumed yet). Mark
+		 * it accordingly.
+		 */
+		state->empty_record = (data.dsize == 0);
+		if (!state->empty_record) {
+			state->parser(key, data, state->private_data);
+		}
 		state->done = true;
 	} else {
 		/*
@@ -1267,6 +1277,7 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key,
 	state.parser = parser;
 	state.private_data = private_data;
 	state.my_vnn = ctdbd_vnn(ctx->conn);
+	state.empty_record = false;
 
 	if (ctx->transaction != NULL) {
 		struct db_ctdb_transaction_handle *h = ctx->transaction;
@@ -1298,6 +1309,20 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key,
 	status = db_ctdb_ltdb_parse(
 		ctx, key, db_ctdb_parse_record_parser_nonpersistent, &state);
 	if (NT_STATUS_IS_OK(status) && state.done) {
+		if (state.empty_record) {
+			/*
+			 * We know authoritatively, that this is an empty
+			 * record. Since ctdb does not distinguish between empty
+			 * and deleted records, this can be a record stored as
+			 * empty or a not-yet-vacuumed tombstone record of a
+			 * deleted record. Now Samba right now can live without
+			 * empty records, so we can safely report this record
+			 * as non-existing.
+			 *
+			 * See bugs 10008 and 12005.
+			 */
+			return NT_STATUS_NOT_FOUND;
+		}
 		return NT_STATUS_OK;
 	}
 
-- 
2.5.5



More information about the samba-technical mailing list