RFC: dbwrap_ctdb and empty vs deleted records
Ralph Böhme
slow at samba.org
Mon Aug 8 16:35:09 UTC 2016
On Mon, Aug 08, 2016 at 12:56:20PM +0200, Michael Adam wrote:
> On 2016-08-08 at 12:38 +0200, Michael Adam wrote:
> > Hi Ralph,
> >
> > after thinking for quite some time about this, and
> > looking through the code with increasingly watery
> > eyes, I have a few comments:
> >
> > In summary, I think the patch is wrong in this exact
> > form. But I have a proposal of a modification that
> > should work always.
> >
> > See the details below:
> >
> > ...
> >
> > Alternative patch attached.
>
> OOps, that one was buggy.
> New attempt attached :-)
after a fruitful pair-programming session with Michael we came up with
a simplified and hopefully correct version that I also just pushed. :)
Cheerio!
-slow
-------------- next part --------------
From f68a25904dfc6af62d361dfb31ec04a39348bec2 Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Sat, 23 Jul 2016 11:08:13 +0200
Subject: [PATCH 1/2] s4/torture: add a test for ctdb-tombstrone-record
deadlock
This tests for a possible deadlock between smbd and ctdb dealing with
ctdb tombstone records.
Commit 925625b52886d40b50fc631bad8bdc81970f7598 explains the deadlock in
more details and contains the fix. It's a fix for a regression
introduced by the patch for bug 10008 (1cae59ce112c).
If you ever want to use this test against that specific commit:
$ git checkout 925625b52886d40b50fc631bad8bdc81970f7598
$ git cherry-pick THIS_COMMIT
This should not deadlock on a ctdb cluster.
$ git revert 925625b52886d40b50fc631bad8bdc81970f7598
This will deadlock.
Bug: https://bugzilla.samba.org/show_bug.cgi?id=12005
Pair-Programmed-With: Michael Adam <obnox at samba.org>
Signed-off-by: Ralph Boehme <slow at samba.org>
Signed-off-by: Michael Adam <obnox at samba.org>
---
source4/torture/smb2/lock.c | 64 +++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 64 insertions(+)
diff --git a/source4/torture/smb2/lock.c b/source4/torture/smb2/lock.c
index 68e353d..3900abf 100644
--- a/source4/torture/smb2/lock.c
+++ b/source4/torture/smb2/lock.c
@@ -3033,6 +3033,69 @@ done:
return ret;
}
+/**
+ * Test lock interaction between smbd and ctdb with tombstone records.
+ *
+ * Re-locking an unlocked record could lead to a deadlock between
+ * smbd and ctdb. Make sure we don't regress.
+ *
+ * https://bugzilla.samba.org/show_bug.cgi?id=12005
+ * https://bugzilla.samba.org/show_bug.cgi?id=10008
+ */
+static bool test_deadlock(struct torture_context *torture,
+ struct smb2_tree *tree)
+{
+ NTSTATUS status;
+ bool ret = true;
+ struct smb2_handle _h;
+ struct smb2_handle *h = NULL;
+ uint8_t buf[200];
+ const char *fname = BASEDIR "\\deadlock.txt";
+
+ if (!lpcfg_clustering(torture->lp_ctx)) {
+ torture_skip(torture, "Test must be run on a ctdb cluster\n");
+ return true;
+ }
+
+ status = torture_smb2_testdir(tree, BASEDIR, &_h);
+ torture_assert_ntstatus_ok(torture, status,
+ "torture_smb2_testdir failed");
+ smb2_util_close(tree, _h);
+
+ status = torture_smb2_testfile(tree, fname, &_h);
+ torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+ "torture_smb2_testfile failed");
+ h = &_h;
+
+ ZERO_STRUCT(buf);
+ status = smb2_util_write(tree, *h, buf, 0, ARRAY_SIZE(buf));
+ torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+ "smb2_util_write failed");
+
+ status = test_smb2_lock(tree, *h, 0, 1, true);
+ torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+ "test_smb2_lock failed");
+
+ status = test_smb2_unlock(tree, *h, 0, 1);
+ torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+ "test_smb2_unlock failed");
+
+ status = test_smb2_lock(tree, *h, 0, 1, true);
+ torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+ "test_smb2_lock failed");
+
+ status = test_smb2_unlock(tree, *h, 0, 1);
+ torture_assert_ntstatus_ok_goto(torture, status, ret, done,
+ "test_smb2_unlock failed");
+
+done:
+ if (h != NULL) {
+ smb2_util_close(tree, *h);
+ }
+ smb2_deltree(tree, BASEDIR);
+ return ret;
+}
+
/* basic testing of SMB2 locking
*/
struct torture_suite *torture_smb2_lock_init(void)
@@ -3068,6 +3131,7 @@ struct torture_suite *torture_smb2_lock_init(void)
torture_suite_add_2smb2_test(suite, "overlap", test_overlap);
torture_suite_add_1smb2_test(suite, "truncate", test_truncate);
torture_suite_add_1smb2_test(suite, "replay", test_replay);
+ torture_suite_add_1smb2_test(suite, "ctdb-delrec-deadlock", test_deadlock);
suite->description = talloc_strdup(suite, "SMB2-LOCK tests");
--
2.5.5
From 17dcdd37b6ac1bbf4911556e99ce5428f962d4cd Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Mon, 8 Aug 2016 16:58:51 +0200
Subject: [PATCH 2/2] dbwrap_ctdb: treat empty records in ltdb as non-existing
When fetching records from remote ctdb nodes via ctdbd_parse() or in
db_ctdb_traverse(), we already check for tombstone records and skip
them. This was originally also done for the ltdb checks.
See also bug: https://bugzilla.samba.org/show_bug.cgi?id=10008
(commit 1cae59ce112ccb51b45357a52b902f80fce1eef1).
Commit 925625b52886d40b50fc631bad8bdc81970f7598 reverted part of the
patch of bug 10008 due to a deadlock it introduced.
This patch re-introduces the consistent treatment of empty records in
the ltdb but avoids the deadlock by correctly signalling
NT_STATUS_NOT_FOUND if an empty record is found authoritatively in
the ltdb and not calling ctdb in this case.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=12005
Pair-Programmed-With: Michael Adam <obnox at samba.org>
Signed-off-by: Ralph Boehme <slow at samba.org>
Signed-off-by: Michael Adam <obnox at samba.org>
---
source3/lib/dbwrap/dbwrap_ctdb.c | 27 ++++++++++++++++++++++++++-
1 file changed, 26 insertions(+), 1 deletion(-)
diff --git a/source3/lib/dbwrap/dbwrap_ctdb.c b/source3/lib/dbwrap/dbwrap_ctdb.c
index 3bbb9be..5123b91 100644
--- a/source3/lib/dbwrap/dbwrap_ctdb.c
+++ b/source3/lib/dbwrap/dbwrap_ctdb.c
@@ -1221,6 +1221,7 @@ struct db_ctdb_parse_record_state {
uint32_t my_vnn;
bool ask_for_readonly_copy;
bool done;
+ bool empty_record;
};
static void db_ctdb_parse_record_parser(
@@ -1240,7 +1241,16 @@ static void db_ctdb_parse_record_parser_nonpersistent(
(struct db_ctdb_parse_record_state *)private_data;
if (db_ctdb_can_use_local_hdr(header, state->my_vnn, true)) {
- state->parser(key, data, state->private_data);
+ /*
+ * A record consisting only of the ctdb header can be
+ * a validly created empty record or a tombstone
+ * record of a deleted record (not vacuumed yet). Mark
+ * it accordingly.
+ */
+ state->empty_record = (data.dsize == 0);
+ if (!state->empty_record) {
+ state->parser(key, data, state->private_data);
+ }
state->done = true;
} else {
/*
@@ -1267,6 +1277,7 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key,
state.parser = parser;
state.private_data = private_data;
state.my_vnn = ctdbd_vnn(ctx->conn);
+ state.empty_record = false;
if (ctx->transaction != NULL) {
struct db_ctdb_transaction_handle *h = ctx->transaction;
@@ -1298,6 +1309,20 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key,
status = db_ctdb_ltdb_parse(
ctx, key, db_ctdb_parse_record_parser_nonpersistent, &state);
if (NT_STATUS_IS_OK(status) && state.done) {
+ if (state.empty_record) {
+ /*
+ * We know authoritatively, that this is an empty
+ * record. Since ctdb does not distinguish between empty
+ * and deleted records, this can be a record stored as
+ * empty or a not-yet-vacuumed tombstone record of a
+ * deleted record. Now Samba right now can live without
+ * empty records, so we can safely report this record
+ * as non-existing.
+ *
+ * See bugs 10008 and 12005.
+ */
+ return NT_STATUS_NOT_FOUND;
+ }
return NT_STATUS_OK;
}
--
2.5.5
More information about the samba-technical
mailing list