RFC: dbwrap_ctdb and empty vs deleted records

Michael Adam obnox at samba.org
Mon Aug 8 10:56:20 UTC 2016


On 2016-08-08 at 12:38 +0200, Michael Adam wrote:
> Hi Ralph,
> 
> after thinking for quite some time about this, and
> looking through the code with increasingly watery
> eyes, I have a few comments:
> 
> In summary, I think the patch is wrong in this exact
> form. But I have a proposal of a modification that
> should work always.
> 
> See the details below:
> 
> ...
>
> Alternative patch attached.

OOps, that one was buggy.
New attempt attached :-)

(Note: to be applied after your state-struct moving patch.)

Michael
-------------- next part --------------
From 0091919492e05e25a2df19ae1b34deaea68e24b2 Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Fri, 1 Jul 2016 17:21:04 +0200
Subject: [PATCH] dbwrap_ctdb: treat tombstone records in ltdb as non-existing

When fetching records from remote ctdb nodes via ctdbd_parse() or in
db_ctdb_traverse(), we already check for tombstone records and skip
them. Add the same check to db_ctdb_ltdb_parser() when fetching records
from the ltbd.

See also bug: https://bugzilla.samba.org/show_bug.cgi?id=10008

Commit 925625b52886d40b50fc631bad8bdc81970f7598 reverted part of the
patch in bug 10008 due to a deadlock it introduced.

This patch avoids the deadlock because it ensures db_ctdb_parse_record()
returns NT_STATUS_NOT_FOUND if an empty record is found in the ltdb and
not calling ctdb.

Bug: https://bugzilla.samba.org/show_bug.cgi?id=12005

Pair-Programmed-With: Michael Adam <obnox at samba.org>

Signed-off-by: Ralph Boehme <slow at samba.org>
Signed-off-by: Michael Adam <obnox at samba.org>
---
 source3/lib/dbwrap/dbwrap_ctdb.c | 35 +++++++++++++++++++++++++++++++++++
 1 file changed, 35 insertions(+)

diff --git a/source3/lib/dbwrap/dbwrap_ctdb.c b/source3/lib/dbwrap/dbwrap_ctdb.c
index ededa1e..a1fc9b4 100644
--- a/source3/lib/dbwrap/dbwrap_ctdb.c
+++ b/source3/lib/dbwrap/dbwrap_ctdb.c
@@ -79,6 +79,7 @@ struct db_ctdb_ltdb_parse_state {
 	void (*parser)(TDB_DATA key, struct ctdb_ltdb_header *header,
 		       TDB_DATA data, void *private_data);
 	void *private_data;
+	bool empty_record;
 };
 
 static int db_ctdb_ltdb_parser(TDB_DATA key, TDB_DATA data,
@@ -91,6 +92,16 @@ static int db_ctdb_ltdb_parser(TDB_DATA key, TDB_DATA data,
 		return -1;
 	}
 
+	if (data.dsize == sizeof(struct ctdb_ltdb_header)) {
+		/*
+		 * A record consisting only of the ctdb header
+		 * can be a validly created empty record or a
+		 * tombstone record of a deleted record (not
+		 * vacuumed yet). Mark it.
+		 */
+		state->empty_record = true;
+	}
+
 	state->parser(
 		key, (struct ctdb_ltdb_header *)data.dptr,
 		make_tdb_data(data.dptr + sizeof(struct ctdb_ltdb_header),
@@ -105,6 +116,7 @@ struct db_ctdb_parse_record_state {
 	uint32_t my_vnn;
 	bool ask_for_readonly_copy;
 	bool done;
+	bool empty_record;
 };
 
 static NTSTATUS db_ctdb_ltdb_parse(
@@ -113,17 +125,25 @@ static NTSTATUS db_ctdb_ltdb_parse(
 		       TDB_DATA data, void *private_data),
 	void *private_data)
 {
+	struct db_ctdb_parse_record_state *rec_state =
+		(struct db_ctdb_parse_record_state *)private_data;
 	struct db_ctdb_ltdb_parse_state state;
 	int ret;
 
 	state.parser = parser;
 	state.private_data = private_data;
+	state.empty_record = false;
 
 	ret = tdb_parse_record(db->wtdb->tdb, key, db_ctdb_ltdb_parser,
 			       &state);
 	if (ret == -1) {
 		return NT_STATUS_NOT_FOUND;
 	}
+
+	if (state.empty_record) {
+		rec_state->empty_record = true;
+	}
+
 	return NT_STATUS_OK;
 }
 
@@ -1267,6 +1287,7 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key,
 	state.parser = parser;
 	state.private_data = private_data;
 	state.my_vnn = ctdbd_vnn(ctx->conn);
+	state.empty_record = false;
 
 	if (ctx->transaction != NULL) {
 		struct db_ctdb_transaction_handle *h = ctx->transaction;
@@ -1298,6 +1319,20 @@ static NTSTATUS db_ctdb_parse_record(struct db_context *db, TDB_DATA key,
 	status = db_ctdb_ltdb_parse(
 		ctx, key, db_ctdb_parse_record_parser_nonpersistent, &state);
 	if (NT_STATUS_IS_OK(status) && state.done) {
+		if (state.empty_record) {
+			/*
+			 * We know authoritatively, that this is an empty
+			 * record. Since ctdb does not distinguish between empty
+			 * and deleted records, this can be a record stored as
+			 * empty or a not-yet-vacuumed tombstone record of a
+			 * deleted record. Now Samba right now can live without
+			 * empty records, so we can safely report this record
+			 * as non-existing.
+			 *
+			 * See bugs 10008 and 12005.
+			 */
+			return NT_STATUS_NOT_FOUND;
+		}
 		return NT_STATUS_OK;
 	}
 
-- 
2.5.5

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20160808/b8924b8d/signature.sig>


More information about the samba-technical mailing list