[SCM] SAMBA-CTDB repository - branch v3-4-ctdb updated - 29173408a3d9050d83aee87b033b0e8937ce9240

Michael Adam obnox at samba.org
Mon Nov 2 17:01:05 MST 2009


The branch, v3-4-ctdb has been updated
       via  29173408a3d9050d83aee87b033b0e8937ce9240 (commit)
       via  ddc9184de972168ebb8a7f70efb1e1a19800a745 (commit)
       via  b991de4f20eb46410fc38aae9aad675eaa86ab06 (commit)
       via  7e9c18c7d46b45bb1a2ed4af1f2aacf401c4ff0b (commit)
       via  d5cf73232aff9bd7e5ebb8b70c8f28aaea3780a9 (commit)
       via  bee87fb72bdf4a2374e4ad341bd3376e963be5b8 (commit)
       via  2ba7cb6004251fa0c14a2204bf809803181f76b3 (commit)
       via  0080b44e63e28716b9b44622fa1c2454580a730b (commit)
       via  05031c3629d283632af36fe067727a32b643cc6a (commit)
       via  28a72732a1f613132044f371533c46b3b4e09be9 (commit)
       via  bb3c277954e9ae934d96aa3de6300e62c98a157c (commit)
       via  21b97f4a498ea712140c6724a762f6b3fb3fc7a7 (commit)
       via  9065e4b5f7b4b02179b2263a254a7e7f0c2b508b (commit)
       via  2902d4738a2c8207f7d6cf5b998f01db536d58b7 (commit)
      from  aab0b67097154b444ddb464d65fe128d33f2dab8 (commit)

http://gitweb.samba.org/?p=obnox/samba-ctdb.git;a=shortlog;h=v3-4-ctdb


- Log -----------------------------------------------------------------
commit 29173408a3d9050d83aee87b033b0e8937ce9240
Author: Michael Adam <obnox at samba.org>
Date:   Tue Nov 3 00:51:27 2009 +0100

    s3:registry: add an extra check for dsize==0 to regdb_fetch_keys_internal()
    
    Don't only rely on dptr == NULL.
    I stumbled over this one when rewriting some of the dbwrap_ctdb code.
    
    Michael

commit ddc9184de972168ebb8a7f70efb1e1a19800a745
Author: Michael Adam <obnox at samba.org>
Date:   Tue Nov 3 00:47:37 2009 +0100

    s3:registry: add safety check for return value of tdb_unpack to regdb_fetch_keys_internal()
    
    Prevents segfaults in some situations.
    
    (For a non existent or empty record, we sometimes rely on the fetch operation
     to return dsize==0 and sometimes we rely on dptr==NULL.)
    
    Michael

commit b991de4f20eb46410fc38aae9aad675eaa86ab06
Author: Michael Adam <obnox at samba.org>
Date:   Sat Oct 31 13:16:34 2009 +0100

    s3:dbwrap_ctdb: add debug message to transaction_fetch_start()
    
    for the case that another local process has started a transaction
    bewteen releasing the transaction_lock record and starting the
    transaction.
    
    Michael

commit 7e9c18c7d46b45bb1a2ed4af1f2aacf401c4ff0b
Author: Michael Adam <obnox at samba.org>
Date:   Sat Oct 31 13:13:04 2009 +0100

    s3:dbwrap_ctdb: split combined check in two and add descriptive debug
    
    in db_ctdb_transaction_fetch_start() for error conditions when re-fetching
    the transaction_lock record inside the transaction
    
    Michael

commit d5cf73232aff9bd7e5ebb8b70c8f28aaea3780a9
Author: Michael Adam <obnox at samba.org>
Date:   Thu Oct 29 00:01:45 2009 +0100

    s3:dbwrap_ctdb: fix race condition with concurrent transactions on the same node.
    
    In ctdb_transaction_commit(), when the trans2_commit control fails, there
    is a race condition in the 1 second sleep between the local transaction_cancel
    and the call to ctdb_replay_transaction(): The database is not locked, and
    neither is the transaction_lock record. So another client can start and possibly
    complete a new transaction in this gap, but only on the same node: The locking
    of the transaction_lock record on a different node which involves migration of
    the record to the other node has been disabled by introduction of the
    transaction_active flag on the db which closes precisely this gap from the start
    of the commit until the call to TRANS2_FINISH or TRANS2_ERROR.
    But this mechanism does not cover the case where a process on the same node
    tries to start a transaction: There is no obstacle to locking the transaction_lock
    record because the record does not need to be migrated.
    
    This commit closes this race condition in ctdb_transaction_fetch_start()
    by using the new ctdb_ctrl_transaction_active() call to ask the local
    ctdb daemon whether it has a transaction running on the database.
    If so, the check is repeated until the running transaction is done.
    
    This does introduce an additional call to the local ctdbd when starting
    transactions, but it does close the (hopefully) last race condition.
    
    Michael

commit bee87fb72bdf4a2374e4ad341bd3376e963be5b8
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 28 23:56:59 2009 +0100

    s3:configure: add a check for the new CTDB_CONTROL_TRANS2_ACTIVE
    
    Michael

commit 2ba7cb6004251fa0c14a2204bf809803181f76b3
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 28 23:56:03 2009 +0100

    s3:dbwrap_ctdb: add new db_ctdb_transaction_active() that calls CTDB_CONTROL_TRANS2_COMMIT
    
    Michael

commit 0080b44e63e28716b9b44622fa1c2454580a730b
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 28 01:54:04 2009 +0100

    s3:dbwrap_ctdb: fix a race in starting concurrent transactions on a single node
    
    There are two races in concurrent transactions on a single node.
    One in starting a transaction and one with replay during commit.
    
    This commit closes the first race by storing the client pid in the
    transaction-lock record and comparing the stored pid against its own
    pid after releasing the lock and refetching the record inside the
    transaction.
    
    Michael

commit 05031c3629d283632af36fe067727a32b643cc6a
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 28 01:50:15 2009 +0100

    s3:dbwrap_ctdb: use db_ctdb_ltdb_fetch() inside db_ctdb_transaction_fetch_start
    
    Michael

commit 28a72732a1f613132044f371533c46b3b4e09be9
Author: Michael Adam <obnox at samba.org>
Date:   Wed Oct 28 01:28:38 2009 +0100

    s3:dbwrap_ctdb: use db_ctdb_ltdb_fetch() inside db_ctdb_transaction_fetch()
    
    Michael

commit bb3c277954e9ae934d96aa3de6300e62c98a157c
Author: Michael Adam <obnox at samba.org>
Date:   Tue Nov 3 00:55:41 2009 +0100

    s3:dbwrap_ctdb: add a function db_ctdb_ltdb_fetch()
    
    This fetches a record from the db and splits out the ctdb header.
    
    Michael

commit 21b97f4a498ea712140c6724a762f6b3fb3fc7a7
Author: Michael Adam <obnox at samba.org>
Date:   Thu Oct 22 16:27:45 2009 +0200

    s3:dbrwap_ctdb: add a function db_ctdb_ltdb_store()
    
    and use it in db_ctdb_store() and db_ctdb_transaction_store().
    
    Michael

commit 9065e4b5f7b4b02179b2263a254a7e7f0c2b508b
Author: Michael Adam <obnox at samba.org>
Date:   Thu Oct 22 14:37:51 2009 +0200

    s3:dbwrap_ctdb: reformat a comment slightly to enhance clearness.
    
    Michael

commit 2902d4738a2c8207f7d6cf5b998f01db536d58b7
Author: Michael Adam <obnox at samba.org>
Date:   Mon May 25 21:59:40 2009 +0200

    s3:dbwrap_ctdb: fix some function header comments
    
    Michael
    (cherry picked from commit f5a5c6a5dcf6be2486c53138e24f8d76b64f882e)
    
    Signed-off-by: Michael Adam <obnox at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 source3/configure.in              |   17 +++
 source3/lib/dbwrap_ctdb.c         |  249 +++++++++++++++++++++++++++++--------
 source3/registry/reg_backend_db.c |    6 +-
 3 files changed, 216 insertions(+), 56 deletions(-)


Changeset truncated at 500 lines:

diff --git a/source3/configure.in b/source3/configure.in
index 69587e0..2151f0a 100644
--- a/source3/configure.in
+++ b/source3/configure.in
@@ -5055,6 +5055,23 @@ else
 	ctdb_broken="missing transaction support"
 fi
 
+AC_HAVE_DECL(CTDB_CONTROL_TRANS2_ACTIVE,[
+#include "confdefs.h"
+#define NO_CONFIG_H
+#include "replace.h"
+#include "system/wait.h"
+#include "system/network.h"
+#include <talloc.h>
+#include <tdb.h>
+#include <ctdb.h>
+#include <ctdb_private.h>
+])
+if test x"$ac_cv_have_CTDB_CONTROL_TRANS2_ACTIVE_decl" = x"yes"; then
+	ctdb_broken=no
+else
+	ctdb_broken="transaction support too old"
+fi
+
 # in ctdb 1.0.57 ctdb_control_tcp was temparary renamed to ctdb_tcp_client
 AC_CHECK_TYPE(struct ctdb_tcp_client,[
 	AC_DEFINE([ctdb_control_tcp],[ctdb_tcp_client],[ctdb ipv4 support])
diff --git a/source3/lib/dbwrap_ctdb.c b/source3/lib/dbwrap_ctdb.c
index f492d5f..8563990 100644
--- a/source3/lib/dbwrap_ctdb.c
+++ b/source3/lib/dbwrap_ctdb.c
@@ -26,9 +26,11 @@
 struct db_ctdb_transaction_handle {
 	struct db_ctdb_ctx *ctx;
 	bool in_replay;
-	/* we store the reads and writes done under a transaction one
-	   list stores both reads and writes, the other just writes
-	*/
+	/*
+	 * we store the reads and writes done under a transaction:
+	 * - one list stores both reads and writes (m_all),
+	 * - the other just writes (m_write)
+	 */
 	struct ctdb_marshall_buffer *m_all;
 	struct ctdb_marshall_buffer *m_write;
 	uint32_t nesting;
@@ -73,6 +75,91 @@ static NTSTATUS tdb_error_to_ntstatus(struct tdb_context *tdb)
 }
 
 
+/**
+ * fetch a record from the tdb, separating out the header
+ * information and returning the body of the record.
+ */
+static NTSTATUS db_ctdb_ltdb_fetch(struct db_ctdb_ctx *db,
+				   TDB_DATA key,
+				   struct ctdb_ltdb_header *header,
+				   TALLOC_CTX *mem_ctx,
+				   TDB_DATA *data)
+{
+	TDB_DATA rec;
+	NTSTATUS status;
+
+	rec = tdb_fetch(db->wtdb->tdb, key);
+	if (rec.dsize < sizeof(struct ctdb_ltdb_header)) {
+		status = NT_STATUS_NOT_FOUND;
+		if (data) {
+			ZERO_STRUCTP(data);
+		}
+		if (header) {
+			header->dmaster = (uint32_t)-1;
+			header->rsn = 0;
+		}
+		goto done;
+	}
+
+	if (header) {
+		*header = *(struct ctdb_ltdb_header *)rec.dptr;
+	}
+
+	if (data) {
+		data->dsize = rec.dsize - sizeof(struct ctdb_ltdb_header);
+		if (data->dsize == 0) {
+			data->dptr = NULL;
+		} else {
+			data->dptr = (unsigned char *)talloc_memdup(mem_ctx,
+					rec.dptr
+					 + sizeof(struct ctdb_ltdb_header),
+					data->dsize);
+			if (data->dptr == NULL) {
+				status = NT_STATUS_NO_MEMORY;
+				goto done;
+			}
+		}
+	}
+
+	status = NT_STATUS_OK;
+
+done:
+	SAFE_FREE(rec.dptr);
+	return status;
+}
+
+/*
+ * Store a record together with the ctdb record header
+ * in the local copy of the database.
+ */
+static NTSTATUS db_ctdb_ltdb_store(struct db_ctdb_ctx *db,
+				   TDB_DATA key,
+				   struct ctdb_ltdb_header *header,
+				   TDB_DATA data)
+{
+	TALLOC_CTX *tmp_ctx = talloc_stackframe();
+	TDB_DATA rec;
+	int ret;
+
+	rec.dsize = data.dsize + sizeof(struct ctdb_ltdb_header);
+	rec.dptr = (uint8_t *)talloc_size(tmp_ctx, rec.dsize);
+
+	if (rec.dptr == NULL) {
+		talloc_free(tmp_ctx);
+		return NT_STATUS_NO_MEMORY;
+	}
+
+	memcpy(rec.dptr, header, sizeof(struct ctdb_ltdb_header));
+	memcpy(sizeof(struct ctdb_ltdb_header) + (uint8_t *)rec.dptr, data.dptr, data.dsize);
+
+	ret = tdb_store(db->wtdb->tdb, key, rec, TDB_REPLACE);
+
+	talloc_free(tmp_ctx);
+
+	return (ret == 0) ? NT_STATUS_OK
+			  : tdb_error_to_ntstatus(db->wtdb->tdb);
+
+}
 
 /*
   form a ctdb_rec_data record from a key/data pair
@@ -211,24 +298,56 @@ static struct ctdb_rec_data *db_ctdb_marshall_loop_next(struct ctdb_marshall_buf
 }
 
 
+static int32_t db_ctdb_transaction_active(uint32_t db_id)
+{
+	int32_t status;
+	NTSTATUS ret;
+	TDB_DATA indata;
+
+	indata.dptr = (uint8_t *)&db_id;
+	indata.dsize = sizeof(db_id);
+
+	ret = ctdbd_control_local(messaging_ctdbd_connection(),
+				  CTDB_CONTROL_TRANS2_ACTIVE, 0, 0,
+				  indata, NULL, NULL, &status);
 
-/* start a transaction on a database */
+	if (!NT_STATUS_IS_OK(ret)) {
+		DEBUG(2, ("ctdb control TRANS2_ACTIVE failed\n"));
+		return -1;
+	}
+
+	return status;
+}
+
+
+/**
+ * CTDB transaction destructor
+ */
 static int db_ctdb_transaction_destructor(struct db_ctdb_transaction_handle *h)
 {
 	tdb_transaction_cancel(h->ctx->wtdb->tdb);
 	return 0;
 }
 
-/* start a transaction on a database */
+/**
+ * start a transaction on a ctdb database:
+ * - lock the transaction lock key
+ * - start the tdb transaction
+ */
 static int db_ctdb_transaction_fetch_start(struct db_ctdb_transaction_handle *h)
 {
 	struct db_record *rh;
+	struct db_ctdb_rec *crec;
 	TDB_DATA key;
 	TALLOC_CTX *tmp_ctx;
 	const char *keyname = CTDB_TRANSACTION_LOCK_KEY;
 	int ret;
 	struct db_ctdb_ctx *ctx = h->ctx;
 	TDB_DATA data;
+	pid_t pid;
+	NTSTATUS status;
+	struct ctdb_ltdb_header header;
+	int32_t transaction_status;
 
 	key.dptr = (uint8_t *)discard_const(keyname);
 	key.dsize = strlen(keyname);
@@ -242,6 +361,34 @@ again:
 		talloc_free(tmp_ctx);
 		return -1;
 	}
+	crec = talloc_get_type_abort(rh->private_data, struct db_ctdb_rec);
+
+	transaction_status = db_ctdb_transaction_active(ctx->db_id);
+	if (transaction_status == 1) {
+		unsigned long int usec = (1000 + random()) % 100000;
+		DEBUG(3, ("Transaction already active on db_id[0x%08x]."
+			  "Re-trying after %lu microseconds...",
+			  ctx->db_id, usec));
+		talloc_free(tmp_ctx);
+		usleep(usec);
+		goto again;
+	}
+
+	/*
+	 * store the pid in the database:
+	 * it is not enought that the node is dmaster...
+	 */
+	pid = getpid();
+	data.dptr = (unsigned char *)&pid;
+	data.dsize = sizeof(pid_t);
+	status = db_ctdb_ltdb_store(ctx, key, &(crec->header), data);
+	if (!NT_STATUS_IS_OK(status)) {
+		DEBUG(0, (__location__ " Failed to store pid in transaction "
+			  "record: %s\n", nt_errstr(status)));
+		talloc_free(tmp_ctx);
+		return -1;
+	}
+
 	talloc_free(rh);
 
 	ret = tdb_transaction_start(ctx->wtdb->tdb);
@@ -251,24 +398,46 @@ again:
 		return -1;
 	}
 
-	data = tdb_fetch(ctx->wtdb->tdb, key);
-	if ((data.dptr == NULL) ||
-	    (data.dsize < sizeof(struct ctdb_ltdb_header)) ||
-	    ((struct ctdb_ltdb_header *)data.dptr)->dmaster != get_my_vnn()) {
-		SAFE_FREE(data.dptr);
+	status = db_ctdb_ltdb_fetch(ctx, key, &header, tmp_ctx, &data);
+	if (!NT_STATUS_IS_OK(status)) {
+		DEBUG(0, (__location__ " failed to refetch transaction lock "
+			  "record inside transaction: %s - retrying\n",
+			  nt_errstr(status)));
+		tdb_transaction_cancel(ctx->wtdb->tdb);
+		talloc_free(tmp_ctx);
+		goto again;
+	}
+
+	if (header.dmaster != get_my_vnn()) {
+		DEBUG(3, (__location__ " refetch transaction lock record : "
+			  "we are not dmaster any more "
+			  "(dmaster[%u] != my_vnn[%u]) - retrying\n",
+			  header.dmaster, get_my_vnn()));
+		tdb_transaction_cancel(ctx->wtdb->tdb);
+		talloc_free(tmp_ctx);
+		goto again;
+	}
+
+	if ((data.dsize != sizeof(pid_t)) || (*(pid_t *)(data.dptr) != pid)) {
+		DEBUG(3, (__location__ " refetch transaction lock record: "
+			  "another local process has started a transaction "
+			  "(stored pid [%u] != my pid [%u]) - retrying\n",
+			  *(pid_t *)(data.dptr), pid));
 		tdb_transaction_cancel(ctx->wtdb->tdb);
 		talloc_free(tmp_ctx);
 		goto again;
 	}
 
-	SAFE_FREE(data.dptr);
 	talloc_free(tmp_ctx);
 
 	return 0;
 }
 
 
-/* start a transaction on a database */
+/**
+ * CTDB dbwrap API: transaction_start function
+ * starts a transaction on a persistent database
+ */
 static int db_ctdb_transaction_start(struct db_context *db)
 {
 	struct db_ctdb_transaction_handle *h;
@@ -320,24 +489,14 @@ static int db_ctdb_transaction_fetch(struct db_ctdb_ctx *db,
 				     TDB_DATA key, TDB_DATA *data)
 {
 	struct db_ctdb_transaction_handle *h = db->transaction;
+	NTSTATUS status;
 
-	*data = tdb_fetch(h->ctx->wtdb->tdb, key);
+	status = db_ctdb_ltdb_fetch(h->ctx, key, NULL, mem_ctx, data);
 
-	if (data->dptr != NULL) {
-		uint8_t *oldptr = (uint8_t *)data->dptr;
-		data->dsize -= sizeof(struct ctdb_ltdb_header);
-		if (data->dsize == 0) {
-			data->dptr = NULL;
-		} else {
-			data->dptr = (uint8 *)
-				talloc_memdup(
-					mem_ctx, data->dptr+sizeof(struct ctdb_ltdb_header),
-					data->dsize);
-		}
-		SAFE_FREE(oldptr);
-		if (data->dptr == NULL && data->dsize != 0) {
-			return -1;
-		}
+	if (NT_STATUS_EQUAL(status, NT_STATUS_NOT_FOUND)) {
+		*data = tdb_null;
+	} else if (!NT_STATUS_IS_OK(status)) {
+		return -1;
 	}
 
 	if (!h->in_replay) {
@@ -461,6 +620,7 @@ static int db_ctdb_transaction_store(struct db_ctdb_transaction_handle *h,
 	int ret;
 	TDB_DATA rec;
 	struct ctdb_ltdb_header header;
+	NTSTATUS status;
 
 	/* we need the header so we can update the RSN */
 	rec = tdb_fetch(h->ctx->wtdb->tdb, key);
@@ -501,17 +661,12 @@ static int db_ctdb_transaction_store(struct db_ctdb_transaction_handle *h,
 		return -1;
 	}
 
-	rec.dsize = data.dsize + sizeof(struct ctdb_ltdb_header);
-	rec.dptr = (uint8_t *)talloc_size(tmp_ctx, rec.dsize);
-	if (rec.dptr == NULL) {
-		DEBUG(0,(__location__ " Failed to alloc record\n"));
-		talloc_free(tmp_ctx);
-		return -1;
+	status = db_ctdb_ltdb_store(h->ctx, key, &header, data);
+	if (NT_STATUS_IS_OK(status)) {
+		ret = 0;
+	} else {
+		ret = -1;
 	}
-	memcpy(rec.dptr, &header, sizeof(struct ctdb_ltdb_header));
-	memcpy(sizeof(struct ctdb_ltdb_header) + (uint8_t *)rec.dptr, data.dptr, data.dsize);
-
-	ret = tdb_store(h->ctx->wtdb->tdb, key, rec, TDB_REPLACE);
 
 	talloc_free(tmp_ctx);
 
@@ -775,24 +930,8 @@ static NTSTATUS db_ctdb_store(struct db_record *rec, TDB_DATA data, int flag)
 {
 	struct db_ctdb_rec *crec = talloc_get_type_abort(
 		rec->private_data, struct db_ctdb_rec);
-	TDB_DATA cdata;
-	int ret;
-
-	cdata.dsize = sizeof(crec->header) + data.dsize;
 
-	if (!(cdata.dptr = SMB_MALLOC_ARRAY(uint8, cdata.dsize))) {
-		return NT_STATUS_NO_MEMORY;
-	}
-
-	memcpy(cdata.dptr, &crec->header, sizeof(crec->header));
-	memcpy(cdata.dptr + sizeof(crec->header), data.dptr, data.dsize);
-
-	ret = tdb_store(crec->ctdb_ctx->wtdb->tdb, rec->key, cdata, TDB_REPLACE);
-
-	SAFE_FREE(cdata.dptr);
-
-	return (ret == 0) ? NT_STATUS_OK
-			  : tdb_error_to_ntstatus(crec->ctdb_ctx->wtdb->tdb);
+	return db_ctdb_ltdb_store(crec->ctdb_ctx, rec->key, &(crec->header), data);
 }
 
 
diff --git a/source3/registry/reg_backend_db.c b/source3/registry/reg_backend_db.c
index dec43ae..96d581c 100644
--- a/source3/registry/reg_backend_db.c
+++ b/source3/registry/reg_backend_db.c
@@ -1465,7 +1465,7 @@ static WERROR regdb_fetch_keys_internal(struct db_context *db, const char *key,
 
 	value = regdb_fetch_key_internal(db, frame, key);
 
-	if (value.dptr == NULL) {
+	if (value.dsize == 0 || value.dptr == NULL) {
 		DEBUG(10, ("regdb_fetch_keys: no subkeys found for key [%s]\n",
 			   key));
 		goto done;
@@ -1474,6 +1474,10 @@ static WERROR regdb_fetch_keys_internal(struct db_context *db, const char *key,
 	buf = value.dptr;
 	buflen = value.dsize;
 	len = tdb_unpack( buf, buflen, "d", &num_items);
+	if (len == (uint32_t)-1) {
+		werr = WERR_NOT_FOUND;
+		goto done;
+	}
 
 	werr = regsubkey_ctr_reinit(ctr);
 	W_ERROR_NOT_OK_GOTO_DONE(werr);


-- 
SAMBA-CTDB repository


More information about the samba-cvs mailing list