[SCM] CTDB repository - branch master updated - ctdb-1.10-103-g81663b8

Ronnie Sahlberg sahlberg at samba.org
Thu Feb 24 15:27:09 MST 2011


The branch, master has been updated
       via  81663b81687c0ba681500cca6aa8174bb9587ad2 (commit)
       via  f7dfeb7143f574c2434f7dd16917380dfd1f4f64 (commit)
       via  70ba153b532528bdccea70c5ea28972257f384c1 (commit)
       via  9e0898db6df52d9bc799dd87bfea8c72d5f70ba0 (commit)
       via  e886ff24f4e3e250944289db95916b948893d26c (commit)
       via  f416e76838fe2adf629d4356d1cc87054b1af164 (commit)
       via  761cb235193564a0f337d0308f0a9e6de0ef2710 (commit)
       via  a14917c983c3b9bbbf38f5ddeecdbbe5bde32364 (commit)
       via  1237e15df4af58a3d220eea42a4b75e21e65029f (commit)
       via  d871a38978219e004833608c11aae98fe47614b9 (commit)
       via  2c2d1646eb753ea9561f085bcb101153267b052b (commit)
       via  cab95570dc1eefb08abbac5ae411c29f699b51cc (commit)
       via  983c1ca2e18ecd60fca69bfe9e116125cc695857 (commit)
       via  12cf0619255b12230843cd8bb49cbfdea376ca2f (commit)
      from  62b7fe853db37c0a90e48a0332a3426a8dcb4ed8 (commit)

http://gitweb.samba.org/?p=sahlberg/ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 81663b81687c0ba681500cca6aa8174bb9587ad2
Author: Michael Adam <obnox at samba.org>
Date:   Wed Nov 24 08:01:01 2010 +0100

    server: add a comment explaining the call redirect logic in ctdb_call_send_redirect().

commit f7dfeb7143f574c2434f7dd16917380dfd1f4f64
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 17:39:57 2011 +0100

    recover: finish pending trans3 commits when a recovery is finished.
    
    When the end_recovery control is received, pending trans3 commits are
    finished. During the recovery, all the actions like persistent_callback
    and persistent_store_timeout had been disabled to let the recovery do
    its job. After the recover is completed, send the reply to the waiting
    clients.

commit 70ba153b532528bdccea70c5ea28972257f384c1
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 17:38:40 2011 +0100

    persistent: add ctdb_persistent_finish_trans3_commits().
    
    This function walks all databases and checks for running trans3 commits.
    It sends replies to all of them (with error code) and ends them.
    To be called when a recovery finishes.

commit 9e0898db6df52d9bc799dd87bfea8c72d5f70ba0
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 17:37:42 2011 +0100

    daemon: correctly end a running trans3_commit if the client disconnects.

commit e886ff24f4e3e250944289db95916b948893d26c
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 17:35:27 2011 +0100

    persistent: add a client context to the persistent_stat and track the db_id
    
    The db_id is tracked in the client context as an indication that a
    transaction commit is in progress. This is cleared in the persistent_state
    talloc destructor.
    
    This is in order to properly treat running trans3_commits if the client
    disconnects.

commit f416e76838fe2adf629d4356d1cc87054b1af164
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 00:03:07 2011 +0100

    persistent: reject trans3_control when a commit is already active.
    
    This should actually never happen.

commit 761cb235193564a0f337d0308f0a9e6de0ef2710
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 00:01:13 2011 +0100

    persistent: allocate the persistent state in the ctdb_db struct in trans3_commit
    
    Make sure that ctdb_db->persistent_state is correctly NULL-ed when
    the state is freed. This way, we can use ctdb_db->persistent_state
    as an indication for whether a transaction commit is currently
    running.

commit a14917c983c3b9bbbf38f5ddeecdbbe5bde32364
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 00:23:18 2011 +0100

    persistent: add a ctdb_db context to the ctdb_persistent_state struct.

commit 1237e15df4af58a3d220eea42a4b75e21e65029f
Author: Michael Adam <obnox at samba.org>
Date:   Wed Feb 23 00:00:04 2011 +0100

    persistent: add a ctdb_persistent_state member to the ctdb_db context.
    
    To be used for tracking running transaction commits through recoveries.

commit d871a38978219e004833608c11aae98fe47614b9
Author: Michael Adam <obnox at samba.org>
Date:   Tue Feb 22 22:49:52 2011 +0100

    persistent_callback: print "no error message given" instead of "(null)"

commit 2c2d1646eb753ea9561f085bcb101153267b052b
Author: Michael Adam <obnox at samba.org>
Date:   Tue Feb 22 22:47:30 2011 +0100

    persistent: reduce indentation for the finishing moves in ctdb_persistent_callback

commit cab95570dc1eefb08abbac5ae411c29f699b51cc
Author: Michael Adam <obnox at samba.org>
Date:   Tue Feb 22 22:44:16 2011 +0100

    persistent: if a node failed to update_record, trigger a recovery
    
    and stop processing of the update_record replies in order to let
    the recovery finish the trans3_commit control.

commit 983c1ca2e18ecd60fca69bfe9e116125cc695857
Author: Michael Adam <obnox at samba.org>
Date:   Tue Feb 22 22:24:50 2011 +0100

    persistent_store_timout: do not really time out the trans3_commit control in recovery
    
    If a recovery was started, then all further processing of the update_record
    controls sent by the trans3_commit control and timing them out is disabled.
    The recovery should trigger sending the reply for the update record control
    when finished.

commit 12cf0619255b12230843cd8bb49cbfdea376ca2f
Author: Michael Adam <obnox at samba.org>
Date:   Tue Feb 22 22:24:50 2011 +0100

    persistent_callback: ignore the update-recordreturn code of remote node in recovery
    
    If a recovery was started, then all further processing of the update_record
    controls sent by the trans3_commit control is disabled. The recovery should
    trigger sending the reply for the update record control when finished.

-----------------------------------------------------------------------

Summary of changes:
 include/ctdb_private.h   |    3 +
 server/ctdb_call.c       |   27 ++++++++-
 server/ctdb_daemon.c     |    9 +++
 server/ctdb_persistent.c |  137 +++++++++++++++++++++++++++++++++++++++------
 server/ctdb_recover.c    |    2 +
 5 files changed, 156 insertions(+), 22 deletions(-)


Changeset truncated at 500 lines:

diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index 3f36870..db5594d 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -509,6 +509,7 @@ struct ctdb_db_context {
 	int pending_requests;
 	struct lockwait_handle *lockwait_active;
 	struct lockwait_handle *lockwait_overflow;
+	struct ctdb_persistent_state *persistent_state;
 };
 
 
@@ -1215,6 +1216,8 @@ int32_t ctdb_control_trans3_commit(struct ctdb_context *ctdb,
 				   struct ctdb_req_control *c,
 				   TDB_DATA recdata, bool *async_reply);
 
+void ctdb_persistent_finish_trans3_commits(struct ctdb_context *ctdb);
+
 int32_t ctdb_control_transaction_start(struct ctdb_context *ctdb, uint32_t id);
 int32_t ctdb_control_transaction_commit(struct ctdb_context *ctdb, uint32_t id);
 int32_t ctdb_control_transaction_cancel(struct ctdb_context *ctdb);
diff --git a/server/ctdb_call.c b/server/ctdb_call.c
index be6e8f9..e188fcf 100644
--- a/server/ctdb_call.c
+++ b/server/ctdb_call.c
@@ -99,9 +99,30 @@ static void ctdb_send_error(struct ctdb_context *ctdb,
 }
 
 
-/*
-  send a redirect reply
-*/
+/**
+ * send a redirect reply
+ *
+ * The logic behind this function is this:
+ *
+ * A client wants to grab a record and sends a CTDB_REQ_CALL packet
+ * to its local ctdb (ctdb_request_call). If the node is not itself
+ * the record's DMASTER, it first redirects the packet to  the
+ * record's LMASTER. The LMASTER then redirects the call packet to
+ * the current DMASTER. But there is a race: The record may have
+ * been migrated off the DMASTER while the redirected packet is
+ * on the wire (or in the local queue). So in case the record has
+ * migrated off the new destinaton of the call packet, instead of
+ * going back to the LMASTER to get the new DMASTER, we try to
+ * reduce rountrips by fist chasing the record a couple of times
+ * before giving up the direct chase and finally going back to the
+ * LMASTER (again). Note that this works because auf this: When
+ * a record is migrated off a node, then the new DMASTER is stored
+ * in the record's copy on the former DMASTER.
+ *
+ * The maxiumum number of attempts for direct chase to make before
+ * going back to the LMASTER is configurable by the tunable
+ * "MaxRedirectCount".
+ */
 static void ctdb_call_send_redirect(struct ctdb_context *ctdb, 
 				    TDB_DATA key,
 				    struct ctdb_req_call *c, 
diff --git a/server/ctdb_daemon.c b/server/ctdb_daemon.c
index 362f1ce..9c650a0 100644
--- a/server/ctdb_daemon.c
+++ b/server/ctdb_daemon.c
@@ -225,7 +225,16 @@ static int ctdb_client_destructor(struct ctdb_client *client)
 		DEBUG(DEBUG_ERR, (__location__ " client exit while transaction "
 				  "commit active. Forcing recovery.\n"));
 		client->ctdb->recovery_mode = CTDB_RECOVERY_ACTIVE;
+
+		/* legacy trans2 transaction state: */
 		ctdb_db->transaction_active = false;
+
+		/*
+		 * trans3 transaction state:
+		 *
+		 * The destructor sets the pointer to NULL.
+		 */
+		talloc_free(ctdb_db->persistent_state);
 	}
 
 	return 0;
diff --git a/server/ctdb_persistent.c b/server/ctdb_persistent.c
index f9a2051..b95f456 100644
--- a/server/ctdb_persistent.c
+++ b/server/ctdb_persistent.c
@@ -28,6 +28,8 @@
 
 struct ctdb_persistent_state {
 	struct ctdb_context *ctdb;
+	struct ctdb_db_context *ctdb_db; /* used by trans3_commit */
+	struct ctdb_client *client; /* used by trans3_commit */
 	struct ctdb_req_control *c;
 	const char *errormsg;
 	uint32_t num_pending;
@@ -52,27 +54,48 @@ static void ctdb_persistent_callback(struct ctdb_context *ctdb,
 {
 	struct ctdb_persistent_state *state = talloc_get_type(private_data, 
 							      struct ctdb_persistent_state);
+	enum ctdb_trans2_commit_error etype;
+
+	if (ctdb->recovery_mode != CTDB_RECOVERY_NORMAL) {
+		DEBUG(DEBUG_INFO, ("ctdb_persistent_callback: ignoring reply "
+				   "during recovery\n"));
+		return;
+	}
 
 	if (status != 0) {
 		DEBUG(DEBUG_ERR,("ctdb_persistent_callback failed with status %d (%s)\n",
-			 status, errormsg));
+			 status, errormsg?errormsg:"no error message given"));
 		state->status = status;
 		state->errormsg = errormsg;
 		state->num_failed++;
+
+		/*
+		 * If a node failed to complete the update_record control,
+		 * then either a recovery is already running or something
+		 * bad is going on. So trigger a recovery and let the
+		 * recovery finish the transaction, sending back the reply
+		 * for the trans3_commit control to the client.
+		 */
+		ctdb->recovery_mode = CTDB_RECOVERY_ACTIVE;
+		return;
 	}
+
 	state->num_pending--;
-	if (state->num_pending == 0) {
-		enum ctdb_trans2_commit_error etype;
-		if (state->num_failed == state->num_sent) {
-			etype = CTDB_TRANS2_COMMIT_ALLFAIL;
-		} else if (state->num_failed != 0) {
-			etype = CTDB_TRANS2_COMMIT_SOMEFAIL;
-		} else {
-			etype = CTDB_TRANS2_COMMIT_SUCCESS;
-		}
-		ctdb_request_control_reply(state->ctdb, state->c, NULL, etype, state->errormsg);
-		talloc_free(state);
+
+	if (state->num_pending != 0) {
+		return;
+	}
+
+	if (state->num_failed == state->num_sent) {
+		etype = CTDB_TRANS2_COMMIT_ALLFAIL;
+	} else if (state->num_failed != 0) {
+		etype = CTDB_TRANS2_COMMIT_SOMEFAIL;
+	} else {
+		etype = CTDB_TRANS2_COMMIT_SUCCESS;
 	}
+
+	ctdb_request_control_reply(state->ctdb, state->c, NULL, etype, state->errormsg);
+	talloc_free(state);
 }
 
 /*
@@ -82,13 +105,53 @@ static void ctdb_persistent_store_timeout(struct event_context *ev, struct timed
 					 struct timeval t, void *private_data)
 {
 	struct ctdb_persistent_state *state = talloc_get_type(private_data, struct ctdb_persistent_state);
-	
+
+	if (state->ctdb->recovery_mode != CTDB_RECOVERY_NORMAL) {
+		DEBUG(DEBUG_INFO, ("ctdb_persistent_store_timeout: ignoring "
+				   "timeout during recovery\n"));
+		return;
+	}
+
 	ctdb_request_control_reply(state->ctdb, state->c, NULL, CTDB_TRANS2_COMMIT_TIMEOUT, 
 				   "timeout in ctdb_persistent_state");
 
 	talloc_free(state);
 }
 
+/**
+ * Finish pending trans3 commit controls, i.e. send
+ * reply to the client. This is called by the end-recovery
+ * control to fix the situation when a recovery interrupts
+ * the usual porgress of a transaction.
+ */
+void ctdb_persistent_finish_trans3_commits(struct ctdb_context *ctdb)
+{
+	struct ctdb_db_context *ctdb_db;
+
+	if (ctdb->recovery_mode != CTDB_RECOVERY_NORMAL) {
+		DEBUG(DEBUG_INFO, ("ctdb_persistent_store_timeout: ignoring "
+				   "timeout during recovery\n"));
+		return;
+	}
+
+	for (ctdb_db = ctdb->db_list; ctdb_db; ctdb_db = ctdb_db->next) {
+		struct ctdb_persistent_state *state;
+
+		if (ctdb_db->persistent_state == NULL) {
+			continue;
+		}
+
+		state = ctdb_db->persistent_state;
+
+		ctdb_request_control_reply(ctdb, state->c, NULL,
+					   CTDB_TRANS2_COMMIT_SOMEFAIL,
+					   "trans3 commit ended by recovery");
+
+		/* The destructor sets ctdb_db->persistent_state to NULL. */
+		talloc_free(state);
+	}
+}
+
 /*
   store a set of persistent records - called from a ctdb client when it has updated
   some records in a persistent database. The client will have the record
@@ -247,6 +310,18 @@ int32_t ctdb_control_trans2_commit(struct ctdb_context *ctdb,
 	return 0;
 }
 
+static int ctdb_persistent_state_destructor(struct ctdb_persistent_state *state)
+{
+	if (state->client != NULL) {
+		state->client->db_id = 0;
+	}
+
+	if (state->ctdb_db != NULL) {
+		state->ctdb_db->persistent_state = NULL;
+	}
+
+	return 0;
+}
 
 /*
  * Store a set of persistent records.
@@ -267,6 +342,21 @@ int32_t ctdb_control_trans3_commit(struct ctdb_context *ctdb,
 		return -1;
 	}
 
+	client = ctdb_reqid_find(ctdb, c->client_id, struct ctdb_client);
+	if (client == NULL) {
+		DEBUG(DEBUG_ERR,(__location__ " can not match persistent_store "
+				 "to a client. Returning error\n"));
+		return -1;
+	}
+
+	if (client->db_id != 0) {
+		DEBUG(DEBUG_ERR,(__location__ " ERROR: trans3_commit: "
+				 "client-db_id[0x%08x] != 0 "
+				 "(client_id[0x%08x]): trans3_commit active?\n",
+				 client->db_id, client->client_id));
+		return -1;
+	}
+
 	ctdb_db = find_ctdb_db(ctdb, m->db_id);
 	if (ctdb_db == NULL) {
 		DEBUG(DEBUG_ERR,(__location__ " ctdb_control_trans3_commit: "
@@ -274,18 +364,27 @@ int32_t ctdb_control_trans3_commit(struct ctdb_context *ctdb,
 		return -1;
 	}
 
-	client = ctdb_reqid_find(ctdb, c->client_id, struct ctdb_client);
-	if (client == NULL) {
-		DEBUG(DEBUG_ERR,(__location__ " can not match persistent_store "
-				 "to a client. Returning error\n"));
+	if (ctdb_db->persistent_state != NULL) {
+		DEBUG(DEBUG_ERR, (__location__ " Error: "
+				  "ctdb_control_trans3_commit "
+				  "called while a transaction commit is "
+				  "active. db_id[0x%08x]\n", m->db_id));
 		return -1;
 	}
 
-	state = talloc_zero(ctdb, struct ctdb_persistent_state);
-	CTDB_NO_MEMORY(ctdb, state);
+	ctdb_db->persistent_state = talloc_zero(ctdb_db,
+						struct ctdb_persistent_state);
+	CTDB_NO_MEMORY(ctdb, ctdb_db->persistent_state);
 
+	client->db_id = m->db_id;
+
+	state = ctdb_db->persistent_state;
 	state->ctdb = ctdb;
+	state->ctdb_db = ctdb_db;
 	state->c    = c;
+	state->client = client;
+
+	talloc_set_destructor(state, ctdb_persistent_state_destructor);
 
 	for (i = 0; i < ctdb->vnn_map->size; i++) {
 		struct ctdb_node *node = ctdb->nodes[ctdb->vnn_map->map[i]];
diff --git a/server/ctdb_recover.c b/server/ctdb_recover.c
index 4db4d97..2dbfbd4 100644
--- a/server/ctdb_recover.c
+++ b/server/ctdb_recover.c
@@ -988,6 +988,8 @@ int32_t ctdb_control_end_recovery(struct ctdb_context *ctdb,
 
 	DEBUG(DEBUG_NOTICE,("Recovery has finished\n"));
 
+	ctdb_persistent_finish_trans3_commits(ctdb);
+
 	state = talloc(ctdb, struct recovery_callback_state);
 	CTDB_NO_MEMORY(ctdb, state);
 


-- 
CTDB repository


More information about the samba-cvs mailing list