[SCM] CTDB repository - branch master updated - ctdb-2.1-190-gc50eca6

Amitay Isaacs amitay at samba.org
Thu May 23 23:49:03 MDT 2013


The branch, master has been updated
       via  c50eca6fbf49a6c7bf50905334704f8d2d3237d7 (commit)
       via  39a43feae7c7de07ddaf2d6cb962f923d47d0c19 (commit)
       via  ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7 (commit)
       via  4a2effcc455be67ff4a779a59ca81ba584312cd6 (commit)
       via  bf20c3ab090f75f59097b36186347cedb1c445d4 (commit)
       via  dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1 (commit)
       via  f43fe3a560d5915c1a9893256f4e7bfe3d7e290a (commit)
       via  c31feb27dcdb748b5333321c85fe54852dfa1bcf (commit)
       via  8076773a9924dcf8aff16f7d96b2b9ac383ecc28 (commit)
       via  9e7b7cd04adc5e66e2ffa4edf463a682aaea379b (commit)
      from  dbb7c550133c92292a7212bdcaaa79f399b0919b (commit)

http://gitweb.samba.org/?p=ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit c50eca6fbf49a6c7bf50905334704f8d2d3237d7
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 18:02:51 2013 +1100

    ctdbd: When the "setup" event fails log an error and exit, don't abort
    
    The "setup" event can fail when one of the eventscripts fails to run
    its "setup" event.  If this occurs then the eventscript should log an
    error.  The stack trace and core file generated when we abort provides
    no useful information.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 39a43feae7c7de07ddaf2d6cb962f923d47d0c19
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 16:02:31 2013 +1100

    eventscripts: 11.natgw should not call ctdb tool in "init" event
    
    The current code calls "ctdb setnatgwstate ..." on every event.
    However, calling the ctdb tool in the "init" event is not permitted.
    
    Instead, update the capability when it is needed and at regular
    intervals via the "monitor" event.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit ed6814ff0a59ddbb1c1b3128b505380f60d7aeb7
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Apr 18 20:30:14 2013 +1000

    ctdbd: Add new runstate CTDB_RUNSTATE_FIRST_RECOVERY
    
    This adds more serialisation to the startup, ensuring that the
    "startup" event runs after everything to do with the first recovery
    (including the "recovered" event).
    
    Given that it now takes longer to get to the "startup" state, the
    initscript needs to wait until ctdbd gets to "first_recovery".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 4a2effcc455be67ff4a779a59ca81ba584312cd6
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 14:09:14 2013 +1100

    tools/ctdb: "ctdb runstate" now accepts optional expected run state arguments
    
    If one or more run states are specified then "ctdb runstate" succeeds
    only if ctdbd is in one of those run states.
    
    At the moment, if the "setup" event fails then the initscript succeeds
    but ctdbd exits almost immediately.  This behaviour isn't very
    friendly.
    
    The initscript now waits until ctdbd is in "startup" or "running" run
    state via the use of "ctdb runstate startup running", meaning that ctdbd
    has successfully passed the "setup" event.
    
    The "setup" event code in 00.ctdb now waits until ctdbd is in the
    "setup" run state before proceeding via the use of "ctdb runstate setup".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit bf20c3ab090f75f59097b36186347cedb1c445d4
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 11 14:07:12 2013 +1100

    tools/ctdb: New command runstate to print current runstate
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit dc4220e6f618cc688b3ca8e52bcb3eec6cb55bb1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue May 21 16:18:28 2013 +1000

    ctdbd: New control CTDB_CONTROL_GET_RUNSTATE
    
    Also new client function ctdb_ctrl_get_runstate().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit f43fe3a560d5915c1a9893256f4e7bfe3d7e290a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 16:48:39 2013 +1100

    ctdbd: Start logging process earlier
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit c31feb27dcdb748b5333321c85fe54852dfa1bcf
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 16:33:36 2013 +1100

    ctdbd: Only start recovery daemon and timed events after setup event
    
    This deconstructs ctdb_start_transport(), which did much more than
    starting the transport.
    
    This removes a very unlikely race and adds some clarity.  The setup
    event is supposed to set the tunables before the first recovery.
    However, there was nothing stopping the first recovery from starting
    before the setup event had completed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 8076773a9924dcf8aff16f7d96b2b9ac383ecc28
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jan 10 16:06:25 2013 +1100

    ctdbd: Replace ctdb->done_startup with ctdb->runstate
    
    This allows states, including startup and shutdown states, to be
    clearly tracked.  This doesn't include regular runtime "states", which
    are handled by node flags.
    
    Introduce new functions ctdb_set_runstate(), runstate_to_string() and
    runstate_from_string().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit 9e7b7cd04adc5e66e2ffa4edf463a682aaea379b
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 23 16:06:47 2013 +1000

    tools/ctdb: Remove duplicate command definition for "sync"
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

-----------------------------------------------------------------------

Summary of changes:
 client/ctdb_client.c      |   30 +++++++++++++++++++++
 common/ctdb_util.c        |   50 +++++++++++++++++++++++++++++++++++
 config/ctdb.init          |    2 +-
 config/events.d/00.ctdb   |    2 +-
 config/events.d/11.natgw  |   17 +++++++++---
 doc/ctdb.1.xml            |   22 +++++++++++++++
 include/ctdb_client.h     |    5 +++
 include/ctdb_private.h    |   16 ++++++++++-
 include/ctdb_protocol.h   |    1 +
 server/ctdb_control.c     |    8 +++++
 server/ctdb_daemon.c      |   63 ++++++++++++++++++++++++--------------------
 server/ctdb_ltdb_server.c |   10 +++---
 server/ctdb_monitor.c     |   19 ++++++++++---
 server/ctdb_recover.c     |    4 +++
 server/ctdb_takeover.c    |    2 +-
 tools/ctdb.c              |   47 ++++++++++++++++++++++++++++++++-
 16 files changed, 249 insertions(+), 49 deletions(-)


Changeset truncated at 500 lines:

diff --git a/client/ctdb_client.c b/client/ctdb_client.c
index 2ae8958..e930bff 100644
--- a/client/ctdb_client.c
+++ b/client/ctdb_client.c
@@ -1654,6 +1654,36 @@ int ctdb_ctrl_ping(struct ctdb_context *ctdb, uint32_t destnode)
 	return res;
 }
 
+int ctdb_ctrl_get_runstate(struct ctdb_context *ctdb, 
+			   struct timeval timeout, 
+			   uint32_t destnode,
+			   uint32_t *runstate)
+{
+	TDB_DATA outdata;
+	int32_t res;
+	int ret;
+
+	ret = ctdb_control(ctdb, destnode, 0, CTDB_CONTROL_GET_RUNSTATE, 0,
+			   tdb_null, ctdb, &outdata, &res, &timeout, NULL);
+	if (ret != 0 || res != 0) {
+		DEBUG(DEBUG_ERR,("ctdb_control for get_runstate failed\n"));
+		return ret != 0 ? ret : res;
+	}
+
+	if (outdata.dsize != sizeof(uint32_t)) {
+		DEBUG(DEBUG_ERR,("Invalid return data in get_runstate\n"));
+		talloc_free(outdata.dptr);
+		return -1;
+	}
+
+	if (runstate != NULL) {
+		*runstate = *(uint32_t *)outdata.dptr;
+	}
+	talloc_free(outdata.dptr);
+
+	return 0;
+}
+
 /*
   find the real path to a ltdb 
  */
diff --git a/common/ctdb_util.c b/common/ctdb_util.c
index 71dee2b..a910a0c 100644
--- a/common/ctdb_util.c
+++ b/common/ctdb_util.c
@@ -702,3 +702,53 @@ const char *ctdb_eventscript_call_names[] = {
 	"updateip",
 	"ipreallocated"
 };
+
+/* Runstate handling */
+static struct {
+	enum ctdb_runstate runstate;
+	const char * label;
+} runstate_map[] = {
+	{ CTDB_RUNSTATE_UNKNOWN, "UNKNOWN" },
+	{ CTDB_RUNSTATE_INIT, "INIT" },
+	{ CTDB_RUNSTATE_SETUP, "SETUP" },
+	{ CTDB_RUNSTATE_FIRST_RECOVERY, "FIRST_RECOVERY" },
+	{ CTDB_RUNSTATE_STARTUP, "STARTUP" },
+	{ CTDB_RUNSTATE_RUNNING, "RUNNING" },
+	{ CTDB_RUNSTATE_SHUTDOWN, "SHUTDOWN" },
+	{ -1, NULL },
+};
+
+const char *runstate_to_string(enum ctdb_runstate runstate)
+{
+	int i;
+	for (i=0; runstate_map[i].label != NULL ; i++) {
+		if (runstate_map[i].runstate == runstate) {
+			return runstate_map[i].label;
+		}
+	}
+
+	return runstate_map[0].label;
+}
+
+enum ctdb_runstate runstate_from_string(const char *label)
+{
+	int i;
+	for (i=0; runstate_map[i].label != NULL; i++) {
+		if (strcasecmp(runstate_map[i].label, label) == 0) {
+			return runstate_map[i].runstate;
+		}
+	}
+
+	return CTDB_RUNSTATE_UNKNOWN;
+}
+
+void ctdb_set_runstate(struct ctdb_context *ctdb, enum ctdb_runstate runstate)
+{
+	if (runstate <= ctdb->runstate) {
+		ctdb_fatal(ctdb, "runstate must always increase");
+	}
+
+	DEBUG(DEBUG_NOTICE,("Set runstate to %s (%d)\n",
+			    runstate_to_string(runstate), runstate));
+	ctdb->runstate = runstate;
+}
diff --git a/config/ctdb.init b/config/ctdb.init
index 6c4e16d..2ceb45f 100755
--- a/config/ctdb.init
+++ b/config/ctdb.init
@@ -220,7 +220,7 @@ wait_until_ready () {
     _timeout="${1:-10}" # default is 10 seconds
 
     _count=0
-    while ! ctdb ping >/dev/null 2>&1 ; do
+    while ! ctdb runstate first_recovery startup running >/dev/null 2>&1 ; do
 	if [ $_count -ge $_timeout ] ; then
 	    return 1
 	fi
diff --git a/config/events.d/00.ctdb b/config/events.d/00.ctdb
index c1ac11a..02d1569 100755
--- a/config/events.d/00.ctdb
+++ b/config/events.d/00.ctdb
@@ -53,7 +53,7 @@ wait_until_ready () {
     _timeout="${1:-10}" # default is 10 seconds
 
     _count=0
-    while ! ctdb ping >/dev/null 2>&1 ; do
+    while ! ctdb runstate setup >/dev/null 2>&1 ; do
 	if [ $_count -ge $_timeout ] ; then
 	    return 1
 	fi
diff --git a/config/events.d/11.natgw b/config/events.d/11.natgw
index a6e0523..c6c45ca 100755
--- a/config/events.d/11.natgw
+++ b/config/events.d/11.natgw
@@ -13,12 +13,15 @@ loadconfig
 
 [ -z "$CTDB_NATGW_NODES" ] && exit 0
 
-# Update capabilities to show whether we support teh NATGW capability or not
-if [ "$CTDB_NATGW_SLAVE_ONLY" = "yes" ] ; then
+set_natgw_capability ()
+{
+    # Set NATGW capability depending on configuration
+    if [ "$CTDB_NATGW_SLAVE_ONLY" = "yes" ] ; then
 	ctdb setnatgwstate off
-else
+    else
 	ctdb setnatgwstate on
-fi
+    fi
+}
 
 delete_all() {
 	_ip="${CTDB_NATGW_PUBLIC_IP%/*}"
@@ -58,6 +61,10 @@ ensure_natgwmaster ()
 }
 
 case "$1" in 
+    setup)
+	set_natgw_capability
+	;;
+
     startup)
 	# Error if CTDB_NATGW_PUBLIC_IP is listed in public addresses
 	grep -q "^$CTDB_NATGW_PUBLIC_IP[[:space:]]" "${CTDB_PUBLIC_ADDRESSES:-/etc/ctdb/public_addresses}" && \
@@ -70,6 +77,7 @@ case "$1" in
     recovered|updatenatgw|ipreallocated)
 	mypnn=$(ctdb pnn | cut -d: -f2)
 
+	set_natgw_capability
 	ensure_natgwmaster "$1"
 
 	delete_all
@@ -103,6 +111,7 @@ case "$1" in
 	;;
 
     monitor)
+	set_natgw_capability
 	ensure_natgwmaster "$1"
 	;;
 
diff --git a/doc/ctdb.1.xml b/doc/ctdb.1.xml
index 83d0ac0..ce83a3e 100644
--- a/doc/ctdb.1.xml
+++ b/doc/ctdb.1.xml
@@ -382,6 +382,28 @@ response from 3 time=0.000114 sec  (2 clients)
       </screen>
     </refsect2>
 
+    <refsect2><title>runstate [setup|first_recovery|startup|running]</title>
+      <para>
+        Print the runstate of the specified node.  Runstates are used
+        to serialise important state transitions in CTDB, particularly
+        during startup.
+      </para>
+      <para>
+        If one or more optional runstate arguments are specified then
+        the node must be in one of these runstates for the command to
+        succeed.
+      </para>
+      <para>
+	Example: ctdb runstate
+      </para>
+      <para>
+	Example output:
+      </para>
+      <screen format="linespecific">
+RUNNING
+      </screen>
+    </refsect2>
+
     <refsect2><title>ifaces</title>
       <para>
 	This command will display the list of network interfaces, which could
diff --git a/include/ctdb_client.h b/include/ctdb_client.h
index 564c563..8739923 100644
--- a/include/ctdb_client.h
+++ b/include/ctdb_client.h
@@ -295,6 +295,11 @@ int ctdb_ctrl_process_exists(struct ctdb_context *ctdb, uint32_t destnode, pid_t
 
 int ctdb_ctrl_ping(struct ctdb_context *ctdb, uint32_t destnode);
 
+int ctdb_ctrl_get_runstate(struct ctdb_context *ctdb, 
+			   struct timeval timeout, 
+			   uint32_t destnode,
+			   uint32_t *runstate);
+
 int ctdb_ctrl_get_config(struct ctdb_context *ctdb);
 
 int ctdb_ctrl_get_debuglevel(struct ctdb_context *ctdb, uint32_t destnode, int32_t *level);
diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index c47210e..eadd963 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -436,6 +436,20 @@ struct ctdb_write_record {
 
 enum ctdb_freeze_mode {CTDB_FREEZE_NONE, CTDB_FREEZE_PENDING, CTDB_FREEZE_FROZEN};
 
+enum ctdb_runstate {
+	CTDB_RUNSTATE_UNKNOWN,
+	CTDB_RUNSTATE_INIT,
+	CTDB_RUNSTATE_SETUP,
+	CTDB_RUNSTATE_FIRST_RECOVERY,
+	CTDB_RUNSTATE_STARTUP,
+	CTDB_RUNSTATE_RUNNING,
+	CTDB_RUNSTATE_SHUTDOWN,
+};
+
+const char *runstate_to_string(enum ctdb_runstate runstate);
+enum ctdb_runstate runstate_from_string(const char *label);
+void ctdb_set_runstate(struct ctdb_context *ctdb, enum ctdb_runstate runstate);
+
 #define CTDB_MONITORING_ACTIVE		0
 #define CTDB_MONITORING_DISABLED	1
 
@@ -505,7 +519,7 @@ struct ctdb_context {
 	pid_t ctdbd_pid;
 	pid_t recoverd_pid;
 	pid_t syslogd_pid;
-	bool done_startup;
+	enum ctdb_runstate runstate;
 	struct ctdb_monitor_state *monitor;
 	struct ctdb_log_state *log;
 	int start_as_disabled;
diff --git a/include/ctdb_protocol.h b/include/ctdb_protocol.h
index 09ce01a..10f643b 100644
--- a/include/ctdb_protocol.h
+++ b/include/ctdb_protocol.h
@@ -405,6 +405,7 @@ enum ctdb_controls {CTDB_CONTROL_PROCESS_EXISTS          = 0,
 		    CTDB_CONTROL_TRAVERSE_ALL_EXT	 = 135,
 		    CTDB_CONTROL_RECEIVE_RECORDS	 = 136,
 		    CTDB_CONTROL_IPREALLOCATED		 = 137,
+		    CTDB_CONTROL_GET_RUNSTATE		 = 138,
 };
 
 /*
diff --git a/server/ctdb_control.c b/server/ctdb_control.c
index 72a602d..bf4a20d 100644
--- a/server/ctdb_control.c
+++ b/server/ctdb_control.c
@@ -207,6 +207,13 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
 		CHECK_CONTROL_DATA_SIZE(0);
 		return ctdb->num_clients;
 
+	case CTDB_CONTROL_GET_RUNSTATE:
+		CHECK_CONTROL_DATA_SIZE(0);
+		outdata->dptr = (uint8_t *)&ctdb->runstate;
+		outdata->dsize = sizeof(uint32_t);
+		return 0;
+
+
 	case CTDB_CONTROL_SET_DB_READONLY: {
 		uint32_t db_id;
 		struct ctdb_db_context *ctdb_db;
@@ -325,6 +332,7 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
 
 	case CTDB_CONTROL_SHUTDOWN:
 		DEBUG(DEBUG_NOTICE,("Received SHUTDOWN command. Stopping CTDB daemon.\n"));
+		ctdb_set_runstate(ctdb, CTDB_RUNSTATE_SHUTDOWN);
 		ctdb_stop_recoverd(ctdb);
 		ctdb_stop_keepalive(ctdb);
 		ctdb_stop_monitoring(ctdb);
diff --git a/server/ctdb_daemon.c b/server/ctdb_daemon.c
index d2df115..cedee09 100644
--- a/server/ctdb_daemon.c
+++ b/server/ctdb_daemon.c
@@ -76,25 +76,8 @@ static void ctdb_start_time_tickd(struct ctdb_context *ctdb)
 			ctdb_time_tick, ctdb);
 }
 
-
-/* called when CTDB is ready to process requests */
-static void ctdb_start_transport(struct ctdb_context *ctdb)
+static void ctdb_start_periodic_events(struct ctdb_context *ctdb)
 {
-	/* start the transport running */
-	if (ctdb->methods->start(ctdb) != 0) {
-		DEBUG(DEBUG_ALERT,("transport failed to start!\n"));
-		ctdb_fatal(ctdb, "transport failed to start");
-	}
-
-	/* start the recovery daemon process */
-	if (ctdb_start_recoverd(ctdb) != 0) {
-		DEBUG(DEBUG_ALERT,("Failed to start recovery daemon\n"));
-		exit(11);
-	}
-
-	/* Make sure we log something when the daemon terminates */
-	atexit(print_exit_message);
-
 	/* start monitoring for connected/disconnected nodes */
 	ctdb_start_keepalive(ctdb);
 
@@ -1049,16 +1032,26 @@ static void ctdb_setup_event_callback(struct ctdb_context *ctdb, int status,
 				      void *private_data)
 {
 	if (status != 0) {
-		ctdb_fatal(ctdb, "Failed to run setup event\n");
-		return;
+		DEBUG(DEBUG_ALERT,("Failed to run setup event - exiting\n"));
+		exit(1);
 	}
 	ctdb_run_notification_script(ctdb, "setup");
 
+	ctdb_set_runstate(ctdb, CTDB_RUNSTATE_FIRST_RECOVERY);
+
 	/* tell all other nodes we've just started up */
 	ctdb_daemon_send_control(ctdb, CTDB_BROADCAST_ALL,
 				 0, CTDB_CONTROL_STARTUP, 0,
 				 CTDB_CTRL_FLAG_NOREPLY,
 				 tdb_null, NULL, NULL);
+
+	/* Start the recovery daemon */
+	if (ctdb_start_recoverd(ctdb) != 0) {
+		DEBUG(DEBUG_ALERT,("Failed to start recovery daemon\n"));
+		exit(11);
+	}
+
+	ctdb_start_periodic_events(ctdb);
 }
 
 static struct timeval tevent_before_wait_ts;
@@ -1207,6 +1200,12 @@ int ctdb_start_daemon(struct ctdb_context *ctdb, bool do_fork, bool use_syslog,
 	}
 
 	ctdb_set_child_logging(ctdb);
+	if (use_syslog) {
+		if (start_syslog_daemon(ctdb)) {
+			DEBUG(DEBUG_CRIT, ("Failed to start syslog daemon\n"));
+			exit(10);
+		}
+	}
 
 	/* initialize statistics collection */
 	ctdb_statistics_init(ctdb);
@@ -1259,6 +1258,7 @@ int ctdb_start_daemon(struct ctdb_context *ctdb, bool do_fork, bool use_syslog,
 		ctdb_fatal(ctdb, "Failed to attach to databases\n");
 	}
 
+	ctdb_set_runstate(ctdb, CTDB_RUNSTATE_INIT);
 	ret = ctdb_event_script(ctdb, CTDB_EVENT_INIT);
 	if (ret != 0) {
 		ctdb_fatal(ctdb, "Failed to run init event\n");
@@ -1281,9 +1281,21 @@ int ctdb_start_daemon(struct ctdb_context *ctdb, bool do_fork, bool use_syslog,
 		ctdb_release_all_ips(ctdb);
 	}
 
-	/* start the transport going */
-	ctdb_start_transport(ctdb);
 
+	/* Make sure we log something when the daemon terminates */
+	atexit(print_exit_message);
+
+	/* Start the transport */
+	if (ctdb->methods->start(ctdb) != 0) {
+		DEBUG(DEBUG_ALERT,("transport failed to start!\n"));
+		ctdb_fatal(ctdb, "transport failed to start");
+	}
+
+	/* Recovery daemon and timed events are started from the
+	 * callback, only after the setup event completes
+	 * successfully.
+	 */
+	ctdb_set_runstate(ctdb, CTDB_RUNSTATE_SETUP);
 	ret = ctdb_event_script_callback(ctdb,
 					 ctdb,
 					 ctdb_setup_event_callback,
@@ -1297,13 +1309,6 @@ int ctdb_start_daemon(struct ctdb_context *ctdb, bool do_fork, bool use_syslog,
 		exit(1);
 	}
 
-	if (use_syslog) {
-		if (start_syslog_daemon(ctdb)) {
-			DEBUG(DEBUG_CRIT, ("Failed to start syslog daemon\n"));
-			exit(10);
-		}
-	}
-
 	ctdb_lockdown_memory(ctdb);
 	  
 	/* go into a wait loop to allow other nodes to complete */
diff --git a/server/ctdb_ltdb_server.c b/server/ctdb_ltdb_server.c
index 8b06703..0426d96 100644
--- a/server/ctdb_ltdb_server.c
+++ b/server/ctdb_ltdb_server.c
@@ -656,7 +656,7 @@ int32_t ctdb_control_db_set_healthy(struct ctdb_context *ctdb, TDB_DATA indata)
 		return -1;
 	}
 
-	if (may_recover && !ctdb->done_startup) {
+	if (may_recover && ctdb->runstate == CTDB_RUNSTATE_STARTUP) {
 		DEBUG(DEBUG_ERR, (__location__ " db %s become healthy  - force recovery for startup\n",
 				  ctdb_db->db_name));
 		ctdb->recovery_mode = CTDB_RECOVERY_ACTIVE;
@@ -794,7 +794,7 @@ static int ctdb_local_attach(struct ctdb_context *ctdb, const char *db_name,
 		if (ctdb->max_persistent_check_errors > 0) {
 			remaining_tries = 1;
 		}
-		if (ctdb->done_startup) {
+		if (ctdb->runstate == CTDB_RUNSTATE_RUNNING) {
 			remaining_tries = 0;
 		}
 
@@ -1086,9 +1086,9 @@ int32_t ctdb_control_db_attach(struct ctdb_context *ctdb, TDB_DATA indata,
 			return -1;
 		}
 
-		if (ctdb->recovery_mode == CTDB_RECOVERY_ACTIVE
-		 && client->pid != ctdb->recoverd_pid
-		 && !ctdb->done_startup) {
+		if (ctdb->recovery_mode == CTDB_RECOVERY_ACTIVE &&
+		    client->pid != ctdb->recoverd_pid &&
+		    ctdb->runstate < CTDB_RUNSTATE_RUNNING) {
 			struct ctdb_deferred_attach_context *da_ctx = talloc(client, struct ctdb_deferred_attach_context);
 
 			if (da_ctx == NULL) {
diff --git a/server/ctdb_monitor.c b/server/ctdb_monitor.c
index 984f947..1608804 100644
--- a/server/ctdb_monitor.c
+++ b/server/ctdb_monitor.c
@@ -204,7 +204,7 @@ static void ctdb_startup_callback(struct ctdb_context *ctdb, int status, void *p
 		DEBUG(DEBUG_ERR,("startup event failed\n"));
 	} else if (status == 0) {
 		DEBUG(DEBUG_NOTICE,("startup event OK - enabling monitoring\n"));
-		ctdb->done_startup = true;
+		ctdb_set_runstate(ctdb, CTDB_RUNSTATE_RUNNING);
 		ctdb->monitor->next_interval = 2;
 		ctdb_run_notification_script(ctdb, "startup");
 	}
@@ -307,7 +307,6 @@ static void ctdb_wait_until_recovered(struct event_context *ev, struct timed_eve
 	}
 	ctdb->db_persistent_check_errors = 0;
 
-	DEBUG(DEBUG_NOTICE,(__location__ " Recoveries finished. Running the \"startup\" event.\n"));
 	event_add_timed(ctdb->ev, ctdb->monitor->monitor_context,
 			     timeval_current(),
 			     ctdb_check_health, ctdb);
@@ -323,15 +322,25 @@ static void ctdb_check_health(struct event_context *ev, struct timed_event *te,
 	struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);
 	int ret = 0;
 
+	if (ctdb->runstate < CTDB_RUNSTATE_STARTUP) {
+		DEBUG(DEBUG_NOTICE,("Not yet in startup runstate. Wait one more second\n"));
+		event_add_timed(ctdb->ev, ctdb->monitor->monitor_context,
+				timeval_current_ofs(1, 0), 
+				ctdb_check_health, ctdb);
+		return;
+	}
+	
 	if (ctdb->recovery_mode != CTDB_RECOVERY_NORMAL ||
-	    (ctdb->monitor->monitoring_mode == CTDB_MONITORING_DISABLED && ctdb->done_startup)) {
+	    (ctdb->monitor->monitoring_mode == CTDB_MONITORING_DISABLED &&
+	     ctdb->runstate == CTDB_RUNSTATE_RUNNING)) {
 		event_add_timed(ctdb->ev, ctdb->monitor->monitor_context,
 				timeval_current_ofs(ctdb->monitor->next_interval, 0), 
 				ctdb_check_health, ctdb);
 		return;
 	}
 	
-	if (!ctdb->done_startup) {
+	if (ctdb->runstate == CTDB_RUNSTATE_STARTUP) {
+		DEBUG(DEBUG_NOTICE,("Recoveries finished. Running the \"startup\" event.\n"));
 		ret = ctdb_event_script_callback(ctdb, 
 						 ctdb->monitor->monitor_context, ctdb_startup_callback, 
 						 ctdb, false,


-- 
CTDB repository


More information about the samba-cvs mailing list