[SCM] CTDB repository - branch status-test updated - ctdb-1.0.104-19-g66b2298

Mon Nov 23 19:44:32 MST 2009

The branch, status-test has been updated
       via  66b22980b14601f29fe8cc64bd8f29883c7ca1c0 (commit)
       via  a6d353519932eee48f9241ad8887b692882906c9 (commit)
       via  b0648c7f08eba87ec3c9714e2525c9b621bfb4ef (commit)
       via  834c93b3e1b8f4151b8a2cd82c2dd8bacc17f66c (commit)
       via  470822b329f9d3ca9bef518b56e9ce28d5fedda2 (commit)
       via  fe8027309c1f7b987cd368fa98f9b28741baa786 (commit)
       via  e7d57d7ae678b24dab3364a348838c6a3398942c (commit)
       via  20b15de068d042b292725945927ceda1b01d07c0 (commit)
       via  c715746c2f40eb9b21dbf011d16f1f1b0b53fdf9 (commit)
      from  c5f798116bf3b7954e23c7267b056ee1f5560f45 (commit)

http://gitweb.samba.org/?p=sahlberg/ctdb.git;a=shortlog;h=status-test


- Log -----------------------------------------------------------------
commit 66b22980b14601f29fe8cc64bd8f29883c7ca1c0
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:24:22 2009 +1030

    eventscript: check that ctdb forced script events correct
    
    Now we're doing checking, we might as well make sure the commands from
    "ctdb eventscripts" are valid.
    
    This gets rid of the "UNKNOWN" event type.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit a6d353519932eee48f9241ad8887b692882906c9
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:23:13 2009 +1030

    eventscript: check that internal script events are being invoked correctly
    
    This is not as good as a compile-time check, but at least we count the
    number of arguments are correct.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit b0648c7f08eba87ec3c9714e2525c9b621bfb4ef
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:22:46 2009 +1030

    eventscript: remove call name from state->options
    
    Finally, we remove the call name (eg. "monitor" or "start") from the
    options field of the struct: it now contains only extra options.
    
    This is clearer, and mainly involves adding some %s to debug statements.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit 834c93b3e1b8f4151b8a2cd82c2dd8bacc17f66c
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:19:58 2009 +1030

    eventscript: put call type into state struct.
    
    This means we can get rid of more strcmp; they can simply use the
    state->call value instead.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit 470822b329f9d3ca9bef518b56e9ce28d5fedda2
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:16:49 2009 +1030

    eventscript: introduce enum for different event script calls.
    
    Rather than doing strcmp everywhere, pass an explicit enum around.  This
    also subtly documents what options are available.  The "options" arg
    is now used for extra arguments only.
    
    Unfortunately, gcc complains on empty format strings, so we make
    ctdb_event_script() take no varargs, and add ctdb_event_script_args().  We
    leave ctdb_event_script_callback() taking varargs, which means callers
    have to do "%s", "".
    
    For the moment, we have CTDB_EVENT_UNKNOWN for handling forced scripts
    from the ctdb tool.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit fe8027309c1f7b987cd368fa98f9b28741baa786
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:09:46 2009 +1030

    eventscript: put timeout inside ctdb_event_script_callback_v
    
    Everyone uses the same timeout value, so just remove it from the API.
    If we ever need variable timeouts, that might as well be central too.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit e7d57d7ae678b24dab3364a348838c6a3398942c
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:08:39 2009 +1030

    eventscript: typo cleanups
    
    1) ctdb_event_script_v doesn't take varargs.  ctdb_run_event_script is
       a better name, and fix comment.
    2) Fix indentation on allowed_scripts.
    3) Comment on run_eventscripts_callback is wrong; it's the callback
       for any ctdb forced event.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit 20b15de068d042b292725945927ceda1b01d07c0
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:06:53 2009 +1030

    eventscript: fix bug in timeouts on forced eventscripts.  Again.
    
    In 15bc66ae801b0c69, Ronnie fixed a double-free race.  The problem was that
    ctdb_run_eventscripts() hands a context to ctdb_event_script_callback() to
    hang its data off, which gets freed in the callback.  This particularly
    hurt in ctdb_event_script_timeout.
    
    There's nothing wrong with this, but obviously we should make the callback
    call last of all.  At the time, ctdb_event_script_timeout() carefully
    extracted everything from the struct ctdb_event_script_state before
    calling ->callback.
    
    This was cleaned up in 64da4402c6ad485f (Ronnie again), and now state
    was referred to after the callback again.  But the same change introduced
    a direct use-after-free bug which caused an occasional oops.
    
    So in our last episode (eda052101728cf92) Volker fixed this, and Michael
    committed it.
    
    But we still have the double free bug which 15bc66ae801b0c69 was supposed
    to fix!  Let's try to fix this in a more permanent way, but always doing
    the callback from the destructor.  This means we need to hold the status,
    and don't send the KILL signal if ->child is set to 0.
    
    Finally, add a comment about freeing ourselves in run_eventscripts_callback
    and the structure definition.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

commit c715746c2f40eb9b21dbf011d16f1f1b0b53fdf9
Author: Rusty Russell <rusty at rustcorp.com.au>
Date:   Tue Nov 24 11:00:13 2009 +1030

    eventscript: clean up forked handler event code
    
    Write the whole int through the pipe, rather than quietly cutting it
    off.  Also, use -2 as the result if the read fails; -1 comes from many
    paths if the child fails before running the script.
    
    Add a comment about why we don't need to check the write.
    
    Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

-----------------------------------------------------------------------

Summary of changes:
 include/ctdb_private.h |   18 +++-
 server/ctdb_control.c  |    2 +-
 server/ctdb_monitor.c  |    6 +-
 server/ctdb_recover.c  |   21 ++---
 server/ctdb_recoverd.c |    2 +-
 server/ctdb_takeover.c |   14 ++--
 server/eventscript.c   |  245 ++++++++++++++++++++++++++++++++----------------
 7 files changed, 198 insertions(+), 110 deletions(-)


Changeset truncated at 500 lines:

diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index bee921d..4136bd5 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -858,6 +858,18 @@ enum ctdb_trans2_commit_error {
 	CTDB_TRANS2_COMMIT_SOMEFAIL=3 /* some nodes failed the commit, some allowed it */
 };
 
+/* different calls to event scripts. */
+enum ctdb_eventscript_call {
+	CTDB_EVENT_STARTUP,		/* CTDB starting up: no args. */
+	CTDB_EVENT_START_RECOVERY,	/* CTDB recovery starting: no args. */
+	CTDB_EVENT_RECOVERED,		/* CTDB recovery finished: no args. */
+	CTDB_EVENT_TAKE_IP,		/* IP taken: interface, IP address, netmask bits. */
+	CTDB_EVENT_RELEASE_IP,		/* IP released: interface, IP address, netmask bits. */
+	CTDB_EVENT_STOPPED,		/* This node is stopped: no args. */
+	CTDB_EVENT_MONITOR,		/* Please check if service is healthy: no args. */
+	CTDB_EVENT_STATUS,		/* Report service status: no args. */
+	CTDB_EVENT_SHUTDOWN		/* CTDB shutting down: no args. */
+};
 
 /* internal prototypes */
 void ctdb_set_error(struct ctdb_context *ctdb, const char *fmt, ...) PRINTF_ATTRIBUTE(2,3);
@@ -1326,12 +1338,14 @@ int32_t ctdb_control_get_tcp_tickle_list(struct ctdb_context *ctdb, TDB_DATA ind
 int32_t ctdb_control_set_tcp_tickle_list(struct ctdb_context *ctdb, TDB_DATA indata);
 
 void ctdb_takeover_client_destructor_hook(struct ctdb_client *client);
-int ctdb_event_script(struct ctdb_context *ctdb, const char *fmt, ...) PRINTF_ATTRIBUTE(2,3);
+int ctdb_event_script(struct ctdb_context *ctdb, enum ctdb_eventscript_call call);
+int ctdb_event_script_args(struct ctdb_context *ctdb, enum ctdb_eventscript_call call,
+			   const char *fmt, ...) PRINTF_ATTRIBUTE(3,4);
 int ctdb_event_script_callback(struct ctdb_context *ctdb, 
-			       struct timeval timeout,
 			       TALLOC_CTX *mem_ctx,
 			       void (*callback)(struct ctdb_context *, int, void *),
 			       void *private_data,
+			       enum ctdb_eventscript_call call,
 			       const char *fmt, ...) PRINTF_ATTRIBUTE(6,7);
 void ctdb_release_all_ips(struct ctdb_context *ctdb);
 
diff --git a/server/ctdb_control.c b/server/ctdb_control.c
index bf82e51..970e005 100644
--- a/server/ctdb_control.c
+++ b/server/ctdb_control.c
@@ -286,7 +286,7 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
 		if (ctdb->methods != NULL) {
 			ctdb->methods->shutdown(ctdb);
 		}
-		ctdb_event_script(ctdb, "shutdown");
+		ctdb_event_script(ctdb, CTDB_EVENT_SHUTDOWN);
 		DEBUG(DEBUG_NOTICE,("Received SHUTDOWN command. Stopping CTDB daemon.\n"));
 		exit(0);
 
diff --git a/server/ctdb_monitor.c b/server/ctdb_monitor.c
index efba8f9..efb767a 100644
--- a/server/ctdb_monitor.c
+++ b/server/ctdb_monitor.c
@@ -223,9 +223,8 @@ static void ctdb_check_health(struct event_context *ev, struct timed_event *te,
 	
 	if (!ctdb->done_startup) {
 		ret = ctdb_event_script_callback(ctdb, 
-						 timeval_set(ctdb->tunable.script_timeout, 0),
 						 ctdb->monitor->monitor_context, ctdb_startup_callback, 
-						 ctdb, "startup");
+						 ctdb, CTDB_EVENT_STARTUP, "%s", "");
 	} else {
 		int i;
 		int skip_monitoring = 0;
@@ -248,9 +247,8 @@ static void ctdb_check_health(struct event_context *ev, struct timed_event *te,
 			return;
 		} else {
 			ret = ctdb_event_script_callback(ctdb, 
-					timeval_set(ctdb->tunable.script_timeout, 0),
 					ctdb->monitor->monitor_context, ctdb_health_callback,
-					ctdb, "monitor");
+					ctdb, CTDB_EVENT_MONITOR, "%s", "");
 		}
 	}
 
diff --git a/server/ctdb_recover.c b/server/ctdb_recover.c
index ccac8e6..76b0a9b 100644
--- a/server/ctdb_recover.c
+++ b/server/ctdb_recover.c
@@ -962,11 +962,9 @@ int32_t ctdb_control_end_recovery(struct ctdb_context *ctdb,
 
 	ctdb_disable_monitoring(ctdb);
 
-	ret = ctdb_event_script_callback(ctdb, 
-					 timeval_set(ctdb->tunable.script_timeout, 0),
-					 state, 
+	ret = ctdb_event_script_callback(ctdb, state,
 					 ctdb_end_recovery_callback, 
-					 state, "recovered");
+					 state, CTDB_EVENT_RECOVERED, "%s", "");
 
 	if (ret != 0) {
 		ctdb_enable_monitoring(ctdb);
@@ -1016,11 +1014,10 @@ int32_t ctdb_control_start_recovery(struct ctdb_context *ctdb,
 
 	ctdb_disable_monitoring(ctdb);
 
-	ret = ctdb_event_script_callback(ctdb, 
-					 timeval_set(ctdb->tunable.script_timeout, 0),
-					 state, 
+	ret = ctdb_event_script_callback(ctdb, state,
 					 ctdb_start_recovery_callback, 
-					 state, "startrecovery");
+					 state, CTDB_EVENT_START_RECOVERY,
+					 "%s", "");
 
 	if (ret != 0) {
 		DEBUG(DEBUG_ERR,(__location__ " Failed to start recovery\n"));
@@ -1160,7 +1157,7 @@ static void ctdb_recd_ping_timeout(struct event_context *ev, struct timed_event
 	if (ctdb->methods != NULL) {
 		ctdb->methods->shutdown(ctdb);
 	}
-	ctdb_event_script(ctdb, "shutdown");
+	ctdb_event_script(ctdb, CTDB_EVENT_SHUTDOWN);
 	DEBUG(DEBUG_ERR, ("Recovery daemon ping timeout. Daemon has been shut down.\n"));
 	exit(0);
 }
@@ -1230,11 +1227,9 @@ int32_t ctdb_control_stop_node(struct ctdb_context *ctdb, struct ctdb_req_contro
 
 	ctdb_disable_monitoring(ctdb);
 
-	ret = ctdb_event_script_callback(ctdb, 
-					 timeval_set(ctdb->tunable.script_timeout, 0),
-					 state, 
+	ret = ctdb_event_script_callback(ctdb, state,
 					 ctdb_stop_node_callback, 
-					 state, "stopped");
+					 state, CTDB_EVENT_STOPPED, "%s", "");
 
 	if (ret != 0) {
 		ctdb_enable_monitoring(ctdb);
diff --git a/server/ctdb_recoverd.c b/server/ctdb_recoverd.c
index ecdcd99..d304e24 100644
--- a/server/ctdb_recoverd.c
+++ b/server/ctdb_recoverd.c
@@ -3288,7 +3288,7 @@ static void ctdb_check_recd(struct event_context *ev, struct timed_event *te,
 		if (ctdb->methods != NULL) {
 			ctdb->methods->shutdown(ctdb);
 		}
-		ctdb_event_script(ctdb, "shutdown");
+		ctdb_event_script(ctdb, CTDB_EVENT_SHUTDOWN);
 
 		exit(10);	
 	}
diff --git a/server/ctdb_takeover.c b/server/ctdb_takeover.c
index a140578..6bc4e37 100644
--- a/server/ctdb_takeover.c
+++ b/server/ctdb_takeover.c
@@ -235,9 +235,9 @@ int32_t ctdb_control_takeover_ip(struct ctdb_context *ctdb,
 		vnn->iface));
 
 	ret = ctdb_event_script_callback(ctdb, 
-					 timeval_set(ctdb->tunable.script_timeout, 0),
 					 state, takeover_ip_callback, state,
-					 "takeip %s %s %u",
+					 CTDB_EVENT_TAKE_IP,
+					 "%s %s %u",
 					 vnn->iface, 
 					 talloc_strdup(state, ctdb_addr_to_str(&pip->addr)),
 					 vnn->public_netmask_bits);
@@ -391,9 +391,9 @@ int32_t ctdb_control_release_ip(struct ctdb_context *ctdb,
 	state->vnn   = vnn;
 
 	ret = ctdb_event_script_callback(ctdb, 
-					 timeval_set(ctdb->tunable.script_timeout, 0),
 					 state, release_ip_callback, state,
-					 "releaseip %s %s %u",
+					 CTDB_EVENT_RELEASE_IP,
+					 "%s %s %u",
 					 vnn->iface, 
 					 talloc_strdup(state, ctdb_addr_to_str(&pip->addr)),
 					 vnn->public_netmask_bits);
@@ -1354,7 +1354,7 @@ void ctdb_release_all_ips(struct ctdb_context *ctdb)
 		if (vnn->pnn == ctdb->pnn) {
 			vnn->pnn = -1;
 		}
-		ctdb_event_script(ctdb, "releaseip %s %s %u",
+		ctdb_event_script_args(ctdb, CTDB_EVENT_RELEASE_IP, "%s %s %u",
 				  vnn->iface, 
 				  talloc_strdup(ctdb, ctdb_addr_to_str(&vnn->public_address)),
 				  vnn->public_netmask_bits);
@@ -2094,9 +2094,9 @@ int32_t ctdb_control_del_public_address(struct ctdb_context *ctdb, TDB_DATA inda
 			DLIST_REMOVE(ctdb->vnn, vnn);
 
 			ret = ctdb_event_script_callback(ctdb, 
-					 timeval_set(ctdb->tunable.script_timeout, 0),
 					 mem_ctx, delete_ip_callback, mem_ctx,
-					 "releaseip %s %s %u",
+					 CTDB_EVENT_RELEASE_IP,
+					 "%s %s %u",
 					 vnn->iface, 
 					 talloc_strdup(mem_ctx, ctdb_addr_to_str(&vnn->public_address)),
 					 vnn->public_netmask_bits);
diff --git a/server/eventscript.c b/server/eventscript.c
index 1e76cd6..24e08f0 100644
--- a/server/eventscript.c
+++ b/server/eventscript.c
@@ -32,6 +32,18 @@ static struct {
 	const char *script_running;
 } child_state;
 
+static const char *call_names[] = {
+	"startup",
+	"startrecovery",
+	"recovered",
+	"takeip",
+	"releaseip",
+	"stopped",
+	"monitor",
+	"status",
+	"shutdown",
+};
+
 static void ctdb_event_script_timeout(struct event_context *ev, struct timed_event *te, struct timeval t, void *p);
 
 /*
@@ -61,9 +73,12 @@ static void sigterm(int sig)
 struct ctdb_event_script_state {
 	struct ctdb_context *ctdb;
 	pid_t child;
+	/* Warning: this can free us! */
 	void (*callback)(struct ctdb_context *, int, void *);
+	int cb_status;
 	int fd[2];
 	void *private_data;
+	enum ctdb_eventscript_call call;
 	const char *options;
 	struct timeval timeout;
 };
@@ -441,23 +456,20 @@ static struct ctdb_script_list *ctdb_get_script_list(struct ctdb_context *ctdb,
 
 
 /*
-  run the event script - varargs version
+  Actually run the event script
   this function is called and run in the context of a forked child
   which allows it to do blocking calls such as system()
  */
-static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
+static int ctdb_run_event_script(struct ctdb_context *ctdb,
+				 enum ctdb_eventscript_call call,
+				 const char *options)
 {
 	char *cmdstr;
 	int ret;
 	TALLOC_CTX *tmp_ctx = talloc_new(ctdb);
 	struct ctdb_script_list *scripts, *current;
-	int is_monitor = 0;
-
-	if (!strcmp(options, "monitor")) {
-		is_monitor = 1;
-	}
 
-	if (is_monitor == 1) {
+	if (call == CTDB_EVENT_MONITOR) {
 		/* This is running in the forked child process. At this stage
 		 * we want to switch from being a ctdb daemon into being a
 		 * client and connect to the real local daemon.
@@ -477,14 +489,15 @@ static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
 	if (ctdb->recovery_mode != CTDB_RECOVERY_NORMAL) {
 		/* we guarantee that only some specifically allowed event scripts are run
 		   while in recovery */
-		const char *allowed_scripts[] = {"startrecovery", "shutdown", "releaseip", "stopped" };
+		const enum ctdb_eventscript_call allowed_calls[] = {
+			CTDB_EVENT_START_RECOVERY, CTDB_EVENT_SHUTDOWN, CTDB_EVENT_RELEASE_IP, CTDB_EVENT_STOPPED };
 		int i;
-		for (i=0;i<ARRAY_SIZE(allowed_scripts);i++) {
-			if (strncmp(options, allowed_scripts[i], strlen(allowed_scripts[i])) == 0) break;
+		for (i=0;i<ARRAY_SIZE(allowed_calls);i++) {
+			if (call == allowed_calls[i]) break;
 		}
-		if (i == ARRAY_SIZE(allowed_scripts)) {
-			DEBUG(DEBUG_ERR,("Refusing to run event scripts with option '%s' while in recovery\n",
-				 options));
+		if (i == ARRAY_SIZE(allowed_calls)) {
+			DEBUG(DEBUG_ERR,("Refusing to run event scripts call '%s' while in recovery\n",
+				 call_names[call]));
 			talloc_free(tmp_ctx);
 			return -1;
 		}
@@ -514,14 +527,14 @@ static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
 		   status of the event asynchronously.
 		*/
 		if ((ctdb->tunable.use_status_events_for_monitoring != 0) 
-		&& (!strcmp(options, "monitor"))) {
+		    && call == CTDB_EVENT_MONITOR) {
 			cmdstr = talloc_asprintf(tmp_ctx, "%s/%s %s", 
 					ctdb->event_script_dir,
 					current->name, "status");
 		} else {
-			cmdstr = talloc_asprintf(tmp_ctx, "%s/%s %s", 
+			cmdstr = talloc_asprintf(tmp_ctx, "%s/%s %s %s",
 					ctdb->event_script_dir,
-					current->name, options);
+					current->name, call_names[call], options);
 		}
 		CTDB_NO_MEMORY(ctdb, cmdstr);
 
@@ -530,7 +543,7 @@ static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
 		child_state.start = timeval_current();
 		child_state.script_running = cmdstr;
 
-		if (is_monitor == 1) {
+		if (call == CTDB_EVENT_MONITOR) {
 			if (ctdb_ctrl_event_script_start(ctdb, current->name) != 0) {
 				DEBUG(DEBUG_ERR,(__location__ " Failed to start event script monitoring\n"));
 				talloc_free(tmp_ctx);
@@ -563,7 +576,7 @@ static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
 			DEBUG(DEBUG_ERR,("Script %s returned status 127. Someone just deleted it?\n", cmdstr));
 		}
  
-		if (is_monitor == 1) {
+		if (call == CTDB_EVENT_MONITOR) {
 			if (ctdb_ctrl_event_script_stop(ctdb, ret) != 0) {
 				DEBUG(DEBUG_ERR,(__location__ " Failed to stop event script monitoring\n"));
 				talloc_free(tmp_ctx);
@@ -574,7 +587,7 @@ static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
 		/* return an error if the script failed */
 		if (ret != 0) {
 			DEBUG(DEBUG_ERR,("Event script %s failed with error %d\n", cmdstr, ret));
-			if (is_monitor == 1) {
+			if (call == CTDB_EVENT_MONITOR) {
 				if (ctdb_ctrl_event_script_finished(ctdb) != 0) {
 					DEBUG(DEBUG_ERR,(__location__ " Failed to finish event script monitoring\n"));
 					talloc_free(tmp_ctx);
@@ -590,7 +603,7 @@ static int ctdb_event_script_v(struct ctdb_context *ctdb, const char *options)
 	child_state.start = timeval_current();
 	child_state.script_running = "finished";
 	
-	if (is_monitor == 1) {
+	if (call == CTDB_EVENT_MONITOR) {
 		if (ctdb_ctrl_event_script_finished(ctdb) != 0) {
 			DEBUG(DEBUG_ERR,(__location__ " Failed to finish event script monitoring\n"));
 			talloc_free(tmp_ctx);
@@ -609,20 +622,17 @@ static void ctdb_event_script_handler(struct event_context *ev, struct fd_event
 	struct ctdb_event_script_state *state = 
 		talloc_get_type(p, struct ctdb_event_script_state);
 	struct ctdb_context *ctdb = state->ctdb;
-	signed char rt = 0;
-
-	read(state->fd[0], &rt, sizeof(rt));
 
-	DEBUG(DEBUG_INFO,(__location__ " Eventscript %s finished with state %d\n", state->options, rt));
-
-	if (state->callback) {
-		state->callback(ctdb, rt, state->private_data);
-		state->callback = NULL;
+	if (read(state->fd[0], &state->cb_status, sizeof(state->cb_status)) !=
+	    sizeof(state->cb_status)) {
+		state->cb_status = -2;
 	}
 
-	ctdb->event_script_timeouts = 0;
+	DEBUG(DEBUG_INFO,(__location__ " Eventscript %s %s finished with state %d\n",
+			  call_names[state->call], state->options, state->cb_status));
 
-	talloc_set_destructor(state, NULL);
+	state->child = 0;
+	ctdb->event_script_timeouts = 0;
 	talloc_free(state);
 }
 
@@ -646,23 +656,19 @@ static void ctdb_event_script_timeout(struct event_context *ev, struct timed_eve
 				      struct timeval t, void *p)
 {
 	struct ctdb_event_script_state *state = talloc_get_type(p, struct ctdb_event_script_state);
-	void *private_data = state->private_data;
 	struct ctdb_context *ctdb = state->ctdb;
 
-	DEBUG(DEBUG_ERR,("Event script timed out : %s count : %u  pid : %d\n", state->options, ctdb->event_script_timeouts, state->child));
+	DEBUG(DEBUG_ERR,("Event script timed out : %s %s count : %u  pid : %d\n",
+			 call_names[state->call], state->options, ctdb->event_script_timeouts, state->child));
 
 	if (kill(state->child, 0) != 0) {
 		DEBUG(DEBUG_ERR,("Event script child process already dead, errno %s(%d)\n", strerror(errno), errno));
-		if (state->callback) {
-			state->callback(ctdb, 0, private_data);
-			state->callback = NULL;
-		}
-		talloc_set_destructor(state, NULL);
+		state->child = 0;
 		talloc_free(state);
 		return;
 	}
 
-	if (!strcmp(state->options, "monitor")) {
+	if (state->call == CTDB_EVENT_MONITOR) {
 		/* if it is a monitor event, we allow it to "hang" a few times
 		   before we declare it a failure and ban ourself (and make
 		   ourself unhealthy)
@@ -673,22 +679,13 @@ static void ctdb_event_script_timeout(struct event_context *ev, struct timed_eve
 
 		if (ctdb->event_script_timeouts > ctdb->tunable.script_ban_count) {
 			DEBUG(DEBUG_ERR, ("Maximum timeout count %u reached for eventscript. Making node unhealthy\n", ctdb->tunable.script_ban_count));
-			if (state->callback) {
-				state->callback(ctdb, -ETIME, private_data);
-				state->callback = NULL;
-			}
+			state->cb_status = -ETIME;
 		} else {
-			if (state->callback) {
-			  	state->callback(ctdb, 0, private_data);
-				state->callback = NULL;
-			}
+			state->cb_status = 0;
 		}
-	} else if (!strcmp(state->options, "startup")) {
+	} else if (state->call == CTDB_EVENT_STARTUP) {
 		DEBUG(DEBUG_ERR, (__location__ " eventscript for startup event timedout.\n"));
-		if (state->callback) {
-			state->callback(ctdb, -1, private_data);
-			state->callback = NULL;
-		}
+		state->cb_status = -1;
 	} else {
 		/* if it is not a monitor or a startup event we ban ourself
 		   immediately
@@ -697,13 +694,10 @@ static void ctdb_event_script_timeout(struct event_context *ev, struct timed_eve
 
 		ctdb_ban_self(ctdb, ctdb->tunable.recovery_ban_period);
 
-		if (state->callback) {
-			state->callback(ctdb, -1, private_data);
-			state->callback = NULL;
-		}
+		state->cb_status = -1;
 	}
 
-	if (!strcmp(state->options, "monitor") || !strcmp(state->options, "status")) {
+	if (state->call == CTDB_EVENT_MONITOR || state->call == CTDB_EVENT_STATUS) {
 		struct ctdb_monitor_script_status *script;
 
 		if (ctdb->current_monitor_status_ctx == NULL) {
@@ -728,39 +722,78 @@ static void ctdb_event_script_timeout(struct event_context *ev, struct timed_eve
 }
 
 /*
-  destroy a running event script
+  destroy an event script: kill it if ->child != 0.
  */
 static int event_script_destructor(struct ctdb_event_script_state *state)
 {
-	DEBUG(DEBUG_ERR,(__location__ " Sending SIGTERM to child pid:%d\n", state->child));
+	if (state->child) {
+		DEBUG(DEBUG_ERR,(__location__ " Sending SIGTERM to child pid:%d\n", state->child));
 
-	if (state->callback) {
-		state->callback(state->ctdb, 0, state->private_data);
-		state->callback = NULL;
+		if (kill(state->child, SIGTERM) != 0) {
+			DEBUG(DEBUG_ERR,("Failed to kill child process for eventscript, errno %s(%d)\n", strerror(errno), errno));
+		}
 	}
 
-	if (kill(state->child, SIGTERM) != 0) {
-		DEBUG(DEBUG_ERR,("Failed to kill child process for eventscript, errno %s(%d)\n", strerror(errno), errno));
+	/* This is allowed to free us; talloc will prevent double free anyway,
+	 * but beware if you call this outside the destructor! */
+	if (state->callback) {
+		state->callback(state->ctdb, state->cb_status, state->private_data);
 	}
 
 	return 0;
 }
 
+static unsigned int count_words(const char *options)
+{
+	unsigned int words = 0;
+
+	options += strspn(options, " \t");
+	while (*options) {
+		words++;
+		options += strcspn(options, " \t");
+		options += strspn(options, " \t");
+	}
+	return words;
+}
+
+static bool check_options(enum ctdb_eventscript_call call, const char *options)
+{
+	switch (call) {
+	/* These all take no arguments. */
+	case CTDB_EVENT_STARTUP:
+	case CTDB_EVENT_START_RECOVERY:
+	case CTDB_EVENT_RECOVERED:
+	case CTDB_EVENT_STOPPED:
+	case CTDB_EVENT_MONITOR:
+	case CTDB_EVENT_STATUS:
+	case CTDB_EVENT_SHUTDOWN:
+		return count_words(options) == 0;
+
+	case CTDB_EVENT_TAKE_IP: /* interface, IP address, netmask bits. */
+	case CTDB_EVENT_RELEASE_IP:
+		return count_words(options) == 3;
+
+	default:


-- 
CTDB repository