[SCM] Samba Shared Repository - branch master updated

Sat May 9 22:11:04 MDT 2015

The branch, master has been updated
       via  c75f297 ctdb-daemon: Fix typo in debug message
       via  d30b529 ctdb-daemon: Initialise eventscript status earlier
       via  070964d ctdb-daemon: Make ctdb_event_script_args() terminate if no scripts
       via  6808b0a ctdb-daemon: Drop interface monitoring
       via  a2c64a4 ctdb-common: Reimplement external tracing using ctdb_set_helper()
       via  c927ec9 ctdb-scripts: Drop update of public address configuration from config.tdb
       via  7ee57b8 ctdb-recoverd: Short circuit takeover run if no nodes are RUNNING
       via  91f99dd ctdb-recoverd: Remove redundant condition when checking recovery lock
       via  a45ab7d ctdb-recoverd: Simplify using TALLOC_FREE()
       via  2c72c9d ctdb-recoverd: Drop redundant condition in election handler
       via  c75fdf2 ctdb-recoverd: Remove unused memory context variable
       via  e6f99fc ctdb-daemon: Broadcast IP rellocation request from monitor code
       via  4b4ba77f ctdb-recoverd: Replace unnecessary use of ctdb->recovery_master
       via  6415edf ctdb-recoverd: Rename some local variables to avoid conflict with convention
       via  36fc620 ctdb_recoverd: Move num_lmasters calculation to near where it is used
       via  1fd2d38 ctdb-recoverd: Make num_lmasters a local variable
       via  385e932 ctdb-recoverd: Remove unused struct members num_active and num_connected
       via  7fb84fc ctdb-tests: Test stub for ctdb_get_capabilities()
       via  eb206f5 ctdb-daemon: Remove unused capabilities field from struct ctdb_node
       via  c3d6678 ctdb-recoverd: Use capabilities API
       via  7a42bca ctdb-client: Add API for retrieving and checking capabilities
      from  fe93f7d vfs_fruit: comment fix: the options are documented

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit c75f297ac3bd2e734ce0b2e794f481e81963518b
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Apr 13 07:52:04 2015 +1000

    ctdb-daemon: Fix typo in debug message
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
    Autobuild-Date(master): Sun May 10 06:10:21 CEST 2015 on sn-devel-104

commit d30b529cccf68e97b83287f993a368af066161c9
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Mar 18 20:46:46 2015 +1100

    ctdb-daemon: Initialise eventscript status earlier
    
    Don't initialise it after ctdb_event_script_callback_v() may have
    short-circuited.  This can stop ctdb_event_script_args() from ever
    terminating.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 070964dbcfd23e319eb8a5a9912ddd61ac99cf8c
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Mar 18 20:27:45 2015 +1100

    ctdb-daemon: Make ctdb_event_script_args() terminate if no scripts
    
    status.done is never set to true unless event_script_callback() is
    invoked.  The short-circuit in ctdb_event_script_callback_v() means
    that this doesn't happen.  CTDB can't work very well without 00.ctdb
    (for tunable initialisation and the like) but it shouldn't get stuck.
    
    So call the callback when there are no scripts in
    event_script_callback().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 6808b0aa6a40e76e22070d8bde805f88a4bc899c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Mar 17 21:42:23 2015 +1100

    ctdb-daemon: Drop interface monitoring
    
    This is done by 10.interace where the monitor event fails when there
    is a missing interface.  The in-daemon interface checking adds no
    value.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit a2c64a4810df6d70ec65a1fc773a5175a298788d
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat Mar 7 16:15:01 2015 +1100

    ctdb-common: Reimplement external tracing using ctdb_set_helper()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit c927ec928cce1ee4cf9ffcf4aa3d6c8ef6ad4144
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 30 16:17:19 2014 +1100

    ctdb-scripts: Drop update of public address configuration from config.tdb
    
    This isn't used or documented anywhere.
    
    2 differing points of view:
    
    * This is a very good idea but it should probably be generalised to
      cover more configuration items.  This would end up like the Samba
      registry configuration and would use a tool to support setting
      configuration values.
    
    * If people really want to update configuration while a node is down
      then they should fix the configuration before bringing up that node.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 7ee57b8d7c0b882227dab1f83187e44dd4639ad3
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 19 15:08:40 2014 +1100

    ctdb-recoverd: Short circuit takeover run if no nodes are RUNNING
    
    If all nodes are still in, say, FIRST_RECOVERY runstate, then the logs
    contain unfortunate noise like:
    
      recoverd:Failed to find node to cover ip 10.0.2.131
    
    This avoids that by adding an early exit that avoids running
    takeover_run_core() when there are no nodes in the
    CTDB_RUNSTATE_RUNNING.
    
    To support this add the runstate to the ipflags structure.  There are
    clearly other ways of hacking this but this seems the simplest.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 91f99ddfb3ba1cfb98223863ac3a474a5fbe4ea1
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Mar 31 13:59:49 2015 +1100

    ctdb-recoverd: Remove redundant condition when checking recovery lock
    
    It isn't possible to hold the recovery lock without having a lock file
    set.
    
    This is part of a goal to generalise the recovery lock mechanism to
    just use a helper program, which may use a lock file or may use
    something else.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit a45ab7d1fe148e9f03b4a1e4b37388e50a91cf0c
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Mar 31 13:59:02 2015 +1100

    ctdb-recoverd: Simplify using TALLOC_FREE()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 2c72c9de48545d31878413b9ab656916679c1f14
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Mar 30 22:01:52 2015 +1100

    ctdb-recoverd: Drop redundant condition in election handler
    
    Election packets from the current node are ignored at the beginning of
    the function, so this does not need to be checked.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit c75fdf208fdb0c68fca6298abc87705c9cd8137a
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Mar 30 21:52:45 2015 +1100

    ctdb-recoverd: Remove unused memory context variable
    
    It is set, memory is allocated but it is never used.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit e6f99fcba3cd91fbf664a23975db058849e39f6a
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Mar 30 20:51:51 2015 +1100

    ctdb-daemon: Broadcast IP rellocation request from monitor code
    
    No need to just send it to the recovery master.
    
    This reduces the need for main daemon code to know which node is the
    recovery master.  The end goal is for the main daemon to not need to
    know which node is the recovery master - this information would be
    stored in the recovery daemon (and subsequently a separate cluster
    management daemon).
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 4b4ba77f4a773899f41fa3b7c5e98f8e608568d6
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Mar 31 14:03:43 2015 +1100

    ctdb-recoverd: Replace unnecessary use of ctdb->recovery_master
    
    Databases are only pulled by the recovery master, so it can compare
    with current node PNN.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 6415edfa26fde8d25069e9ee867b0c76e90e9188
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Mar 29 19:20:55 2015 +1100

    ctdb-recoverd: Rename some local variables to avoid conflict with convention
    
    rec is always a (struct ctdb_recoverd *)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 36fc620898d1cf6f696d7b7b6cb13c6a8d1dc636
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Mar 29 20:00:17 2015 +1100

    ctdb_recoverd: Move num_lmasters calculation to near where it is used
    
    Unless this node is the recovery master then this is not needed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 1fd2d3886c41b0dd12cd1bebc3add75bd71b05bc
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Mar 29 17:49:02 2015 +1100

    ctdb-recoverd: Make num_lmasters a local variable
    
    It isn't used anywhere else and is always re-initialised to 0.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 385e9326eaff9ee32d02101bc8aafc69e334c4d3
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Mar 29 17:28:57 2015 +1100

    ctdb-recoverd: Remove unused struct members num_active and num_connected
    
    They are initialised and updated but the values are never used.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 7fb84fc19bf390cc25179f24845b93362968de35
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Feb 24 16:11:17 2015 +1100

    ctdb-tests: Test stub for ctdb_get_capabilities()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit eb206f5d30014f8f74a5db9930e372a0c1a83822
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 31 15:28:52 2014 +1000

    ctdb-daemon: Remove unused capabilities field from struct ctdb_node
    
    Update the ctdb tool test stub code to cope.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit c3d6678dbc3e26dfb7f4714c9d171c2e82d9af7c
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 31 15:26:03 2014 +1000

    ctdb-recoverd: Use capabilities API
    
    Simplify update_capabilities() using the capabilities API and store
    the capabilities in new field rec->caps rather than scattered around
    ctdb->nodes.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 7a42bcaeaee7f79eea4a7749479f7fadd3e1ae0a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 31 15:06:19 2014 +1000

    ctdb-client: Add API for retrieving and checking capabilities
    
    ctdb_get_capabilities() gets capabilities from all connected nodes
    into an array.  ctdb_get_node_capabilities() gets capabilities for a
    particular node from array.  ctdb_node_has_capabilities() returns true
    if given node has all of the given capabilities.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/client/ctdb_client.c        |  78 +++++++++++++++++
 ctdb/common/ctdb_util.c          |   8 +-
 ctdb/config/events.d/00.ctdb     |  22 -----
 ctdb/include/ctdb_client.h       |  22 +++++
 ctdb/include/ctdb_private.h      |   5 --
 ctdb/server/ctdb_daemon.c        |   3 -
 ctdb/server/ctdb_ltdb_server.c   |   2 +-
 ctdb/server/ctdb_monitor.c       |  11 ++-
 ctdb/server/ctdb_recoverd.c      | 177 ++++++++++++++++++---------------------
 ctdb/server/ctdb_takeover.c      |  62 ++++----------
 ctdb/server/eventscript.c        |   9 +-
 ctdb/tests/src/ctdb_test.c       |   8 ++
 ctdb/tests/src/ctdb_test_stubs.c |  28 ++++++-
 13 files changed, 252 insertions(+), 183 deletions(-)


Changeset truncated at 500 lines:

diff --git a/ctdb/client/ctdb_client.c b/ctdb/client/ctdb_client.c
index 6e18269..7526362 100644
--- a/ctdb/client/ctdb_client.c
+++ b/ctdb/client/ctdb_client.c
@@ -3825,6 +3825,84 @@ int ctdb_ctrl_getcapabilities(struct ctdb_context *ctdb, struct timeval timeout,
 	return ret;
 }
 
+static void get_capabilities_callback(struct ctdb_context *ctdb,
+				      uint32_t node_pnn, int32_t res,
+				      TDB_DATA outdata, void *callback_data)
+{
+	struct ctdb_node_capabilities *caps =
+		talloc_get_type(callback_data,
+				struct ctdb_node_capabilities);
+
+	if ( (outdata.dsize != sizeof(uint32_t)) || (outdata.dptr == NULL) ) {
+		DEBUG(DEBUG_ERR, (__location__ " Invalid length/pointer for getcap callback : %u %p\n",  (unsigned)outdata.dsize, outdata.dptr));
+		return;
+	}
+
+	if (node_pnn >= talloc_array_length(caps)) {
+		DEBUG(DEBUG_ERR,
+		      (__location__ " unexpected PNN %u\n", node_pnn));
+		return;
+	}
+
+	caps[node_pnn].retrieved = true;
+	caps[node_pnn].capabilities = *((uint32_t *)outdata.dptr);
+}
+
+struct ctdb_node_capabilities *
+ctdb_get_capabilities(struct ctdb_context *ctdb,
+		      TALLOC_CTX *mem_ctx,
+		      struct timeval timeout,
+		      struct ctdb_node_map *nodemap)
+{
+	uint32_t *nodes;
+	uint32_t i, res;
+	struct ctdb_node_capabilities *ret;
+
+	nodes = list_of_connected_nodes(ctdb, nodemap, mem_ctx, true);
+
+	ret = talloc_array(mem_ctx, struct ctdb_node_capabilities,
+			   nodemap->num);
+	CTDB_NO_MEMORY_NULL(ctdb, ret);
+	/* Prepopulate the expected PNNs */
+	for (i = 0; i < talloc_array_length(ret); i++) {
+		ret[i].retrieved = false;
+	}
+
+	res = ctdb_client_async_control(ctdb, CTDB_CONTROL_GET_CAPABILITIES,
+					nodes, 0, timeout,
+					false, tdb_null,
+					get_capabilities_callback, NULL,
+					ret);
+	if (res != 0) {
+		DEBUG(DEBUG_ERR,
+		      (__location__ " Failed to read node capabilities.\n"));
+		TALLOC_FREE(ret);
+	}
+
+	return ret;
+}
+
+uint32_t *
+ctdb_get_node_capabilities(struct ctdb_node_capabilities *caps,
+			   uint32_t pnn)
+{
+	if (pnn < talloc_array_length(caps) && caps[pnn].retrieved) {
+		return &caps[pnn].capabilities;
+	}
+
+	return NULL;
+}
+
+bool ctdb_node_has_capabilities(struct ctdb_node_capabilities *caps,
+				uint32_t pnn,
+				uint32_t capabilities_required)
+{
+	uint32_t *capp = ctdb_get_node_capabilities(caps, pnn);
+	return (capp != NULL) &&
+		((*capp & capabilities_required) == capabilities_required);
+}
+
+
 struct server_id {
 	uint64_t pid;
 	uint32_t task_id;
diff --git a/ctdb/common/ctdb_util.c b/ctdb/common/ctdb_util.c
index 8e2e430..5d63c27 100644
--- a/ctdb/common/ctdb_util.c
+++ b/ctdb/common/ctdb_util.c
@@ -134,14 +134,16 @@ bool ctdb_set_helper(const char *type, char *helper, size_t size,
 void ctdb_external_trace(void)
 {
 	int ret;
-	const char * t = getenv("CTDB_EXTERNAL_TRACE");
+	static char external_trace[PATH_MAX+1] = "";
 	char * cmd;
 
-	if (t == NULL) {
+	if (!ctdb_set_helper("external trace handler",
+			     external_trace, sizeof(external_trace),
+			     "CTDB_EXTERNAL_TRACE", NULL, NULL)) {
 		return;
 	}
 
-	cmd = talloc_asprintf(NULL, "%s %lu", t, (unsigned long) getpid());
+	cmd = talloc_asprintf(NULL, "%s %lu", external_trace, (unsigned long) getpid());
 	DEBUG(DEBUG_WARNING,("begin external trace: %s\n", cmd));
 	ret = system(cmd);
 	if (ret == -1) {
diff --git a/ctdb/config/events.d/00.ctdb b/ctdb/config/events.d/00.ctdb
index d8096ee..5e8af4c 100755
--- a/ctdb/config/events.d/00.ctdb
+++ b/ctdb/config/events.d/00.ctdb
@@ -100,27 +100,6 @@ EOF
     done
 }
 
-update_config_from_tdb() {
-
-    # Pull optional ctdb configuration data out of config.tdb
-    ctdb_get_pnn
-    _key="public_addresses:node#${pnn}"
-    _t="$service_state_dir/public_addresses"
-    rm -f "$_t"
-
-    if ctdb pfetch config.tdb "$_key" "$_t" 2>/dev/null && \
-	[ -s "$_t" -a -n "$CTDB_PUBLIC_ADDRESSES"] && \
-	! cmp -s "$_t" "$CTDB_PUBLIC_ADDRESSES" ; then
-
-	echo "CTDB public address configuration has changed."
-	echo "Extracting new configuration from database."
-	diff "$_t" "$CTDB_PUBLIC_ADDRESSES"
-	cp "$_t" "$CTDB_PUBLIC_ADDRESSES"
-	echo "Restarting CTDB"
-	service ctdb restart &
-    fi
-}
-
 set_ctdb_variables ()
 {
     # set any tunables from the config file
@@ -211,7 +190,6 @@ case "$1" in
 
     startup)
 	ctdb attach ctdb.tdb persistent
-	update_config_from_tdb &
 	;;
     monitor)
 	monitor_system_memory
diff --git a/ctdb/include/ctdb_client.h b/ctdb/include/ctdb_client.h
index 3051596..57f4917 100644
--- a/ctdb/include/ctdb_client.h
+++ b/ctdb/include/ctdb_client.h
@@ -506,6 +506,28 @@ int ctdb_ctrl_setreclock(struct ctdb_context *ctdb,
 	struct timeval timeout, uint32_t destnode,
 	const char *reclock);
 
+struct ctdb_node_capabilities {
+	bool retrieved;
+	uint32_t capabilities;
+};
+
+/* Retrieve capabilities for all connected nodes.  The length of the
+ * returned array can be calculated using talloc_array_length(). */
+struct ctdb_node_capabilities *
+ctdb_get_capabilities(struct ctdb_context *ctdb,
+		      TALLOC_CTX *mem_ctx,
+		      struct timeval timeout,
+		      struct ctdb_node_map *nodemap);
+
+/* Get capabilities for specified node, NULL if not found */
+uint32_t *
+ctdb_get_node_capabilities(struct ctdb_node_capabilities *caps,
+			   uint32_t pnn);
+
+/* True if the given node has all of the required capabilities */
+bool ctdb_node_has_capabilities(struct ctdb_node_capabilities *caps,
+				uint32_t pnn,
+				uint32_t capabilities_required);
 
 uint32_t *list_of_nodes(struct ctdb_context *ctdb,
 			struct ctdb_node_map *node_map,
diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index 532f859..3391560 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -222,11 +222,6 @@ struct ctdb_node {
 	uint32_t rx_cnt;
 	uint32_t tx_cnt;
 
-	/* used to track node capabilities, is only valid/tracked inside the
-	   recovery daemon.
-	*/
-	uint32_t capabilities;
-
 	/* a list of controls pending to this node, so we can time them out quickly
 	   if the node becomes disconnected */
 	struct daemon_control_state *pending_controls;
diff --git a/ctdb/server/ctdb_daemon.c b/ctdb/server/ctdb_daemon.c
index f5876a6..1dd0a6d 100644
--- a/ctdb/server/ctdb_daemon.c
+++ b/ctdb/server/ctdb_daemon.c
@@ -1311,9 +1311,6 @@ int ctdb_start_daemon(struct ctdb_context *ctdb, bool do_fork)
 			DEBUG(DEBUG_ALERT,("Unable to setup public address list\n"));
 			exit(1);
 		}
-		if (ctdb->do_checkpublicip) {
-			ctdb_start_monitoring_interfaces(ctdb);
-		}
 	}
 
 	ctdb_initialise_vnn_map(ctdb);
diff --git a/ctdb/server/ctdb_ltdb_server.c b/ctdb/server/ctdb_ltdb_server.c
index 174a460..8cf7180 100644
--- a/ctdb/server/ctdb_ltdb_server.c
+++ b/ctdb/server/ctdb_ltdb_server.c
@@ -622,7 +622,7 @@ int ctdb_recheck_persistent_health(struct ctdb_context *ctdb)
 				   ctdb_db->unhealthy_reason));
 	}
 	DEBUG((fail!=0)?DEBUG_ALERT:DEBUG_NOTICE,
-	      ("ctdb_recheck_presistent_health: OK[%d] FAIL[%d]\n",
+	      ("ctdb_recheck_persistent_health: OK[%d] FAIL[%d]\n",
 	       ok, fail));
 
 	if (fail != 0) {
diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c
index 5c0c055..c502f0e 100644
--- a/ctdb/server/ctdb_monitor.c
+++ b/ctdb/server/ctdb_monitor.c
@@ -187,11 +187,14 @@ after_change_status:
 	}
 
 	/* ask the recmaster to reallocate all addresses */
-	DEBUG(DEBUG_ERR,("Node became %s. Ask recovery master %u to perform ip reallocation\n",
-			 state_str, ctdb->recovery_master));
-	ret = ctdb_daemon_send_message(ctdb, ctdb->recovery_master, CTDB_SRVID_TAKEOVER_RUN, rddata);
+	DEBUG(DEBUG_ERR,
+	      ("Node became %s. Ask recovery master to reallocate IPs\n",
+	       state_str));
+	ret = ctdb_daemon_send_message(ctdb, CTDB_BROADCAST_CONNECTED, CTDB_SRVID_TAKEOVER_RUN, rddata);
 	if (ret != 0) {
-		DEBUG(DEBUG_ERR,(__location__ " Failed to send ip takeover run request message to %u\n", ctdb->recovery_master));
+		DEBUG(DEBUG_ERR,
+		      (__location__
+		       " Failed to send IP takeover run request\n"));
 	}
 }
 
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 673075a..e76a0d0 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -225,9 +225,6 @@ struct ctdb_banning_state {
 struct ctdb_recoverd {
 	struct ctdb_context *ctdb;
 	uint32_t recmaster;
-	uint32_t num_active;
-	uint32_t num_lmasters;
-	uint32_t num_connected;
 	uint32_t last_culprit_node;
 	struct ctdb_node_map *nodemap;
 	struct timeval priority_time;
@@ -242,6 +239,7 @@ struct ctdb_recoverd {
 	struct ctdb_op_state *recovery;
 	struct ctdb_control_get_ifaces *ifaces;
 	uint32_t *force_rebalance_nodes;
+	struct ctdb_node_capabilities *caps;
 };
 
 #define CONTROL_TIMEOUT() timeval_current_ofs(ctdb->tunable.recover_timeout, 0)
@@ -406,44 +404,43 @@ static int run_startrecovery_eventscript(struct ctdb_recoverd *rec, struct ctdb_
 	return 0;
 }
 
-static void async_getcap_callback(struct ctdb_context *ctdb, uint32_t node_pnn, int32_t res, TDB_DATA outdata, void *callback_data)
-{
-	if ( (outdata.dsize != sizeof(uint32_t)) || (outdata.dptr == NULL) ) {
-		DEBUG(DEBUG_ERR, (__location__ " Invalid length/pointer for getcap callback : %u %p\n",  (unsigned)outdata.dsize, outdata.dptr));
-		return;
-	}
-	if (node_pnn < ctdb->num_nodes) {
-		ctdb->nodes[node_pnn]->capabilities = *((uint32_t *)outdata.dptr);
-	}
-
-	if (node_pnn == ctdb->pnn) {
-		ctdb->capabilities = ctdb->nodes[node_pnn]->capabilities;
-	}
-}
-
 /*
   update the node capabilities for all connected nodes
  */
-static int update_capabilities(struct ctdb_context *ctdb, struct ctdb_node_map *nodemap)
+static int update_capabilities(struct ctdb_recoverd *rec,
+			       struct ctdb_node_map *nodemap)
 {
-	uint32_t *nodes;
+	uint32_t *capp;
 	TALLOC_CTX *tmp_ctx;
+	struct ctdb_node_capabilities *caps;
+	struct ctdb_context *ctdb = rec->ctdb;
 
-	tmp_ctx = talloc_new(ctdb);
+	tmp_ctx = talloc_new(rec);
 	CTDB_NO_MEMORY(ctdb, tmp_ctx);
 
-	nodes = list_of_connected_nodes(ctdb, nodemap, tmp_ctx, true);
-	if (ctdb_client_async_control(ctdb, CTDB_CONTROL_GET_CAPABILITIES,
-					nodes, 0,
-					CONTROL_TIMEOUT(),
-					false, tdb_null,
-					async_getcap_callback, NULL,
-					NULL) != 0) {
-		DEBUG(DEBUG_ERR, (__location__ " Failed to read node capabilities.\n"));
+	caps = ctdb_get_capabilities(ctdb, tmp_ctx,
+				     CONTROL_TIMEOUT(), nodemap);
+
+	if (caps == NULL) {
+		DEBUG(DEBUG_ERR,
+		      (__location__ " Failed to get node capabilities\n"));
 		talloc_free(tmp_ctx);
 		return -1;
 	}
 
+	capp = ctdb_get_node_capabilities(caps, ctdb_get_pnn(ctdb));
+	if (capp == NULL) {
+		DEBUG(DEBUG_ERR,
+		      (__location__
+		       " Capabilities don't include current node.\n"));
+		talloc_free(tmp_ctx);
+		return -1;
+	}
+	ctdb->capabilities = *capp;
+
+	TALLOC_FREE(rec->caps);
+	rec->caps = talloc_steal(rec, caps);
+
 	talloc_free(tmp_ctx);
 	return 0;
 }
@@ -716,13 +713,13 @@ static int create_missing_local_databases(struct ctdb_context *ctdb, struct ctdb
 /*
   pull the remote database contents from one node into the recdb
  */
-static int pull_one_remote_database(struct ctdb_context *ctdb, uint32_t srcnode, 
+static int pull_one_remote_database(struct ctdb_context *ctdb, uint32_t srcnode,
 				    struct tdb_wrap *recdb, uint32_t dbid)
 {
 	int ret;
 	TDB_DATA outdata;
 	struct ctdb_marshall_buffer *reply;
-	struct ctdb_rec_data *rec;
+	struct ctdb_rec_data *recdata;
 	int i;
 	TALLOC_CTX *tmp_ctx = talloc_new(recdb);
 
@@ -741,21 +738,21 @@ static int pull_one_remote_database(struct ctdb_context *ctdb, uint32_t srcnode,
 		talloc_free(tmp_ctx);
 		return -1;
 	}
-	
-	rec = (struct ctdb_rec_data *)&reply->data[0];
-	
+
+	recdata = (struct ctdb_rec_data *)&reply->data[0];
+
 	for (i=0;
 	     i<reply->count;
-	     rec = (struct ctdb_rec_data *)(rec->length + (uint8_t *)rec), i++) {
+	     recdata = (struct ctdb_rec_data *)(recdata->length + (uint8_t *)recdata), i++) {
 		TDB_DATA key, data;
 		struct ctdb_ltdb_header *hdr;
 		TDB_DATA existing;
-		
-		key.dptr = &rec->data[0];
-		key.dsize = rec->keylen;
-		data.dptr = &rec->data[key.dsize];
-		data.dsize = rec->datalen;
-		
+
+		key.dptr = &recdata->data[0];
+		key.dsize = recdata->keylen;
+		data.dptr = &recdata->data[key.dsize];
+		data.dsize = recdata->datalen;
+
 		hdr = (struct ctdb_ltdb_header *)data.dptr;
 
 		if (data.dsize < sizeof(struct ctdb_ltdb_header)) {
@@ -766,11 +763,11 @@ static int pull_one_remote_database(struct ctdb_context *ctdb, uint32_t srcnode,
 
 		/* fetch the existing record, if any */
 		existing = tdb_fetch(recdb->tdb, key);
-		
+
 		if (existing.dptr != NULL) {
 			struct ctdb_ltdb_header header;
 			if (existing.dsize < sizeof(struct ctdb_ltdb_header)) {
-				DEBUG(DEBUG_CRIT,(__location__ " Bad record size %u from node %u\n", 
+				DEBUG(DEBUG_CRIT,(__location__ " Bad record size %u from node %u\n",
 					 (unsigned)existing.dsize, srcnode));
 				free(existing.dptr);
 				talloc_free(tmp_ctx);
@@ -779,15 +776,16 @@ static int pull_one_remote_database(struct ctdb_context *ctdb, uint32_t srcnode,
 			header = *(struct ctdb_ltdb_header *)existing.dptr;
 			free(existing.dptr);
 			if (!(header.rsn < hdr->rsn ||
-			      (header.dmaster != ctdb->recovery_master && header.rsn == hdr->rsn))) {
+			      (header.dmaster != ctdb_get_pnn(ctdb) &&
+			       header.rsn == hdr->rsn))) {
 				continue;
 			}
 		}
-		
+
 		if (tdb_store(recdb->tdb, key, data, TDB_REPLACE) != 0) {
 			DEBUG(DEBUG_CRIT,(__location__ " Failed to store record\n"));
 			talloc_free(tmp_ctx);
-			return -1;				
+			return -1;
 		}
 	}
 
@@ -1409,7 +1407,7 @@ struct recdb_data {
 static int traverse_recdb(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data, void *p)
 {
 	struct recdb_data *params = (struct recdb_data *)p;
-	struct ctdb_rec_data *rec;
+	struct ctdb_rec_data *recdata;
 	struct ctdb_ltdb_header *hdr;
 
 	/*
@@ -1454,25 +1452,25 @@ static int traverse_recdb(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
 	}
 
 	/* add the record to the blob ready to send to the nodes */
-	rec = ctdb_marshall_record(params->recdata, 0, key, NULL, data);
-	if (rec == NULL) {
+	recdata = ctdb_marshall_record(params->recdata, 0, key, NULL, data);
+	if (recdata == NULL) {
 		params->failed = true;
 		return -1;
 	}
-	if (params->len + rec->length >= params->allocated_len) {
-		params->allocated_len = rec->length + params->len + params->ctdb->tunable.pulldb_preallocation_size;
+	if (params->len + recdata->length >= params->allocated_len) {
+		params->allocated_len = recdata->length + params->len + params->ctdb->tunable.pulldb_preallocation_size;
 		params->recdata = talloc_realloc_size(NULL, params->recdata, params->allocated_len);
 	}
 	if (params->recdata == NULL) {
 		DEBUG(DEBUG_CRIT,(__location__ " Failed to expand recdata to %u\n",
-			 rec->length + params->len));
+			 recdata->length + params->len));
 		params->failed = true;
 		return -1;
 	}
 	params->recdata->count++;
-	memcpy(params->len+(uint8_t *)params->recdata, rec, rec->length);
-	params->len += rec->length;
-	talloc_free(rec);
+	memcpy(params->len+(uint8_t *)params->recdata, recdata, recdata->length);
+	params->len += recdata->length;
+	talloc_free(recdata);
 
 	return 0;
 }
@@ -2082,7 +2080,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
 	
 
 	/* update the capabilities for all nodes */
-	ret = update_capabilities(ctdb, nodemap);
+	ret = update_capabilities(rec, nodemap);
 	if (ret!=0) {
 		DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));
 		goto fail;
@@ -2101,7 +2099,9 @@ static int do_recovery(struct ctdb_recoverd *rec,
 		if (nodemap->nodes[i].flags & NODE_FLAGS_INACTIVE) {
 			continue;
 		}
-		if (!(ctdb->nodes[i]->capabilities & CTDB_CAP_LMASTER)) {
+		if (!ctdb_node_has_capabilities(rec->caps,
+						ctdb->nodes[i]->pnn,
+						CTDB_CAP_LMASTER)) {
 			/* this node can not be an lmaster */
 			DEBUG(DEBUG_DEBUG, ("Node %d cant be a LMASTER, skipping it\n", i));
 			continue;
@@ -2714,7 +2714,6 @@ static void election_handler(struct ctdb_context *ctdb, uint64_t srvid,
 	struct ctdb_recoverd *rec = talloc_get_type(private_data, struct ctdb_recoverd);
 	int ret;
 	struct election_message *em = (struct election_message *)data.dptr;
-	TALLOC_CTX *mem_ctx;
 
 	/* Ignore election packets from ourself */
 	if (ctdb->pnn == em->pnn) {
@@ -2729,8 +2728,6 @@ static void election_handler(struct ctdb_context *ctdb, uint64_t srvid,
 						timeval_current_ofs(ctdb->tunable.election_timeout, 0), 
 						ctdb_election_timeout, rec);


-- 
Samba Shared Repository