[SCM] Samba Shared Repository - branch master updated
Amitay Isaacs
amitay at samba.org
Tue Apr 7 02:21:03 MDT 2015
The branch, master has been updated
via 0858b11 ctdb-tests: Use ctdb_node_list_to_map() in tool stubs
via 1ef1cfd ctdb-common: Move ctdb_node_list_to_map() to utilities
via dd52d82 ctdb-daemon: Factor out new function ctdb_node_list_to_map()
via ffbe0a6 ctdb-tools: Drop the recovery from "reloadnodes"
via d340f30 ctdb-daemon: Don't delay reloading the nodes file
via 85bd9a3 ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled
via 13dc4a9 ctdb-tool: Update "reloadnodes" to disable recoveries
via ee9619c ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES
via 2ca484c ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable()
via 108db33 ctdb-recoverd: Add slightly more abstraction for disabling takeover runs
via ec32d9b ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable()
via 281f7e8 ctdb-recoverd: Use a goto for do_recovery() failures
via a2044c6 ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable()
via 55b2461 ctdb-recoverd: Add a new abstraction ctdb_op_disable()
via ae9cd037 ctdb-daemon: Pass on consistent flag information to recovery daemon
via 4b972bb ctdb-tests: Add "ctdb reloadnodes" test for "node remains deleted"
via 181658f ctdb-tools: Fix spurious messages about deleted nodes being disconnected
from b57c778 rpc_server: Coverity fix for CID 1273079
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 0858b11ff735b535bfeded346c87a0c245d902c7
Author: Martin Schwenke <martin at meltin.net>
Date: Sun Feb 22 06:37:41 2015 +1100
ctdb-tests: Use ctdb_node_list_to_map() in tool stubs
Drop copy of old ctdb_control_nodemap().
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
Autobuild-Date(master): Tue Apr 7 10:20:41 CEST 2015 on sn-devel-104
commit 1ef1cfdc4d6b923357630451177fdcde1d616e87
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 20 12:34:25 2015 +1100
ctdb-common: Move ctdb_node_list_to_map() to utilities
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit dd52d82c73b26a3fed6dfd4aaf7d51f576d019d9
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 20 12:31:37 2015 +1100
ctdb-daemon: Factor out new function ctdb_node_list_to_map()
Change ctdb_control_getnodemap() to use this.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit ffbe0a6def236f5d0b03d089a7fc3f060eb0e392
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Feb 4 12:06:56 2015 +1100
ctdb-tools: Drop the recovery from "reloadnodes"
A recovery is not required: when deleting a node it should already be
disconnected and when adding a node it will also be disconnected. The
new sanity checks in "reloadnodes" ensure that these assumptions are
met.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit d340f308e76af53b04ae9b5432c4f6c84315303a
Author: Martin Schwenke <martin at meltin.net>
Date: Tue Feb 10 15:43:03 2015 +1100
ctdb-daemon: Don't delay reloading the nodes file
Presumably this was done to minimise the chance of a recovery
occurring while the nodemaps are inconsistent across nodes.
Another potential theory is that the forced recovery in the
ctdb.c:control_reload_nodes_file() stops another recovery occurring
for ReRecoveryTimeout seconds, so this delay causes the reloads to
occur during that period.
This is no longer necessary because recoveries are now explicitly
disabled while node files are reloaded.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 85bd9a33eb65d6fd03ad85aeedf141a2813c2bb8
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 6 20:59:11 2015 +1100
ctdb-recoverd: Avoid nodemap-related checks when recoveries are disabled
The potential resulting recovery won't run anyway. Also recoveries
may have been disabled by "reloadnodes" and if the nodemaps are
inconsistent between nodes then avoid triggering an unnecessary
recovery.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 13dc4a98426b30e7226015b1d8a86ec2e80d6228
Author: Martin Schwenke <martin at meltin.net>
Date: Mon Feb 9 20:20:44 2015 +1100
ctdb-tool: Update "reloadnodes" to disable recoveries
If a recovery occurs when some nodes have reloaded and others haven't
then the nodemaps with be inconsistent so bad things will happen.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit ee9619c28b594b7fec8093b522ac205e5d4eb0ea
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 6 15:06:44 2015 +1100
ctdb-recoverd: New message ID CTDB_SRVID_DISABLE_RECOVERIES
Also add test stub support.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 2ca484cd50c2655c59802cae6c81982b42bf61eb
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 6 15:03:03 2015 +1100
ctdb-recoverd: Simplify disable_ip_check_handler() using ctdb_op_disable()
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 108db3396f71a35ef1690a5b483d2728223803df
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 6 13:05:12 2015 +1100
ctdb-recoverd: Add slightly more abstraction for disabling takeover runs
Factor out new function srvid_disable_and_reply(), which can be
re-used.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit ec32d9bea8993778cd6b0fc63bfde492ee21d830
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 6 14:47:33 2015 +1100
ctdb-recoverd: Reimplement ReRecoveryTimeout using ctdb_op_disable()
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 281f7e8152e01a15e9df946ee293156ded8b2857
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Feb 6 14:32:08 2015 +1100
ctdb-recoverd: Use a goto for do_recovery() failures
This will allow extra things to be done on failure.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit a2044c65bc669e7240bd4ffc4b6935f57f493535
Author: Martin Schwenke <martin at meltin.net>
Date: Sun Feb 8 20:52:12 2015 +1100
ctdb-recoverd: Reimplement disabling takeover runs using ctdb_op_disable()
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 55b246195b282175022ea2ae239ebcd5d4970d3f
Author: Martin Schwenke <martin at meltin.net>
Date: Sun Feb 8 20:50:38 2015 +1100
ctdb-recoverd: Add a new abstraction ctdb_op_disable()
This can be used to disable and re-enable an operation, and do all the
relevant sanity checking.
Most of this is from existing functions
disable_takeover_runs_handler(), clear_takeover_runs_disable() and
reenable_takeover_runs().
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit ae9cd037ee96c000b11aaa7d171463b00fe4850c
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Feb 4 17:18:12 2015 +1100
ctdb-daemon: Pass on consistent flag information to recovery daemon
Signed-off-by: Martin Schwenke <martin at meltin.net>
Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 4b972bbdb3e2d3f35fad3c47dc6e84f0fee513c4
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Apr 1 18:00:04 2015 +1100
ctdb-tests: Add "ctdb reloadnodes" test for "node remains deleted"
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 181658f5bb180c48f88504a703ed3a3758ac3b5b
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Apr 1 17:10:46 2015 +1100
ctdb-tools: Fix spurious messages about deleted nodes being disconnected
The code was too "clever". The 4 different cases should be separate.
The "node remains deleted" case doesn't need the IP address comparison
(always 0.0.0.0) or the disconnected check.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
-----------------------------------------------------------------------
Summary of changes:
ctdb/common/ctdb_util.c | 27 ++
ctdb/include/ctdb_private.h | 3 +
ctdb/include/ctdb_protocol.h | 3 +
ctdb/server/ctdb_monitor.c | 1 +
ctdb/server/ctdb_recover.c | 47 +---
ctdb/server/ctdb_recoverd.c | 293 +++++++++++++--------
ctdb/tests/src/ctdb_test_stubs.c | 50 +---
...eloadnodes.001.sh => stubby.reloadnodes.024.sh} | 9 +-
ctdb/tools/ctdb.c | 31 ++-
9 files changed, 263 insertions(+), 201 deletions(-)
copy ctdb/tests/tool/{stubby.reloadnodes.001.sh => stubby.reloadnodes.024.sh} (72%)
Changeset truncated at 500 lines:
diff --git a/ctdb/common/ctdb_util.c b/ctdb/common/ctdb_util.c
index 76fb06d..8e2e430 100644
--- a/ctdb/common/ctdb_util.c
+++ b/ctdb/common/ctdb_util.c
@@ -579,6 +579,33 @@ struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX *mem_ctx,
return ret;
}
+struct ctdb_node_map *
+ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes,
+ TALLOC_CTX *mem_ctx)
+{
+ uint32_t i;
+ size_t size;
+ struct ctdb_node_map *node_map;
+
+ size = offsetof(struct ctdb_node_map, nodes) +
+ num_nodes * sizeof(struct ctdb_node_and_flags);
+ node_map = (struct ctdb_node_map *)talloc_zero_size(mem_ctx, size);
+ if (node_map == NULL) {
+ DEBUG(DEBUG_ERR,
+ (__location__ " Failed to allocate nodemap array\n"));
+ return NULL;
+ }
+
+ node_map->num = num_nodes;
+ for (i=0; i<num_nodes; i++) {
+ node_map->nodes[i].addr = nodes[i]->address;
+ node_map->nodes[i].pnn = nodes[i]->pnn;
+ node_map->nodes[i].flags = nodes[i]->flags;
+ }
+
+ return node_map;
+}
+
const char *ctdb_eventscript_call_names[] = {
"init",
"setup",
diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index b37d5bb..532f859 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -1388,6 +1388,9 @@ int ctdb_client_async_control(struct ctdb_context *ctdb,
client_async_callback fail_callback,
void *callback_data);
+struct ctdb_node_map *
+ctdb_node_list_to_map(struct ctdb_node **nodes, uint32_t num_nodes,
+ TALLOC_CTX *mem_ctx);
struct ctdb_node_map *ctdb_read_nodes_file(TALLOC_CTX *mem_ctx,
const char *nlist);
void ctdb_load_nodes_file(struct ctdb_context *ctdb);
diff --git a/ctdb/include/ctdb_protocol.h b/ctdb/include/ctdb_protocol.h
index c828c01..4dea56b 100644
--- a/ctdb/include/ctdb_protocol.h
+++ b/ctdb/include/ctdb_protocol.h
@@ -156,6 +156,9 @@ struct ctdb_call_info {
/* A message handler ID to stop takeover runs from occurring */
#define CTDB_SRVID_DISABLE_TAKEOVER_RUNS 0xFB03000000000000LL
+/* A message handler ID to stop recoveries from occurring */
+#define CTDB_SRVID_DISABLE_RECOVERIES 0xFB04000000000000LL
+
/* A message id to ask the recovery daemon to temporarily disable the
public ip checks
*/
diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c
index 9b8df6d..5c0c055 100644
--- a/ctdb/server/ctdb_monitor.c
+++ b/ctdb/server/ctdb_monitor.c
@@ -497,6 +497,7 @@ int32_t ctdb_control_modflags(struct ctdb_context *ctdb, TDB_DATA indata)
}
/* tell the recovery daemon something has changed */
+ c->new_flags = node->flags;
ctdb_daemon_send_message(ctdb, ctdb->pnn,
CTDB_SRVID_SET_NODE_FLAGS, indata);
diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c
index eb3f46d..7a684d5 100644
--- a/ctdb/server/ctdb_recover.c
+++ b/ctdb/server/ctdb_recover.c
@@ -118,30 +118,19 @@ ctdb_control_getdbmap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indat
return 0;
}
-int
+int
ctdb_control_getnodemap(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA indata, TDB_DATA *outdata)
{
- uint32_t i, num_nodes;
- struct ctdb_node_map *node_map;
-
CHECK_CONTROL_DATA_SIZE(0);
- num_nodes = ctdb->num_nodes;
-
- outdata->dsize = offsetof(struct ctdb_node_map, nodes) + num_nodes*sizeof(struct ctdb_node_and_flags);
- outdata->dptr = (unsigned char *)talloc_zero_size(outdata, outdata->dsize);
- if (!outdata->dptr) {
- DEBUG(DEBUG_ALERT, (__location__ " Failed to allocate nodemap array\n"));
- exit(1);
+ outdata->dptr = (unsigned char *)ctdb_node_list_to_map(ctdb->nodes,
+ ctdb->num_nodes,
+ outdata);
+ if (outdata->dptr == NULL) {
+ return -1;
}
- node_map = (struct ctdb_node_map *)outdata->dptr;
- node_map->num = num_nodes;
- for (i=0; i<num_nodes; i++) {
- node_map->nodes[i].addr = ctdb->nodes[i]->address;
- node_map->nodes[i].pnn = ctdb->nodes[i]->pnn;
- node_map->nodes[i].flags = ctdb->nodes[i]->flags;
- }
+ outdata->dsize = talloc_get_size(outdata->dptr);
return 0;
}
@@ -177,14 +166,15 @@ ctdb_control_getnodemapv4(struct ctdb_context *ctdb, uint32_t opcode, TDB_DATA i
return 0;
}
-static void
-ctdb_reload_nodes_event(struct event_context *ev, struct timed_event *te,
- struct timeval t, void *private_data)
+/*
+ reload the nodes file
+*/
+int
+ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)
{
int i, num_nodes;
- struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);
TALLOC_CTX *tmp_ctx;
- struct ctdb_node **nodes;
+ struct ctdb_node **nodes;
tmp_ctx = talloc_new(ctdb);
@@ -225,17 +215,6 @@ ctdb_reload_nodes_event(struct event_context *ev, struct timed_event *te,
ctdb_daemon_send_message(ctdb, ctdb->pnn, CTDB_SRVID_RELOAD_NODES, tdb_null);
talloc_free(tmp_ctx);
- return;
-}
-
-/*
- reload the nodes file after a short delay (so that we can send the response
- back first
-*/
-int
-ctdb_control_reload_nodes_file(struct ctdb_context *ctdb, uint32_t opcode)
-{
- event_add_timed(ctdb->ev, ctdb, timeval_current_ofs(1,0), ctdb_reload_nodes_event, ctdb);
return 0;
}
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 99018be..673075a 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -117,6 +117,103 @@ nomem:
srvid_request_reply(ctdb, request, result);
}
+/* An abstraction to allow an operation (takeover runs, recoveries,
+ * ...) to be disabled for a given timeout */
+struct ctdb_op_state {
+ struct tevent_timer *timer;
+ bool in_progress;
+ const char *name;
+};
+
+static struct ctdb_op_state *ctdb_op_init(TALLOC_CTX *mem_ctx, const char *name)
+{
+ struct ctdb_op_state *state = talloc_zero(mem_ctx, struct ctdb_op_state);
+
+ if (state != NULL) {
+ state->in_progress = false;
+ state->name = name;
+ }
+
+ return state;
+}
+
+static bool ctdb_op_is_disabled(struct ctdb_op_state *state)
+{
+ return state->timer != NULL;
+}
+
+static bool ctdb_op_begin(struct ctdb_op_state *state)
+{
+ if (ctdb_op_is_disabled(state)) {
+ DEBUG(DEBUG_NOTICE,
+ ("Unable to begin - %s are disabled\n", state->name));
+ return false;
+ }
+
+ state->in_progress = true;
+ return true;
+}
+
+static bool ctdb_op_end(struct ctdb_op_state *state)
+{
+ return state->in_progress = false;
+}
+
+static bool ctdb_op_is_in_progress(struct ctdb_op_state *state)
+{
+ return state->in_progress;
+}
+
+static void ctdb_op_enable(struct ctdb_op_state *state)
+{
+ TALLOC_FREE(state->timer);
+}
+
+static void ctdb_op_timeout_handler(struct event_context *ev,
+ struct timed_event *te,
+ struct timeval yt, void *p)
+{
+ struct ctdb_op_state *state =
+ talloc_get_type(p, struct ctdb_op_state);
+
+ DEBUG(DEBUG_NOTICE,("Reenabling %s after timeout\n", state->name));
+ ctdb_op_enable(state);
+}
+
+static int ctdb_op_disable(struct ctdb_op_state *state,
+ struct tevent_context *ev,
+ uint32_t timeout)
+{
+ if (timeout == 0) {
+ DEBUG(DEBUG_NOTICE,("Reenabling %s\n", state->name));
+ ctdb_op_enable(state);
+ return 0;
+ }
+
+ if (state->in_progress) {
+ DEBUG(DEBUG_ERR,
+ ("Unable to disable %s - in progress\n", state->name));
+ return -EAGAIN;
+ }
+
+ DEBUG(DEBUG_NOTICE,("Disabling %s for %u seconds\n",
+ state->name, timeout));
+
+ /* Clear any old timers */
+ talloc_free(state->timer);
+
+ /* Arrange for the timeout to occur */
+ state->timer = tevent_add_timer(ev, state,
+ timeval_current_ofs(timeout, 0),
+ ctdb_op_timeout_handler, state);
+ if (state->timer == NULL) {
+ DEBUG(DEBUG_ERR,(__location__ " Unable to setup timer\n"));
+ return -ENOMEM;
+ }
+
+ return 0;
+}
+
struct ctdb_banning_state {
uint32_t count;
struct timeval last_reported_time;
@@ -141,8 +238,8 @@ struct ctdb_recoverd {
struct timed_event *election_timeout;
struct vacuum_info *vacuum_info;
struct srvid_requests *reallocate_requests;
- bool takeover_run_in_progress;
- TALLOC_CTX *takeover_runs_disable_ctx;
+ struct ctdb_op_state *takeover_run;
+ struct ctdb_op_state *recovery;
struct ctdb_control_get_ifaces *ifaces;
uint32_t *force_rebalance_nodes;
};
@@ -1566,7 +1663,7 @@ static int ctdb_reload_remote_public_ips(struct ctdb_context *ctdb,
}
if (ctdb->do_checkpublicip &&
- rec->takeover_runs_disable_ctx == NULL &&
+ !ctdb_op_is_disabled(rec->takeover_run) &&
verify_remote_ip_allocation(ctdb,
node->known_public_ips,
node->pnn)) {
@@ -1691,19 +1788,14 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
DEBUG(DEBUG_NOTICE, ("Takeover run starting\n"));
- if (rec->takeover_run_in_progress) {
+ if (ctdb_op_is_in_progress(rec->takeover_run)) {
DEBUG(DEBUG_ERR, (__location__
" takeover run already in progress \n"));
ok = false;
goto done;
}
- rec->takeover_run_in_progress = true;
-
- /* If takeover runs are in disabled then fail... */
- if (rec->takeover_runs_disable_ctx != NULL) {
- DEBUG(DEBUG_ERR,
- ("Takeover runs are disabled so refusing to run one\n"));
+ if (!ctdb_op_begin(rec->takeover_run)) {
ok = false;
goto done;
}
@@ -1767,7 +1859,7 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
done:
rec->need_takeover_run = !ok;
talloc_free(nodes);
- rec->takeover_run_in_progress = false;
+ ctdb_op_end(rec->takeover_run);
DEBUG(DEBUG_NOTICE, ("Takeover run %s\n", ok ? "completed successfully" : "unsuccessful"));
return ok;
@@ -1796,16 +1888,20 @@ static int do_recovery(struct ctdb_recoverd *rec,
/* if recovery fails, force it again */
rec->need_recovery = true;
+ if (!ctdb_op_begin(rec->recovery)) {
+ return -1;
+ }
+
if (rec->election_timeout) {
/* an election is in progress */
DEBUG(DEBUG_ERR, ("do_recovery called while election in progress - try again later\n"));
- return -1;
+ goto fail;
}
ban_misbehaving_nodes(rec, &self_ban);
if (self_ban) {
DEBUG(DEBUG_NOTICE, ("This node was banned, aborting recovery\n"));
- return -1;
+ goto fail;
}
if (ctdb->recovery_lock_file != NULL) {
@@ -1823,14 +1919,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
*/
DEBUG(DEBUG_ERR, ("Unable to get recovery lock"
" - retrying recovery\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_ERR,("Unable to get recovery lock - aborting recovery "
"and ban ourself for %u seconds\n",
ctdb->tunable.recovery_ban_period));
ctdb_ban_node(rec, pnn, ctdb->tunable.recovery_ban_period);
- return -1;
+ goto fail;
}
ctdb_ctrl_report_recd_lock_latency(ctdb,
CONTROL_TIMEOUT(),
@@ -1846,7 +1942,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = ctdb_ctrl_getdbmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, &dbmap);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to get dbids from node :%u\n", pnn));
- return -1;
+ goto fail;
}
/* we do the db creation before we set the recovery mode, so the freeze happens
@@ -1856,14 +1952,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = create_missing_local_databases(ctdb, nodemap, pnn, &dbmap, mem_ctx);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to create missing local databases\n"));
- return -1;
+ goto fail;
}
/* verify that all other nodes have all our databases */
ret = create_missing_remote_databases(ctdb, nodemap, pnn, dbmap, mem_ctx);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to create missing remote databases\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - created remote databases\n"));
@@ -1884,14 +1980,14 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_ACTIVE);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to active on cluster\n"));
- return -1;
+ goto fail;
}
/* execute the "startrecovery" event script on all nodes */
ret = run_startrecovery_eventscript(rec, nodemap);
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'startrecovery' event on cluster\n"));
- return -1;
+ goto fail;
}
/*
@@ -1908,7 +2004,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
DEBUG(DEBUG_WARNING, (__location__ "Unable to update flags on inactive node %d\n", i));
} else {
DEBUG(DEBUG_ERR, (__location__ " Unable to update flags on all nodes for node %d\n", i));
- return -1;
+ goto fail;
}
}
}
@@ -1932,7 +2028,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = ctdb_ctrl_setvnnmap(ctdb, CONTROL_TIMEOUT(), pnn, mem_ctx, vnnmap);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set vnnmap for node %u\n", pnn));
- return -1;
+ goto fail;
}
data.dptr = (void *)&generation;
@@ -1954,7 +2050,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
NULL) != 0) {
DEBUG(DEBUG_ERR,("Failed to cancel recovery transaction\n"));
}
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE,(__location__ " started transactions on all nodes\n"));
@@ -1966,7 +2062,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
pnn, nodemap, generation);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Failed to recover database 0x%x\n", dbmap->dbs[i].dbid));
- return -1;
+ goto fail;
}
}
@@ -1979,7 +2075,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
NULL, NULL,
NULL) != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to commit recovery changes. Recovery failed.\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - committed databases\n"));
@@ -1989,7 +2085,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = update_capabilities(ctdb, nodemap);
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));
- return -1;
+ goto fail;
}
/* build a new vnn map with all the currently active and
@@ -2029,7 +2125,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = update_vnnmap_on_all_nodes(ctdb, nodemap, pnn, vnnmap, mem_ctx);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to update vnnmap on all nodes\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated vnnmap\n"));
@@ -2038,7 +2134,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = set_recovery_master(ctdb, nodemap, pnn);
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery master\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - updated recmaster\n"));
@@ -2047,7 +2143,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = set_recovery_mode(ctdb, rec, nodemap, CTDB_RECOVERY_NORMAL);
if (ret != 0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to set recovery mode to normal on cluster\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - disabled recovery mode\n"));
@@ -2058,7 +2154,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
DEBUG(DEBUG_ERR,("Failed to read public ips from remote node %d\n",
culprit));
rec->need_takeover_run = true;
- return -1;
+ goto fail;
}
do_takeover_run(rec, nodemap, false);
@@ -2067,7 +2163,7 @@ static int do_recovery(struct ctdb_recoverd *rec,
ret = run_recovered_eventscript(rec, nodemap, "do_recovery");
if (ret!=0) {
DEBUG(DEBUG_ERR, (__location__ " Unable to run the 'recovered' event on cluster. Recovery process failed.\n"));
- return -1;
+ goto fail;
}
DEBUG(DEBUG_NOTICE, (__location__ " Recovery - finished the recovered event\n"));
--
Samba Shared Repository
More information about the samba-cvs
mailing list