[SCM] Samba Shared Repository - branch master updated
Amitay Isaacs
amitay at samba.org
Tue Aug 18 06:25:02 UTC 2020
The branch, master has been updated
via 8bb6a6607da ctdb-recoverd: Broadcast takeover run message when verifying IPs
via 4aa8e72d60e ctdb-recoverd: Rename update_local_flags() -> update_flags()
via 702c7c4934e ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps
via 910a0b3b747 ctdb-recoverd: Get remote nodemaps earlier
via d50919b0cb2 ctdb-recoverd: Do not fetch the nodemap from the recovery master
via 762d1d8a960 ctdb-recoverd: Change get_remote_nodemaps() to use connected nodes
via 368c83bfe3b ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array
via 10ce0dbf1c1 ctdb-recoverd: Add fail callback to assign banning credits
via a079ee31690 ctdb-recoverd: Add an intermediate state struct for nodemap fetching
via 2eaa0af6160 ctdb-recoverd: Move memory allocation into get_remote_nodemaps()
via 3324dd272c7 ctdb-recoverd: Change signature of get_remote_nodemaps()
via d2d90f25021 ctdb-recoverd: Fix a local memory leak
via 52f520d39cd ctdb-recoverd: Basic cleanups for get_remote_nodemaps()
from 20606fd0a4c WHATSNEW: list deprecated parameters
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 8bb6a6607da22903be2e8f5ed2360d01a08555e9
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Jul 29 07:02:45 2020 +1000
ctdb-recoverd: Broadcast takeover run message when verifying IPs
This makes it consistent with the monitoring code. If the master has
changed then this means the master will always get the message.
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
Autobuild-Date(master): Tue Aug 18 06:24:11 UTC 2020 on sn-devel-184
commit 4aa8e72d60e92951b35190d2ffcfdb1bfb756609
Author: Martin Schwenke <martin at meltin.net>
Date: Wed Jan 24 10:21:37 2018 +1100
ctdb-recoverd: Rename update_local_flags() -> update_flags()
This also updates remote flags so the name is misleading.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 702c7c4934e79a9161fdc59df70df30ae492d89f
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 20:35:55 2018 +1100
ctdb-recoverd: Change update_local_flags() to use already retrieved nodemaps
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 910a0b3b747a987ba69b6a0b6256e964b7d85dfe
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Jun 14 03:51:01 2019 +1000
ctdb-recoverd: Get remote nodemaps earlier
update_local_flags() will be changed to use these nodemaps.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit d50919b0cb28f299c9b6985271b29d4f27c5f619
Author: Martin Schwenke <martin at meltin.net>
Date: Fri Jun 14 00:23:22 2019 +1000
ctdb-recoverd: Do not fetch the nodemap from the recovery master
The nodemap has already been fetched from the local node and is
actually passed to this function. Care must be taken to avoid
referencing the "remote" nodemap for the recovery master. It also
isn't useful to do so, since it would be the same nodemap.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 762d1d8a9605f97973a2c1176de5d29fcc61d15a
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 20:02:42 2018 +1100
ctdb-recoverd: Change get_remote_nodemaps() to use connected nodes
The plan here is to use the nodemaps retrieved by get_remote_nodes()
in update_local_flags(). This will improve efficiency, since
get_remote_nodes() fetches flags from nodes in parallel. It also
means that get_remote_nodes() can be used exactly once early on in
main_loop() to retrieve remote nodemaps. Retrieving nodemaps multiple
times is unnecessary and racy - a single monitoring iteration should
not fetch flags multiple times and compare them.
This introduces a temporary behaviour change but it will be of no
consequence when the above changes are made.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 368c83bfe3bbfff568d14f65e7b1ffa41d5349ac
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jul 30 11:57:51 2020 +1000
ctdb-recoverd: Fix node_pnn check and assignment of nodemap into array
This array is indexed by the same index as nodemap, not the PNN.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 10ce0dbf1c11eaaab7b28b6bbd014235a36d1962
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 19:58:15 2018 +1100
ctdb-recoverd: Add fail callback to assign banning credits
Also drop error handling in main_loop() that is replaced by this
change.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit a079ee31690cf7110f46b41989ffcfb83b7626d6
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 19:52:22 2018 +1100
ctdb-recoverd: Add an intermediate state struct for nodemap fetching
This will allow an error callback to be added.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 2eaa0af6160588b6e3364b181d0976477d12b51b
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 16:31:39 2018 +1100
ctdb-recoverd: Move memory allocation into get_remote_nodemaps()
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 3324dd272c7dafa92cd9c3fd0af8f50084bcdaaa
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 16:41:19 2018 +1100
ctdb-recoverd: Change signature of get_remote_nodemaps()
Change 1st argument to a rec context, since this will be needed later.
Drop the nodemap argument and access it via rec->nodemap instead.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit d2d90f250214582d7124b8137aa2cf5032b2f285
Author: Martin Schwenke <martin at meltin.net>
Date: Mon Aug 17 20:27:18 2020 +1000
ctdb-recoverd: Fix a local memory leak
The memory is allocated off the memory context used by the current
iteration of main loop. It is freed when main loop completes the fix
doesn't require backporting to stable branches. However, it is sloppy
so it is worth fixing.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
commit 52f520d39cd92e1cf2413fd7e0dd362debd6f463
Author: Martin Schwenke <martin at meltin.net>
Date: Thu Jan 18 16:19:36 2018 +1100
ctdb-recoverd: Basic cleanups for get_remote_nodemaps()
Don't log an error on failure - let the caller can do this. Apart
from this: fix up coding style and modernise the remaining error
message.
BUG: https://bugzilla.samba.org/show_bug.cgi?id=14466
Signed-off-by: Martin Schwenke <martin at meltin.net>
Reviewed-by: Amitay Isaacs <amitay at gmail.com>
-----------------------------------------------------------------------
Summary of changes:
ctdb/server/ctdb_recoverd.c | 165 ++++++++++++++++++++++++++++----------------
1 file changed, 105 insertions(+), 60 deletions(-)
Changeset truncated at 500 lines:
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 3f72619127a..aeb23276fe7 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -537,18 +537,19 @@ static void ctdb_wait_election(struct ctdb_recoverd *rec)
}
/*
- Update our local flags from all remote connected nodes.
- This is only run when we are or we belive we are the recovery master
+ * Update local flags from all remote connected nodes and push out
+ * flags changes to all nodes. This is only run by the recovery
+ * master.
*/
-static int update_local_flags(struct ctdb_recoverd *rec, struct ctdb_node_map_old *nodemap)
+static int update_flags(struct ctdb_recoverd *rec,
+ struct ctdb_node_map_old *nodemap,
+ struct ctdb_node_map_old **remote_nodemaps)
{
unsigned int j;
struct ctdb_context *ctdb = rec->ctdb;
TALLOC_CTX *mem_ctx = talloc_new(ctdb);
- /* get the nodemap for all active remote nodes and verify
- they are the same as for this node
- */
+ /* Check flags from remote nodes */
for (j=0; j<nodemap->num; j++) {
struct ctdb_node_map_old *remote_nodemap=NULL;
uint32_t local_flags = nodemap->nodes[j].flags;
@@ -562,15 +563,7 @@ static int update_local_flags(struct ctdb_recoverd *rec, struct ctdb_node_map_ol
continue;
}
- ret = ctdb_ctrl_getnodemap(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn,
- mem_ctx, &remote_nodemap);
- if (ret != 0) {
- DEBUG(DEBUG_ERR, (__location__ " Unable to get nodemap from remote node %u\n",
- nodemap->nodes[j].pnn));
- ctdb_set_culprit(rec, nodemap->nodes[j].pnn);
- talloc_free(mem_ctx);
- return -1;
- }
+ remote_nodemap = remote_nodemaps[j];
remote_flags = remote_nodemap->nodes[j].flags;
if (local_flags != remote_flags) {
@@ -595,7 +588,6 @@ static int update_local_flags(struct ctdb_recoverd *rec, struct ctdb_node_map_ol
local_flags);
nodemap->nodes[j].flags = remote_flags;
}
- talloc_free(remote_nodemap);
}
talloc_free(mem_ctx);
return 0;
@@ -2212,10 +2204,12 @@ done:
data.dptr = (uint8_t *)&rd;
data.dsize = sizeof(rd);
- ret = ctdb_client_send_message(ctdb, rec->recmaster, CTDB_SRVID_TAKEOVER_RUN, data);
+ ret = ctdb_client_send_message(ctdb,
+ CTDB_BROADCAST_CONNECTED,
+ CTDB_SRVID_TAKEOVER_RUN,
+ data);
if (ret != 0) {
- DEBUG(DEBUG_ERR,
- ("Failed to send takeover run request\n"));
+ D_ERR("Failed to send takeover run request\n");
}
}
talloc_free(mem_ctx);
@@ -2223,37 +2217,94 @@ done:
}
-static void async_getnodemap_callback(struct ctdb_context *ctdb, uint32_t node_pnn, int32_t res, TDB_DATA outdata, void *callback_data)
+struct remote_nodemaps_state {
+ struct ctdb_node_map_old **remote_nodemaps;
+ struct ctdb_recoverd *rec;
+};
+
+static void async_getnodemap_callback(struct ctdb_context *ctdb,
+ uint32_t node_pnn,
+ int32_t res,
+ TDB_DATA outdata,
+ void *callback_data)
{
- struct ctdb_node_map_old **remote_nodemaps = callback_data;
+ struct remote_nodemaps_state *state =
+ (struct remote_nodemaps_state *)callback_data;
+ struct ctdb_node_map_old **remote_nodemaps = state->remote_nodemaps;
+ struct ctdb_node_map_old *nodemap = state->rec->nodemap;
+ size_t i;
+
+ for (i = 0; i < nodemap->num; i++) {
+ if (nodemap->nodes[i].pnn == node_pnn) {
+ break;
+ }
+ }
- if (node_pnn >= ctdb->num_nodes) {
- DEBUG(DEBUG_ERR,(__location__ " pnn from invalid node\n"));
+ if (i >= nodemap->num) {
+ DBG_ERR("Invalid PNN %"PRIu32"\n", node_pnn);
return;
}
- remote_nodemaps[node_pnn] = (struct ctdb_node_map_old *)talloc_steal(remote_nodemaps, outdata.dptr);
+ remote_nodemaps[i] = (struct ctdb_node_map_old *)talloc_steal(
+ remote_nodemaps, outdata.dptr);
+
+}
+
+static void async_getnodemap_error(struct ctdb_context *ctdb,
+ uint32_t node_pnn,
+ int32_t res,
+ TDB_DATA outdata,
+ void *callback_data)
+{
+ struct remote_nodemaps_state *state =
+ (struct remote_nodemaps_state *)callback_data;
+ struct ctdb_recoverd *rec = state->rec;
+ DBG_ERR("Failed to retrieve nodemap from node %u\n", node_pnn);
+ ctdb_set_culprit(rec, node_pnn);
}
-static int get_remote_nodemaps(struct ctdb_context *ctdb, TALLOC_CTX *mem_ctx,
- struct ctdb_node_map_old *nodemap,
- struct ctdb_node_map_old **remote_nodemaps)
+static int get_remote_nodemaps(struct ctdb_recoverd *rec,
+ TALLOC_CTX *mem_ctx,
+ struct ctdb_node_map_old ***remote_nodemaps)
{
+ struct ctdb_context *ctdb = rec->ctdb;
+ struct ctdb_node_map_old **t;
uint32_t *nodes;
+ struct remote_nodemaps_state state;
+ int ret;
- nodes = list_of_active_nodes(ctdb, nodemap, mem_ctx, true);
- if (ctdb_client_async_control(ctdb, CTDB_CONTROL_GET_NODEMAP,
- nodes, 0,
- CONTROL_TIMEOUT(), false, tdb_null,
+ t = talloc_zero_array(mem_ctx,
+ struct ctdb_node_map_old *,
+ rec->nodemap->num);
+ if (t == NULL) {
+ DBG_ERR("Memory allocation error\n");
+ return -1;
+ }
+
+ nodes = list_of_connected_nodes(ctdb, rec->nodemap, mem_ctx, false);
+
+ state.remote_nodemaps = t;
+ state.rec = rec;
+
+ ret = ctdb_client_async_control(ctdb,
+ CTDB_CONTROL_GET_NODEMAP,
+ nodes,
+ 0,
+ CONTROL_TIMEOUT(),
+ false,
+ tdb_null,
async_getnodemap_callback,
- NULL,
- remote_nodemaps) != 0) {
- DEBUG(DEBUG_ERR, (__location__ " Unable to pull all remote nodemaps\n"));
+ async_getnodemap_error,
+ &state);
+ talloc_free(nodes);
- return -1;
+ if (ret != 0) {
+ talloc_free(t);
+ return ret;
}
+ *remote_nodemaps = t;
return 0;
}
@@ -2498,10 +2549,17 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
}
- /* ensure our local copies of flags are right */
- ret = update_local_flags(rec, nodemap);
+ /* Get the nodemaps for all connected remote nodes */
+ ret = get_remote_nodemaps(rec, mem_ctx, &remote_nodemaps);
if (ret != 0) {
- DEBUG(DEBUG_ERR,("Unable to update local flags\n"));
+ DBG_ERR("Failed to read remote nodemaps\n");
+ return;
+ }
+
+ /* Ensure our local and remote flags are correct */
+ ret = update_flags(rec, nodemap, remote_nodemaps);
+ if (ret != 0) {
+ D_ERR("Unable to update flags\n");
return;
}
@@ -2574,33 +2632,14 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
goto takeover_run_checks;
}
- /* get the nodemap for all active remote nodes
- */
- remote_nodemaps = talloc_array(mem_ctx, struct ctdb_node_map_old *, nodemap->num);
- if (remote_nodemaps == NULL) {
- DEBUG(DEBUG_ERR, (__location__ " failed to allocate remote nodemap array\n"));
- return;
- }
- for(i=0; i<nodemap->num; i++) {
- remote_nodemaps[i] = NULL;
- }
- if (get_remote_nodemaps(ctdb, mem_ctx, nodemap, remote_nodemaps) != 0) {
- DEBUG(DEBUG_ERR,(__location__ " Failed to read remote nodemaps\n"));
- return;
- }
-
/* verify that all other nodes have the same nodemap as we have
*/
for (j=0; j<nodemap->num; j++) {
- if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {
+ if (nodemap->nodes[j].pnn == ctdb->pnn) {
continue;
}
-
- if (remote_nodemaps[j] == NULL) {
- DEBUG(DEBUG_ERR,(__location__ " Did not get a remote nodemap for node %d, restarting monitoring\n", j));
- ctdb_set_culprit(rec, j);
-
- return;
+ if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {
+ continue;
}
/* if the nodes disagree on how many nodes there are
@@ -2635,6 +2674,9 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
* up-to-date information for all the nodes.
*/
for (j=0; j<nodemap->num; j++) {
+ if (nodemap->nodes[j].pnn == ctdb->pnn) {
+ continue;
+ }
if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {
continue;
}
@@ -2642,6 +2684,9 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
}
for (j=0; j<nodemap->num; j++) {
+ if (nodemap->nodes[j].pnn == ctdb->pnn) {
+ continue;
+ }
if (nodemap->nodes[j].flags & NODE_FLAGS_INACTIVE) {
continue;
}
--
Samba Shared Repository
More information about the samba-cvs
mailing list