[SCM] Samba Shared Repository - branch v4-13-test updated

Jule Anger janger at samba.org
Mon Sep 13 14:13:01 UTC 2021


The branch, v4-13-test has been updated
       via  cea68cbf537 ctdb-daemon: Don't mark a node as unhealthy when connecting to it
       via  479fc4fee0c ctdb-daemon: Ignore flag changes for disconnected nodes
       via  cc3ce341ee1 ctdb-daemon: Simplify ctdb_control_modflags()
       via  3ab6be4f7bc ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete
       via  7c4daa7ffa0 ctdb-daemon: Don't bother sending CTDB_SRVID_SET_NODE_FLAGS
       via  c4d7ed5eac4 ctdb-daemon: Modernise remaining debug macro in this function
       via  3d2313dc906 ctdb-daemon: Update logging for flag changes
       via  85372296a7e ctdb-daemon: Correct the condition for logging unchanged flags
       via  c89f30810d3 ctdb-tools: Use disable and enable controls in tool
       via  75b8b5de3e8 ctdb-client: Add client code for disable/enable controls
       via  ce58aefb4ee ctdb_daemon: Implement controls DISABLE_NODE/ENABLE_NODE
       via  7aac8fd9e5e ctdb-daemon: Start as disabled means PERMANENTLY_DISABLED
       via  65f9b5520d2 ctdb-daemon: Factor out a function to get node structure from PNN
       via  e3578ea22cb ctdb-daemon: Add a helper variable
       via  3d797b570b0 ctdb-protocol: Add marshalling for controls DISABLE_NODE/ENABLE_NODE
       via  ac8bbe2d0ae ctdb-protocol: Add new controls to disable and enable nodes
       via  74aa5b204e2 ctdb-recoverd: Push flags for a node if any remote node disagrees
       via  e93c885426d ctdb-recoverd: Update the local node map before pushing out flags
       via  76f8dffb527 ctdb-recoverd: Add a helper variable
      from  4ada6c24a5c selftest: Add prefix to new schema attributes to avoid flapping dsdb_schema_attributes

https://git.samba.org/?p=samba.git;a=shortlog;h=v4-13-test


- Log -----------------------------------------------------------------
commit cea68cbf537b6d44eb199126dc2ccf97fd3fff55
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 17:25:32 2021 +1000

    ctdb-daemon: Don't mark a node as unhealthy when connecting to it
    
    Remote nodes are already initialised as UNHEALTHY when the node list
    is initialised at startup (ctdb_load_nodes_file() calls
    convert_node_map_to_list()) and when disconnected (ctdb_node_dead()).
    So, drop this code.
    
    RN: Fix CTDB flag/status update race conditions
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
    Autobuild-Date(master): Thu Sep  9 02:38:34 UTC 2021 on sn-devel-184
    
    (cherry picked from commit 9e7d2d9794af7251c42cb22f23ee9f86c6ea05c1)
    
    Autobuild-User(v4-13-test): Jule Anger <janger at samba.org>
    Autobuild-Date(v4-13-test): Mon Sep 13 14:13:00 UTC 2021 on sn-devel-184

commit 479fc4fee0c78dd8e6fcab929480d08ec5ccfba2
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jul 27 15:50:54 2021 +1000

    ctdb-daemon: Ignore flag changes for disconnected nodes
    
    If this node is not connected to a node then we shouldn't know
    anything about it.  The state will be pushed later by the recovery
    master.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 7f697b1938efb3972f03f25546bf807d5af9a26c)

commit cc3ce341ee17d46bc8461b8628641d9f7c0c033c
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 8 11:11:11 2021 +1000

    ctdb-daemon: Simplify ctdb_control_modflags()
    
    Now that there are separate disable/enable controls used by the ctdb
    tool this control can ignore any flag updates for the current nodes.
    These only come from the recovery master, which depends on being able
    to fetch flags for all nodes.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit ae10a8a4b70e53ea3be6257d1f86f2d9a56aa62a)

commit 3ab6be4f7bc672c719ea6891736ecc6448bab1be
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jan 17 19:04:34 2018 +1100

    ctdb-recoverd: Mark CTDB_SRVID_SET_NODE_FLAGS obsolete
    
    CTDB_SRVID_SET_NODE_FLAGS is no longer sent so drop monitor_handler()
    and replace with srvid_not_implemented().  Mark the SRVID obsolete in
    its comment.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 916c5ee131dc5c7f1d9c3540147d1f915c8302ad)

commit 7c4daa7ffa05c2fb6ef710ba107cdb47a0e57811
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 8 11:32:20 2021 +1000

    ctdb-daemon: Don't bother sending CTDB_SRVID_SET_NODE_FLAGS
    
    The code that handles this message is
    ctdb_recoverd.c:monitor_handler().  Although it appears to do
    something potentially useful, it only logs the flags changes.  All
    changes made are to local structures - there are no actual
    side-effects.
    
    It used to trigger a takeover run when the DISABLED flag changed.
    This was dropped back in commit
    662f06de9fdce7b1bc1772a4fbe43de271564917.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit e75256767fffc6a7ac0b97e58737a39c63c8b187)

commit c4d7ed5eac4ddd971181af13f5ca32c443f0a79a
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 8 11:34:49 2021 +1000

    ctdb-daemon: Modernise remaining debug macro in this function
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 0132bd5a2233193256af434a37506f86ed62c075)

commit 3d2313dc906b1794d1cc3235fcd10c7b6ea5d874
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 8 11:29:38 2021 +1000

    ctdb-daemon: Update logging for flag changes
    
    When flags change, promote the message to NOTICE level and switch the
    message to the style that is currently generated by
    ctdb-recoverd.c:monitor_handler().  This will allow monitor_handler()
    to go away in future.
    
    Drop logging when flags do not change.  The recovery master now logs
    when it pushes flags for a node, so the lack of a corresponding
    "changed flags" message here indicates that no update was required.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit b6d25d079e30919457cacbfbbfd670bf88295a9c)

commit 85372296a7ed90f9873261a4e4ad5c6fb518c502
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 15:13:49 2021 +1000

    ctdb-daemon: Correct the condition for logging unchanged flags
    
    Don't trust the old flags from the recovery master.
    
    Surrounding code will change in future comments, including the use of
    old-style debug macros, so just make this change clear.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit eec44e286250a6ee7b5c42d85d632bdc300a409f)

commit c89f30810d3c036bbe8a0acc28b0d741ee2408be
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 14:37:19 2021 +1000

    ctdb-tools: Use disable and enable controls in tool
    
    Note that there a change from broadcast to a directed control here.
    This is OK because the recovery master will push flags if any nodes
    disagree with the canonical flags fetched from a node.
    
    Static function ctdb_ctrl_modflags() is no longer used to drop it.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 5914054698dab934fd4db5efb9d211b2fdc40bb9)

commit 75b8b5de3e835b5eeaeca7cc6100b1e538c88d9c
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 14:32:12 2021 +1000

    ctdb-client: Add client code for disable/enable controls
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 6fe6a54e7f32e650be6ab36041159081dbde5165)

commit ce58aefb4ee23df9a1d8461e1ca3c55f43aa5889
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 14:12:59 2021 +1000

    ctdb_daemon: Implement controls DISABLE_NODE/ENABLE_NODE
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 15a6489c288b3adb635a728cb2049621ab1a07f7)

commit 7aac8fd9e5e6ebd404f8eb7d568e5b3d7e11fa8b
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 14:02:28 2021 +1000

    ctdb-daemon: Start as disabled means PERMANENTLY_DISABLED
    
    DISABLED is UNHEALTHY | PERMANENTLY_DISABLED, which is not what is
    intended here.  Luckily, it doesn't do any harm because nodes are
    marked unhealthy at startup anyway.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 60c1ef146538d90f97b7823459f7548ca5fa6dd3)

commit 65f9b5520d20ee404ffca87b282773fe171fe3d8
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 14:01:33 2021 +1000

    ctdb-daemon: Factor out a function to get node structure from PNN
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 1ac7bc7532b2fad791d0e53effa7c64cdc73c4eb)

commit e3578ea22cb5dcd2bbba3d96fb9eeac52da55be9
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 28 10:27:42 2021 +1000

    ctdb-daemon: Add a helper variable
    
    Simplifies a subsequent change.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit e0a7b5a9e866452b1faaed86a105492fe7b237e2)

commit 3d797b570b024c5b490664fff3580bd54e39270d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 9 12:10:12 2021 +1000

    ctdb-protocol: Add marshalling for controls DISABLE_NODE/ENABLE_NODE
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 6845dca87e6ffc5e449fb78d23eb9c7a22698b80)

commit ac8bbe2d0aeb5ed18816c0fabc125bef5ff609b0
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Jul 8 17:28:20 2021 +1000

    ctdb-protocol: Add new controls to disable and enable nodes
    
    These are CTDB_CONTROL_DISABLE_NODE and CTDB_CONTROL_ENABLE_NODE.
    
    For consistency these match CTDB_CONTROL_STOP_NODE and
    CTDB_CONTROL_CONTINUE_NODE.  It would be possible to add a single
    control but it would need to take data.
    
    The aim is to finally fix races in flag handling.  Previous fixes have
    improved the situation but they have only narrowed the race window.
    The problem is that the recovery daemon on the master node pushes
    flags to nodes the same way that disable and enable are implemented.
    So the following sequence is still racy:
    
    1. Node A is disabled
    2. Recovery master pulls flags from all nodes including A
    3. Node A is enabled
    4. Recovery master notices A is disabled and pushes a flag update to
       all nodes including node A
    5. Node A is erroneously marked disabled
    
    Node A can not tell if the MODIFY_FLAGS control is from a "ctdb
    disable" command or a flag update from the recovery master.
    
    The solution is to use a different mechanism for disable/enable and
    for a node to ignore MODIFY_FLAGS controls for their own flags.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 49dc5d8cd2d3767044ac69cbd25c8210d11cadf7)

commit 74aa5b204e2e20b594b093342578151ab7cc3f9f
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jul 11 22:17:08 2021 +1000

    ctdb-recoverd: Push flags for a node if any remote node disagrees
    
    This will usually happen if flags on the node in question change, so
    keeping the code simple and pushing to all nodes won't hurt.  When all
    nodes come up there might be differences in connected nodes, causing
    such "fix ups".  Receiving nodes will ignore no-op pushes.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 8305f6a7f132f03b0bbdb26692b7491fd3f6c24f)

commit e93c885426dd1ad3e13750deda634c90e08bb2e5
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jul 11 21:28:43 2021 +1000

    ctdb-recoverd: Update the local node map before pushing out flags
    
    The resulting code structure looks a little weird.  However, there is
    another condition that requires the flags to be pushed that will be
    inserted before the continue statement in a subsequent commit..
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 620d07871420cdbfa055c1ace75ec1ac4c32721d)

commit 76f8dffb527caa5e12a9a4922f4315bf8a5d2ac5
Author: Martin Schwenke <martin at meltin.net>
Date:   Sun Jul 11 20:40:10 2021 +1000

    ctdb-recoverd: Add a helper variable
    
    Improves readability and simplifies subsequent changes.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14784
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 82a075d4d734588a42fca7ebaf529892d1eba853)

-----------------------------------------------------------------------

Summary of changes:
 ctdb/client/client_control_sync.c          |  68 ++++++++++++++++
 ctdb/client/client_sync.h                  |  12 +++
 ctdb/include/ctdb_private.h                |   2 +
 ctdb/protocol/protocol.h                   |   4 +-
 ctdb/protocol/protocol_api.h               |   6 ++
 ctdb/protocol/protocol_client.c            |  36 +++++++++
 ctdb/protocol/protocol_control.c           |  12 +++
 ctdb/protocol/protocol_debug.c             |   2 +
 ctdb/server/ctdb_control.c                 |  42 ++++++++++
 ctdb/server/ctdb_daemon.c                  |  35 +++++++--
 ctdb/server/ctdb_monitor.c                 |  67 ++++++++--------
 ctdb/server/ctdb_recoverd.c                | 120 +++++++++++++++--------------
 ctdb/server/ctdb_server.c                  |   1 -
 ctdb/tests/UNIT/cunit/protocol_test_101.sh |   2 +-
 ctdb/tests/src/fake_ctdbd.c                |  54 +++++++++++++
 ctdb/tests/src/protocol_common_ctdb.c      |  24 ++++++
 ctdb/tests/src/protocol_ctdb_test.c        |   2 +-
 ctdb/tools/ctdb.c                          |  57 +++-----------
 18 files changed, 400 insertions(+), 146 deletions(-)


Changeset truncated at 500 lines:

diff --git a/ctdb/client/client_control_sync.c b/ctdb/client/client_control_sync.c
index e56a2b2f18d..29e0249198c 100644
--- a/ctdb/client/client_control_sync.c
+++ b/ctdb/client/client_control_sync.c
@@ -2718,3 +2718,71 @@ int ctdb_ctrl_tunnel_deregister(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
 
 	return 0;
 }
+
+int ctdb_ctrl_disable_node(TALLOC_CTX *mem_ctx,
+			   struct tevent_context *ev,
+			   struct ctdb_client_context *client,
+			   int destnode,
+			   struct timeval timeout)
+{
+	struct ctdb_req_control request;
+	struct ctdb_reply_control *reply;
+	int ret;
+
+	ctdb_req_control_disable_node(&request);
+	ret = ctdb_client_control(mem_ctx,
+				  ev,
+				  client,
+				  destnode,
+				  timeout,
+				  &request,
+				  &reply);
+	if (ret != 0) {
+		D_ERR("Control DISABLE_NODE failed to node %u, ret=%d\n",
+		      destnode,
+		      ret);
+		return ret;
+	}
+
+	ret = ctdb_reply_control_disable_node(reply);
+	if (ret != 0) {
+		D_ERR("Control DISABLE_NODE failed, ret=%d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
+
+int ctdb_ctrl_enable_node(TALLOC_CTX *mem_ctx,
+			  struct tevent_context *ev,
+			  struct ctdb_client_context *client,
+			  int destnode,
+			  struct timeval timeout)
+{
+	struct ctdb_req_control request;
+	struct ctdb_reply_control *reply;
+	int ret;
+
+	ctdb_req_control_enable_node(&request);
+	ret = ctdb_client_control(mem_ctx,
+				  ev,
+				  client,
+				  destnode,
+				  timeout,
+				  &request,
+				  &reply);
+	if (ret != 0) {
+		D_ERR("Control ENABLE_NODE failed to node %u, ret=%d\n",
+		      destnode,
+		      ret);
+		return ret;
+	}
+
+	ret = ctdb_reply_control_enable_node(reply);
+	if (ret != 0) {
+		D_ERR("Control ENABLE_NODE failed, ret=%d\n", ret);
+		return ret;
+	}
+
+	return 0;
+}
diff --git a/ctdb/client/client_sync.h b/ctdb/client/client_sync.h
index b29e669fba4..25a9615098c 100644
--- a/ctdb/client/client_sync.h
+++ b/ctdb/client/client_sync.h
@@ -491,6 +491,18 @@ int ctdb_ctrl_tunnel_deregister(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
 				int destnode, struct timeval timeout,
 				uint64_t tunnel_id);
 
+int ctdb_ctrl_disable_node(TALLOC_CTX *mem_ctx,
+			   struct tevent_context *ev,
+			   struct ctdb_client_context *client,
+			   int destnode,
+			   struct timeval timeout);
+
+int ctdb_ctrl_enable_node(TALLOC_CTX *mem_ctx,
+			  struct tevent_context *ev,
+			  struct ctdb_client_context *client,
+			  int destnode,
+			  struct timeval timeout);
+
 /* from client/client_message_sync.c */
 
 int ctdb_message_recd_update_ip(TALLOC_CTX *mem_ctx, struct tevent_context *ev,
diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index 9ca87332d61..6f4111f1a18 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -565,6 +565,8 @@ int daemon_deregister_message_handler(struct ctdb_context *ctdb,
 void daemon_tunnel_handler(uint64_t tunnel_id, TDB_DATA data,
 			   void *private_data);
 
+struct ctdb_node *ctdb_find_node(struct ctdb_context *ctdb, uint32_t pnn);
+
 int ctdb_start_daemon(struct ctdb_context *ctdb,
 		      bool interactive,
 		      bool test_mode_enabled);
diff --git a/ctdb/protocol/protocol.h b/ctdb/protocol/protocol.h
index 35543a67cf9..403d66c3972 100644
--- a/ctdb/protocol/protocol.h
+++ b/ctdb/protocol/protocol.h
@@ -137,7 +137,7 @@ struct ctdb_call {
 /* SRVID to inform clients that an IP address has been taken over */
 #define CTDB_SRVID_TAKE_IP 0xF301000000000000LL
 
-/* SRVID to inform recovery daemon of the node flags */
+/* SRVID to inform recovery daemon of the node flags - OBSOLETE */
 #define CTDB_SRVID_SET_NODE_FLAGS 0xF400000000000000LL
 
 /* SRVID to inform recovery daemon to update public ip assignment */
@@ -376,6 +376,8 @@ enum ctdb_controls {CTDB_CONTROL_PROCESS_EXISTS          = 0,
 		    CTDB_CONTROL_VACUUM_FETCH            = 154,
 		    CTDB_CONTROL_DB_VACUUM               = 155,
 		    CTDB_CONTROL_ECHO_DATA               = 156,
+		    CTDB_CONTROL_DISABLE_NODE            = 157,
+		    CTDB_CONTROL_ENABLE_NODE             = 158,
 };
 
 #define MAX_COUNT_BUCKETS 16
diff --git a/ctdb/protocol/protocol_api.h b/ctdb/protocol/protocol_api.h
index bdb4bc0e2ea..b7fcc53dd68 100644
--- a/ctdb/protocol/protocol_api.h
+++ b/ctdb/protocol/protocol_api.h
@@ -615,6 +615,12 @@ void ctdb_req_control_echo_data(struct ctdb_req_control *request,
 				struct ctdb_echo_data *echo_data);
 int ctdb_reply_control_echo_data(struct ctdb_reply_control *reply);
 
+void ctdb_req_control_disable_node(struct ctdb_req_control *request);
+int ctdb_reply_control_disable_node(struct ctdb_reply_control *reply);
+
+void ctdb_req_control_enable_node(struct ctdb_req_control *request);
+int ctdb_reply_control_enable_node(struct ctdb_reply_control *reply);
+
 /* From protocol/protocol_debug.c */
 
 void ctdb_packet_print(uint8_t *buf, size_t buflen, FILE *fp);
diff --git a/ctdb/protocol/protocol_client.c b/ctdb/protocol/protocol_client.c
index cde544feb52..71d2f0144b3 100644
--- a/ctdb/protocol/protocol_client.c
+++ b/ctdb/protocol/protocol_client.c
@@ -2409,3 +2409,39 @@ int ctdb_reply_control_echo_data(struct ctdb_reply_control *reply)
 
 	return reply->status;
 }
+
+/* CTDB_CONTROL_DISABLE_NODE */
+
+void ctdb_req_control_disable_node(struct ctdb_req_control *request)
+{
+	request->opcode = CTDB_CONTROL_DISABLE_NODE;
+	request->pad = 0;
+	request->srvid = 0;
+	request->client_id = 0;
+	request->flags = 0;
+
+	request->rdata.opcode = CTDB_CONTROL_DISABLE_NODE;
+}
+
+int ctdb_reply_control_disable_node(struct ctdb_reply_control *reply)
+{
+	return ctdb_reply_control_generic(reply, CTDB_CONTROL_DISABLE_NODE);
+}
+
+/* CTDB_CONTROL_ENABLE_NODE */
+
+void ctdb_req_control_enable_node(struct ctdb_req_control *request)
+{
+	request->opcode = CTDB_CONTROL_ENABLE_NODE;
+	request->pad = 0;
+	request->srvid = 0;
+	request->client_id = 0;
+	request->flags = 0;
+
+	request->rdata.opcode = CTDB_CONTROL_ENABLE_NODE;
+}
+
+int ctdb_reply_control_enable_node(struct ctdb_reply_control *reply)
+{
+	return ctdb_reply_control_generic(reply, CTDB_CONTROL_ENABLE_NODE);
+}
diff --git a/ctdb/protocol/protocol_control.c b/ctdb/protocol/protocol_control.c
index 4fd5a5a7d4d..076863278a3 100644
--- a/ctdb/protocol/protocol_control.c
+++ b/ctdb/protocol/protocol_control.c
@@ -419,6 +419,12 @@ static size_t ctdb_req_control_data_len(struct ctdb_req_control_data *cd)
 	case CTDB_CONTROL_ECHO_DATA:
 		len = ctdb_echo_data_len(cd->data.echo_data);
 		break;
+
+	case CTDB_CONTROL_DISABLE_NODE:
+		break;
+
+	case CTDB_CONTROL_ENABLE_NODE:
+		break;
 	}
 
 	return len;
@@ -1418,6 +1424,12 @@ static size_t ctdb_reply_control_data_len(struct ctdb_reply_control_data *cd)
 	case CTDB_CONTROL_ECHO_DATA:
 		len = ctdb_echo_data_len(cd->data.echo_data);
 		break;
+
+	case CTDB_CONTROL_DISABLE_NODE:
+		break;
+
+	case CTDB_CONTROL_ENABLE_NODE:
+		break;
 	}
 
 	return len;
diff --git a/ctdb/protocol/protocol_debug.c b/ctdb/protocol/protocol_debug.c
index 56f14e32b09..2e5ed9f0ced 100644
--- a/ctdb/protocol/protocol_debug.c
+++ b/ctdb/protocol/protocol_debug.c
@@ -245,6 +245,8 @@ static void ctdb_opcode_print(uint32_t opcode, FILE *fp)
 		{ CTDB_CONTROL_VACUUM_FETCH, "VACUUM_FETCH" },
 		{ CTDB_CONTROL_DB_VACUUM, "DB_VACUUM" },
 		{ CTDB_CONTROL_ECHO_DATA, "ECHO_DATA" },
+		{ CTDB_CONTROL_DISABLE_NODE, "DISABLE_NODE" },
+		{ CTDB_CONTROL_ENABLE_NODE, "ENABLE_NODE" },
 		{ MAP_END, "" },
 	};
 
diff --git a/ctdb/server/ctdb_control.c b/ctdb/server/ctdb_control.c
index 95f3b175934..a9d1aa1b438 100644
--- a/ctdb/server/ctdb_control.c
+++ b/ctdb/server/ctdb_control.c
@@ -173,6 +173,40 @@ done:
 	TALLOC_FREE(state);
 }
 
+static int ctdb_control_disable_node(struct ctdb_context *ctdb)
+{
+	struct ctdb_node *node;
+
+	node = ctdb_find_node(ctdb, CTDB_CURRENT_NODE);
+	if (node == NULL) {
+		/* Can't happen */
+		DBG_ERR("Unable to find current node\n");
+		return -1;
+	}
+
+	D_ERR("Disable node\n");
+	node->flags |= NODE_FLAGS_PERMANENTLY_DISABLED;
+
+	return 0;
+}
+
+static int ctdb_control_enable_node(struct ctdb_context *ctdb)
+{
+	struct ctdb_node *node;
+
+	node = ctdb_find_node(ctdb, CTDB_CURRENT_NODE);
+	if (node == NULL) {
+		/* Can't happen */
+		DBG_ERR("Unable to find current node\n");
+		return -1;
+	}
+
+	D_ERR("Enable node\n");
+	node->flags &= ~NODE_FLAGS_PERMANENTLY_DISABLED;
+
+	return 0;
+}
+
 /*
   process a control request
  */
@@ -828,6 +862,14 @@ static int32_t ctdb_control_dispatch(struct ctdb_context *ctdb,
 		return ctdb_control_echo_data(ctdb, c, indata, async_reply);
 	}
 
+	case CTDB_CONTROL_DISABLE_NODE:
+		CHECK_CONTROL_DATA_SIZE(0);
+		return ctdb_control_disable_node(ctdb);
+
+	case CTDB_CONTROL_ENABLE_NODE:
+		CHECK_CONTROL_DATA_SIZE(0);
+		return ctdb_control_enable_node(ctdb);
+
 	default:
 		DEBUG(DEBUG_CRIT,(__location__ " Unknown CTDB control opcode %u\n", opcode));
 		return -1;
diff --git a/ctdb/server/ctdb_daemon.c b/ctdb/server/ctdb_daemon.c
index 7ebb419bc1f..f64a0475348 100644
--- a/ctdb/server/ctdb_daemon.c
+++ b/ctdb/server/ctdb_daemon.c
@@ -1225,28 +1225,51 @@ failed:
 	return -1;	
 }
 
-static void initialise_node_flags (struct ctdb_context *ctdb)
+struct ctdb_node *ctdb_find_node(struct ctdb_context *ctdb, uint32_t pnn)
 {
+	struct ctdb_node *node = NULL;
 	unsigned int i;
 
+	if (pnn == CTDB_CURRENT_NODE) {
+		pnn = ctdb->pnn;
+	}
+
 	/* Always found: PNN correctly set just before this is called */
 	for (i = 0; i < ctdb->num_nodes; i++) {
-		if (ctdb->pnn == ctdb->nodes[i]->pnn) {
-			break;
+		node = ctdb->nodes[i];
+		if (pnn == node->pnn) {
+			return node;
 		}
 	}
 
-	ctdb->nodes[i]->flags &= ~NODE_FLAGS_DISCONNECTED;
+	return NULL;
+}
+
+static void initialise_node_flags (struct ctdb_context *ctdb)
+{
+	struct ctdb_node *node = NULL;
+
+	node = ctdb_find_node(ctdb, CTDB_CURRENT_NODE);
+	/*
+	 * PNN correctly set just before this is called so always
+	 * found but keep static analysers happy...
+	 */
+	if (node == NULL) {
+		DBG_ERR("Unable to find current node\n");
+		return;
+	}
+
+	node->flags &= ~NODE_FLAGS_DISCONNECTED;
 
 	/* do we start out in DISABLED mode? */
 	if (ctdb->start_as_disabled != 0) {
 		D_ERR("This node is configured to start in DISABLED state\n");
-		ctdb->nodes[i]->flags |= NODE_FLAGS_DISABLED;
+		node->flags |= NODE_FLAGS_PERMANENTLY_DISABLED;
 	}
 	/* do we start out in STOPPED mode? */
 	if (ctdb->start_as_stopped != 0) {
 		D_ERR("This node is configured to start in STOPPED state\n");
-		ctdb->nodes[i]->flags |= NODE_FLAGS_STOPPED;
+		node->flags |= NODE_FLAGS_STOPPED;
 	}
 }
 
diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c
index 5c694bde969..ab58ec485fe 100644
--- a/ctdb/server/ctdb_monitor.c
+++ b/ctdb/server/ctdb_monitor.c
@@ -455,52 +455,55 @@ int32_t ctdb_control_modflags(struct ctdb_context *ctdb, TDB_DATA indata)
 	struct ctdb_node *node;
 	uint32_t old_flags;
 
-	if (c->pnn >= ctdb->num_nodes) {
-		DEBUG(DEBUG_ERR,(__location__ " Node %d is invalid, num_nodes :%d\n", c->pnn, ctdb->num_nodes));
-		return -1;
+	/*
+	 * Don't let other nodes override the current node's flags.
+	 * The recovery master fetches flags from this node so there's
+	 * no need to push them back.  Doing so is racy.
+	 */
+	if (c->pnn == ctdb->pnn) {
+		DBG_DEBUG("Ignoring flag changes for current node\n");
+		return 0;
 	}
 
-	node         = ctdb->nodes[c->pnn];
-	old_flags    = node->flags;
-	if (c->pnn != ctdb->pnn) {
-		c->old_flags  = node->flags;
+	node = ctdb_find_node(ctdb, c->pnn);
+	if (node == NULL) {
+		DBG_ERR("Node %u is invalid\n", c->pnn);
+		return -1;
 	}
-	node->flags   = c->new_flags & ~NODE_FLAGS_DISCONNECTED;
-	node->flags  |= (c->old_flags & NODE_FLAGS_DISCONNECTED);
 
-	/* we don't let other nodes modify our STOPPED status */
-	if (c->pnn == ctdb->pnn) {
-		node->flags &= ~NODE_FLAGS_STOPPED;
-		if (old_flags & NODE_FLAGS_STOPPED) {
-			node->flags |= NODE_FLAGS_STOPPED;
-		}
+	if (node->flags & NODE_FLAGS_DISCONNECTED) {
+		DBG_DEBUG("Ignoring flag changes for disconnected node\n");
+		return 0;
 	}
 
-	/* we don't let other nodes modify our BANNED status */
-	if (c->pnn == ctdb->pnn) {
-		node->flags &= ~NODE_FLAGS_BANNED;
-		if (old_flags & NODE_FLAGS_BANNED) {
-			node->flags |= NODE_FLAGS_BANNED;
-		}
-	}
+	/*
+	 * Remember the old flags.  We don't care what some other node
+	 * thought the old flags were - that's irrelevant.
+	 */
+	old_flags = node->flags;
 
-	if (node->flags == c->old_flags) {
-		DEBUG(DEBUG_INFO, ("Control modflags on node %u - Unchanged - flags 0x%x\n", c->pnn, node->flags));
+	/*
+	 * This node tracks nodes it is connected to, so don't let
+	 * another node override this
+	 */
+	node->flags =
+		(old_flags & NODE_FLAGS_DISCONNECTED) |
+		(c->new_flags & ~NODE_FLAGS_DISCONNECTED);
+
+	if (node->flags == old_flags) {
 		return 0;
 	}
 
-	DEBUG(DEBUG_INFO, ("Control modflags on node %u - flags now 0x%x\n", c->pnn, node->flags));
+	D_NOTICE("Node %u has changed flags - 0x%x -> 0x%x\n",
+		 c->pnn,
+		 old_flags,
+		 node->flags);
 
 	if (node->flags == 0 && ctdb->runstate <= CTDB_RUNSTATE_STARTUP) {
-		DEBUG(DEBUG_ERR, (__location__ " Node %u became healthy - force recovery for startup\n",
-				  c->pnn));
+		DBG_ERR("Node %u became healthy - force recovery for startup\n",
+			c->pnn);
 		ctdb->recovery_mode = CTDB_RECOVERY_ACTIVE;
 	}
 
-	/* tell the recovery daemon something has changed */
-	c->new_flags = node->flags;
-	ctdb_daemon_send_message(ctdb, ctdb->pnn,
-				 CTDB_SRVID_SET_NODE_FLAGS, indata);
-
 	return 0;
 }
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 856fcbb72c8..28ef9468cd4 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -553,40 +553,73 @@ static int update_flags(struct ctdb_recoverd *rec,
 	for (j=0; j<nodemap->num; j++) {
 		struct ctdb_node_map_old *remote_nodemap=NULL;
 		uint32_t local_flags = nodemap->nodes[j].flags;
+		uint32_t remote_pnn = nodemap->nodes[j].pnn;
 		uint32_t remote_flags;
+		unsigned int i;
 		int ret;
 
 		if (local_flags & NODE_FLAGS_DISCONNECTED) {
 			continue;
 		}
-		if (nodemap->nodes[j].pnn == ctdb->pnn) {
-			continue;
+		if (remote_pnn == ctdb->pnn) {
+			/*
+			 * No remote nodemap for this node since this
+			 * is the local nodemap.  However, still need
+			 * to check this against the remote nodes and
+			 * push it if they are out-of-date.
+			 */
+			goto compare_remotes;
 		}
 
 		remote_nodemap = remote_nodemaps[j];
 		remote_flags = remote_nodemap->nodes[j].flags;
 
 		if (local_flags != remote_flags) {
-			ret = update_flags_on_all_nodes(rec,
-							nodemap->nodes[j].pnn,
-							remote_flags);
-			if (ret != 0) {
-				DBG_ERR(
-				    "Unable to update flags on remote nodes\n");
-				talloc_free(mem_ctx);
-				return -1;
-			}
-
 			/*
 			 * Update the local copy of the flags in the
 			 * recovery daemon.
 			 */
 			D_NOTICE("Remote node %u had flags 0x%x, "
 				 "local had 0x%x - updating local\n",


-- 
Samba Shared Repository



More information about the samba-cvs mailing list