[SCM] Samba Shared Repository - branch v4-10-test updated

Karolin Seeger kseeger at samba.org
Wed Aug 28 12:08:02 UTC 2019


The branch, v4-10-test has been updated
       via  040a483956a ctdb-daemon: Make node inactive in the NODE_STOP control
       via  7dd839c7f2a ctdb-daemon: Drop unused function ctdb_local_node_got_banned()
       via  d14e656f21b ctdb-daemon: Switch banning code to use ctdb_node_become_inactive()
       via  916f0db0d1b ctdb-daemon: Factor out new function ctdb_node_become_inactive()
       via  e224ff934e1 ctdb-tcp: Mark node as disconnected if incoming connection goes away
       via  7f0af1f925f ctdb-tcp: Only mark a node connected if both directions are up
       via  cd0d85bb4e4 ctdb-tcp: Create outbound queue when the connection becomes writable
       via  e41e2feba0a ctdb-tcp: Use TALLOC_FREE()
       via  b31d8dc286c ctdb-tcp: Move incoming fd and queue into struct ctdb_tcp_node
       via  bf08a2d958b ctdb-tcp: Rename fd -> out_fd
       via  611610cff8d ctdb-daemon: Add function ctdb_ip_to_node()
       via  5684a9b8ab9 ctdb-daemon: Replace function ctdb_ip_to_nodeid() with ctdb_ip_to_pnn()
      from  52f6e7cd578 vfs_glusterfs: Enable profiling for file system operations

https://git.samba.org/?p=samba.git;a=shortlog;h=v4-10-test


- Log -----------------------------------------------------------------
commit 040a483956a5fb5f1cf175e91e02d47c000cd793
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 19 21:48:04 2019 +1000

    ctdb-daemon: Make node inactive in the NODE_STOP control
    
    Currently some of this is supported by a periodic check in the
    recovery daemon's main_loop(), which notices the flag change, sets
    recovery mode active and freezes databases.  If STOP_NODE returns
    immediately then the associated recovery can complete and the node can
    be continued before databases are actually frozen.
    
    Instead, immediately do all of the things that make a node inactive.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
    RN: Stop "ctdb stop" from completing before freezing databases
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
    Autobuild-Date(master): Tue Aug 20 08:32:27 UTC 2019 on sn-devel-184
    
    (cherry picked from commit e9f2e205ee89f4f3d6302cc11b4d0eb2efaf0f53)
    
    Autobuild-User(v4-10-test): Karolin Seeger <kseeger at samba.org>
    Autobuild-Date(v4-10-test): Wed Aug 28 12:07:00 UTC 2019 on sn-devel-144

commit 7dd839c7f2a2c01fec5a390fefe3aac554d5f871
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 20 11:29:42 2019 +1000

    ctdb-daemon: Drop unused function ctdb_local_node_got_banned()
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 91ac4c13d8472955d1f04bd775ec4b3ff8bf1b61)

commit d14e656f21b1df092f9697d26d53ecf5a075d492
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 19 21:52:57 2019 +1000

    ctdb-daemon: Switch banning code to use ctdb_node_become_inactive()
    
    There's no reason to avoid immediately setting recovery mode to active
    and initiating freeze of databases.
    
    This effectively reverts the following commits:
    
      d8f3b490bbb691c9916eed0df5b980c1aef23c85
      b4357a79d916b1f8ade8fa78563fbef0ce670aa9
    
    The latter is now implemented using a control, resulting in looser
    coupling.
    
    See also the following commit:
    
      f8141e91a693912ea1107a49320e83702a80757a
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 0f5f7b7cf4e970f3f36c5e0b3d09e710fe90801a)

commit 916f0db0d1ba4a1dd2f5a78c0b832f228b15066f
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Aug 19 21:47:03 2019 +1000

    ctdb-daemon: Factor out new function ctdb_node_become_inactive()
    
    This is a superset of ctdb_local_node_got_banned() so will replace
    that function, and will also be used in the NODE_STOP control.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14087
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit a42bcaabb63722411bee52b80cbfc795593defbc)

commit e224ff934e1f26201cb2f85e8e8ab1b382587979
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Aug 13 17:08:43 2019 +1000

    ctdb-tcp: Mark node as disconnected if incoming connection goes away
    
    To make it easy to pass the node data to the upcall, the private data
    for ctdb_tcp_read_cb() needs to be changed from tnode to node.
    
    RN: Avoid marking a node as connected before it can receive packets
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Martin Schwenke <martins at samba.org>
    Autobuild-Date(master): Fri Aug 16 22:50:35 UTC 2019 on sn-devel-184
    
    (cherry picked from commit 73c850eda4209b688a169aeeb20c453b738cbb35)

commit 7f0af1f925f2efa866eb10d7a344f5f6808273fb
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 15:33:05 2019 +1000

    ctdb-tcp: Only mark a node connected if both directions are up
    
    Nodes are currently marked as up if the outgoing connection is
    established.  However, if the incoming connection is not yet
    established then this node could send a request where the replying
    node can not queue its reply.  Wait until both directions are up
    before marking a node as connected.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 8c98c10f242bc722beffc711e85c0e4f2e74cd57)

commit cd0d85bb4e4506fa6e930c055b1750f52a44b85d
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 15 15:57:31 2019 +1000

    ctdb-tcp: Create outbound queue when the connection becomes writable
    
    Since commit ddd97553f0a8bfaada178ec4a7460d76fa21f079
    ctdb_queue_send() doesn't queue a packet if the connection isn't yet
    established (i.e. when fd == -1).  So, don't bother creating the
    outbound queue during initialisation but create it when the connection
    becomes writable.
    
    Now the presence of the queue indicates that the outbound connection
    is up.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 7f4854d9643a096a6d8a354fcd27b7c6ed24a75e)

commit e41e2feba0a92121ff975a62d42dfaa8998f11d5
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 15 15:45:16 2019 +1000

    ctdb-tcp: Use TALLOC_FREE()
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit d80d9edb4dc107b15a35a39e5c966a3eaed6453a)

commit b31d8dc286c557ab03feef19e135d307a9889e22
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 15:29:36 2019 +1000

    ctdb-tcp: Move incoming fd and queue into struct ctdb_tcp_node
    
    This makes it easy to track both incoming and outgoing connectivity
    states.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit c68b6f96f26664459187ab2fbd56767fb31767e0)

commit bf08a2d958bd2eb389882cc8fbe39790114c87a8
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Aug 9 15:06:34 2019 +1000

    ctdb-tcp: Rename fd -> out_fd
    
    in_fd is coming soon.
    
    Fix coding style violations in the affected and adjacent lines.
    Modernise some debug macros and make them more consistent (e.g. drop
    logging of errno when strerror(errno) is already logged.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit c06620169fc178ea6db2631f03edf008285d8cf2)

commit 611610cff8d9505621fb4b462ebac8222d45dc46
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Aug 8 16:20:44 2019 +1000

    ctdb-daemon: Add function ctdb_ip_to_node()
    
    This is the core logic from ctdb_ip_to_pnn(), so re-implement that
    that function using ctdb_ip_to_node().
    
    Something similar (ctdb_ip_to_nodeid()) was recently removed in commit
    010c1d77cd7e192b1fff39b7b91fccbdbbf4a786 because it wasn't required.
    Now there is a use case.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=14084
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 3acb8e9d1c854b577d6be282257269df83055d31)

commit 5684a9b8ab93e56017dd95b2b6d4f5d8a3565121
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat Jun 22 05:53:15 2019 +1000

    ctdb-daemon: Replace function ctdb_ip_to_nodeid() with ctdb_ip_to_pnn()
    
    Node ID is a poorly defined concept, indicating the slot in the node
    map where the IP address was found.  This signed value also ends up
    compared to num_nodes, which is unsigned, producing unwanted warnings.
    
    Just return the PNN because this what both callers really want.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 010c1d77cd7e192b1fff39b7b91fccbdbbf4a786)

-----------------------------------------------------------------------

Summary of changes:
 ctdb/include/ctdb_private.h |   8 +-
 ctdb/server/ctdb_banning.c  |  26 +-----
 ctdb/server/ctdb_daemon.c   |  11 +--
 ctdb/server/ctdb_recover.c  |  45 ++++++++++
 ctdb/server/ctdb_server.c   |  28 ++++--
 ctdb/tcp/ctdb_tcp.h         |  16 ++--
 ctdb/tcp/tcp_connect.c      | 212 +++++++++++++++++++++++++++++---------------
 ctdb/tcp/tcp_init.c         |  21 +++--
 ctdb/tcp/tcp_io.c           |  17 +++-
 9 files changed, 249 insertions(+), 135 deletions(-)


Changeset truncated at 500 lines:

diff --git a/ctdb/include/ctdb_private.h b/ctdb/include/ctdb_private.h
index ea00bb12128..0c66725d36c 100644
--- a/ctdb/include/ctdb_private.h
+++ b/ctdb/include/ctdb_private.h
@@ -481,7 +481,6 @@ int ctdb_ibw_init(struct ctdb_context *ctdb);
 
 /* from ctdb_banning.c */
 
-void ctdb_local_node_got_banned(struct ctdb_context *ctdb);
 int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata);
 int32_t ctdb_control_get_ban_state(struct ctdb_context *ctdb, TDB_DATA *outdata);
 void ctdb_ban_self(struct ctdb_context *ctdb);
@@ -829,6 +828,8 @@ int32_t ctdb_control_recd_ping(struct ctdb_context *ctdb);
 int32_t ctdb_control_set_recmaster(struct ctdb_context *ctdb,
 				   uint32_t opcode, TDB_DATA indata);
 
+void ctdb_node_become_inactive(struct ctdb_context *ctdb);
+
 int32_t ctdb_control_stop_node(struct ctdb_context *ctdb);
 int32_t ctdb_control_continue_node(struct ctdb_context *ctdb);
 
@@ -841,7 +842,10 @@ void ctdb_stop_recoverd(struct ctdb_context *ctdb);
 
 int ctdb_set_transport(struct ctdb_context *ctdb, const char *transport);
 
-int ctdb_ip_to_nodeid(struct ctdb_context *ctdb, const ctdb_sock_addr *nodeip);
+struct ctdb_node *ctdb_ip_to_node(struct ctdb_context *ctdb,
+				  const ctdb_sock_addr *nodeip);
+uint32_t ctdb_ip_to_pnn(struct ctdb_context *ctdb,
+			const ctdb_sock_addr *nodeip);
 
 void ctdb_load_nodes_file(struct ctdb_context *ctdb);
 
diff --git a/ctdb/server/ctdb_banning.c b/ctdb/server/ctdb_banning.c
index 9cd163645a1..3c711575e8c 100644
--- a/ctdb/server/ctdb_banning.c
+++ b/ctdb/server/ctdb_banning.c
@@ -57,30 +57,6 @@ static void ctdb_ban_node_event(struct tevent_context *ev,
 	}
 }
 
-void ctdb_local_node_got_banned(struct ctdb_context *ctdb)
-{
-	struct ctdb_db_context *ctdb_db;
-
-	DEBUG(DEBUG_NOTICE, ("This node has been banned - releasing all public "
-			     "IPs and setting the generation to INVALID.\n"));
-
-	/* Reset the generation id to 1 to make us ignore any
-	   REQ/REPLY CALL/DMASTER someone sends to us.
-	   We are now banned so we shouldnt service database calls
-	   anymore.
-	*/
-	ctdb->vnn_map->generation = INVALID_GENERATION;
-	for (ctdb_db = ctdb->db_list; ctdb_db != NULL; ctdb_db = ctdb_db->next) {
-		ctdb_db->generation = INVALID_GENERATION;
-	}
-
-	/* Recovery daemon will set the recovery mode ACTIVE and freeze
-	 * databases.
-	 */
-
-	ctdb_release_all_ips(ctdb);
-}
-
 int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata)
 {
 	struct ctdb_ban_state *bantime = (struct ctdb_ban_state *)indata.dptr;
@@ -129,7 +105,7 @@ int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata)
 			 ctdb_ban_node_event, ctdb);
 
 	if (!already_banned) {
-		ctdb_local_node_got_banned(ctdb);
+		ctdb_node_become_inactive(ctdb);
 	}
 	return 0;
 }
diff --git a/ctdb/server/ctdb_daemon.c b/ctdb/server/ctdb_daemon.c
index aa0694548f8..95b5b6381de 100644
--- a/ctdb/server/ctdb_daemon.c
+++ b/ctdb/server/ctdb_daemon.c
@@ -1251,21 +1251,18 @@ static void ctdb_initialise_vnn_map(struct ctdb_context *ctdb)
 
 static void ctdb_set_my_pnn(struct ctdb_context *ctdb)
 {
-	int nodeid;
-
 	if (ctdb->address == NULL) {
 		ctdb_fatal(ctdb,
 			   "Can not determine PNN - node address is not set\n");
 	}
 
-	nodeid = ctdb_ip_to_nodeid(ctdb, ctdb->address);
-	if (nodeid == -1) {
+	ctdb->pnn = ctdb_ip_to_pnn(ctdb, ctdb->address);
+	if (ctdb->pnn == CTDB_UNKNOWN_PNN) {
 		ctdb_fatal(ctdb,
-			   "Can not determine PNN - node address not found in node list\n");
+			   "Can not determine PNN - unknown node address\n");
 	}
 
-	ctdb->pnn = ctdb->nodes[nodeid]->pnn;
-	DEBUG(DEBUG_NOTICE, ("PNN is %u\n", ctdb->pnn));
+	D_NOTICE("PNN is %u\n", ctdb->pnn);
 }
 
 /*
diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c
index cfe77f643a6..f7a73982a71 100644
--- a/ctdb/server/ctdb_recover.c
+++ b/ctdb/server/ctdb_recover.c
@@ -1418,12 +1418,57 @@ int32_t ctdb_control_set_recmaster(struct ctdb_context *ctdb, uint32_t opcode, T
 	return 0;
 }
 
+void ctdb_node_become_inactive(struct ctdb_context *ctdb)
+{
+	struct ctdb_db_context *ctdb_db;
+
+	D_WARNING("Making node INACTIVE\n");
+
+	/*
+	 * Do not service database calls - reset generation to invalid
+	 * so this node ignores any REQ/REPLY CALL/DMASTER
+	 */
+	ctdb->vnn_map->generation = INVALID_GENERATION;
+	for (ctdb_db = ctdb->db_list; ctdb_db != NULL; ctdb_db = ctdb_db->next) {
+		ctdb_db->generation = INVALID_GENERATION;
+	}
+
+	/*
+	 * Although this bypasses the control, the only thing missing
+	 * is the deferred drop of all public IPs, which isn't
+	 * necessary because they are dropped below
+	 */
+	if (ctdb->recovery_mode != CTDB_RECOVERY_ACTIVE) {
+		D_NOTICE("Recovery mode set to ACTIVE\n");
+		ctdb->recovery_mode = CTDB_RECOVERY_ACTIVE;
+	}
+
+	/*
+	 * Initiate database freeze - this will be scheduled for
+	 * immediate execution and will be in progress long before the
+	 * calling control returns
+	 */
+	ctdb_daemon_send_control(ctdb,
+				 ctdb->pnn,
+				 0,
+				 CTDB_CONTROL_FREEZE,
+				 0,
+				 CTDB_CTRL_FLAG_NOREPLY,
+				 tdb_null,
+				 NULL,
+				 NULL);
+
+	D_NOTICE("Dropping all public IP addresses\n");
+	ctdb_release_all_ips(ctdb);
+}
 
 int32_t ctdb_control_stop_node(struct ctdb_context *ctdb)
 {
 	DEBUG(DEBUG_ERR, ("Stopping node\n"));
 	ctdb->nodes[ctdb->pnn]->flags |= NODE_FLAGS_STOPPED;
 
+	ctdb_node_become_inactive(ctdb);
+
 	return 0;
 }
 
diff --git a/ctdb/server/ctdb_server.c b/ctdb/server/ctdb_server.c
index c991b85d99b..ddff85b81c5 100644
--- a/ctdb/server/ctdb_server.c
+++ b/ctdb/server/ctdb_server.c
@@ -45,24 +45,36 @@ int ctdb_set_transport(struct ctdb_context *ctdb, const char *transport)
 	return 0;
 }
 
-/*
-  Check whether an ip is a valid node ip
-  Returns the node id for this ip address or -1
-*/
-int ctdb_ip_to_nodeid(struct ctdb_context *ctdb, const ctdb_sock_addr *nodeip)
+/* Return the node structure for nodeip, NULL if nodeip is invalid */
+struct ctdb_node *ctdb_ip_to_node(struct ctdb_context *ctdb,
+				  const ctdb_sock_addr *nodeip)
 {
-	int nodeid;
+	unsigned int nodeid;
 
 	for (nodeid=0;nodeid<ctdb->num_nodes;nodeid++) {
 		if (ctdb->nodes[nodeid]->flags & NODE_FLAGS_DELETED) {
 			continue;
 		}
 		if (ctdb_same_ip(&ctdb->nodes[nodeid]->address, nodeip)) {
-			return nodeid;
+			return ctdb->nodes[nodeid];
 		}
 	}
 
-	return -1;
+	return NULL;
+}
+
+/* Return the PNN for nodeip, CTDB_UNKNOWN_PNN if nodeip is invalid */
+uint32_t ctdb_ip_to_pnn(struct ctdb_context *ctdb,
+			const ctdb_sock_addr *nodeip)
+{
+	struct ctdb_node *node;
+
+	node = ctdb_ip_to_node(ctdb, nodeip);
+	if (node == NULL) {
+		return CTDB_UNKNOWN_PNN;
+	}
+
+	return node->pnn;
 }
 
 /* Load a nodes list file into a nodes array */
diff --git a/ctdb/tcp/ctdb_tcp.h b/ctdb/tcp/ctdb_tcp.h
index 0a998c94da4..9a615fc6393 100644
--- a/ctdb/tcp/ctdb_tcp.h
+++ b/ctdb/tcp/ctdb_tcp.h
@@ -26,23 +26,19 @@ struct ctdb_tcp {
 	int listen_fd;
 };
 
-/*
-  state associated with an incoming connection
-*/
-struct ctdb_incoming {
-	struct ctdb_context *ctdb;
-	int fd;
-	struct ctdb_queue *queue;
-};
-
 /*
   state associated with one tcp node
 */
 struct ctdb_tcp_node {
-	int fd;
+	int out_fd;
 	struct ctdb_queue *out_queue;
+
 	struct tevent_fd *connect_fde;
 	struct tevent_timer *connect_te;
+
+	struct ctdb_context *ctdb;
+	int in_fd;
+	struct ctdb_queue *in_queue;
 };
 
 
diff --git a/ctdb/tcp/tcp_connect.c b/ctdb/tcp/tcp_connect.c
index 385547e0e78..f02340c1789 100644
--- a/ctdb/tcp/tcp_connect.c
+++ b/ctdb/tcp/tcp_connect.c
@@ -44,15 +44,13 @@ void ctdb_tcp_stop_connection(struct ctdb_node *node)
 {
 	struct ctdb_tcp_node *tnode = talloc_get_type(
 		node->private_data, struct ctdb_tcp_node);
-	
-	ctdb_queue_set_fd(tnode->out_queue, -1);
-	talloc_free(tnode->connect_te);
-	talloc_free(tnode->connect_fde);
-	tnode->connect_fde = NULL;
-	tnode->connect_te = NULL;
-	if (tnode->fd != -1) {
-		close(tnode->fd);
-		tnode->fd = -1;
+
+	TALLOC_FREE(tnode->out_queue);
+	TALLOC_FREE(tnode->connect_te);
+	TALLOC_FREE(tnode->connect_fde);
+	if (tnode->out_fd != -1) {
+		close(tnode->out_fd);
+		tnode->out_fd = -1;
 	}
 }
 
@@ -93,12 +91,13 @@ static void ctdb_node_connect_write(struct tevent_context *ev,
 	int error = 0;
 	socklen_t len = sizeof(error);
 	int one = 1;
+	int ret;
 
 	talloc_free(tnode->connect_te);
 	tnode->connect_te = NULL;
 
-	if (getsockopt(tnode->fd, SOL_SOCKET, SO_ERROR, &error, &len) != 0 ||
-	    error != 0) {
+	ret = getsockopt(tnode->out_fd, SOL_SOCKET, SO_ERROR, &error, &len);
+	if (ret != 0 || error != 0) {
 		ctdb_tcp_stop_connection(node);
 		tnode->connect_te = tevent_add_timer(ctdb->ev, tnode,
 						    timeval_current_ofs(1, 0),
@@ -109,22 +108,54 @@ static void ctdb_node_connect_write(struct tevent_context *ev,
 	talloc_free(tnode->connect_fde);
 	tnode->connect_fde = NULL;
 
-        if (setsockopt(tnode->fd,IPPROTO_TCP,TCP_NODELAY,(char *)&one,sizeof(one)) == -1) {
-		DEBUG(DEBUG_WARNING, ("Failed to set TCP_NODELAY on fd - %s\n",
-				      strerror(errno)));
+	ret = setsockopt(tnode->out_fd,
+			 IPPROTO_TCP,
+			 TCP_NODELAY,
+			 (char *)&one,
+			 sizeof(one));
+	if (ret == -1) {
+		DBG_WARNING("Failed to set TCP_NODELAY on fd - %s\n",
+			  strerror(errno));
 	}
-        if (setsockopt(tnode->fd,SOL_SOCKET,SO_KEEPALIVE,(char *)&one,sizeof(one)) == -1) {
-		DEBUG(DEBUG_WARNING, ("Failed to set KEEPALIVE on fd - %s\n",
-				      strerror(errno)));
+	ret = setsockopt(tnode->out_fd,
+			 SOL_SOCKET,
+			 SO_KEEPALIVE,(char *)&one,
+			 sizeof(one));
+	if (ret == -1) {
+		DBG_WARNING("Failed to set KEEPALIVE on fd - %s\n",
+			    strerror(errno));
 	}
 
-	ctdb_queue_set_fd(tnode->out_queue, tnode->fd);
+	tnode->out_queue = ctdb_queue_setup(node->ctdb,
+					    tnode,
+					    tnode->out_fd,
+					    CTDB_TCP_ALIGNMENT,
+					    ctdb_tcp_tnode_cb,
+					    node,
+					    "to-node-%s",
+					    node->name);
+	if (tnode->out_queue == NULL) {
+		DBG_ERR("Failed to set up outgoing queue\n");
+		ctdb_tcp_stop_connection(node);
+		tnode->connect_te = tevent_add_timer(ctdb->ev,
+						     tnode,
+						     timeval_current_ofs(1, 0),
+						     ctdb_tcp_node_connect,
+						     node);
+		return;
+	}
 
 	/* the queue subsystem now owns this fd */
-	tnode->fd = -1;
+	tnode->out_fd = -1;
 
-	/* tell the ctdb layer we are connected */
-	node->ctdb->upcalls->node_connected(node);
+	/*
+	 * Mark the node to which this connection has been established
+	 * as connected, but only if the corresponding listening
+	 * socket is also connected
+	 */
+	if (tnode->in_fd != -1) {
+		node->ctdb->upcalls->node_connected(node);
+	}
 }
 
 
@@ -149,26 +180,24 @@ void ctdb_tcp_node_connect(struct tevent_context *ev, struct tevent_timer *te,
 
 	sock_out = node->address;
 
-	tnode->fd = socket(sock_out.sa.sa_family, SOCK_STREAM, IPPROTO_TCP);
-	if (tnode->fd == -1) {
-		DEBUG(DEBUG_ERR, (__location__ " Failed to create socket\n"));
+	tnode->out_fd = socket(sock_out.sa.sa_family, SOCK_STREAM, IPPROTO_TCP);
+	if (tnode->out_fd == -1) {
+		DBG_ERR("Failed to create socket\n");
 		return;
 	}
 
-	ret = set_blocking(tnode->fd, false);
+	ret = set_blocking(tnode->out_fd, false);
 	if (ret != 0) {
-		DEBUG(DEBUG_ERR,
-		      (__location__
-		       " failed to set socket non-blocking (%s)\n",
-		       strerror(errno)));
-		close(tnode->fd);
-		tnode->fd = -1;
+		DBG_ERR("Failed to set socket non-blocking (%s)\n",
+			strerror(errno));
+		close(tnode->out_fd);
+		tnode->out_fd = -1;
 		return;
 	}
 
-	set_close_on_exec(tnode->fd);
+	set_close_on_exec(tnode->out_fd);
 
-	DEBUG(DEBUG_DEBUG, (__location__ " Created TCP SOCKET FD:%d\n", tnode->fd));
+	DBG_DEBUG("Created TCP SOCKET FD:%d\n", tnode->out_fd);
 
 	/* Bind our side of the socketpair to the same address we use to listen
 	 * on incoming CTDB traffic.
@@ -197,39 +226,48 @@ void ctdb_tcp_node_connect(struct tevent_context *ev, struct tevent_timer *te,
 	default:
 		DEBUG(DEBUG_ERR, (__location__ " unknown family %u\n",
 			sock_in.sa.sa_family));
-		close(tnode->fd);
-		tnode->fd = -1;
+		close(tnode->out_fd);
+		tnode->out_fd = -1;
 		return;
 	}
 
-	if (bind(tnode->fd, (struct sockaddr *)&sock_in, sockin_size) == -1) {
-		DEBUG(DEBUG_ERR, (__location__ " Failed to bind socket %s(%d)\n",
-				  strerror(errno), errno));
-		close(tnode->fd);
-		tnode->fd = -1;
+	ret = bind(tnode->out_fd, (struct sockaddr *)&sock_in, sockin_size);
+	if (ret == -1) {
+		DBG_ERR("Failed to bind socket (%s)\n", strerror(errno));
+		close(tnode->out_fd);
+		tnode->out_fd = -1;
 		return;
 	}
 
-	if (connect(tnode->fd, (struct sockaddr *)&sock_out, sockout_size) != 0 &&
-	    errno != EINPROGRESS) {
+	ret = connect(tnode->out_fd,
+		      (struct sockaddr *)&sock_out,
+		      sockout_size);
+	if (ret != 0 && errno != EINPROGRESS) {
 		ctdb_tcp_stop_connection(node);
-		tnode->connect_te = tevent_add_timer(ctdb->ev, tnode,
+		tnode->connect_te = tevent_add_timer(ctdb->ev,
+						     tnode,
 						     timeval_current_ofs(1, 0),
-						     ctdb_tcp_node_connect, node);
+						     ctdb_tcp_node_connect,
+						     node);
 		return;
 	}
 
 	/* non-blocking connect - wait for write event */
-	tnode->connect_fde = tevent_add_fd(node->ctdb->ev, tnode, tnode->fd,
+	tnode->connect_fde = tevent_add_fd(node->ctdb->ev,
+					   tnode,
+					   tnode->out_fd,
 					   TEVENT_FD_WRITE|TEVENT_FD_READ,
-					   ctdb_node_connect_write, node);
+					   ctdb_node_connect_write,
+					   node);
 
 	/* don't give it long to connect - retry in one second. This ensures
 	   that we find a node is up quickly (tcp normally backs off a syn reply
 	   delay by quite a lot) */
-	tnode->connect_te = tevent_add_timer(ctdb->ev, tnode,
+	tnode->connect_te = tevent_add_timer(ctdb->ev,
+					     tnode,
 					     timeval_current_ofs(1, 0),
-					     ctdb_tcp_node_connect, node);
+					     ctdb_tcp_node_connect,
+					     node);
 }
 
 /*
@@ -244,8 +282,9 @@ static void ctdb_listen_event(struct tevent_context *ev, struct tevent_fd *fde,
 	struct ctdb_tcp *ctcp = talloc_get_type(ctdb->private_data, struct ctdb_tcp);
 	ctdb_sock_addr addr;
 	socklen_t len;
-	int fd, nodeid;
-	struct ctdb_incoming *in;
+	int fd;
+	struct ctdb_node *node;
+	struct ctdb_tcp_node *tnode;
 	int one = 1;
 	int ret;
 
@@ -255,41 +294,70 @@ static void ctdb_listen_event(struct tevent_context *ev, struct tevent_fd *fde,
 	if (fd == -1) return;
 	smb_set_close_on_exec(fd);
 
-	nodeid = ctdb_ip_to_nodeid(ctdb, &addr);
-
-	if (nodeid == -1) {
-		DEBUG(DEBUG_ERR, ("Refused connection from unknown node %s\n", ctdb_addr_to_str(&addr)));
+	node = ctdb_ip_to_node(ctdb, &addr);
+	if (node == NULL) {
+		D_ERR("Refused connection from unknown node %s\n",
+		      ctdb_addr_to_str(&addr));
 		close(fd);
 		return;
 	}
 
-	in = talloc_zero(ctcp, struct ctdb_incoming);
-	in->fd = fd;
-	in->ctdb = ctdb;
+	tnode = talloc_get_type_abort(node->private_data,
+				      struct ctdb_tcp_node);
+	if (tnode == NULL) {
+		/* This can't happen - see ctdb_tcp_initialise() */
+		DBG_ERR("INTERNAL ERROR setting up connection from node %s\n",
+			ctdb_addr_to_str(&addr));
+		close(fd);
+		return;


-- 
Samba Shared Repository



More information about the samba-cvs mailing list