[PATCH] ctdb: try to fix ctdb endless banning loop

Michael Adam obnox at samba.org
Wed Jun 1 09:49:02 UTC 2016


On 2016-06-01 at 15:58 +1000, Amitay Isaacs wrote:
> On Wed, Jun 1, 2016 at 12:28 PM, Michael Adam <obnox at samba.org> wrote:
> 
> > On 2016-06-01 at 12:15 +1000, Amitay Isaacs wrote:
> > > Here's a slightly better patch.
> >
> > That looks good! Thanks... I was still looking for the right
> > place to store the 'alrady-did-freeze' info. :-)
> >
> > Feel free to push with my r-b.
> > Will also run tests and provide feed-back.
> >
> > > Also, I have reverted d8f3b490bbb691c9916eed0df5b980c1aef23c85.
> >
> > As discussed I disagree with the revert (for now).
> > This is a possible further (re)optimization but
> > breaks separation that the patches had introduced.
> >
> >
> Agreed.  I like the clean separation of banning code too. :-)

I see you pushed it with the remaining dbgmsg patches.
But it did not land. I repushed it with a slightly
amended commit msg containing the reference to the
new bug I just created for this:

https://bugzilla.samba.org/show_bug.cgi?id=11945

Find the updated patch attached for reference.

Thanks - Michael
-------------- next part --------------
From cab372e56cdd94f4cb97c51698a2ad1b96168399 Mon Sep 17 00:00:00 2001
From: Amitay Isaacs <amitay at gmail.com>
Date: Wed, 1 Jun 2016 12:10:46 +1000
Subject: [PATCH] ctdb-recoverd: Freeze databases whenever the node is INACTIVE

If the node becomes stopped or banned after recovery is marked
active, then it will never freeze the databases, and hence the
node will keep banning itself indefinitely, until ctdbd is restarted.

This is a regression from 4.3, introduced with

b4357a79d916b1f8ade8fa78563fbef0ce670aa9

and

d8f3b490bbb691c9916eed0df5b980c1aef23c85

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11945

Signed-off-by: Amitay Isaacs <amitay at gmail.com>
Reviewed-by: Michael Adam <obnox at samba.org>
Reviewed-by: Martin Schwenke <martin at meltin.net>
---
 ctdb/server/ctdb_recoverd.c | 15 +++++++++++++--
 1 file changed, 13 insertions(+), 2 deletions(-)

diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 09940dc..cb5f8a3 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -257,6 +257,7 @@ struct ctdb_recoverd {
 	struct ctdb_iface_list_old *ifaces;
 	uint32_t *force_rebalance_nodes;
 	struct ctdb_node_capabilities *caps;
+	bool frozen_on_inactive;
 };
 
 #define CONTROL_TIMEOUT() timeval_current_ofs(ctdb->tunable.recover_timeout, 0)
@@ -3550,11 +3551,18 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 
 				return;
 			}
-			ret = ctdb_ctrl_freeze(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE);
+		}
+		if (! rec->frozen_on_inactive) {
+			ret = ctdb_ctrl_freeze(ctdb, CONTROL_TIMEOUT(),
+					       CTDB_CURRENT_NODE);
 			if (ret != 0) {
-				DEBUG(DEBUG_ERR,(__location__ " Failed to freeze node in STOPPED or BANNED state\n"));
+				DEBUG(DEBUG_ERR,
+				      (__location__ " Failed to freeze node "
+				       "in STOPPED or BANNED state\n"));
 				return;
 			}
+
+			rec->frozen_on_inactive = true;
 		}
 
 		/* If this node is stopped or banned then it is not the recovery
@@ -3564,6 +3572,8 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 		return;
 	}
 
+	rec->frozen_on_inactive = false;
+
 	/* Retrieve capabilities from all connected nodes */
 	ret = update_capabilities(rec, nodemap);
 	if (ret != 0) {
@@ -3901,6 +3911,7 @@ static void monitor_cluster(struct ctdb_context *ctdb)
 	CTDB_NO_MEMORY_FATAL(ctdb, rec->recovery);
 
 	rec->priority_time = timeval_current();
+	rec->frozen_on_inactive = false;
 
 	/* register a message port for sending memory dumps */
 	ctdb_client_set_message_handler(ctdb, CTDB_SRVID_MEM_DUMP, mem_dump_handler, rec);
-- 
2.5.5

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20160601/d86d7223/signature.sig>


More information about the samba-technical mailing list