[SCM] CTDB repository - branch master updated - ctdb-2.2-59-gef1c4e9

Mon Jul 1 22:41:19 MDT 2013

The branch, master has been updated
       via  ef1c4e99ca66e7a990bc557f34abb624c315e6ba (commit)
       via  fcd5e1f04c5fe6c98399429b8f0918b8779acba6 (commit)
       via  932360992b08a5483d90c0590218ba0fd756119e (commit)
       via  741944f118e98f178b860194eecb215180949d18 (commit)
       via  ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95 (commit)
       via  df30c0a05ed908fc2a997c56ff5484736b23b70f (commit)
       via  14399de1dd0bd8dabf1f48b1457e3ccb37589d8a (commit)
       via  aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe (commit)
       via  ae1693905036ecdbc4594fde1f12500faae4a554 (commit)
       via  593a17678fbd3109e118154b034d43b852659518 (commit)
       via  93bcb6617e1024f810533e12390a572f51703ca0 (commit)
       via  815ddd3341b7e9db39e05a3a3fcd9a1420f053bc (commit)
       via  2396981c4bcf30530aeb7f4395093cc202105b50 (commit)
       via  38304f88e0c634e97d4687c25adef975f71537b8 (commit)
       via  a60f228f8380f222f838eb619d2ab55f96f11ac2 (commit)
       via  297d93cecc3c0655e72ecac38508e113bdbeab9c (commit)
       via  bb178338658b4ae32382a1f62f7c21cee1d4878f (commit)
       via  6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8 (commit)
       via  8d622660a14c929e365d306147b378ea6ab92175 (commit)
       via  34af2cdf686d5d77854cbaa7bbcd8f878e9171c7 (commit)
       via  c6f8407648abb37f2ed781afa5171dad8c9f59e9 (commit)
       via  46efe7a886f8c4c56f19536adc98a73c22db906a (commit)
       via  87716e8f504d659515d3dbcf93badbf106873bc8 (commit)
       via  478e24bceda3fedfba54ccb48faa115df726b819 (commit)
       via  4be8dff3a4451192f838497b4747273685959bed (commit)
       via  7eb2f89979360b6cc98ca9b17c48310277fa89fc (commit)
       via  4f87925a287f612a6ab3b5da1a387a31c7bea28f (commit)
      from  733fc909425860f6a02c205c2d8f34a731853922 (commit)

http://gitweb.samba.org/?p=ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit ef1c4e99ca66e7a990bc557f34abb624c315e6ba
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Tue Jul 2 12:40:37 2013 +1000

    ctdbd: Don't ban self if init or shutdown event fails
    
    There is no point in banning the node if init or shutdown event times
    out since it's going to quit anyway.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit fcd5e1f04c5fe6c98399429b8f0918b8779acba6
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 17:46:43 2013 +1000

    doc: The second half of monitoring is only for recovery master
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 932360992b08a5483d90c0590218ba0fd756119e
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 26 09:23:22 2013 +0200

    recoverd: when the recmaster is banned, use that information when forcing an election
    
    When we trigger an election because the recmaster considers itself inactive,
    update our local nodemap with the recmaster's flags before calling
    force_election(). This way, we don't send the inactive node freeze commands
    (e.g.) that may fail and then lead to ourselves getting banned.
    
    The theory is that this should help avoiding banning loops.
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 741944f118e98f178b860194eecb215180949d18
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 26 07:11:51 2013 +0200

    recoverd: fix a comment typo
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit ac06c46e4a80c635f6094b5ac6f0bf3e3a02db95
Author: Michael Adam <obnox at samba.org>
Date:   Fri Jun 21 17:57:37 2013 +0200

    recoverd: fix a comment in main_loop
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit df30c0a05ed908fc2a997c56ff5484736b23b70f
Author: Michael Adam <obnox at samba.org>
Date:   Fri Jun 21 14:06:22 2013 +0200

    recoverd: eliminate some trailing spaces from ctdb_election_win()
    
    Signed-off-by: Michael Adam <obnox at samba.org>

commit 14399de1dd0bd8dabf1f48b1457e3ccb37589d8a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 28 16:31:07 2013 +1000

    recoverd: Don't continue if the current node gets banned
    
    Can not continue with recovery or monitoring cluster.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Pair-programmed-with: Amitay Isaacs <amitay at gmail.com>

commit aea12dce83ef385e9fb3bc03ac7ace0874a0e3fe
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:31:02 2013 +1000

    recoverd: Refactor code to ban misbehaving nodes
    
    Since we have nodemap information, there is no need to hardcode the
    limit of 20.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    Pair-Programmed-With: Martin Schwenke <martin at meltin.net>

commit ae1693905036ecdbc4594fde1f12500faae4a554
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 16:01:16 2013 +1000

    recoverd: Move code to ban other nodes after we get local node flags
    
    If a node gets banned first, then it should not ban other nodes.
    
    This code was moved up in main_loop to avoid waiting for nodemap
    from other nodes (commit 83b0261f2cb453195b86f547d360400103a8b795).
    
    To prevent a banned node from banning other nodes, we need to first get
    nodemap information from local node, so trying to ban other nodes can
    fail if we are already banned.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 593a17678fbd3109e118154b034d43b852659518
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:44:27 2013 +1000

    recoverd: Delay the initial election if node is started in stopped state
    
    Since there is an early exit if a node is stopped or banned, we can wait till
    the node becomes active to start initial election.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 93bcb6617e1024f810533e12390a572f51703ca0
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:33:49 2013 +1000

    recoverd: Update capabilities only if the current node is active
    
    Since we do an early return if a node is stopped or banned, move update
    capabilities code below the early return and just before we check the
    capabilities of current recovery master.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 815ddd3341b7e9db39e05a3a3fcd9a1420f053bc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:46:04 2013 +1000

    recoverd: No need to check if node is recovery master when inactive
    
    If a node is stopped or banned, it will cause early return from the
    main_loop, so this check is redundent.  The election will called by an
    active node.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 2396981c4bcf30530aeb7f4395093cc202105b50
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Thu Jun 27 15:39:15 2013 +1000

    recoverd: Always do an early exit from main_loop if node is stopped or banned
    
    A stopped or banned node cannot do anything useful.  So do not participate
    in any cluster activity and do not cause any unnecessary network traffic.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 38304f88e0c634e97d4687c25adef975f71537b8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:10:47 2013 +1000

    recoverd: Do not set banning credits on a node if current node is inactive
    
    If the current node is banned or stopped, then it should not assign banning
    credits to other nodes since the current node will not have up-to-date flags
    of other nodes.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit a60f228f8380f222f838eb619d2ab55f96f11ac2
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 17:40:36 2013 +1000

    banning: Do not come out of ban if databases are not frozen
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 297d93cecc3c0655e72ecac38508e113bdbeab9c
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 14:33:32 2013 +1000

    banning: No need to check if banned pnn is for local node
    
    If the banned pnn is not the local node, the function returns early.
    So no need for additional check.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit bb178338658b4ae32382a1f62f7c21cee1d4878f
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:04:18 2013 +1000

    banning: Make ctdb_local_node_got_banned() a void function
    
    When this function is called, we are already committed to banning
    and there is no point in failing this function.  In case, freezing of
    databases fails, it will be fixed from recovery daemon.

commit 6a9dbb8fb0f1f6e8c206189cdc2d33bb371ea2a8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:02:44 2013 +1000

    recoverd: Also check if current node is in recovery when it is banned
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 8d622660a14c929e365d306147b378ea6ab92175
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Fri Jun 28 14:09:35 2013 +1000

    recoverd: Set node_flags information as soon as we get nodemap
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 34af2cdf686d5d77854cbaa7bbcd8f878e9171c7
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 26 16:02:23 2013 +1000

    recovered: Remove old comment as the code corresponding to that has gone away
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit c6f8407648abb37f2ed781afa5171dad8c9f59e9
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 14:31:50 2013 +1000

    banning: Log ban state changes for other nodes at higher debug level
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 46efe7a886f8c4c56f19536adc98a73c22db906a
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 16:28:04 2013 +1000

    freeze: Make ctdb_start_freeze() a void function
    
    If this function fails due to memory errors, there is no way to recover.
    The best course of action is to abort.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 87716e8f504d659515d3dbcf93badbf106873bc8
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 16:21:00 2013 +1000

    freeze: If priority is invalid here, it's time to abort
    
    ctdb_start_freeze() is called from ctdb_control_freeze() which fixes the
    priority if it's 0 and return error if it's invalid.  Other callers of
    ctdb_start_freeze() are internal to CTDB.  So if priority is invalid in
    ctdb_start_freeze(), definitely something is seriously wrong.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 478e24bceda3fedfba54ccb48faa115df726b819
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jul 1 13:26:33 2013 +1000

    freeze: Log message from ctdb_start_freeze() and ctdb_control_freeze()
    
    This ensures that whenever databases are frozen either via sending
    control or by calling ctdb_start_freeze(), the action is logged.
    Since ctdb_control_freeze() calls ctdb_start_freeze(), move logging of
    message in early return condition if databases are already frozen.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4be8dff3a4451192f838497b4747273685959bed
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Mon Jun 24 14:18:58 2013 +1000

    recoverd: Print banning message only after verifying pnn
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 7eb2f89979360b6cc98ca9b17c48310277fa89fc
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Jun 26 15:22:46 2013 +1000

    recoverd: When updating flags on nodes, send updated flags and not old flags
    
    This was broken by commit a9a1156ea4e10483a4bf4265b8e9203f0af033aa.
    Instead of a SRVID_SET_NODE_FLAGS message to recovery daemon, a control
    was sent to the local daemon which in turn informed the recovery daemon.
    And while doing this change old flags were sent via CONTROL_MODIFY_FLAGS.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>

commit 4f87925a287f612a6ab3b5da1a387a31c7bea28f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 26 14:34:47 2013 +1000

    tools/ctdb: Add "force" option to "recover" command
    
    At the moment there is no easy way to force a recovery when attempting
    to reproduce certain classes of bugs.  This option is added without
    documentation because it is dangerous until the bugs are fixed!  :-)
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

-----------------------------------------------------------------------

Summary of changes:
 doc/recovery-process.txt |    4 +-
 include/ctdb_private.h   |    4 +-
 server/ctdb_banning.c    |   37 ++++++---
 server/ctdb_freeze.c     |   32 +++-----
 server/ctdb_monitor.c    |    2 +-
 server/ctdb_recoverd.c   |  201 +++++++++++++++++++++++-----------------------
 server/eventscript.c     |    6 +-
 tools/ctdb.c             |   11 ++-
 8 files changed, 156 insertions(+), 141 deletions(-)


Changeset truncated at 500 lines:

diff --git a/doc/recovery-process.txt b/doc/recovery-process.txt
index 7780d84..7cfc678 100644
--- a/doc/recovery-process.txt
+++ b/doc/recovery-process.txt
@@ -112,8 +112,8 @@ These tests are performed on all nodes in the cluster which is why it is optimiz
 as few network calls to other nodes as possible.
 Each node only performs 1 call to the recovery master in each loop and to no other nodes.
 
-NORMAL NODE CLUSTER MONITORING
-------------------------------
+RECOVERY MASTER CLUSTER MONITORING
+-----------------------------------
 The recovery master performs a much more extensive test. In addition to tests 1-9 above
 the recovery master also performs the following tests:
 
diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index bf5b5ec..05109ac 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -1285,7 +1285,7 @@ int ctdb_ctrl_get_all_tunables(struct ctdb_context *ctdb,
 			       uint32_t destnode,
 			       struct ctdb_tunable *tunables);
 
-int ctdb_start_freeze(struct ctdb_context *ctdb, uint32_t priority);
+void ctdb_start_freeze(struct ctdb_context *ctdb, uint32_t priority);
 
 bool parse_ip_mask(const char *s, const char *iface, ctdb_sock_addr *addr, unsigned *mask);
 bool parse_ip_port(const char *s, ctdb_sock_addr *addr);
@@ -1440,7 +1440,7 @@ int ctdb_vacuum_init(struct ctdb_db_context *ctdb_db);
 int32_t ctdb_control_enable_script(struct ctdb_context *ctdb, TDB_DATA indata);
 int32_t ctdb_control_disable_script(struct ctdb_context *ctdb, TDB_DATA indata);
 
-int32_t ctdb_local_node_got_banned(struct ctdb_context *ctdb);
+void ctdb_local_node_got_banned(struct ctdb_context *ctdb);
 int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata);
 int32_t ctdb_control_get_ban_state(struct ctdb_context *ctdb, TDB_DATA *outdata);
 int32_t ctdb_control_set_db_priority(struct ctdb_context *ctdb, TDB_DATA indata);
diff --git a/server/ctdb_banning.c b/server/ctdb_banning.c
index bb3facc..e6df4b9 100644
--- a/server/ctdb_banning.c
+++ b/server/ctdb_banning.c
@@ -31,6 +31,21 @@ ctdb_ban_node_event(struct event_context *ev, struct timed_event *te,
 			       struct timeval t, void *private_data)
 {
 	struct ctdb_context *ctdb = talloc_get_type(private_data, struct ctdb_context);
+	bool freeze_failed = false;
+	int i;
+
+	/* Make sure we were able to freeze databases during banning */
+	for (i=1; i<=NUM_DB_PRIORITIES; i++) {
+		if (ctdb->freeze_mode[i] != CTDB_FREEZE_FROZEN) {
+			freeze_failed = true;
+			break;
+		}
+	}
+	if (freeze_failed) {
+		DEBUG(DEBUG_ERR, ("Banning timedout, but still unable to freeze databases\n"));
+		ctdb_ban_self(ctdb);
+		return;
+	}
 
 	DEBUG(DEBUG_ERR,("Banning timedout\n"));
 	ctdb->nodes[ctdb->pnn]->flags &= ~NODE_FLAGS_BANNED;
@@ -41,7 +56,7 @@ ctdb_ban_node_event(struct event_context *ev, struct timed_event *te,
 	}
 }
 
-int32_t ctdb_local_node_got_banned(struct ctdb_context *ctdb)
+void ctdb_local_node_got_banned(struct ctdb_context *ctdb)
 {
 	uint32_t i;
 
@@ -56,14 +71,10 @@ int32_t ctdb_local_node_got_banned(struct ctdb_context *ctdb)
 	ctdb->vnn_map->generation = INVALID_GENERATION;
 
 	for (i=1; i<=NUM_DB_PRIORITIES; i++) {
-		if (ctdb_start_freeze(ctdb, i) != 0) {
-			DEBUG(DEBUG_ERR,(__location__ " Failed to freeze db priority %u\n", i));
-		}
+		ctdb_start_freeze(ctdb, i);
 	}
 	ctdb_release_all_ips(ctdb);
 	ctdb->recovery_mode = CTDB_RECOVERY_ACTIVE;
-
-	return 0;
 }
 
 int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata)
@@ -78,12 +89,16 @@ int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata)
 			return -1;
 		}
 		if (bantime->time == 0) {
-			DEBUG(DEBUG_INFO,("unbanning node %d\n", bantime->pnn));
+			DEBUG(DEBUG_NOTICE,("unbanning node %d\n", bantime->pnn));
 			ctdb->nodes[bantime->pnn]->flags &= ~NODE_FLAGS_BANNED;
 		} else {
-			DEBUG(DEBUG_INFO,("banning node %d\n", bantime->pnn));
+			DEBUG(DEBUG_NOTICE,("banning node %d\n", bantime->pnn));
 			if (ctdb->tunable.enable_bans == 0) {
-				DEBUG(DEBUG_INFO,("Bans are disabled - ignoring ban of node %u\n", bantime->pnn));
+				/* FIXME: This is bogus. We really should be
+				 * taking decision based on the tunables on
+				 * the banned node and not local node.
+				 */
+				DEBUG(DEBUG_WARNING,("Bans are disabled - ignoring ban of node %u\n", bantime->pnn));
 				return 0;
 			}
 
@@ -120,10 +135,8 @@ int32_t ctdb_control_set_ban_state(struct ctdb_context *ctdb, TDB_DATA indata)
 	ctdb->nodes[bantime->pnn]->flags |= NODE_FLAGS_BANNED;
 
 	event_add_timed(ctdb->ev, ctdb->banning_ctx, timeval_current_ofs(bantime->time,0), ctdb_ban_node_event, ctdb);
-	if (bantime->pnn == ctdb->pnn) {
-		return ctdb_local_node_got_banned(ctdb);
-	}
 
+	ctdb_local_node_got_banned(ctdb);
 	return 0;
 }
 
diff --git a/server/ctdb_freeze.c b/server/ctdb_freeze.c
index e65415c..fee44d4 100644
--- a/server/ctdb_freeze.c
+++ b/server/ctdb_freeze.c
@@ -126,43 +126,38 @@ static int ctdb_freeze_waiter_destructor(struct ctdb_freeze_waiter *w)
 /*
   start the freeze process for a certain priority
  */
-int ctdb_start_freeze(struct ctdb_context *ctdb, uint32_t priority)
+void ctdb_start_freeze(struct ctdb_context *ctdb, uint32_t priority)
 {
 	struct ctdb_freeze_handle *h;
 
-	if (priority == 0) {
-		DEBUG(DEBUG_ERR,("Freeze priority 0 requested, remapping to priority 1\n"));
-		priority = 1;
-	}
-
 	if ((priority < 1) || (priority > NUM_DB_PRIORITIES)) {
 		DEBUG(DEBUG_ERR,(__location__ " Invalid db priority : %u\n", priority));
-		return -1;
+		ctdb_fatal(ctdb, "Internal error");
 	}
 
 	if (ctdb->freeze_mode[priority] == CTDB_FREEZE_FROZEN) {
 		/* we're already frozen */
-		return 0;
+		return;
 	}
 
+	DEBUG(DEBUG_ERR, ("Freeze priority %u\n", priority));
+
 	/* Stop any vacuuming going on: we don't want to wait. */
 	ctdb_stop_vacuuming(ctdb);
 
 	/* if there isn't a freeze lock child then create one */
 	if (ctdb->freeze_handles[priority] == NULL) {
 		h = talloc_zero(ctdb, struct ctdb_freeze_handle);
-		CTDB_NO_MEMORY(ctdb, h);
+		CTDB_NO_MEMORY_FATAL(ctdb, h);
 		h->ctdb = ctdb;
 		h->priority = priority;
 		talloc_set_destructor(h, ctdb_freeze_handle_destructor);
 
 		h->lreq = ctdb_lock_alldb_prio(ctdb, priority, false, ctdb_freeze_lock_handler, h);
-		CTDB_NO_MEMORY(ctdb, h->lreq);
+		CTDB_NO_MEMORY_FATAL(ctdb, h->lreq);
 		ctdb->freeze_handles[priority] = h;
 		ctdb->freeze_mode[priority] = CTDB_FREEZE_PENDING;
 	}
-
-	return 0;
 }
 
 /*
@@ -175,8 +170,6 @@ int32_t ctdb_control_freeze(struct ctdb_context *ctdb, struct ctdb_req_control *
 
 	priority = (uint32_t)c->srvid;
 
-	DEBUG(DEBUG_ERR, ("Freeze priority %u\n", priority));
-
 	if (priority == 0) {
 		DEBUG(DEBUG_ERR,("Freeze priority 0 requested, remapping to priority 1\n"));
 		priority = 1;
@@ -188,14 +181,12 @@ int32_t ctdb_control_freeze(struct ctdb_context *ctdb, struct ctdb_req_control *
 	}
 
 	if (ctdb->freeze_mode[priority] == CTDB_FREEZE_FROZEN) {
+		DEBUG(DEBUG_ERR, ("Freeze priority %u\n", priority));
 		/* we're already frozen */
 		return 0;
 	}
 
-	if (ctdb_start_freeze(ctdb, priority) != 0) {
-		DEBUG(DEBUG_ERR,(__location__ " Failed to start freezing databases with priority %u\n", priority));
-		return -1;
-	}
+	ctdb_start_freeze(ctdb, priority);
 
 	/* add ourselves to list of waiters */
 	if (ctdb->freeze_handles[priority] == NULL) {
@@ -226,10 +217,7 @@ bool ctdb_blocking_freeze(struct ctdb_context *ctdb)
 	int i;
 
 	for (i=1; i<=NUM_DB_PRIORITIES; i++) {
-		if (ctdb_start_freeze(ctdb, i)) {
-			DEBUG(DEBUG_ERR,(__location__ " Failed to freeze databases of prio %u\n", i));
-			continue;
-		}
+		ctdb_start_freeze(ctdb, i);
 
 		/* block until frozen */
 		while (ctdb->freeze_mode[i] == CTDB_FREEZE_PENDING) {
diff --git a/server/ctdb_monitor.c b/server/ctdb_monitor.c
index 106a44f..8d28fff 100644
--- a/server/ctdb_monitor.c
+++ b/server/ctdb_monitor.c
@@ -491,7 +491,7 @@ int32_t ctdb_control_modflags(struct ctdb_context *ctdb, TDB_DATA indata)
 
 	/* if we have become banned, we should go into recovery mode */
 	if ((node->flags & NODE_FLAGS_BANNED) && !(c->old_flags & NODE_FLAGS_BANNED) && (node->pnn == ctdb->pnn)) {
-		return ctdb_local_node_got_banned(ctdb);
+		ctdb_local_node_got_banned(ctdb);
 	}
 	
 	return 0;
diff --git a/server/ctdb_recoverd.c b/server/ctdb_recoverd.c
index f18cdf4..b6b2f6b 100644
--- a/server/ctdb_recoverd.c
+++ b/server/ctdb_recoverd.c
@@ -86,13 +86,13 @@ static void ctdb_ban_node(struct ctdb_recoverd *rec, uint32_t pnn, uint32_t ban_
 	struct ctdb_context *ctdb = rec->ctdb;
 	struct ctdb_ban_time bantime;
        
-	DEBUG(DEBUG_NOTICE,("Banning node %u for %u seconds\n", pnn, ban_time));
-
 	if (!ctdb_validate_pnn(ctdb, pnn)) {
 		DEBUG(DEBUG_ERR,("Bad pnn %u in ctdb_ban_node\n", pnn));
 		return;
 	}
 
+	DEBUG(DEBUG_NOTICE,("Banning node %u for %u seconds\n", pnn, ban_time));
+
 	bantime.pnn  = pnn;
 	bantime.time = ban_time;
 
@@ -120,6 +120,12 @@ static void ctdb_set_culprit_count(struct ctdb_recoverd *rec, uint32_t culprit,
 		return;
 	}
 
+	/* If we are banned or stopped, do not set other nodes as culprits */
+	if (rec->node_flags & NODE_FLAGS_INACTIVE) {
+		DEBUG(DEBUG_NOTICE, ("This node is INACTIVE, cannot set culprit node %d\n", culprit));
+		return;
+	}
+
 	if (ctdb->nodes[culprit]->ban_state == NULL) {
 		ctdb->nodes[culprit]->ban_state = talloc_zero(ctdb->nodes[culprit], struct ctdb_banning_state);
 		CTDB_NO_MEMORY_VOID(ctdb, ctdb->nodes[culprit]->ban_state);
@@ -1108,7 +1114,7 @@ static int update_local_flags(struct ctdb_recoverd *rec, struct ctdb_node_map *n
 			   Since we are the recovery master we can just as
 			   well update the flags on all nodes.
 			*/
-			ret = ctdb_ctrl_modflags(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn, nodemap->nodes[j].flags, ~nodemap->nodes[j].flags);
+			ret = ctdb_ctrl_modflags(ctdb, CONTROL_TIMEOUT(), nodemap->nodes[j].pnn, remote_nodemap->nodes[j].flags, ~remote_nodemap->nodes[j].flags);
 			if (ret != 0) {
 				DEBUG(DEBUG_ERR, (__location__ " Unable to update nodeflags on remote nodes\n"));
 				return -1;
@@ -1540,6 +1546,36 @@ static void takeover_fail_callback(struct ctdb_context *ctdb, uint32_t node_pnn,
 }
 
 
+static void ban_misbehaving_nodes(struct ctdb_recoverd *rec, bool *self_ban)
+{
+	struct ctdb_context *ctdb = rec->ctdb;
+	int i;
+	struct ctdb_banning_state *ban_state;
+
+	*self_ban = false;
+	for (i=0; i<ctdb->num_nodes; i++) {
+		if (ctdb->nodes[i]->ban_state == NULL) {
+			continue;
+		}
+		ban_state = (struct ctdb_banning_state *)ctdb->nodes[i]->ban_state;
+		if (ban_state->count < 2*ctdb->num_nodes) {
+			continue;
+		}
+
+		DEBUG(DEBUG_NOTICE,("Node %u reached %u banning credits - banning it for %u seconds\n",
+			ctdb->nodes[i]->pnn, ban_state->count,
+			ctdb->tunable.recovery_ban_period));
+		ctdb_ban_node(rec, ctdb->nodes[i]->pnn, ctdb->tunable.recovery_ban_period);
+		ban_state->count = 0;
+
+		/* Banning ourself? */
+		if (ctdb->nodes[i]->pnn == rec->ctdb->pnn) {
+			*self_ban = true;
+		}
+	}
+}
+
+
 /*
   we are the recmaster, and recovery is needed - start a recovery run
  */
@@ -1555,30 +1591,19 @@ static int do_recovery(struct ctdb_recoverd *rec,
 	uint32_t *nodes;
 	struct timeval start_time;
 	uint32_t culprit = (uint32_t)-1;
+	bool self_ban;
 
 	DEBUG(DEBUG_NOTICE, (__location__ " Starting do_recovery\n"));
 
 	/* if recovery fails, force it again */
 	rec->need_recovery = true;
 
-	for (i=0; i<ctdb->num_nodes; i++) {
-		struct ctdb_banning_state *ban_state;
-
-		if (ctdb->nodes[i]->ban_state == NULL) {
-			continue;
-		}
-		ban_state = (struct ctdb_banning_state *)ctdb->nodes[i]->ban_state;
-		if (ban_state->count < 2*ctdb->num_nodes) {
-			continue;
-		}
-		DEBUG(DEBUG_NOTICE,("Node %u has caused %u recoveries recently - banning it for %u seconds\n",
-			ctdb->nodes[i]->pnn, ban_state->count,
-			ctdb->tunable.recovery_ban_period));
-		ctdb_ban_node(rec, ctdb->nodes[i]->pnn, ctdb->tunable.recovery_ban_period);
-		ban_state->count = 0;
+	ban_misbehaving_nodes(rec, &self_ban);
+	if (self_ban) {
+		DEBUG(DEBUG_NOTICE, ("This node was banned, aborting recovery\n"));
+		return -1;
 	}
 
-
         if (ctdb->tunable.verify_recovery_lock != 0) {
 		DEBUG(DEBUG_ERR,("Taking out recovery lock from recovery daemon\n"));
 		start_time = timeval_current();
@@ -1952,12 +1977,12 @@ static bool ctdb_election_win(struct ctdb_recoverd *rec, struct election_message
 	/* we cant win if we are banned */
 	if (rec->node_flags & NODE_FLAGS_BANNED) {
 		return false;
-	}	
+	}
 
 	/* we cant win if we are stopped */
 	if (rec->node_flags & NODE_FLAGS_STOPPED) {
 		return false;
-	}	
+	}
 
 	/* we will automatically win if the other node is banned */
 	if (em->node_flags & NODE_FLAGS_BANNED) {
@@ -3319,7 +3344,7 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 	struct ctdb_vnn_map *remote_vnnmap=NULL;
 	int32_t debug_level;
 	int i, j, ret;
-
+	bool self_ban;
 
 
 	/* verify that the main daemon is still running */
@@ -3344,28 +3369,6 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 	}
 	LogLevel = debug_level;
 
-
-	/* We must check if we need to ban a node here but we want to do this
-	   as early as possible so we dont wait until we have pulled the node
-	   map from the local node. thats why we have the hardcoded value 20
-	*/
-	for (i=0; i<ctdb->num_nodes; i++) {
-		struct ctdb_banning_state *ban_state;
-
-		if (ctdb->nodes[i]->ban_state == NULL) {
-			continue;
-		}
-		ban_state = (struct ctdb_banning_state *)ctdb->nodes[i]->ban_state;
-		if (ban_state->count < 20) {
-			continue;
-		}
-		DEBUG(DEBUG_NOTICE,("Node %u has caused %u recoveries recently - banning it for %u seconds\n",
-			ctdb->nodes[i]->pnn, ban_state->count,
-			ctdb->tunable.recovery_ban_period));
-		ctdb_ban_node(rec, ctdb->nodes[i]->pnn, ctdb->tunable.recovery_ban_period);
-		ban_state->count = 0;
-	}
-
 	/* get relevant tunables */
 	ret = ctdb_ctrl_get_all_tunables(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ctdb->tunable);
 	if (ret != 0) {
@@ -3416,10 +3419,43 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 	}
 	nodemap = rec->nodemap;
 
-	/* update the capabilities for all nodes */
-	ret = update_capabilities(ctdb, nodemap);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));
+	/* remember our own node flags */
+	rec->node_flags = nodemap->nodes[pnn].flags;
+
+	ban_misbehaving_nodes(rec, &self_ban);
+	if (self_ban) {
+		DEBUG(DEBUG_NOTICE, ("This node was banned, restart main_loop\n"));
+		return;
+	}
+
+	/* if the local daemon is STOPPED or BANNED, we verify that the databases are
+	   also frozen and that the recmode is set to active.
+	*/
+	if (rec->node_flags & (NODE_FLAGS_STOPPED | NODE_FLAGS_BANNED)) {
+		ret = ctdb_ctrl_getrecmode(ctdb, mem_ctx, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ctdb->recovery_mode);
+		if (ret != 0) {
+			DEBUG(DEBUG_ERR,(__location__ " Failed to read recmode from local node\n"));
+		}
+		if (ctdb->recovery_mode == CTDB_RECOVERY_NORMAL) {
+			DEBUG(DEBUG_ERR,("Node is stopped or banned but recovery mode is not active. Activate recovery mode and lock databases\n"));
+
+			ret = ctdb_ctrl_freeze_priority(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, 1);
+			if (ret != 0) {
+				DEBUG(DEBUG_ERR,(__location__ " Failed to freeze node in STOPPED or BANNED state\n"));
+				return;
+			}
+			ret = ctdb_ctrl_setrecmode(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, CTDB_RECOVERY_ACTIVE);
+			if (ret != 0) {
+				DEBUG(DEBUG_ERR,(__location__ " Failed to activate recovery mode in STOPPED or BANNED state\n"));
+
+				return;
+			}
+		}
+
+		/* If this node is stopped or banned then it is not the recovery
+		 * master, so don't do anything. This prevents stopped or banned
+		 * node from starting election and sending unnecessary controls.
+		 */
 		return;
 	}
 
@@ -3439,50 +3475,27 @@ static void main_loop(struct ctdb_context *ctdb, struct ctdb_recoverd *rec,
 		}
 	}
 
+	/* This is a special case.  When recovery daemon is started, recmaster
+	 * is set to -1.  If a node is not started in stopped state, then
+	 * start election to decide recovery master
+	 */
 	if (rec->recmaster == (uint32_t)-1) {
 		DEBUG(DEBUG_NOTICE,(__location__ " Initial recovery master set - forcing election\n"));
 		force_election(rec, pnn, nodemap);
 		return;
 	}
 
-	/* if the local daemon is STOPPED, we verify that the databases are
-	   also frozen and thet the recmode is set to active 
-	*/
-	if (nodemap->nodes[pnn].flags & NODE_FLAGS_STOPPED) {
-		ret = ctdb_ctrl_getrecmode(ctdb, mem_ctx, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ctdb->recovery_mode);
-		if (ret != 0) {
-			DEBUG(DEBUG_ERR,(__location__ " Failed to read recmode from local node\n"));
-		}
-		if (ctdb->recovery_mode == CTDB_RECOVERY_NORMAL) {
-			DEBUG(DEBUG_ERR,("Node is stopped but recovery mode is not active. Activate recovery mode and lock databases\n"));
-
-			ret = ctdb_ctrl_freeze_priority(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, 1);
-			if (ret != 0) {
-				DEBUG(DEBUG_ERR,(__location__ " Failed to freeze node in STOPPED state\n"));
-				return;
-			}
-			ret = ctdb_ctrl_setrecmode(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, CTDB_RECOVERY_ACTIVE);
-			if (ret != 0) {
-				DEBUG(DEBUG_ERR,(__location__ " Failed to activate recovery mode in STOPPED state\n"));
-
-				return;
-			}
-			return;
-		}
-	}
-	/* If the local node is stopped, verify we are not the recmaster 
-	   and yield this role if so
-	*/
-	if ((nodemap->nodes[pnn].flags & NODE_FLAGS_INACTIVE) && (rec->recmaster == pnn)) {
-		DEBUG(DEBUG_ERR,("Local node is INACTIVE. Yielding recmaster role\n"));
-		force_election(rec, pnn, nodemap);
+	/* update the capabilities for all nodes */
+	ret = update_capabilities(ctdb, nodemap);
+	if (ret != 0) {
+		DEBUG(DEBUG_ERR, (__location__ " Unable to update node capabilities.\n"));
 		return;
 	}
-	
+
 	/*
-	 * if the current recmaster do not have CTDB_CAP_RECMASTER,
-	 * but we have force an election and try to become the new
-	 * recmaster
+	 * If the current recmaster does not have CTDB_CAP_RECMASTER,


-- 
CTDB repository