[PATCH] Fix error handling when recovery fails (bug 12857)

Martin Schwenke martin at meltin.net
Fri Jun 23 07:23:06 UTC 2017


On Fri, 23 Jun 2017 15:40:49 +1000, Amitay Isaacs via samba-technical
<samba-technical at lists.samba.org> wrote:

> If freezing a database during recovery fails, then ctdb will endlessly keep
> trying to recover without banning the culprit node.  This can be avoided by
> assigning banning credits when freezing database fails.
> 
> CTDB is supposed to release all public IP addresses if it stays in recovery
> for long time (default 120 seconds).  This is to avoid clients getting
> stuck on a faulty node.  However, the current logic to set the timer is
> broken.  The timer is reset every time SET_RECMODE control is sent.  The
> attached patches simplify the code in ctdb_control_set_recomode() and make
> it idempotent.  If the required recovery mode is already set, then there is
> no reason to do anything.
> 
> Please review and push.

Reviewed-by: Martin Schwenke <martin at meltin.net>

Will push shortly...

peace & happiness,
martin



More information about the samba-technical mailing list