CTDB: Split brain and banning

Thu Nov 1 01:17:42 UTC 2018

Hi David,

On Wed, 31 Oct 2018 17:32:20 +0100, David Disseldorp via
samba-technical <samba-technical at lists.samba.org> wrote:

> On Wed, 31 Oct 2018 16:55:30 +0100, Michel Buijsman via samba-technical wrote:
> 
> > On Wed, Oct 31, 2018 at 04:37:14PM +0100, David Disseldorp via samba-technical wrote:  
> > > It appears that ctdbd doesn't gracefully handle cases where the recovery
> > > master goes down holding the reclock and standby nodes can't immediately
> > > obtain the reclock following election. Your reclock helper lock_duration
> > > setting of "30" means that the standby nodes may need to wait up to 30
> > > seconds before obtaining the recovery lock.    
> > 
> > Yeah I'd just found that out as well, was just about to mail you.
> >   
> > > If you specify a lock_duration of "5" and set RecoveryBanPeriod=5, does
> > > your cluster return to OK ~5 seconds after master outage?    
> > 
> > How does ElectionTimeout play into this?  
> 
> I don't think ElectionTimeout is having an influence in this case, as
> the elections are proceeding without delay, it's the newly elected
> recmaster that runs into problems when it can't immediately obtain the
> recovery lock.

If a node is still within ElectionTimeout seconds of receiving the last
election packet (i.e. rec->election_timeout is non-NULL) then a
recovery won't start, so there'll be no attempt to take the
reclock.

So, although the election can be decided very quickly, ElectionTimeout
should be able to help with the problem being seen here.  Even with the
keep-alive set quite low, if the beginning of the next recovery can be
stalled beyond the ctdb_mutex_ceph_rados_helper timeout then this
should help.

peace & happiness,
martin