[Samba] CTDB Question: external locking tool

Martin Schwenke martin at meltin.net
Tue Oct 27 04:09:34 UTC 2020


Hi Bob,

On Sun, 25 Oct 2020 20:44:07 -0400, Robert Buck <robert.buck at som.com>
wrote:

> We use a Golang-based lock tool that we wrote for CTDB. That tool interacts
> with our 3.4 etcd cluster, and follows the requirements specified in the
> project.
> 
> Question, does the external command line tool get called when LMASTER and
> RECMASTER are false? Given a scenario where we have a set of processes that
> have it set to false, then others that have it set to true, does the
> locking tool get called when they're set to false?

Indeed it does.  There are 2 current things conspiring against you:

* At the start of each recovery a recovery lock consistency check is
  done. Unfortunately, this means the recovery lock can't be left unset
  on nodes that do not have the recmaster capability because then the
  consistency check would fail.

* At the end of recovery, if the recovery lock is set, all nodes will
  attempt to take the recovery lock and will expect to fail (on the
  leader/master too, since it is being taken from a different process on
  the leader/master).

  This is meant to be a sanity check but, to be honest, I'm not sure
  whether it really adds any value.  A better option might be to only
  accept recovery-related controls from the current leader/master node,
  banning any other node that is stupid enough to send such a control.

I need to think about his more...

One of the problems is that the ideas of recovery master and recovery
lock are historical and they are somewhat dated compared to current
clustering concepts. Recovery master should really be "cluster leader"
and the lock should be "cluster lock".  If we clearly change our
approach in that direction then it makes no sense to check a cluster
lock at recovery time.

I have a branch that does the technical (but not documentation) parts
of switching to cluster leader and lock... but more work is needed
before this is ready to merge.

> IF you say the lock tool still gets called in both cases, then the docs
> need to be updated, and we on our end need to add a special config file
> option to reject lock acquisitions from those nodes that have the CTDB
> options set to false, permitting only those nodes set to true to acquire
> etcd locks.

Well, the documentation (ctdb(7) manual page) does say:

  CTDB does sanity checks to ensure that the recovery lock is held as
  expected.

;-)

OK, that's pretty weak!

I'll try to get some of Amitay's time to discuss what we should do
here...

peace & happiness,
martin



More information about the samba mailing list