[Samba] CTDB RecLockLatencyMs vs RecoverInterval

Martin Schwenke martin at meltin.net
Tue Jun 30 22:20:52 UTC 2020

Hi Bob,

On Tue, 30 Jun 2020 17:00:11 -0400, Robert Buck via samba
<samba at lists.samba.org> wrote:

> I have a question regarding CTDB RecLockLatencyMs tunable parameter. Is
> there any relationship between the RecLockLatencyMs property and
> the RecoverInterval property? Does one need to be larger than the other? Or
> if RecLockLatencyMs were increased to 5000ms, should some other setting be
> changed in proportion?
> We're using a geo-distributed etcd cluster for the CTDB recovery lock and I
> noticed a "*High RECLOCK latency"* (of 4s) message in syslog, and just
> wanted to see if we could safely squelch the warning, and if so, how?

RecoverInterval indicates how often nodes should monitor conditions
that indicate that a database recovery is needed.  I would suggest
leaving this at the default of 1 second.  In future we might change
this to be hard coded anyway.

Many years ago CTDB used to release the recovery lock after each
recovery.  This meant that the recovery lock had to be taken before
each recovery, so the recovery lock latency mattered more.

We changed that so the recovery lock is taken before the first recovery
after a node is elected leader (currently called recovery master), so
it is now more of a cluster lock.  We also made some changes so that
the leader is more likely to be stable across elections.  Both of these
changes make the recovery lock latency matter a lot less.

So, I don't think that warnings about recovery lock latency are as
important as they used to be.  You could safely increase
RecLockLatencyMs to 5000.

However... (and there is always a "however" ;-)

The presence of recovery lock latency warnings made one of the race
conditions in the following bug pretty obvious to me:


so, while they matter less, they still have value.

If you're using a CTDB recovery lock with high latency then you should
make sure you are using a version that contains a fix for the above bug.

Please let us know if you have more questions...

peace & happiness,

