[Samba] CTDB over WAN Link with LMASTER/RECMASTER Disabled

Martin Schwenke martin at meltin.net
Mon Jun 4 05:24:26 UTC 2018


Hi Mike,

On Sat, 2 Jun 2018 20:32:00 -0500 (CDT), Mike Ruebner via samba
<samba at lists.samba.org> wrote:

> I came across the 'CTDB_CAPABILITY_LMASTER=no' and
> 'CTDB_CAPABILITY_RECMASTER=no' options in my quest to salvage a
> rather poorly performing CTDB cluster over Ceph(fs). Unfortunately,
> the docs provide not enough information for a clustering noop like
> myself. Would there be any benefit to disabling those options for a
> branch office node on a high-latency WAN connection?

> Throughput maxes out at 20 Mbit/sec, with latency in the 20 - 30 ms
> range. I am mostly concerned about SMB read/list performance which
> drops significantly for folders with an object count >1000. Share
> mount is Cephfs over Ceph Kraken/Jewel.

> Any pointers greatly appreciated!

I think that the first thing to do would be to measure cephfs
performance over the WAN.  If the bottleneck is there when tweaking
CTDB options won't help you.  If the cephfs performance is good then
look at CTDB...

The potential CTDB problem is that volatile databases such as
locking.tdb are distributed.  In this case the databases are being
distributed across the WAN.  Records will still be distributed in this
way with CTDB_CAPABILITY_LMASTER=no. The advantage would be that files accessed
only (or primarily?) in the main office would not need a trip over the
WAN to locate records.  However, accesses from the branch office would
always need to go over the WAN to locate records.

If the poor performance is only being seen on the branch office node
then I don't think CTDB_CAPABILITY_LMASTER=no would help...  but I
could be wrong.

I think that with a WAN in the mix then you'll always get poor
performance when there is contention for file access on either side
of the WAN or when lots of small files are being created via the branch
office node.

Your best option is to limit contention for records. One possibility
might be, depending on what is happening might be the:

  fileid:algorithm = fsname_norootdir

Samba share option.  However, you really need to understand if your
problem is contention at the root of a share and you must understand
the implications of that option.

I doubt that many people have experience with
CTDB_CAPABILITY_LMASTER=no.  I've been a CTDB developer for over 10
years and I haven't seen anyone discuss how useful this option is, nor
have I done any related performance analysis.

Note that I have ignored CTDB_CAPABILITY_RECMASTER=no.  It should only
come into play during database recoveries.  If you have a lot of active
records (due to high client activity) then having the branch office
node coordinate recovery could be very slow.  However, this shouldn't
affect ongoing performance.

peace & happiness,
martin



More information about the samba mailing list