CTDB scaling?

Tue Nov 15 20:12:12 UTC 2016

Hi,

Since this mail thread was already there, I want to continue on the same
question i.e CTDB linear scaling  and its Performance in HA environment.

As Ronnie mentioned does CTDB only scale up to ~30 nodes with a better
performance or is there any fixed limitation on the number of nodes in
a cluster for ctdb to work without any performance bottleneck ?

Since we have multiple HA group concept in our cluster, we thought of
running ctdb instances per HA so that way we don't want to sync locking
states to other HA groups as shares are isolated to each HA group. But this
idea blocks the syncing secrets.tdb across cluster.

I would like to know others opinions/experience on ctdb linear scaling and
its performance in the field or if there any good practice to solve this
linear scaling issue.

Regards,
--Partha

On Thu, Nov 20, 2014 at 10:19 AM, ronnie sahlberg <ronniesahlberg at gmail.com>
wrote:

> On Thu, Nov 20, 2014 at 9:33 AM, Richard Sharpe
> <realrichardsharpe at gmail.com> wrote:
> > On Wed, Nov 19, 2014 at 5:44 PM, ronnie sahlberg
> > <ronniesahlberg at gmail.com> wrote:
> >> On Wed, Nov 19, 2014 at 3:49 PM, Richard Sharpe
> >> <realrichardsharpe at gmail.com> wrote:
> >>>
> >>> Hi folks,
> >>>
> >>> In Tridge's 2007 paper:
> >>>
> >>> he claims the following performance scaling:
> >>> https://www.samba.org/~tridge/sambaxp-07/ctdb.pdf
> >>>
> >>> NEW (CTDB) approach
> >>> 1 node 42 Mbytes/sec
> >>> 2 nodes 168 MBytes/sec
> >>> 3 nodes 211 MBytes/sec
> >>> 4 nodes 243 MBytes/sec
> >>>
> >>> This seems counter intuitive. 2 nodes gets four times what one node
> >>> gets and four nodes gets almost six times what 1 node does?
> >>>
> >>> What is the explanation for that?
> >>>
> >>
> >> The superlinear scaling is likely due to the increase of memory for
> caching.
> >> This is recall is the uncontended case where you have little cross node
> >> traffic.
> >
> > Hmm, it is still not obvious. There seems to be several things going on
> here.
> >
> > Is it possible that the same NBENCH load was offered across all four
> > configurations?
> >
> > That would make more sense. Then, in the one node case we were hitting
> > the one node limit, and as you say, with two nodes and the load
> > divided between them, more memory was available for caching so we see
> > a big boost there. After that, we seem to be hitting the IO limit of
> > the cluster because more memory does not seem to help that much ... By
> > the time we hit six nodes it looks like we would probably be seeing
> > only another 20MB/s or less for the additional nodes.
>
> Probably.
> I think I recall that tridge might have done these very early tests on
> virtual machines running under QEMU/KVM
> thus making the actual numbers even more difficult to parse/meaningless.
>
> I think the only real takeaway from these numbers is that "scaling is
> good for that kind of workload".
>
> Later tests on real hardware showed for uncontended cases pretty much
> linear scaling up to ~30 nodes
> all pretty much saturating 10GbE on each node in CIFS traffic.
> For the uncontended case.
>
> The heavily contended case often did not scale well at all. Later
> additions to the protocol such as sticky records and
> read-only record delegations helped for some, not nearly all,
> workloads but I do not know if any performance numbers were ever
> collected on that.
>
>
>
> >
> > --
> > Regards,
> > Richard Sharpe
> > (何以解憂？唯有杜康。--曹操)
>

-- 
Thanks & Regards
-Partha