Patch to support Scalable CTDB

Partha Sarathi parthasarathi.bl at gmail.com
Mon Apr 30 23:12:01 UTC 2018


On Mon, Apr 30, 2018 at 7:52 AM, Volker Lendecke <Volker.Lendecke at sernet.de>
wrote:

> On Mon, Apr 30, 2018 at 02:36:14PM +0000, Partha Sarathi via
> samba-technical wrote:
> > Ok. My concern is when you have common Ctdb running across cluster with
> > different file spaces keeping the locking.tdb replicated for all file
> opens
> > doesn’t seems to be worth.
>
> locking.tdb is never actively replicated. The records are only ever
> moved on demand to the nodes that actively request it. That's
> different from secrets.tdb for example.
>
> Volker
>
> --
> SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
> phone: +49-551-370000-0, fax: +49-551-370000-9
> AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
> http://www.sernet.de, mailto:kontakt at sernet.de
>

Thanks, Volker and Ralph, but I see different behavior.

I had three node cluster running a common ctdb with different filespace on
each of the nodes as below

pnn:1 fe80::ec4:7aff:fe34:ac0b OK
pnn:2 fe80::ec4:7aff:fe34:ee47 OK
pnn:3 fe80::ec4:7aff:fe34:b923 OK (THIS NODE)
Generation:1740147135
Size:3
hash:0 lmaster:1
hash:1 lmaster:2
hash:2 lmaster:3
Recovery mode:NORMAL (0)
Recovery master:2


1) Opened a "1.pdf" on node 1  and  noticed couple records updated in
"locking.tdb.1" and also in node 2 "locking.tdb.2".
2) Opened a "3.pdf" on node 3 and  noticed couple records updated in
"locking.tdb.3" and also in both "locking.tdb.2" and locking.tdb.1

Per your statement what I was expecting was unless any node specifically
request for the records, it shouldn't have to get those records. but in the
above example, even without asking all the records were available on all
the nodes. Basically, one more understanding what I learned is, every node
in the cluster try to update their open/close file records  to Recovery
master in the large cluster with different filespace it may be overwhelmed
with all record updates unnecessarily.

The below is the locking.tdb dumps on all the three nodes for different
file open/close but with lcoking.tdb had all the records on all the nodes.
The open records for file "3.pdf" was not necessary on node 1, but Recovery
master had those records, so it updated/replicated to rest of the nodes in
the cluster.

So this kind of cluster-wide replications may slow down the overall
performance when you are trying to open a large number of file with
different file spaces in subclusters.

root at oneblox40274:/var/tmp/samba/ctdb# tdbdump locking.tdb.1
{
key(24) =
"\16\00\00\00\00\00\00\00c\0C\01\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\BA\EF\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"%\00\00\00\00\00\00\00\E8\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(480) =
"\02\00\00\00\00\00\00\00\01\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\A7\02\DDg\AD\C3\E0\A4\00\00\02\00\04\00\02\00\00\00\00\00\02\00\00\00\02\00\00\00\00\00\00\00\ACt\00\00\00\00\00\00\00\00\00\00\01\00\00\00\B1\0CZ\C4\E2\F92X\C6\00\00\00\00\00\00\00\00\00\00\00\FF\FF\FF\FF\81\00\10\00\07\00\00\00\00\00\00\00\00\00\00\00;\81\E7Z\00\00\00\00e\22\03\00%\00\00\00\00\00\00\00\E8\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\CA\F23\F9\00\00\00\00\FE\FF\00\00\00\00\00\00\A6\EB\F4'\00\00\00\00\00\00\00\00\ACt\00\00\00\00\00\00\00\00\00\00\01\00\00\00\B1\0CZ\C4\E2\F92Xm\13\00\00\00\00\00\00\00\00\00\00\FF\FF\FF\FF\81\00\10\00\07\00\00\00\00\00\00\00\00\00\00\00\C3\84\E7Z\00\00\00\00\F8\10\06\00%\00\00\00\00\00\00\00\E8\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00C\B5\DB\0B\00\00\00\00\FE\FF\00\00\00\00\00\00\A6\EB\F4'\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00%\00\00\00\00\00\00\00\E8\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\10\00\00\00\00\00\00\00\10\00\00\00/exports/Public\00\02\00\00\00\00\00\00\00\02\00\00\00.\00\00\00\00\00\00\00;\81\E7Z\00\00\00\00e\22\03\00\00\00\00\00\C3\84\E7Z\00\00\00\00\F8\10\06\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00b\0C\01\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00~\04\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\EAL\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) = "\16\00\00\00\00\00\00\00\87@
\02\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\B1\F9\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\92>\02\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\1B5\02\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\CB\08\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00t\FE\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00z\11\02\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\D6Y\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\9E\06\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00r\09\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"#\00\00\00\00\00\00\00\B1d\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00\B0!\02\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}
{
key(24) =
"\16\00\00\00\00\00\00\00=\0C\01\00\00\00\00\00\00\00\00\00\00\00\00\00"
data(24) =
"\01\00\00\00\00\00\00\00\03\00\00\00\00\00\00\00\00\00\00\00\00\00\00\00"
}


-- 
Thanks & Regards
-Partha


More information about the samba-technical mailing list