RFC: Readonly record support in ctdb and later samba

ronnie sahlberg ronniesahlberg at gmail.com
Wed Aug 31 20:26:34 MDT 2011


CTDB has traditionally only supported exclusive record locks, where
only one single node may be the DMASTER for a record and hold an
authorative version of the record and its data.
This works well for most cifs workloads, but not when you multiple
clients accessing the same record at the same time.

For example when multiple clients read from the same file at the same
time. Each ReadAndX operation requires that the most up to date record
is re-fetched across the cluster network every time to investigate if
it may or may not conflict with a lock.
This both adds high cluster latency since there is an additional
roundtrip across the nodes for every single ReadAndX and also to a
limitation in scalability and performance.

To address this common case where many clients may require to just
read a record like this, we have implemented a shared readonly locks
for ctdb records.
The initial implementation is currently not in master but only in the
master-readonly-records  together with a text document

This adds a delegation mechanism where the DMASTER can hand out
special readonly copies of a record to other nodes on request, and
later revoke these delegations in case we need to take out an
exclusive writelock.

This will require some work in samba db_wrap to make it aware of the
new type of cached readonly records in the database and to utilize
this, avoiding having to perform a roundtrip to either ctdbd itself or
across the network to a different node for every time it just needs to
read a common hot record.

Please review the doc/readonlyrecords.txt for the initial design.

ronnie sahlberg

More information about the samba-technical mailing list