CTDB internals

Christopher R. Hertel crh at ubiqx.mn.org
Thu Nov 1 23:23:44 MDT 2012

On 11/02/2012 12:00 AM, Amitay Isaacs wrote:
> Hi Chris,
> On Fri, Nov 2, 2012 at 3:28 PM, Christopher R. Hertel <crh at ubiqx.mn.org> wrote:
>> Amitay, Obnox, et. al.,
>> I just want to make sure that I've got this right...
>> Reviewing Michael's tutoral, given in 2009 at SambaXP, here's what I get:
>> * The underlying tables are all TDB tables.
>> * These TDB tables are of three types:
>>   1) Persistent
>>   2) Normal ("volatile")
>>   3) Recovery
> There are only two types of databases persistent and normal. Recovery
> file is just a regular file and not tdb database.


...but access is still arbitrated using fcntl byte-range locks.  Is that

>> I think I generally understand how these work.  I have some questions about
>> the sequence of events when writing to a Persistent TDB, but those can wait.
>> My immediate questions are:
>> Q: Is the CTDB_RECOVERY_LOCK file the only tdb file that will be stored on
>>    shared disk and concurrently accessed by multiple nodes?
> Yes, CTDB_RECOVERY_LOCK file is the only file that is stored on the
> shared storage for concurrent access to resolve split-brain situations
> and doing recoveries.


In our test case, we have a couple of other files in there.  For example,
/etc/sysconfig/ctdb is symlinked to a shared file so that we only have to
edit the file once.

>> Q: For the other two types (Persistent and Normal), is the ctdbd daemon
>>    the only reader/writer to the local TDBs?  For Normal LTDBs in
>>    particular, is fcntl byte-range locking used to manage access in any
>>    way?
> For non-persistent databases smbd and ctdbd can read/write to local
> TDBs. The access is ordered by fcntl byte-range locks. smbd accesses a
> record from local TDBs only when the local CTDB node is data master
> for that record.

Q: To do that, smbd would have to go through CTDB somehow, because only
   the ctdbd would know if it were master.  Is that correct?

> For persistent databases, CTDB transaction API is used to write data to TDBs.

I have questions on how that works but they can wait.


Chris -)-----
"Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X
Samba Team -- http://www.samba.org/     -)-----   Christopher R. Hertel
jCIFS Team -- http://jcifs.samba.org/   -)-----   ubiqx development, uninq.
ubiqx Team -- http://www.ubiqx.org/     -)-----   crh at ubiqx.mn.org
OnLineBook -- http://ubiqx.org/cifs/    -)-----   crh at ubiqx.org

More information about the samba-technical mailing list