Using TDB for glib/gobject applications

Tue Jun 23 16:06:44 MDT 2009

Hi Simo,

 > Concurrent transactions are basically impossible with TDB, how would you
 > handle 2 separate transaction create an object with the same key ?

I wouldn't go as far as saying that they are impossible, but they are
quite tricky. What you need is essentially what Ronnie and I added for
the cluster-wide transactions in ctdb (see ctdb_replay_transaction()).

I've CCd Rusty as he is currently working on a possible new format and
transaction system for tdb, so he may have some comments.

The way it works is this:

  - during a transaction, record in some data structure (eg. linked
    list) all of the operations of the transaction. The key thing to
    record is all the keys of the records that are read and the state
    of the record. If the records had a sequence number that would be
    ideal, otherwise a moderately strong hash of the record would do.

  - during the main transaction phase the locking would prevent
    non-transaction writes from happening, but would not prevent other
    transactions from starting.

  - during the commit phase an exclusive lock would be taken (so only
    one commit happens at a time), and the commit logic would re-read
    all the records that were read during the transaction. If any of
    them have changed then we would have to fail the transaction. (for
    ctdb we had to do something a bit more complex, where we had the
    possibility of replaying the whole transaction. That was needed to
    cope with network disconnects, which we don't have to worry about
    for local tdb).

The big change would be that code that uses transaction would need to
cope with the possibility of a transaction commit failing due to a
conflict with another transaction. The usual way to cope with that is
a retry loop. That was why I didn't put concurrent transactions in tdb
- I wanted to keep the use of transactions simple from the programmers
point of view.

One thing we could consider is having a tdb flag
TDB_CONCURRENT_TRANSACTIONS to say that we are happy to cope with
concurrent transactions. So existing code would get reliable
transactions (reliable in the sense that you don't need to retry
them), whereas places that need concurrent transactions could ask for
them.

The main place I could see us using concurrent transactions in Samba
would be ldb, where it might solve a problem we have at the moment,
where the single process Samba4 process model can have multiple ldap
'transactions' at the ldb level mapped to a single tdb transaction. To
solve I think that we'd also have to introduce the concept of a
"transaction handle" to identify which transaction we're talking
about, or find some way to have the same tdb open twice in the same
process (perhaps similar to what Howard did for the split of tdb into
per-thread and global parts of the tdb structure).

Christian, do you want concurrent transactions within a process or
between processes?

Cheers, Tridge