tdb transaction nesting and ctdb

tridge at samba.org tridge at samba.org
Tue Apr 28 01:34:39 GMT 2009


Hi Ronnie,

I'm looking at your commit in the tdb code for ctdb:

  http://git.samba.org/?p=sahlberg/ctdb.git;a=commitdiff;h=459e4ee135bd1cd24c15e5325906eb4ecfd550ec;hp=70f21428c9eec96bcc787be191e7478ad68956dc

As we discussed last week, I think that adding a flag that disables
nested tdb transactions is a good idea, but I think your patch goes
about it the wrong way.

The reason we should have a flag to disable nested tdb transactions is
that two pieces of code in the same application that both use
transactions can easily step on each others toes, which is what
happened with ctdb. When one piece of code is running a transaction,
and a second piece of code starts a new transaction, then cancels it,
the first transaction is currently put into an error state, which
causes operations to be lost. That is not good, but at least the
application is told that operations have been lost.

With your change the situation is now worse, as operations can be
silently lost. You've added a TDB_NO_NESTING flag for tdb, which when
set means that a new transactions auto-cancels any currently
outstanding transaction. That means that new transactions will undo
any previous operations, without the caller of the previous
transaction having any way to know that this has happened. It may work
for the specific problem you are addressing in ctdb, but I think it is
a very poor API.

What I'd suggest is that we have a TDB_NO_NESTED_TRANSACTIONS flag,
which causes any attempt to create a nested transaction to fail, with
a new TDB_ERR_NESTING tdb error code. This means that existing
transaction operations are not lost. 

The behaviour you want for ctdb can then be achieved like this:

    ret = tdb_transaction_start(tdb);
    if (ret == -1 && tdb_error(tdb) == TDB_ERR_NESTING) {
         DEBUG(0,(__location__ " Cancelling old transaction\n"));
         tdb_transaction_cancel(tdb);
	 ret = tdb_transaction_start(tdb);
    }

Does that sound ok?

I also wonder why you set/unset TDB_NO_NESTING in the ctdb code? What
situations are there in ctdb where you think nested transactions are
desirable?

The tdb code in ctdb is also starting to diverge a bit from the
mainline tdb code. I think we should try and keep the two copies of
tdb in sync as far as possible.

Cheers, Tridge


More information about the samba-technical mailing list