tdb transaction nesting and ctdb

ronnie sahlberg ronniesahlberg at
Tue Apr 28 03:32:15 GMT 2009


Ill change tdb_transaction_start() to return an error instead of cancelling
the previous transaction.
This does not matter to CTDB.

Actually this change will only change the semantics off which transaction
will succeed and which will fail.
Currently the innermost transaction will succeed while the outermost
transaction will fail. I personally think this is more logical semantics but
I dont feel strongly enough about it to argue.
I will change it so that instead the innermost transaction will fail and the
outermost transaction will succeed.

I consider nested transactions to be unsafe for callers that are not aware
of the nesting since they can trigger behaviour that are not following the
path of least surprise, for example :

1, tdb_transaction_start()
2, tdb_transaction_start()  -- nested
3, tdb_transaction_commit() -- successful
4, ...   error ocucrs
5, tdb_transaction_commit() -- failure

For the unsuspecting callers this means that tdb responded to the caller in
3 saying success == your transaction was successfully comitted.

However, since this was a nested transaction this didnt commit anything at
all but just decrease the ensting counter.
A short while later the outmost transaction failed, and implicitely this
caused the innermost transaction 2,3 to fail as well.
At that stage it has already been reploied to 3 that the commit was
successful and it would be a bit late to come back and say "remember that
transaction i said was comitted a while ago?  well thing is I changed my
mind, it actually failed".

Thus for safe use of transactions a caller would actually need to know
whether or not the transaction created was top-level or not, because if it
was not a top-level transaction then the return code from
tdb_transaction_commit() is basically undefined and does not tell whether
the transaction completed successfully or not.

For that case I think it would actually be better to invert the flag to be
and have tdb_transaction_start() fail if the flag is not explicitely set by
the application and if there was already a transaction in flight.

>   ret = tdb_transaction_start(tdb);
>   if (ret == -1 && tdb_error(tdb) == TDB_ERR_NESTING) {
>        DEBUG(0,(__location__ " Cancelling old transaction\n"));
>        tdb_transaction_cancel(tdb);
>        ret = tdb_transaction_start(tdb);
>   }
>Does that sound ok?

That sounds ok.

I was thinking about making even simpler like:

while(tdb_transaction_cancel() == 0) ;
ret = tdb_transaction_start();

But that would not be as clear to just why the tdb_transaction_cancel() is

ronnie sahlberg

On Tue, Apr 28, 2009 at 11:34 AM, <tridge at> wrote:

> Hi Ronnie,
> I'm looking at your commit in the tdb code for ctdb:
> As we discussed last week, I think that adding a flag that disables
> nested tdb transactions is a good idea, but I think your patch goes
> about it the wrong way.
> The reason we should have a flag to disable nested tdb transactions is
> that two pieces of code in the same application that both use
> transactions can easily step on each others toes, which is what
> happened with ctdb. When one piece of code is running a transaction,
> and a second piece of code starts a new transaction, then cancels it,
> the first transaction is currently put into an error state, which
> causes operations to be lost. That is not good, but at least the
> application is told that operations have been lost.
> With your change the situation is now worse, as operations can be
> silently lost. You've added a TDB_NO_NESTING flag for tdb, which when
> set means that a new transactions auto-cancels any currently
> outstanding transaction. That means that new transactions will undo
> any previous operations, without the caller of the previous
> transaction having any way to know that this has happened. It may work
> for the specific problem you are addressing in ctdb, but I think it is
> a very poor API.
> What I'd suggest is that we have a TDB_NO_NESTED_TRANSACTIONS flag,
> which causes any attempt to create a nested transaction to fail, with
> a new TDB_ERR_NESTING tdb error code. This means that existing
> transaction operations are not lost.
> The behaviour you want for ctdb can then be achieved like this:
>    ret = tdb_transaction_start(tdb);
>    if (ret == -1 && tdb_error(tdb) == TDB_ERR_NESTING) {
>         DEBUG(0,(__location__ " Cancelling old transaction\n"));
>         tdb_transaction_cancel(tdb);
>         ret = tdb_transaction_start(tdb);
>    }
> Does that sound ok?
> I also wonder why you set/unset TDB_NO_NESTING in the ctdb code? What
> situations are there in ctdb where you think nested transactions are
> desirable?
> The tdb code in ctdb is also starting to diverge a bit from the
> mainline tdb code. I think we should try and keep the two copies of
> tdb in sync as far as possible.
> Cheers, Tridge

More information about the samba-technical mailing list