[PATCH] tdb traverse transaction unlocking bug

Michael Adam obnox at samba.org
Fri Jul 17 17:14:49 MDT 2009


Hi Rusty,

thanks for bringing this up (again)!

Rusty Russell wrote:
> Recent playing with tdb revealed that the code which takes the transaction
> lock for a traversal is buggy.  This has two causes:
> 
> 1) Any (non-read) traversal inside a transaction causes the transaction lock
>     to be released when the traversal finishes, and
> 2) Traversals inside traversals drop the transaction lock early (when the
>     inner traverse finishes).
> 
> Fix is to do proper recursion with the transaction lock, like other locks.
> 
> The samba version of tdb has a different workaround in traverse.c (Volker's
> patch of  May 20 2008), which solves the first problem, but not the second (and
> this patch still applies, but then the code in traverse.c should be removed).

Initially I thought that a rather elegant alternative fix would be
to replace the transaction_(un)locks in the transaction_methods
by a no-op, but this would also only fix #1. :-)

So the transaction-lock-counter solution (that had also already
been mentioned in Volker's commit) is definitively the right way.

> Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
> 
> diff --git a/lib/tdb/common/lock.c b/lib/tdb/common/lock.c
> index f156c0f..6f75787 100644
> --- a/lib/tdb/common/lock.c
> +++ b/lib/tdb/common/lock.c
> @@ -301,7 +301,11 @@ int tdb_unlock(struct tdb_context *tdb, int list, int ltype)
>   */
>  int tdb_transaction_lock(struct tdb_context *tdb, int ltype)
>  {
> -	if (tdb->have_transaction_lock || tdb->global_lock.count) {
> +	if (tdb->global_lock.count) {
> +		return 0;
> +	}
> +	if (tdb->have_transaction_lock) {
> +		tdb->have_transaction_lock++;
>  		return 0;
>  	}
>  	if (tdb->methods->tdb_brlock(tdb, TRANSACTION_LOCK, ltype, 
> @@ -310,7 +314,7 @@ int tdb_transaction_lock(struct tdb_context *tdb, int ltype)
>  		tdb->ecode = TDB_ERR_LOCK;
>  		return -1;
>  	}
> -	tdb->have_transaction_lock = 1;
> +	tdb->have_transaction_lock++;
>  	return 0;
>  }
>  
> @@ -319,15 +323,13 @@ int tdb_transaction_lock(struct tdb_context *tdb, int ltype)
>   */
>  int tdb_transaction_unlock(struct tdb_context *tdb)
>  {
> -	int ret;
> -	if (!tdb->have_transaction_lock) {
> +	if (tdb->global_lock.count) {
>  		return 0;
>  	}

Hmm, I don't quite get why you remove the check
if (!tdb->have_transaction_lock):

Either you rely on the caller to only call transaction_unlock()
when transaction_lock() has been called previously, and then
you don't need to check for tdb->global_lock.count either,
or else you should also check for tdb->have_transaction_lock
or you risk decrementing have_transaction_lock when it is == 0.

> -	ret = tdb->methods->tdb_brlock(tdb, TRANSACTION_LOCK, F_UNLCK, F_SETLKW, 0, 1);
> -	if (ret == 0) {
> -		tdb->have_transaction_lock = 0;
> +	if (--tdb->have_transaction_lock) {
> +		return 0;
>  	}
> -	return ret;
> +	return tdb->methods->tdb_brlock(tdb, TRANSACTION_LOCK, F_UNLCK, F_SETLKW, 0, 1);
>  }
>  
>  
> diff --git a/lib/tdb/common/tdb_private.h b/lib/tdb/common/tdb_private.h
> index ffac89f..c2f51db 100644
> --- a/lib/tdb/common/tdb_private.h
> +++ b/lib/tdb/common/tdb_private.h
> @@ -166,7 +168,7 @@ struct tdb_context {
>  	struct tdb_transaction *transaction;
>  	int page_size;
>  	int max_dead_records;
> -	bool have_transaction_lock;
> +	int have_transaction_lock;
>  	volatile sig_atomic_t *interrupt_sig_ptr;
>  };

Maybe we should also rename have_transaction_lock to
transaction_lock_count (or so).

Cheers - Michael

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 206 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20090718/b1757b2a/attachment.pgp>


More information about the samba-technical mailing list