Bugfix for tdb transactions

Michael Adam obnox at samba.org
Sun Jan 31 11:24:29 MST 2010


Volker Lendecke wrote:
> On Sun, Jan 31, 2010 at 08:10:20PM +1100, tridge at samba.org wrote:
> > I think if you leave the tdb->transaction->prepared boolean alone, and
> > instead add a new bool to the transaction structure that gets set when
> > the GLOBAL_LOCK is taken then the patch will be good.
> 
> Thanks for the review! New patch attached.

Oh yes, much better! In the quick fix I was attempting,
I was trying to avoid introduction of another global state
and rely on the stacking of the actions instead.
This patch is much clearer and simpler, apart from being
correct. =)

Cheers - Michael

> This time I did
> run tdbtorture, seems to survive it fine. I'll run the
> original test that found the bug tomorrow (need to VPN into
> the customer network).
> 
> > Thanks for finding the bug! I can imagine that this was not an easy
> > one to find :-)
> 
> The main problem why it took so long was that I trusted the
> tdb code... I looked everywhere but in lib/tdb, but once I had
> instrumented ctdb to not use mmap so that I could see the
> actual db access, it was pretty obvious what was going on.
> 
> Volker

> From 155f0d40253a0df850231298ee2edbd8dac229e0 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Fri, 29 Jan 2010 18:21:09 +0100
> Subject: [PATCH] tdb: fix an early release of the global lock that can cause data corruption
> 
> There was a bug in tdb where the
> 
>                 tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
> 
> (ending the transaction-"mutex") was done before the
> 
>                         /* remove the recovery marker */
> 
> This means that when a transaction is committed there is a window where another
> opener of the file sees the transaction marker while the transaction committer
> is still fully functional and working on it. This led to transaction being
> rolled back by that second opener of the file while transaction_commit() gave
> no error to the caller.
> 
> This patch moves the F_UNLCK to after the recovery marker was removed, closing
> this window.
> ---
>  lib/tdb/common/transaction.c |   15 ++++++++++-----
>  1 files changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/lib/tdb/common/transaction.c b/lib/tdb/common/transaction.c
> index 20f2bfc..b8988ea 100644
> --- a/lib/tdb/common/transaction.c
> +++ b/lib/tdb/common/transaction.c
> @@ -135,6 +135,9 @@ struct tdb_transaction {
>  	bool prepared;
>  	tdb_off_t magic_offset;
>  
> +	/* set when the GLOBAL_LOCK has been taken */
> +	bool global_lock_taken;
> +
>  	/* old file size before transaction */
>  	tdb_len_t old_map_size;
>  
> @@ -603,6 +606,11 @@ int _tdb_transaction_cancel(struct tdb_context *tdb)
>  		}
>  	}
>  
> +	if (tdb->transaction->global_lock_taken) {
> +		tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
> +		tdb->transaction->global_lock_taken = false;
> +	}
> +
>  	/* remove any global lock created during the transaction */
>  	if (tdb->global_lock.count != 0) {
>  		tdb_brlock(tdb, FREELIST_TOP, F_UNLCK, F_SETLKW, 0, 4*tdb->header.hash_size);
> @@ -947,11 +955,12 @@ static int _tdb_transaction_prepare_commit(struct tdb_context *tdb)
>  		return -1;
>  	}
>  
> +	tdb->transaction->global_lock_taken = true;
> +
>  	if (!(tdb->flags & TDB_NOSYNC)) {
>  		/* write the recovery data to the end of the file */
>  		if (transaction_setup_recovery(tdb, &tdb->transaction->magic_offset) == -1) {
>  			TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_transaction_prepare_commit: failed to setup recovery data\n"));
> -			tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
>  			_tdb_transaction_cancel(tdb);
>  			return -1;
>  		}
> @@ -966,7 +975,6 @@ static int _tdb_transaction_prepare_commit(struct tdb_context *tdb)
>  					     tdb->transaction->old_map_size) == -1) {
>  			tdb->ecode = TDB_ERR_IO;
>  			TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_transaction_prepare_commit: expansion failed\n"));
> -			tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
>  			_tdb_transaction_cancel(tdb);
>  			return -1;
>  		}
> @@ -1056,7 +1064,6 @@ int tdb_transaction_commit(struct tdb_context *tdb)
>  			tdb_transaction_recover(tdb); 
>  
>  			_tdb_transaction_cancel(tdb);
> -			tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
>  
>  			TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_transaction_commit: write failed\n"));
>  			return -1;
> @@ -1072,8 +1079,6 @@ int tdb_transaction_commit(struct tdb_context *tdb)
>  		return -1;
>  	}
>  
> -	tdb_brlock(tdb, GLOBAL_LOCK, F_UNLCK, F_SETLKW, 0, 1);
> -
>  	/*
>  	  TODO: maybe write to some dummy hdr field, or write to magic
>  	  offset without mmap, before the last sync, instead of the
> -- 
> 1.6.0.4
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 206 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20100131/834865eb/attachment.pgp>


More information about the samba-technical mailing list