Bugfix for tdb transactions

Rusty Russell rusty at rustcorp.com.au
Mon Feb 1 02:54:44 MST 2010


On Mon, 1 Feb 2010 06:49:27 pm Rusty Russell wrote:
> On Mon, 1 Feb 2010 06:01:12 pm Volker Lendecke wrote:
> > Before that is fixed, should we commit my patch to fix the
> > problem that happens without the kill -9?
> 
> I prefer that.  I like it from a simplicity point of view, even though the
> larger fix will revert it.

Actually, I changed my mind.  Here is the simplest fix:

tdb: grab entire db lock during recovery

Volker tracked down an open vs commit race which was causing corruption,
as the commit code was dropping the GLOBAL_LOCK before finishing.

And despite the comment above tdb_transaction_recover(), the open path calls
it with only GLOBAL_LOCK rather than the entire db locked.

The simple fix is to lock all the records before invoking recovery: for
this we introduce a new internal function tdb_check_recovery() which
optimistically checks the recovery area before obtaining the lock and
rechecking, and if necessary, performs recovery.

(This method avoids adding a slowdown to tdb_open, as well as allowing us
to call it in future on all write operations).

Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>

diff --git a/lib/tdb/common/open.c b/lib/tdb/common/open.c
index 4d4f95a..cffdf55 100644
--- a/lib/tdb/common/open.c
+++ b/lib/tdb/common/open.c
@@ -333,7 +333,7 @@ struct tdb_context *tdb_open_ex(const char *name, int hash_size, int tdb_flags,
 	}
 
 	/* if needed, run recovery */
-	if (tdb_transaction_recover(tdb) == -1) {
+	if (tdb_check_recovery(tdb) == -1) {
 		goto fail;
 	}
 
diff --git a/lib/tdb/common/tdb_private.h b/lib/tdb/common/tdb_private.h
index be9be72..c33a066 100644
--- a/lib/tdb/common/tdb_private.h
+++ b/lib/tdb/common/tdb_private.h
@@ -230,6 +230,7 @@ int tdb_ofs_write(struct tdb_context *tdb, tdb_off_t offset, tdb_off_t *d);
 int tdb_lock_record(struct tdb_context *tdb, tdb_off_t off);
 int tdb_unlock_record(struct tdb_context *tdb, tdb_off_t off);
 int _tdb_transaction_cancel(struct tdb_context *tdb);
+int tdb_check_recovery(struct tdb_context *tdb);
 int tdb_rec_read(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec);
 int tdb_rec_write(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec);
 int tdb_do_delete(struct tdb_context *tdb, tdb_off_t rec_ptr, struct tdb_record *rec);
diff --git a/lib/tdb/common/transaction.c b/lib/tdb/common/transaction.c
index 20f2bfc..11e5321 100644
--- a/lib/tdb/common/transaction.c
+++ b/lib/tdb/common/transaction.c
@@ -1229,3 +1229,45 @@ int tdb_transaction_recover(struct tdb_context *tdb)
 	/* all done */
 	return 0;
 }
+
+/* 0 = no, -1 = error. */
+int tdb_check_recovery(struct tdb_context *tdb)
+{
+	tdb_off_t recovery_head;
+	struct tdb_record rec;
+	int ret;
+
+	/* find the recovery area */
+	if (tdb_ofs_read(tdb, TDB_RECOVERY_HEAD, &recovery_head) == -1) {
+		TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_need_recovery: failed to read recovery head\n"));
+		tdb->ecode = TDB_ERR_IO;
+		return -1;
+	}
+
+	if (recovery_head == 0) {
+		/* we have never allocated a recovery record */
+		return 0;
+	}
+
+	/* read the recovery record */
+	if (tdb->methods->tdb_read(tdb, recovery_head, &rec,
+				   sizeof(rec), DOCONV()) == -1) {
+		TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_need_recovery: failed to read recovery record\n"));
+		tdb->ecode = TDB_ERR_IO;
+		return -1;
+	}
+
+	if (rec.magic != TDB_RECOVERY_MAGIC) {
+		return 0;
+	}
+
+	/* Grab lock and check again. */
+	if (tdb->methods->tdb_brlock(tdb, FREELIST_TOP, F_WRLCK,
+				     F_SETLKW, 0, 0) == -1) {
+		return -1;
+	}
+
+	ret = tdb_transaction_recover(tdb);
+	tdb->methods->tdb_brlock(tdb, FREELIST_TOP, F_UNLCKW, F_SETLK, 0, 0);
+	return ret;
+}


More information about the samba-technical mailing list