[SCM] Samba Shared Repository - branch master updated

Jeremy Allison jra at samba.org
Tue Oct 30 01:49:03 UTC 2018


The branch, master has been updated
       via  500a729 tdb: Make record deletion circular-chain safe
       via  0521265 tdb: Do early RDONLY error check for tdb_delete
       via  4ed2a67 tdb: Purge dead records whenever we block the freelist
       via  96f6768 tdb: Don't delete dead records in traverse
       via  c18e2ed tdb: Align an integer type
       via  14abead tdb: Fix a typo
       via  5d565f6 vfs_fruit: remove check for number of xattrs from ad_convert_xattr
      from  b37f8f8 python: do not use "is" for string equality

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 500a729a5566a1c8afc36e6f3e73910b13c8b9ad
Author: Volker Lendecke <vl at samba.org>
Date:   Wed Oct 24 14:31:34 2018 +0200

    tdb: Make record deletion circular-chain safe
    
    Before this patch we had 3 loops walking a hash chain to delete
    records:
    
    tdb_do_delete() to find the predecessor of the record that was to be
    deleted. tdb_count_dead(), the name says it all and tdb_purge_dead()
    to give back all dead records from a chain to the freelist.
    
    This patch introduces tdb_trim_dead that walks a hash chain just
    once. While it does so it counts the number of dead records, and all
    records beyond tdb->max_dead_records are moved to the freelist.
    
    Normal record deletion now works by always marking a record as dead in
    step 1 and then calling tdb_trim_dead. This is made safe against
    circular chains by doing the slow chain walk only in the case when we
    did not delete a dead record during our walk.
    
    It changes our dynamics a bit:
    
    When deleting a record with non-zero max_dead_records, now we always
    leave that number of records around when deleting, doing a blocking
    lock on the freelist when we found too many dead records.
    
    Previously when exceeding max_dead_records we wiped all dead records
    to start accumulating them from scratch, assuming we could lock the
    freelist in a nonblocking fashion.
    
    The net effect for an uncontended freelist is the same: In
    tdb_allocate() we still completely hand over all dead records to the
    freelist when we could lock it, it just happens later than without
    this patch.
    
    This means for a lightly loaded system we will potentially leave more
    dead records around in databases like locking.tdb. However, on a
    heavily loaded system we become more predictable: If the freelist is
    so heavily contended that across many deletes we can't get hold of it,
    previously we accumulated more dead records than max_dead_records
    would allow. This is a really lowlevel tradeoff that is likely hard to
    measure, but to me becoming more deterministic without sacrificing too
    much parallelism (we keep more dead records around) is worth trying.
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>
    
    Autobuild-User(master): Jeremy Allison <jra at samba.org>
    Autobuild-Date(master): Tue Oct 30 02:48:38 CET 2018 on sn-devel-144

commit 05212658baef695448a22f43250c1268fe3310e3
Author: Volker Lendecke <vl at samba.org>
Date:   Thu Oct 25 15:59:48 2018 +0200

    tdb: Do early RDONLY error check for tdb_delete
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>

commit 4ed2a67a59d826292e7d4d6991942b4cc15d72dc
Author: Volker Lendecke <vl at samba.org>
Date:   Thu Oct 25 15:55:29 2018 +0200

    tdb: Purge dead records whenever we block the freelist
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>

commit 96f6768e05493bc47431fb954335349f56af822e
Author: Volker Lendecke <vl at samba.org>
Date:   Tue Oct 23 13:40:34 2018 +0200

    tdb: Don't delete dead records in traverse
    
    The next commit will change the handling of dead records, removing the
    "tdb_do_delete" function. As traverses should not happen in normal
    operations, dead records from them should be rare, and relying on
    traverses to remove them is a very bad idea IMHO.
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>

commit c18e2ed00f199a6c48d3e9c9dd7213b723e80f5b
Author: Volker Lendecke <vl at samba.org>
Date:   Wed Oct 24 07:26:49 2018 +0200

    tdb: Align an integer type
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>

commit 14abead3ca13923c6ec7951afb18b97fabeee9f1
Author: Volker Lendecke <vl at samba.org>
Date:   Thu Oct 25 20:53:52 2018 +0200

    tdb: Fix a typo
    
    Signed-off-by: Volker Lendecke <vl at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>

commit 5d565f636fcf49fc1bbbfbc24ef730d2f7cc2cf0
Author: Ralph Boehme <slow at samba.org>
Date:   Fri Oct 19 12:15:42 2018 +0200

    vfs_fruit: remove check for number of xattrs from ad_convert_xattr
    
    Turns out that there exist AppleDouble files with an extended FinderInfo
    entry that includes the xattr marshall buffer, but the count of xattrs
    in the buffer is just zero.
    
    We do want to discard this extended FinderInfo entry and convert it to a
    simple fixed size FinderInfo entry, so remove the check.
    
    Bug: https://bugzilla.samba.org/show_bug.cgi?id=13649
    
    Signed-off-by: Ralph Boehme <slow at samba.org>
    Reviewed-by: Jeremy Allison <jra at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 lib/tdb/common/dump.c        |   2 +-
 lib/tdb/common/freelist.c    |  17 ++++
 lib/tdb/common/lock.c        |   4 +-
 lib/tdb/common/tdb.c         | 222 ++++++++++++++++++++++++-------------------
 lib/tdb/common/tdb_private.h |   3 +-
 lib/tdb/common/traverse.c    |   6 --
 source3/modules/vfs_fruit.c  |   4 -
 7 files changed, 143 insertions(+), 115 deletions(-)


Changeset truncated at 500 lines:

diff --git a/lib/tdb/common/dump.c b/lib/tdb/common/dump.c
index d4e3478..adcf591 100644
--- a/lib/tdb/common/dump.c
+++ b/lib/tdb/common/dump.c
@@ -95,7 +95,7 @@ static int tdb_dump_chain(struct tdb_context *tdb, int i)
 
 _PUBLIC_ void tdb_dump_all(struct tdb_context *tdb)
 {
-	int i;
+	uint32_t i;
 	for (i=0;i<tdb->hash_size;i++) {
 		tdb_dump_chain(tdb, i);
 	}
diff --git a/lib/tdb/common/freelist.c b/lib/tdb/common/freelist.c
index 5afd89b..9064386 100644
--- a/lib/tdb/common/freelist.c
+++ b/lib/tdb/common/freelist.c
@@ -555,6 +555,17 @@ static bool tdb_alloc_dead(
 	return (tdb_ofs_write(tdb, last_ptr, &rec->next) == 0);
 }
 
+static void tdb_purge_dead(struct tdb_context *tdb, uint32_t hash)
+{
+	uint32_t max_dead_records = tdb->max_dead_records;
+
+	tdb->max_dead_records = 0;
+
+	tdb_trim_dead(tdb, hash);
+
+	tdb->max_dead_records = max_dead_records;
+}
+
 /*
  * Chain "hash" is assumed to be locked
  */
@@ -619,6 +630,12 @@ blocking_freelist_allocate:
 	if (tdb_lock(tdb, -1, F_WRLCK) == -1) {
 		return 0;
 	}
+	/*
+	 * Dead records can happen even if max_dead_records==0, they
+	 * are older than the max_dead_records concept: They happen if
+	 * tdb_delete happens concurrently with a traverse.
+	 */
+	tdb_purge_dead(tdb, hash);
 	ret = tdb_allocate_from_freelist(tdb, length, rec);
 	tdb_unlock(tdb, -1, F_WRLCK);
 	return ret;
diff --git a/lib/tdb/common/lock.c b/lib/tdb/common/lock.c
index 9f30c7a..f55184d 100644
--- a/lib/tdb/common/lock.c
+++ b/lib/tdb/common/lock.c
@@ -149,8 +149,8 @@ static int fcntl_unlock(struct tdb_context *tdb, int rw, off_t off, off_t len)
  * This is the memory layout of the hashchain array:
  *
  * FREELIST_TOP + 0 = freelist
- * FREELIST_TOP + 4 = hashtbale list 0
- * FREELIST_TOP + 8 = hashtbale list 1
+ * FREELIST_TOP + 4 = hashtable list 0
+ * FREELIST_TOP + 8 = hashtable list 1
  * ...
  *
  * Otoh lock_offset computes:
diff --git a/lib/tdb/common/tdb.c b/lib/tdb/common/tdb.c
index 4e433c8..9c80a36 100644
--- a/lib/tdb/common/tdb.c
+++ b/lib/tdb/common/tdb.c
@@ -357,103 +357,139 @@ _PUBLIC_ int tdb_exists(struct tdb_context *tdb, TDB_DATA key)
 	return ret;
 }
 
-/* actually delete an entry in the database given the offset */
-int tdb_do_delete(struct tdb_context *tdb, tdb_off_t rec_ptr, struct tdb_record *rec)
+/*
+ * Move a dead record to the freelist. The hash chain and freelist
+ * must be locked.
+ */
+static int tdb_del_dead(struct tdb_context *tdb,
+			uint32_t last_ptr,
+			uint32_t rec_ptr,
+			struct tdb_record *rec,
+			bool *deleted)
 {
-	tdb_off_t last_ptr, i;
-	struct tdb_record lastrec;
-
-	if (tdb->read_only || tdb->traverse_read) return -1;
+	int ret;
 
-	if (((tdb->traverse_write != 0) && (!TDB_DEAD(rec))) ||
-	    tdb_write_lock_record(tdb, rec_ptr) == -1) {
-		/* Someone traversing here: mark it as dead */
-		rec->magic = TDB_DEAD_MAGIC;
-		return tdb_rec_write(tdb, rec_ptr, rec);
+	ret = tdb_write_lock_record(tdb, rec_ptr);
+	if (ret == -1) {
+		/* Someone traversing here: Just leave it dead */
+		return 0;
 	}
-	if (tdb_write_unlock_record(tdb, rec_ptr) != 0)
-		return -1;
-
-	/* find previous record in hash chain */
-	if (tdb_ofs_read(tdb, TDB_HASH_TOP(rec->full_hash), &i) == -1)
-		return -1;
-	for (last_ptr = 0; i != rec_ptr; last_ptr = i, i = lastrec.next)
-		if (tdb_rec_read(tdb, i, &lastrec) == -1)
-			return -1;
-
-	/* unlink it: next ptr is at start of record. */
-	if (last_ptr == 0)
-		last_ptr = TDB_HASH_TOP(rec->full_hash);
-	if (tdb_ofs_write(tdb, last_ptr, &rec->next) == -1)
+	ret = tdb_write_unlock_record(tdb, rec_ptr);
+	if (ret == -1) {
 		return -1;
-
-	/* recover the space */
-	if (tdb_free(tdb, rec_ptr, rec) == -1)
+	}
+	ret = tdb_ofs_write(tdb, last_ptr, &rec->next);
+	if (ret == -1) {
 		return -1;
-	return 0;
-}
-
-static int tdb_count_dead(struct tdb_context *tdb, uint32_t hash)
-{
-	int res = 0;
-	tdb_off_t rec_ptr;
-	struct tdb_record rec;
-
-	/* read in the hash top */
-	if (tdb_ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1)
-		return 0;
+	}
 
-	while (rec_ptr) {
-		if (tdb_rec_read(tdb, rec_ptr, &rec) == -1)
-			return 0;
+	*deleted = true;
 
-		if (rec.magic == TDB_DEAD_MAGIC) {
-			res += 1;
-		}
-		rec_ptr = rec.next;
-	}
-	return res;
+	ret = tdb_free(tdb, rec_ptr, rec);
+	return ret;
 }
 
 /*
- * Purge all DEAD records from a hash chain
+ * Walk the hash chain and leave tdb->max_dead_records around. Move
+ * the rest of dead records to the freelist.
  */
-int tdb_purge_dead(struct tdb_context *tdb, uint32_t hash)
+int tdb_trim_dead(struct tdb_context *tdb, uint32_t hash)
 {
-	int res = -1;
+	struct tdb_chainwalk_ctx chainwalk;
 	struct tdb_record rec;
-	tdb_off_t rec_ptr;
+	tdb_off_t last_ptr, rec_ptr;
+	bool locked_freelist = false;
+	int num_dead = 0;
+	int ret;
 
-	if (tdb_lock_nonblock(tdb, -1, F_WRLCK) == -1) {
-		/*
-		 * Don't block the freelist if not strictly necessary
-		 */
+	last_ptr = TDB_HASH_TOP(hash);
+
+	/*
+	 * Init chainwalk with the pointer to the hash top. It might
+	 * be that the very first record in the chain is a dead one
+	 * that we have to delete.
+	 */
+	tdb_chainwalk_init(&chainwalk, last_ptr);
+
+	ret = tdb_ofs_read(tdb, last_ptr, &rec_ptr);
+	if (ret == -1) {
 		return -1;
 	}
 
-	/* read in the hash top */
-	if (tdb_ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1)
-		goto fail;
-
-	while (rec_ptr) {
-		tdb_off_t next;
+	while (rec_ptr != 0) {
+		bool deleted = false;
+		uint32_t next;
 
-		if (tdb_rec_read(tdb, rec_ptr, &rec) == -1) {
+		ret = tdb_rec_read(tdb, rec_ptr, &rec);
+		if (ret == -1) {
 			goto fail;
 		}
 
+		/*
+		 * Make a copy of rec.next: Further down we might
+		 * delete and put the record on the freelist. Make
+		 * sure that modifications in that code path can't
+		 * break the chainwalk here.
+		 */
 		next = rec.next;
 
-		if (rec.magic == TDB_DEAD_MAGIC
-		    && tdb_do_delete(tdb, rec_ptr, &rec) == -1) {
-			goto fail;
+		if (rec.magic == TDB_DEAD_MAGIC) {
+			num_dead += 1;
+
+			if (num_dead > tdb->max_dead_records) {
+
+				if (!locked_freelist) {
+					/*
+					 * Lock the freelist only if
+					 * it's really required.
+					 */
+					ret = tdb_lock(tdb, -1, F_WRLCK);
+					if (ret == -1) {
+						goto fail;
+					};
+					locked_freelist = true;
+				}
+
+				ret = tdb_del_dead(
+					tdb,
+					last_ptr,
+					rec_ptr,
+					&rec,
+					&deleted);
+
+				if (ret == -1) {
+					goto fail;
+				}
+			}
+		}
+
+		/*
+		 * Don't do the chainwalk check if "rec_ptr" was
+		 * deleted. We reduced the chain, and the chainwalk
+		 * check might catch up early. Imagine a valid chain
+		 * with just dead records: We never can bump the
+		 * "slow" pointer in chainwalk_check, as there isn't
+		 * anything left to jump to and compare.
+		 */
+		if (!deleted) {
+			bool ok;
+
+			last_ptr = rec_ptr;
+
+			ok = tdb_chainwalk_check(tdb, &chainwalk, next);
+			if (!ok) {
+				ret = -1;
+				goto fail;
+			}
 		}
 		rec_ptr = next;
 	}
-	res = 0;
- fail:
-	tdb_unlock(tdb, -1, F_WRLCK);
-	return res;
+	ret = 0;
+fail:
+	if (locked_freelist) {
+		tdb_unlock(tdb, -1, F_WRLCK);
+	}
+	return ret;
 }
 
 /* delete an entry in the database given a key */
@@ -463,43 +499,29 @@ static int tdb_delete_hash(struct tdb_context *tdb, TDB_DATA key, uint32_t hash)
 	struct tdb_record rec;
 	int ret;
 
+	if (tdb->read_only || tdb->traverse_read) {
+		tdb->ecode = TDB_ERR_RDONLY;
+		return -1;
+	}
+
 	rec_ptr = tdb_find_lock_hash(tdb, key, hash, F_WRLCK, &rec);
 	if (rec_ptr == 0) {
 		return -1;
 	}
 
-	if (tdb->max_dead_records != 0) {
-
-		uint32_t magic = TDB_DEAD_MAGIC;
-
-		/*
-		 * Allow for some dead records per hash chain, mainly for
-		 * tdb's with a very high create/delete rate like locking.tdb.
-		 */
-
-		if (tdb_count_dead(tdb, hash) >= tdb->max_dead_records) {
-			/*
-			 * Don't let the per-chain freelist grow too large,
-			 * delete all existing dead records
-			 */
-			tdb_purge_dead(tdb, hash);
-		}
-
-		/*
-		 * Just mark the record as dead.
-		 */
-		ret = tdb_ofs_write(
-			tdb, rec_ptr + offsetof(struct tdb_record, magic),
-			&magic);
-	}
-	else {
-		ret = tdb_do_delete(tdb, rec_ptr, &rec);
+	/*
+	 * Mark the record dead
+	 */
+	rec.magic = TDB_DEAD_MAGIC;
+	ret = tdb_rec_write(tdb, rec_ptr, &rec);
+	if (ret == -1) {
+		goto done;
 	}
 
-	if (ret == 0) {
-		tdb_increment_seqnum(tdb);
-	}
+	tdb_increment_seqnum(tdb);
 
+	ret = tdb_trim_dead(tdb, hash);
+done:
 	if (tdb_unlock(tdb, BUCKET(hash), F_WRLCK) != 0)
 		TDB_LOG((tdb, TDB_DEBUG_WARNING, "tdb_delete: WARNING tdb_unlock failed!\n"));
 	return ret;
diff --git a/lib/tdb/common/tdb_private.h b/lib/tdb/common/tdb_private.h
index 307cad9..42aaac6 100644
--- a/lib/tdb/common/tdb_private.h
+++ b/lib/tdb/common/tdb_private.h
@@ -311,7 +311,6 @@ int tdb_unlock_record(struct tdb_context *tdb, tdb_off_t off);
 bool tdb_needs_recovery(struct tdb_context *tdb);
 int tdb_rec_read(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec);
 int tdb_rec_write(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec);
-int tdb_do_delete(struct tdb_context *tdb, tdb_off_t rec_ptr, struct tdb_record *rec);
 unsigned char *tdb_alloc_read(struct tdb_context *tdb, tdb_off_t offset, tdb_len_t len);
 int tdb_parse_data(struct tdb_context *tdb, TDB_DATA key,
 		   tdb_off_t offset, tdb_len_t len,
@@ -323,7 +322,7 @@ tdb_off_t tdb_find_lock_hash(struct tdb_context *tdb, TDB_DATA key, uint32_t has
 tdb_off_t tdb_find_dead(struct tdb_context *tdb, uint32_t hash,
 			struct tdb_record *r, tdb_len_t length,
 			tdb_off_t *p_last_ptr);
-int tdb_purge_dead(struct tdb_context *tdb, uint32_t hash);
+int tdb_trim_dead(struct tdb_context *tdb, uint32_t hash);
 void tdb_io_init(struct tdb_context *tdb);
 int tdb_expand(struct tdb_context *tdb, tdb_off_t size);
 tdb_off_t tdb_expand_adjust(tdb_off_t map_size, tdb_off_t size, int page_size);
diff --git a/lib/tdb/common/traverse.c b/lib/tdb/common/traverse.c
index 7a1d567..a9af1d4 100644
--- a/lib/tdb/common/traverse.c
+++ b/lib/tdb/common/traverse.c
@@ -96,7 +96,6 @@ static tdb_off_t tdb_next_lock(struct tdb_context *tdb, struct tdb_traverse_lock
 
 		/* Iterate through chain */
 		while( tlock->off) {
-			tdb_off_t current;
 			if (tdb_rec_read(tdb, tlock->off, rec) == -1)
 				goto fail;
 
@@ -114,12 +113,7 @@ static tdb_off_t tdb_next_lock(struct tdb_context *tdb, struct tdb_traverse_lock
 				return tlock->off;
 			}
 
-			/* Try to clean dead ones from old traverses */
-			current = tlock->off;
 			tlock->off = rec->next;
-			if (!(tdb->read_only || tdb->traverse_read) &&
-			    tdb_do_delete(tdb, current, rec) != 0)
-				goto fail;
 		}
 		tdb_unlock(tdb, tlock->list, tlock->lock_rw);
 		want_next = 0;
diff --git a/source3/modules/vfs_fruit.c b/source3/modules/vfs_fruit.c
index 30fab85..a3dabcd 100644
--- a/source3/modules/vfs_fruit.c
+++ b/source3/modules/vfs_fruit.c
@@ -1014,10 +1014,6 @@ static bool ad_convert_xattr(struct adouble *ad,
 		return true;
 	}
 
-	if (ad->adx_header.adx_num_attrs == 0) {
-		return true;
-	}
-
 	if (string_replace_cmaps == NULL) {
 		const char **mappings = NULL;
 


-- 
Samba Shared Repository



More information about the samba-cvs mailing list