[SCM] Samba Shared Repository - branch master updated

Volker Lendecke vlendec at samba.org
Thu Jun 26 04:17:05 MDT 2014


The branch, master has been updated
       via  55ff3a3 tdb: defragment the freelist in tdb_allocate_from_freelist()
       via  6502df5 tdb: add "freelist_size" sub-command to tdbtool
       via  56f9231 tdb: use tdb_freelist_merge_adjacent in tdb_freelist_size()
       via  843a8a5 tdb: add tdb_freelist_merge_adjacent()
       via  73c439f tdb: add utility function check_merge_ptr_with_left_record()
       via  4bec28b tdb: simplify tdb_free() using check_merge_with_left_record()
       via  117807c tdb: add utility function check_merge_with_left_record()
       via  66f3330 tdb: improve comments for tdb_free().
       via  8be5c8a tdb: factor merge_with_left_record() out of tdb_free()
       via  63673ae tdb: fix debug message in tdb_free()
       via  08a76aa tdb: reduce indentation in tdb_free() for merging left
       via  87ac4ac tdb: increase readability of read_record_on_left()
       via  f5a777a tdb: factor read_record_on_left() out of tdb_free()
      from  8fa0cde torture4: Add a little test that truncate actually works :-)

http://gitweb.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 55ff3a3b912a911cb33b47f55be2966fbbbde2d8
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 11 12:05:57 2014 +0200

    tdb: defragment the freelist in tdb_allocate_from_freelist()
    
    While we are traversing the freelist anyways, merge a record
    with the left if it is also a free list record.
    
    That partially makes up for the fragmentation introduced by
    the lack of merging with right records in tdb_free().
    
    Note there is a potential slight downside:
    If the left record we merge the current record into was earlier
    in the chain and has hence already been met in traverse,
    then we can not use the enlarged record even if it might be
    a new best fit.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>
    
    Autobuild-User(master): Volker Lendecke <vl at samba.org>
    Autobuild-Date(master): Thu Jun 26 12:16:03 CEST 2014 on sn-devel-104

commit 6502df58391cc8d3477d1c5cce4a6aa027d439bd
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 07:39:31 2014 +0100

    tdb: add "freelist_size" sub-command to tdbtool
    
    With the new code, this has the side effect of
    merging adjacent records in the freelist if the
    database is not read-only.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 56f9231c8ee3afb7ba7a1f56deee5181522ef5f4
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 11 17:26:51 2014 +0200

    tdb: use tdb_freelist_merge_adjacent in tdb_freelist_size()
    
    So that we automatically defragment the free list when freelist_size is called
    (unless the database is read only).
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 843a8a5c7b82459fae2eee490da005a68e4ced91
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 02:39:09 2014 +0100

    tdb: add tdb_freelist_merge_adjacent()
    
    This is intended to be called to reduce the fragmentation in the
    freelist. This is to make up the deficiency of the freelist
    to be not doubly linked. If the freelist were doubly linked,
    we could easily avoid the creation of adjacent freelist entries.
    But with the current singly linked list, it is only possible
    to cheaply merge a new free record into a freelist entry on the left,
    not on the right...
    
    This can be called periodically, e.g. in the vacuuming process
    of a ctdb cluster.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 73c439f5818c40582892c1170ab8b21fb4f59c6d
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 11 12:04:01 2014 +0200

    tdb: add utility function check_merge_ptr_with_left_record()
    
    Variant of check_merge_with_left_record() that reads the record
    itself if necessary.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 4bec28bfa9c383327f6c9767868997e4b239cafd
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 11 12:03:01 2014 +0200

    tdb: simplify tdb_free() using check_merge_with_left_record()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 117807cd2dbb7cf2276b84e2b20f858cd6ec30e9
Author: Michael Adam <obnox at samba.org>
Date:   Wed Jun 11 12:00:48 2014 +0200

    tdb: add utility function check_merge_with_left_record()
    
    Check whether the record left of a given freelist record is
    also a freelist record, and if so, merge the two records.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 66f3330be82a382898bb22c3f9563cc873c75488
Author: Michael Adam <obnox at samba.org>
Date:   Sat Apr 19 01:49:44 2014 +0200

    tdb: improve comments for tdb_free().
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 8be5c8a6dbce9076d7443a0cedf482b3ff754d3b
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 02:01:38 2014 +0100

    tdb: factor merge_with_left_record() out of tdb_free()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 63673aea9ffb29cb28329c2982fccbbefa24242d
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 01:51:39 2014 +0100

    tdb: fix debug message in tdb_free()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 08a76aabe931fdb39eec9ae51ce659302268db86
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 01:46:42 2014 +0100

    tdb: reduce indentation in tdb_free() for merging left
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit 87ac4ac523b68776c98fb1656d89d1eb28419fd4
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 01:39:15 2014 +0100

    tdb: increase readability of read_record_on_left()
    
    by using early returns and better variable names,
    and reducing indentation.
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

commit f5a777a36ca1767df70087dfd084e86d4e97d242
Author: Michael Adam <obnox at samba.org>
Date:   Mon Feb 10 01:31:50 2014 +0100

    tdb: factor read_record_on_left() out of tdb_free()
    
    Signed-off-by: Michael Adam <obnox at samba.org>
    Reviewed-by: Volker Lendecke <vl at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 lib/tdb/common/freelist.c |  416 +++++++++++++++++++++++++++++++++++++++------
 lib/tdb/tools/tdbtool.c   |   15 ++
 2 files changed, 376 insertions(+), 55 deletions(-)


Changeset truncated at 500 lines:

diff --git a/lib/tdb/common/freelist.c b/lib/tdb/common/freelist.c
index 2aeeb1c..86fac2f 100644
--- a/lib/tdb/common/freelist.c
+++ b/lib/tdb/common/freelist.c
@@ -97,10 +97,217 @@ static int update_tailer(struct tdb_context *tdb, tdb_off_t offset,
 			 &totalsize);
 }
 
-/* Add an element into the freelist. Merge adjacent records if
-   necessary. */
+/**
+ * Read the record directly on the left.
+ * Fail if there is no record on the left.
+ */
+static int read_record_on_left(struct tdb_context *tdb, tdb_off_t rec_ptr,
+			       tdb_off_t *left_p,
+			       struct tdb_record *left_r)
+{
+	tdb_off_t left_ptr;
+	tdb_off_t left_size;
+	struct tdb_record left_rec;
+	int ret;
+
+	left_ptr = rec_ptr - sizeof(tdb_off_t);
+
+	if (left_ptr <= TDB_DATA_START(tdb->hash_size)) {
+		/* no record on the left */
+		return -1;
+	}
+
+	/* Read in tailer and jump back to header */
+	ret = tdb_ofs_read(tdb, left_ptr, &left_size);
+	if (ret == -1) {
+		TDB_LOG((tdb, TDB_DEBUG_FATAL,
+			"tdb_free: left offset read failed at %u\n", left_ptr));
+		return -1;
+	}
+
+	/* it could be uninitialised data */
+	if (left_size == 0 || left_size == TDB_PAD_U32) {
+		return -1;
+	}
+
+	if (left_size > rec_ptr) {
+		return -1;
+	}
+
+	left_ptr = rec_ptr - left_size;
+
+	if (left_ptr < TDB_DATA_START(tdb->hash_size)) {
+		return -1;
+	}
+
+	/* Now read in the left record */
+	ret = tdb->methods->tdb_read(tdb, left_ptr, &left_rec,
+				     sizeof(left_rec), DOCONV());
+	if (ret == -1) {
+		TDB_LOG((tdb, TDB_DEBUG_FATAL,
+			 "tdb_free: left read failed at %u (%u)\n",
+			 left_ptr, left_size));
+		return -1;
+	}
+
+	*left_p = left_ptr;
+	*left_r = left_rec;
+
+	return 0;
+}
+
+/**
+ * Merge new freelist record with the direct left neighbour.
+ * This assumes that left_rec represents the record
+ * directly to the left of right_rec and that this is
+ * a freelist record.
+ */
+static int merge_with_left_record(struct tdb_context *tdb,
+				  tdb_off_t left_ptr,
+				  struct tdb_record *left_rec,
+				  struct tdb_record *right_rec)
+{
+	int ret;
+
+	left_rec->rec_len += sizeof(*right_rec) + right_rec->rec_len;
+
+	ret = tdb_rec_write(tdb, left_ptr, left_rec);
+	if (ret == -1) {
+		TDB_LOG((tdb, TDB_DEBUG_FATAL,
+			 "merge_with_left_record: update_left failed at %u\n",
+			 left_ptr));
+		return -1;
+	}
+
+	ret = update_tailer(tdb, left_ptr, left_rec);
+	if (ret == -1) {
+		TDB_LOG((tdb, TDB_DEBUG_FATAL,
+			 "merge_with_left_record: update_tailer failed at %u\n",
+			 left_ptr));
+		return -1;
+	}
+
+	return 0;
+}
+
+/**
+ * Check whether the record left of a given freelist record is
+ * also a freelist record, and if so, merge the two records.
+ *
+ * Return code:
+ *  -1 upon error
+ *   0 if left was not a free record
+ *   1 if left was free and successfully merged.
+ *
+ * The currend record is handed in with pointer and fully read record.
+ *
+ * The left record pointer and struct can be retrieved as result
+ * in lp and lr;
+ */
+static int check_merge_with_left_record(struct tdb_context *tdb,
+					tdb_off_t rec_ptr,
+					struct tdb_record *rec,
+					tdb_off_t *lp,
+					struct tdb_record *lr)
+{
+	tdb_off_t left_ptr;
+	struct tdb_record left_rec;
+	int ret;
+
+	ret = read_record_on_left(tdb, rec_ptr, &left_ptr, &left_rec);
+	if (ret != 0) {
+		return 0;
+	}
+
+	if (left_rec.magic != TDB_FREE_MAGIC) {
+		return 0;
+	}
+
+	/* It's free - expand to include it. */
+	ret = merge_with_left_record(tdb, left_ptr, &left_rec, rec);
+	if (ret != 0) {
+		return -1;
+	}
+
+	if (lp != NULL) {
+		*lp = left_ptr;
+	}
+
+	if (lr != NULL) {
+		*lr = left_rec;
+	}
+
+	return 1;
+}
+
+/**
+ * Check whether the record left of a given freelist record is
+ * also a freelist record, and if so, merge the two records.
+ *
+ * Return code:
+ *  -1 upon error
+ *   0 if left was not a free record
+ *   1 if left was free and successfully merged.
+ *
+ * In this variant, the input record is specified just as the pointer
+ * and is read from the database if needed.
+ *
+ * next_ptr will contain the original record's next pointer after
+ * successful merging (which will be lost after merging), so that
+ * the caller can update the last pointer.
+ */
+static int check_merge_ptr_with_left_record(struct tdb_context *tdb,
+					    tdb_off_t rec_ptr,
+					    tdb_off_t *next_ptr)
+{
+	tdb_off_t left_ptr;
+	struct tdb_record rec, left_rec;
+	int ret;
+
+	ret = read_record_on_left(tdb, rec_ptr, &left_ptr, &left_rec);
+	if (ret != 0) {
+		return 0;
+	}
+
+	if (left_rec.magic != TDB_FREE_MAGIC) {
+		return 0;
+	}
+
+	/* It's free - expand to include it. */
+
+	ret = tdb->methods->tdb_read(tdb, rec_ptr, &rec,
+				     sizeof(rec), DOCONV());
+	if (ret != 0) {
+		return -1;
+	}
+
+	ret = merge_with_left_record(tdb, left_ptr, &left_rec, &rec);
+	if (ret != 0) {
+		return -1;
+	}
+
+	if (next_ptr != NULL) {
+		*next_ptr = rec.next;
+	}
+
+	return 1;
+}
+
+/**
+ * Add an element into the freelist.
+ *
+ * We merge the new record into the left record if it is also a
+ * free record, but not with the right one. This makes the
+ * operation O(1) instead of O(n): merging with the right record
+ * requires a traverse of the freelist to find the previous
+ * record in the free list.
+ *
+ * This prevents db traverses from being O(n^2) after a lot of deletes.
+ */
 int tdb_free(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec)
 {
+	int ret;
+
 	/* Allocation and tailer lock */
 	if (tdb_lock(tdb, -1, F_WRLCK) != 0)
 		return -1;
@@ -138,58 +345,17 @@ int tdb_free(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec)
 left:
 #endif
 
-	/* Look left */
-	if (offset - sizeof(tdb_off_t) > TDB_DATA_START(tdb->hash_size)) {
-		tdb_off_t left = offset - sizeof(tdb_off_t);
-		struct tdb_record l;
-		tdb_off_t leftsize;
-
-		/* Read in tailer and jump back to header */
-		if (tdb_ofs_read(tdb, left, &leftsize) == -1) {
-			TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_free: left offset read failed at %u\n", left));
-			goto update;
-		}
-
-		/* it could be uninitialised data */
-		if (leftsize == 0 || leftsize == TDB_PAD_U32) {
-			goto update;
-		}
-
-		left = offset - leftsize;
-
-		if (leftsize > offset ||
-		    left < TDB_DATA_START(tdb->hash_size)) {
-			goto update;
-		}
-
-		/* Now read in the left record */
-		if (tdb->methods->tdb_read(tdb, left, &l, sizeof(l), DOCONV()) == -1) {
-			TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_free: left read failed at %u (%u)\n", left, leftsize));
-			goto update;
-		}
-
-		/* If it's free, expand to include it. */
-		if (l.magic == TDB_FREE_MAGIC) {
-			/* we now merge the new record into the left record, rather than the other
-			   way around. This makes the operation O(1) instead of O(n). This change
-			   prevents traverse from being O(n^2) after a lot of deletes */
-			l.rec_len += sizeof(*rec) + rec->rec_len;
-			if (tdb_rec_write(tdb, left, &l) == -1) {
-				TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_free: update_left failed at %u\n", left));
-				goto fail;
-			}
-			if (update_tailer(tdb, left, &l) == -1) {
-				TDB_LOG((tdb, TDB_DEBUG_FATAL, "tdb_free: update_tailer failed at %u\n", offset));
-				goto fail;
-			}
-			tdb_unlock(tdb, -1, F_WRLCK);
-			return 0;
-		}
+	ret = check_merge_with_left_record(tdb, offset, rec, NULL, NULL);
+	if (ret == -1) {
+		goto fail;
+	}
+	if (ret == 1) {
+		/* merged */
+		goto done;
 	}
 
-update:
+	/* Nothing to merge, prepend to free list */
 
-	/* Now, prepend to free list */
 	rec->magic = TDB_FREE_MAGIC;
 
 	if (tdb_ofs_read(tdb, FREELIST_TOP, &rec->next) == -1 ||
@@ -199,6 +365,7 @@ update:
 		goto fail;
 	}
 
+done:
 	/* And we're done. */
 	tdb_unlock(tdb, -1, F_WRLCK);
 	return 0;
@@ -282,6 +449,7 @@ static tdb_off_t tdb_allocate_from_freelist(
 		tdb_len_t rec_len;
 	} bestfit;
 	float multiplier = 1.0;
+	bool merge_created_candidate;
 
 	/* over-allocate to reduce fragmentation */
 	length *= 1.25;
@@ -291,6 +459,7 @@ static tdb_off_t tdb_allocate_from_freelist(
 	length = TDB_ALIGN(length, TDB_ALIGNMENT);
 
  again:
+	merge_created_candidate = false;
 	last_ptr = FREELIST_TOP;
 
 	/* read in the freelist top */
@@ -307,10 +476,59 @@ static tdb_off_t tdb_allocate_from_freelist(
 	   issues when faced with a slowly increasing record size.
 	 */
 	while (rec_ptr) {
+		int ret;
+		tdb_off_t left_ptr;
+		struct tdb_record left_rec;
+
 		if (tdb_rec_free_read(tdb, rec_ptr, rec) == -1) {
 			return 0;
 		}
 
+		ret = check_merge_with_left_record(tdb, rec_ptr, rec,
+						   &left_ptr, &left_rec);
+		if (ret == -1) {
+			return 0;
+		}
+		if (ret == 1) {
+			/* merged */
+			rec_ptr = rec->next;
+			ret = tdb_ofs_write(tdb, last_ptr, &rec->next);
+			if (ret == -1) {
+				return 0;
+			}
+
+			/*
+			 * We have merged the current record into the left
+			 * neighbour. So our traverse of the freelist will
+			 * skip it and consider the next record in the chain.
+			 *
+			 * But the enlarged left neighbour may be a candidate.
+			 * If it is, we can not directly use it, though.
+			 * The only thing we can do and have to do here is to
+			 * update the current best fit size in the chain if the
+			 * current best fit is the left record. (By that we may
+			 * worsen the best fit we already had, bit this is not a
+			 * problem.)
+			 *
+			 * If the current best fit is not the left record,
+			 * all we can do is remember the fact that a merge
+			 * created a new candidate so that we can trigger
+			 * a second walk of the freelist if at the end of
+			 * the first walk we have not found any fit.
+			 * This way we can avoid expanding the database.
+			 */
+
+			if (bestfit.rec_ptr == left_ptr) {
+				bestfit.rec_len = left_rec.rec_len;
+			}
+
+			if (left_rec.rec_len > length) {
+				merge_created_candidate = true;
+			}
+
+			continue;
+		}
+
 		if (rec->rec_len >= length) {
 			if (bestfit.rec_ptr == 0 ||
 			    rec->rec_len < bestfit.rec_len) {
@@ -350,6 +568,10 @@ static tdb_off_t tdb_allocate_from_freelist(
 		return newrec_ptr;
 	}
 
+	if (merge_created_candidate) {
+		goto again;
+	}
+
 	/* we didn't find enough space. See if we can expand the
 	   database and if we can then try again */
 	if (tdb_expand(tdb, length + sizeof(*rec)) == 0)
@@ -444,10 +666,69 @@ blocking_freelist_allocate:
 	return ret;
 }
 
-/*
-   return the size of the freelist - used to decide if we should repack
-*/
-_PUBLIC_ int tdb_freelist_size(struct tdb_context *tdb)
+/**
+ * Merge adjacent records in the freelist.
+ */
+static int tdb_freelist_merge_adjacent(struct tdb_context *tdb,
+				       int *count_records, int *count_merged)
+{
+	tdb_off_t cur, next;
+	int count = 0;
+	int merged = 0;
+	int ret;
+
+	ret = tdb_lock(tdb, -1, F_RDLCK);
+	if (ret == -1) {
+		return -1;
+	}
+
+	cur = FREELIST_TOP;
+	while (tdb_ofs_read(tdb, cur, &next) == 0 && next != 0) {
+		tdb_off_t next2;
+
+		count++;
+
+		ret = check_merge_ptr_with_left_record(tdb, next, &next2);
+		if (ret == -1) {
+			goto done;
+		}
+		if (ret == 1) {
+			/*
+			 * merged:
+			 * now let cur->next point to next2 instead of next
+			 */
+
+			ret = tdb_ofs_write(tdb, cur, &next2);
+			if (ret != 0) {
+				goto done;
+			}
+
+			next = next2;
+			merged++;
+		}
+
+		cur = next;
+	}
+
+	if (count_records != NULL) {
+		*count_records = count;
+	}
+
+	if (count_merged != NULL) {
+		*count_merged = merged;
+	}
+
+	ret = 0;
+
+done:
+	tdb_unlock(tdb, -1, F_RDLCK);
+	return ret;
+}
+
+/**
+ * return the size of the freelist - no merging done
+ */
+static int tdb_freelist_size_no_merge(struct tdb_context *tdb)
 {
 	tdb_off_t ptr;
 	int count=0;
@@ -464,3 +745,28 @@ _PUBLIC_ int tdb_freelist_size(struct tdb_context *tdb)
 	tdb_unlock(tdb, -1, F_RDLCK);
 	return count;
 }
+
+/**
+ * return the size of the freelist - used to decide if we should repack
+ *
+ * As a side effect, adjacent records are merged unless the
+ * database is read-only, in order to reduce the fragmentation
+ * without repacking.
+ */
+_PUBLIC_ int tdb_freelist_size(struct tdb_context *tdb)
+{
+
+	int count = 0;
+
+	if (tdb->read_only) {
+		count = tdb_freelist_size_no_merge(tdb);
+	} else {
+		int ret;
+		ret = tdb_freelist_merge_adjacent(tdb, &count, NULL);
+		if (ret != 0) {
+			return -1;
+		}
+	}
+
+	return count;
+}
diff --git a/lib/tdb/tools/tdbtool.c b/lib/tdb/tools/tdbtool.c
index 2f93e33..780782b 100644
--- a/lib/tdb/tools/tdbtool.c
+++ b/lib/tdb/tools/tdbtool.c
@@ -55,6 +55,7 @@ enum commands {
 	CMD_DELETE,
 	CMD_LIST_HASH_FREE,
 	CMD_LIST_FREE,
+	CMD_FREELIST_SIZE,
 	CMD_INFO,
 	CMD_MMAP,


-- 
Samba Shared Repository


More information about the samba-cvs mailing list