[SCM] Samba Shared Repository - branch master updated
Jeremy Allison
jra at samba.org
Tue Oct 30 01:49:03 UTC 2018
The branch, master has been updated
via 500a729 tdb: Make record deletion circular-chain safe
via 0521265 tdb: Do early RDONLY error check for tdb_delete
via 4ed2a67 tdb: Purge dead records whenever we block the freelist
via 96f6768 tdb: Don't delete dead records in traverse
via c18e2ed tdb: Align an integer type
via 14abead tdb: Fix a typo
via 5d565f6 vfs_fruit: remove check for number of xattrs from ad_convert_xattr
from b37f8f8 python: do not use "is" for string equality
https://git.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit 500a729a5566a1c8afc36e6f3e73910b13c8b9ad
Author: Volker Lendecke <vl at samba.org>
Date: Wed Oct 24 14:31:34 2018 +0200
tdb: Make record deletion circular-chain safe
Before this patch we had 3 loops walking a hash chain to delete
records:
tdb_do_delete() to find the predecessor of the record that was to be
deleted. tdb_count_dead(), the name says it all and tdb_purge_dead()
to give back all dead records from a chain to the freelist.
This patch introduces tdb_trim_dead that walks a hash chain just
once. While it does so it counts the number of dead records, and all
records beyond tdb->max_dead_records are moved to the freelist.
Normal record deletion now works by always marking a record as dead in
step 1 and then calling tdb_trim_dead. This is made safe against
circular chains by doing the slow chain walk only in the case when we
did not delete a dead record during our walk.
It changes our dynamics a bit:
When deleting a record with non-zero max_dead_records, now we always
leave that number of records around when deleting, doing a blocking
lock on the freelist when we found too many dead records.
Previously when exceeding max_dead_records we wiped all dead records
to start accumulating them from scratch, assuming we could lock the
freelist in a nonblocking fashion.
The net effect for an uncontended freelist is the same: In
tdb_allocate() we still completely hand over all dead records to the
freelist when we could lock it, it just happens later than without
this patch.
This means for a lightly loaded system we will potentially leave more
dead records around in databases like locking.tdb. However, on a
heavily loaded system we become more predictable: If the freelist is
so heavily contended that across many deletes we can't get hold of it,
previously we accumulated more dead records than max_dead_records
would allow. This is a really lowlevel tradeoff that is likely hard to
measure, but to me becoming more deterministic without sacrificing too
much parallelism (we keep more dead records around) is worth trying.
Signed-off-by: Volker Lendecke <vl at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
Autobuild-User(master): Jeremy Allison <jra at samba.org>
Autobuild-Date(master): Tue Oct 30 02:48:38 CET 2018 on sn-devel-144
commit 05212658baef695448a22f43250c1268fe3310e3
Author: Volker Lendecke <vl at samba.org>
Date: Thu Oct 25 15:59:48 2018 +0200
tdb: Do early RDONLY error check for tdb_delete
Signed-off-by: Volker Lendecke <vl at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
commit 4ed2a67a59d826292e7d4d6991942b4cc15d72dc
Author: Volker Lendecke <vl at samba.org>
Date: Thu Oct 25 15:55:29 2018 +0200
tdb: Purge dead records whenever we block the freelist
Signed-off-by: Volker Lendecke <vl at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
commit 96f6768e05493bc47431fb954335349f56af822e
Author: Volker Lendecke <vl at samba.org>
Date: Tue Oct 23 13:40:34 2018 +0200
tdb: Don't delete dead records in traverse
The next commit will change the handling of dead records, removing the
"tdb_do_delete" function. As traverses should not happen in normal
operations, dead records from them should be rare, and relying on
traverses to remove them is a very bad idea IMHO.
Signed-off-by: Volker Lendecke <vl at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
commit c18e2ed00f199a6c48d3e9c9dd7213b723e80f5b
Author: Volker Lendecke <vl at samba.org>
Date: Wed Oct 24 07:26:49 2018 +0200
tdb: Align an integer type
Signed-off-by: Volker Lendecke <vl at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
commit 14abead3ca13923c6ec7951afb18b97fabeee9f1
Author: Volker Lendecke <vl at samba.org>
Date: Thu Oct 25 20:53:52 2018 +0200
tdb: Fix a typo
Signed-off-by: Volker Lendecke <vl at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
commit 5d565f636fcf49fc1bbbfbc24ef730d2f7cc2cf0
Author: Ralph Boehme <slow at samba.org>
Date: Fri Oct 19 12:15:42 2018 +0200
vfs_fruit: remove check for number of xattrs from ad_convert_xattr
Turns out that there exist AppleDouble files with an extended FinderInfo
entry that includes the xattr marshall buffer, but the count of xattrs
in the buffer is just zero.
We do want to discard this extended FinderInfo entry and convert it to a
simple fixed size FinderInfo entry, so remove the check.
Bug: https://bugzilla.samba.org/show_bug.cgi?id=13649
Signed-off-by: Ralph Boehme <slow at samba.org>
Reviewed-by: Jeremy Allison <jra at samba.org>
-----------------------------------------------------------------------
Summary of changes:
lib/tdb/common/dump.c | 2 +-
lib/tdb/common/freelist.c | 17 ++++
lib/tdb/common/lock.c | 4 +-
lib/tdb/common/tdb.c | 222 ++++++++++++++++++++++++-------------------
lib/tdb/common/tdb_private.h | 3 +-
lib/tdb/common/traverse.c | 6 --
source3/modules/vfs_fruit.c | 4 -
7 files changed, 143 insertions(+), 115 deletions(-)
Changeset truncated at 500 lines:
diff --git a/lib/tdb/common/dump.c b/lib/tdb/common/dump.c
index d4e3478..adcf591 100644
--- a/lib/tdb/common/dump.c
+++ b/lib/tdb/common/dump.c
@@ -95,7 +95,7 @@ static int tdb_dump_chain(struct tdb_context *tdb, int i)
_PUBLIC_ void tdb_dump_all(struct tdb_context *tdb)
{
- int i;
+ uint32_t i;
for (i=0;i<tdb->hash_size;i++) {
tdb_dump_chain(tdb, i);
}
diff --git a/lib/tdb/common/freelist.c b/lib/tdb/common/freelist.c
index 5afd89b..9064386 100644
--- a/lib/tdb/common/freelist.c
+++ b/lib/tdb/common/freelist.c
@@ -555,6 +555,17 @@ static bool tdb_alloc_dead(
return (tdb_ofs_write(tdb, last_ptr, &rec->next) == 0);
}
+static void tdb_purge_dead(struct tdb_context *tdb, uint32_t hash)
+{
+ uint32_t max_dead_records = tdb->max_dead_records;
+
+ tdb->max_dead_records = 0;
+
+ tdb_trim_dead(tdb, hash);
+
+ tdb->max_dead_records = max_dead_records;
+}
+
/*
* Chain "hash" is assumed to be locked
*/
@@ -619,6 +630,12 @@ blocking_freelist_allocate:
if (tdb_lock(tdb, -1, F_WRLCK) == -1) {
return 0;
}
+ /*
+ * Dead records can happen even if max_dead_records==0, they
+ * are older than the max_dead_records concept: They happen if
+ * tdb_delete happens concurrently with a traverse.
+ */
+ tdb_purge_dead(tdb, hash);
ret = tdb_allocate_from_freelist(tdb, length, rec);
tdb_unlock(tdb, -1, F_WRLCK);
return ret;
diff --git a/lib/tdb/common/lock.c b/lib/tdb/common/lock.c
index 9f30c7a..f55184d 100644
--- a/lib/tdb/common/lock.c
+++ b/lib/tdb/common/lock.c
@@ -149,8 +149,8 @@ static int fcntl_unlock(struct tdb_context *tdb, int rw, off_t off, off_t len)
* This is the memory layout of the hashchain array:
*
* FREELIST_TOP + 0 = freelist
- * FREELIST_TOP + 4 = hashtbale list 0
- * FREELIST_TOP + 8 = hashtbale list 1
+ * FREELIST_TOP + 4 = hashtable list 0
+ * FREELIST_TOP + 8 = hashtable list 1
* ...
*
* Otoh lock_offset computes:
diff --git a/lib/tdb/common/tdb.c b/lib/tdb/common/tdb.c
index 4e433c8..9c80a36 100644
--- a/lib/tdb/common/tdb.c
+++ b/lib/tdb/common/tdb.c
@@ -357,103 +357,139 @@ _PUBLIC_ int tdb_exists(struct tdb_context *tdb, TDB_DATA key)
return ret;
}
-/* actually delete an entry in the database given the offset */
-int tdb_do_delete(struct tdb_context *tdb, tdb_off_t rec_ptr, struct tdb_record *rec)
+/*
+ * Move a dead record to the freelist. The hash chain and freelist
+ * must be locked.
+ */
+static int tdb_del_dead(struct tdb_context *tdb,
+ uint32_t last_ptr,
+ uint32_t rec_ptr,
+ struct tdb_record *rec,
+ bool *deleted)
{
- tdb_off_t last_ptr, i;
- struct tdb_record lastrec;
-
- if (tdb->read_only || tdb->traverse_read) return -1;
+ int ret;
- if (((tdb->traverse_write != 0) && (!TDB_DEAD(rec))) ||
- tdb_write_lock_record(tdb, rec_ptr) == -1) {
- /* Someone traversing here: mark it as dead */
- rec->magic = TDB_DEAD_MAGIC;
- return tdb_rec_write(tdb, rec_ptr, rec);
+ ret = tdb_write_lock_record(tdb, rec_ptr);
+ if (ret == -1) {
+ /* Someone traversing here: Just leave it dead */
+ return 0;
}
- if (tdb_write_unlock_record(tdb, rec_ptr) != 0)
- return -1;
-
- /* find previous record in hash chain */
- if (tdb_ofs_read(tdb, TDB_HASH_TOP(rec->full_hash), &i) == -1)
- return -1;
- for (last_ptr = 0; i != rec_ptr; last_ptr = i, i = lastrec.next)
- if (tdb_rec_read(tdb, i, &lastrec) == -1)
- return -1;
-
- /* unlink it: next ptr is at start of record. */
- if (last_ptr == 0)
- last_ptr = TDB_HASH_TOP(rec->full_hash);
- if (tdb_ofs_write(tdb, last_ptr, &rec->next) == -1)
+ ret = tdb_write_unlock_record(tdb, rec_ptr);
+ if (ret == -1) {
return -1;
-
- /* recover the space */
- if (tdb_free(tdb, rec_ptr, rec) == -1)
+ }
+ ret = tdb_ofs_write(tdb, last_ptr, &rec->next);
+ if (ret == -1) {
return -1;
- return 0;
-}
-
-static int tdb_count_dead(struct tdb_context *tdb, uint32_t hash)
-{
- int res = 0;
- tdb_off_t rec_ptr;
- struct tdb_record rec;
-
- /* read in the hash top */
- if (tdb_ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1)
- return 0;
+ }
- while (rec_ptr) {
- if (tdb_rec_read(tdb, rec_ptr, &rec) == -1)
- return 0;
+ *deleted = true;
- if (rec.magic == TDB_DEAD_MAGIC) {
- res += 1;
- }
- rec_ptr = rec.next;
- }
- return res;
+ ret = tdb_free(tdb, rec_ptr, rec);
+ return ret;
}
/*
- * Purge all DEAD records from a hash chain
+ * Walk the hash chain and leave tdb->max_dead_records around. Move
+ * the rest of dead records to the freelist.
*/
-int tdb_purge_dead(struct tdb_context *tdb, uint32_t hash)
+int tdb_trim_dead(struct tdb_context *tdb, uint32_t hash)
{
- int res = -1;
+ struct tdb_chainwalk_ctx chainwalk;
struct tdb_record rec;
- tdb_off_t rec_ptr;
+ tdb_off_t last_ptr, rec_ptr;
+ bool locked_freelist = false;
+ int num_dead = 0;
+ int ret;
- if (tdb_lock_nonblock(tdb, -1, F_WRLCK) == -1) {
- /*
- * Don't block the freelist if not strictly necessary
- */
+ last_ptr = TDB_HASH_TOP(hash);
+
+ /*
+ * Init chainwalk with the pointer to the hash top. It might
+ * be that the very first record in the chain is a dead one
+ * that we have to delete.
+ */
+ tdb_chainwalk_init(&chainwalk, last_ptr);
+
+ ret = tdb_ofs_read(tdb, last_ptr, &rec_ptr);
+ if (ret == -1) {
return -1;
}
- /* read in the hash top */
- if (tdb_ofs_read(tdb, TDB_HASH_TOP(hash), &rec_ptr) == -1)
- goto fail;
-
- while (rec_ptr) {
- tdb_off_t next;
+ while (rec_ptr != 0) {
+ bool deleted = false;
+ uint32_t next;
- if (tdb_rec_read(tdb, rec_ptr, &rec) == -1) {
+ ret = tdb_rec_read(tdb, rec_ptr, &rec);
+ if (ret == -1) {
goto fail;
}
+ /*
+ * Make a copy of rec.next: Further down we might
+ * delete and put the record on the freelist. Make
+ * sure that modifications in that code path can't
+ * break the chainwalk here.
+ */
next = rec.next;
- if (rec.magic == TDB_DEAD_MAGIC
- && tdb_do_delete(tdb, rec_ptr, &rec) == -1) {
- goto fail;
+ if (rec.magic == TDB_DEAD_MAGIC) {
+ num_dead += 1;
+
+ if (num_dead > tdb->max_dead_records) {
+
+ if (!locked_freelist) {
+ /*
+ * Lock the freelist only if
+ * it's really required.
+ */
+ ret = tdb_lock(tdb, -1, F_WRLCK);
+ if (ret == -1) {
+ goto fail;
+ };
+ locked_freelist = true;
+ }
+
+ ret = tdb_del_dead(
+ tdb,
+ last_ptr,
+ rec_ptr,
+ &rec,
+ &deleted);
+
+ if (ret == -1) {
+ goto fail;
+ }
+ }
+ }
+
+ /*
+ * Don't do the chainwalk check if "rec_ptr" was
+ * deleted. We reduced the chain, and the chainwalk
+ * check might catch up early. Imagine a valid chain
+ * with just dead records: We never can bump the
+ * "slow" pointer in chainwalk_check, as there isn't
+ * anything left to jump to and compare.
+ */
+ if (!deleted) {
+ bool ok;
+
+ last_ptr = rec_ptr;
+
+ ok = tdb_chainwalk_check(tdb, &chainwalk, next);
+ if (!ok) {
+ ret = -1;
+ goto fail;
+ }
}
rec_ptr = next;
}
- res = 0;
- fail:
- tdb_unlock(tdb, -1, F_WRLCK);
- return res;
+ ret = 0;
+fail:
+ if (locked_freelist) {
+ tdb_unlock(tdb, -1, F_WRLCK);
+ }
+ return ret;
}
/* delete an entry in the database given a key */
@@ -463,43 +499,29 @@ static int tdb_delete_hash(struct tdb_context *tdb, TDB_DATA key, uint32_t hash)
struct tdb_record rec;
int ret;
+ if (tdb->read_only || tdb->traverse_read) {
+ tdb->ecode = TDB_ERR_RDONLY;
+ return -1;
+ }
+
rec_ptr = tdb_find_lock_hash(tdb, key, hash, F_WRLCK, &rec);
if (rec_ptr == 0) {
return -1;
}
- if (tdb->max_dead_records != 0) {
-
- uint32_t magic = TDB_DEAD_MAGIC;
-
- /*
- * Allow for some dead records per hash chain, mainly for
- * tdb's with a very high create/delete rate like locking.tdb.
- */
-
- if (tdb_count_dead(tdb, hash) >= tdb->max_dead_records) {
- /*
- * Don't let the per-chain freelist grow too large,
- * delete all existing dead records
- */
- tdb_purge_dead(tdb, hash);
- }
-
- /*
- * Just mark the record as dead.
- */
- ret = tdb_ofs_write(
- tdb, rec_ptr + offsetof(struct tdb_record, magic),
- &magic);
- }
- else {
- ret = tdb_do_delete(tdb, rec_ptr, &rec);
+ /*
+ * Mark the record dead
+ */
+ rec.magic = TDB_DEAD_MAGIC;
+ ret = tdb_rec_write(tdb, rec_ptr, &rec);
+ if (ret == -1) {
+ goto done;
}
- if (ret == 0) {
- tdb_increment_seqnum(tdb);
- }
+ tdb_increment_seqnum(tdb);
+ ret = tdb_trim_dead(tdb, hash);
+done:
if (tdb_unlock(tdb, BUCKET(hash), F_WRLCK) != 0)
TDB_LOG((tdb, TDB_DEBUG_WARNING, "tdb_delete: WARNING tdb_unlock failed!\n"));
return ret;
diff --git a/lib/tdb/common/tdb_private.h b/lib/tdb/common/tdb_private.h
index 307cad9..42aaac6 100644
--- a/lib/tdb/common/tdb_private.h
+++ b/lib/tdb/common/tdb_private.h
@@ -311,7 +311,6 @@ int tdb_unlock_record(struct tdb_context *tdb, tdb_off_t off);
bool tdb_needs_recovery(struct tdb_context *tdb);
int tdb_rec_read(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec);
int tdb_rec_write(struct tdb_context *tdb, tdb_off_t offset, struct tdb_record *rec);
-int tdb_do_delete(struct tdb_context *tdb, tdb_off_t rec_ptr, struct tdb_record *rec);
unsigned char *tdb_alloc_read(struct tdb_context *tdb, tdb_off_t offset, tdb_len_t len);
int tdb_parse_data(struct tdb_context *tdb, TDB_DATA key,
tdb_off_t offset, tdb_len_t len,
@@ -323,7 +322,7 @@ tdb_off_t tdb_find_lock_hash(struct tdb_context *tdb, TDB_DATA key, uint32_t has
tdb_off_t tdb_find_dead(struct tdb_context *tdb, uint32_t hash,
struct tdb_record *r, tdb_len_t length,
tdb_off_t *p_last_ptr);
-int tdb_purge_dead(struct tdb_context *tdb, uint32_t hash);
+int tdb_trim_dead(struct tdb_context *tdb, uint32_t hash);
void tdb_io_init(struct tdb_context *tdb);
int tdb_expand(struct tdb_context *tdb, tdb_off_t size);
tdb_off_t tdb_expand_adjust(tdb_off_t map_size, tdb_off_t size, int page_size);
diff --git a/lib/tdb/common/traverse.c b/lib/tdb/common/traverse.c
index 7a1d567..a9af1d4 100644
--- a/lib/tdb/common/traverse.c
+++ b/lib/tdb/common/traverse.c
@@ -96,7 +96,6 @@ static tdb_off_t tdb_next_lock(struct tdb_context *tdb, struct tdb_traverse_lock
/* Iterate through chain */
while( tlock->off) {
- tdb_off_t current;
if (tdb_rec_read(tdb, tlock->off, rec) == -1)
goto fail;
@@ -114,12 +113,7 @@ static tdb_off_t tdb_next_lock(struct tdb_context *tdb, struct tdb_traverse_lock
return tlock->off;
}
- /* Try to clean dead ones from old traverses */
- current = tlock->off;
tlock->off = rec->next;
- if (!(tdb->read_only || tdb->traverse_read) &&
- tdb_do_delete(tdb, current, rec) != 0)
- goto fail;
}
tdb_unlock(tdb, tlock->list, tlock->lock_rw);
want_next = 0;
diff --git a/source3/modules/vfs_fruit.c b/source3/modules/vfs_fruit.c
index 30fab85..a3dabcd 100644
--- a/source3/modules/vfs_fruit.c
+++ b/source3/modules/vfs_fruit.c
@@ -1014,10 +1014,6 @@ static bool ad_convert_xattr(struct adouble *ad,
return true;
}
- if (ad->adx_header.adx_num_attrs == 0) {
- return true;
- }
-
if (string_replace_cmaps == NULL) {
const char **mappings = NULL;
--
Samba Shared Repository
More information about the samba-cvs
mailing list