[SCM] Samba Shared Repository - branch master updated
Rusty Russell
rusty at samba.org
Tue Feb 23 22:57:52 MST 2010
The branch, master has been updated
via ec96ea6... tdb: handle processes dying during transaction commit.
via 1bf482b... patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch
via ececeff... tdb: add -k option to tdbtorture
via 8c3fda4... tdb: don't truncate tdb on recovery
via 9f295ee... tdb: remove lock ops
via a84222b... tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
via dd1b508... tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
via fca1621... tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
via caaf5c6... tdb: suppress record write locks when allrecord lock is taken.
via 9341f23... tdb: cleanup: always grab allrecord lock to infinity.
via 1ab8776... tdb: remove num_locks
via d48c3e4... tdb: use tdb_nest_lock() for seqnum lock.
via 4738d47... tdb: use tdb_nest_lock() for active lock.
via 9136818... tdb: use tdb_nest_lock() for open lock.
via e8fa70a... tdb: use tdb_nest_lock() for transaction lock.
via ce41411... tdb: cleanup: find_nestlock() helper.
via db27073... tdb: cleanup: tdb_release_extra_locks() helper
via fba42f1... tdb: cleanup: tdb_have_extra_locks() helper
via b754f61... tdb: don't suppress the transaction lock because of the allrecord lock.
via 5d9de60... tdb: cleanup: tdb_nest_lock/tdb_nest_unlock
via e9114a7... tdb: cleanup: rename global_lock to allrecord_lock.
via 7ab422d... tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
via a6e0ef8... tdb: make _tdb_transaction_cancel static.
via 452b4a5... tdb: cleanup: split brlock and brunlock methods.
from fffdce6... s4/schema: Move msDS-IntId implementation to samldb.c module
http://gitweb.samba.org/?p=samba.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit ec96ea690edbe3398d690b4a953d487ca1773f1c
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 13:23:58 2010 +1030
tdb: handle processes dying during transaction commit.
tdb transactions were designed to be robust against the machine
powering off, but interestingly were never designed to handle the case
where an administrator kill -9's a process during commit. Because
recovery is only done on tdb_open, processes with the tdb already
mapped will simply use it despite it being corrupt and needing
recovery.
The solution to this is to check for recovery every time we grab a
data lock: we could have gained the lock because a process just died.
This has no measurable cost: here is the time for tdbtorture -s 0 -n 1
-l 10000:
Before:
2.75 2.50 2.81 3.19 2.91 2.53 2.72 2.50 2.78 2.77 = Avg 2.75
After:
2.81 2.57 3.42 2.49 3.02 2.49 2.84 2.48 2.80 2.43 = Avg 2.74
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 1bf482b9ef9ec73dd7ee4387d7087aa3955503dd
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 13:18:06 2010 +1030
patch tdb-refactor-tdb_lock-and-tdb_lock_nonblock.patch
commit ececeffd85db1b27c07cdf91a921fd203006daf6
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:53:05 2010 +1030
tdb: add -k option to tdbtorture
To test the case of death of a process during transaction commit, add
a -k (kill random) option to tdbtorture. The easiest way to do this
is to make every worker a child (unless there's only one child), which
is why this patch is bigger than you might expect.
Using -k without -t (always transactions) you expect corruption, though
it doesn't happen every time. With -t, we currently get corruption but
the next patch fixes that.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 8c3fda4318adc71899bc41486d5616da3a91a688
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:50:41 2010 +1030
tdb: don't truncate tdb on recovery
The current recovery code truncates the tdb file on recovery. This is
fine if recovery is only done on first open, but is a really bad idea
as we move to allowing recovery on "live" databases.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 9f295eecffd92e55584fc36539cd85cd32c832de
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:49:22 2010 +1030
tdb: remove lock ops
Now the transaction code uses the standard allrecord lock, that stops
us from trying to grab any per-record locks anyway. We don't need to
have special noop lock ops for transactions.
This is a nice simplification: if you see brlock, you know it's really
going to grab a lock.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit a84222bbaf9ed2c7b9c61b8157b2e3c85f17fa32
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 11:02:55 2010 +1030
tdb: rename tdb_release_extra_locks() to tdb_release_transaction_locks()
tdb_release_extra_locks() is too general: it carefully skips over the
transaction lock, even though the only caller then drops it. Change
this, and rename it to show it's clearly transaction-specific.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit dd1b508c63034452673dbfee9956f52a1b6c90a5
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 12:42:24 2010 +1030
tdb: cleanup: remove ltype argument from _tdb_transaction_cancel.
Now the transaction allrecord lock is the standard one, and thus is cleaned
in tdb_release_extra_locks(), _tdb_transaction_cancel() doesn't need to
know what type it is.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit fca1621965c547e2d076eca2a2599e9629f91266
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 15:42:15 2010 +1030
tdb: tdb_allrecord_lock/tdb_allrecord_unlock/tdb_allrecord_upgrade
Centralize locking of all chains of the tdb; rename _tdb_lockall to
tdb_allrecord_lock and _tdb_unlockall to tdb_allrecord_unlock, and
tdb_brlock_upgrade to tdb_allrecord_upgrade.
Then we use this in the transaction code. Unfortunately, if the transaction
code records that it has grabbed the allrecord lock read-only, write locks
will fail, so we treat this upgradable lock as a write lock, and mark it
as upgradable using the otherwise-unused offset field.
One subtlety: now the transaction code is using the allrecord_lock, the
tdb_release_extra_locks() function drops it for us, so we no longer need
to do it manually in _tdb_transaction_cancel.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit caaf5c6baa1a4f340c1f38edd99b3a8b56621b8b
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:45:26 2010 +1030
tdb: suppress record write locks when allrecord lock is taken.
Records themselves get (read) locked by the traversal code against delete.
Interestingly, this locking isn't done when the allrecord lock has been
taken, though the allrecord lock until recently didn't cover the actual
records (it now goes to end of file).
The write record lock, grabbed by the delete code, is not suppressed
by the allrecord lock. This is now bad: it causes us to punch a hole
in the allrecord lock when we release the write record lock. Make this
consistent: *no* record locks of any kind when the allrecord lock is
taken.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 9341f230f8968b4b18e451d15dda5ccbe7787768
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:45:14 2010 +1030
tdb: cleanup: always grab allrecord lock to infinity.
We were previously inconsistent with our "global" lock: the
transaction code grabbed it from FREELIST_TOP to end of file, and the
rest of the code grabbed it from FREELIST_TOP to end of the hash
chains. Change it to always grab to end of file for simplicity and
so we can merge the two.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 1ab8776247f89b143b6e58f4b038ab4bcea20d3a
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 15:01:07 2010 +1030
tdb: remove num_locks
This was redundant before this patch series: it mirrored num_lockrecs
exactly. It still does.
Also, skip useless branch when locks == 1: unconditional assignment is
cheaper anyway.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit d48c3e4982a38fb6b568ed3903e55e07a0fe5ca6
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:40:57 2010 +1030
tdb: use tdb_nest_lock() for seqnum lock.
This is pure overhead, but it centralizes the locking. Realloc (esp. as
most implementations are lazy) is fast compared to the fnctl anyway.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 4738d474c412cc59d26fcea64007e99094e8b675
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:44:40 2010 +1030
tdb: use tdb_nest_lock() for active lock.
Use our newly-generic nested lock tracking for the active lock.
Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 9136818df30c7179e1cffa18201cdfc990ebd7b7
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Mon Feb 22 13:58:07 2010 +1030
tdb: use tdb_nest_lock() for open lock.
This never nests, so it's overkill, but it centralizes the locking into
lock.c and removes the ugly flag in the transaction code to track whether
we have the lock or not.
Note that we have a temporary hack so this places a real lock, despite
the fact that we are in a transaction.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit e8fa70a321d489b454b07bd65e9b0d95084168de
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:37:34 2010 +1030
tdb: use tdb_nest_lock() for transaction lock.
Rather than a boutique lock and a separate nest count, use our
newly-generic nested lock tracking for the transaction lock.
Note that the tdb_have_extra_locks() and tdb_release_extra_locks()
functions have to skip over this lock now it is tracked.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit ce41411c84760684ce539b6a302a0623a6a78a72
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:35:54 2010 +1030
tdb: cleanup: find_nestlock() helper.
Factor out two loops which find locks; we are going to introduce a couple
more so a helper makes sense.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit db270734d8b4208e00ce9de5af1af7ee11823f6d
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:41:15 2010 +1030
tdb: cleanup: tdb_release_extra_locks() helper
Move locking intelligence back into lock.c, rather than open-coding the
lock release in transaction.c.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit fba42f1fb4f81b8913cce5a23ca5350ba45f40e1
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:34:26 2010 +1030
tdb: cleanup: tdb_have_extra_locks() helper
In many places we check whether locks are held: add a helper to do this.
The _tdb_lockall() case has already checked for the allrecord lock, so
the extra work done by tdb_have_extra_locks() is merely redundant.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit b754f61d235bdc3e410b60014d6be4072645e16f
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:31:49 2010 +1030
tdb: don't suppress the transaction lock because of the allrecord lock.
tdb_transaction_lock() and tdb_transaction_unlock() do nothing if we
hold the allrecord lock. However, the two locks don't overlap, so
this is wrong.
This simplification makes the transaction lock a straight-forward nested
lock.
There are two callers for these functions:
1) The transaction code, which already makes sure the allrecord_lock
isn't held.
2) The traverse code, which wants to stop transactions whether it has the
allrecord lock or not. There have been deadlocks here before, however
this should not bring them back (I hope!)
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 5d9de604d92d227899e9b861c6beafb2e4fa61e0
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:26:13 2010 +1030
tdb: cleanup: tdb_nest_lock/tdb_nest_unlock
Because fcntl locks don't nest, we track them in the tdb->lockrecs array
and only place/release them when the count goes to 1/0. We only do this
for record locks, so we simply place the list number (or -1 for the free
list) in the structure.
To generalize this:
1) Put the offset rather than list number in struct tdb_lock_type.
2) Rename _tdb_lock() to tdb_nest_lock, make it non-static and move the
allrecord check out to the callers (except the mark case which doesn't
care).
3) Rename _tdb_unlock() to tdb_nest_unlock(), make it non-static and
move the allrecord out to the callers (except mark again).
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit e9114a758538d460d4f9deae5ce631bf44b1eff8
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:19:47 2010 +1030
tdb: cleanup: rename global_lock to allrecord_lock.
The word global is overloaded in tdb. The global_lock inside struct
tdb_context is used to indicate we hold a lock across all the chains.
Rename it to allrecord_lock.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit 7ab422d6fbd4f8be02838089a41f872d538ee7a7
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:18:33 2010 +1030
tdb: cleanup: rename GLOBAL_LOCK to OPEN_LOCK.
The word global is overloaded in tdb. The GLOBAL_LOCK offset is used at
open time to serialize initialization (and by the transaction code to block
open).
Rename it to OPEN_LOCK.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
commit a6e0ef87d25734760fe77b87a9fd11db56760955
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 24 10:39:59 2010 +1030
tdb: make _tdb_transaction_cancel static.
Now tdb_open() calls tdb_transaction_cancel() instead of
_tdb_transaction_cancel, we can make it static.
Signed-off-by: Rusty Russell<rusty at rustcorp.com.au>
commit 452b4a5a6efeecfb5c83475f1375ddc25bcddfbe
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Wed Feb 17 12:17:19 2010 +1030
tdb: cleanup: split brlock and brunlock methods.
This is taken from the CCAN code base: rather than using tdb_brlock for
locking and unlocking, we split it into brlock and brunlock functions.
For extra debugging information, brunlock says what kind of lock it is
unlocking (even though fnctl locks don't need this). This requires an
extra argument to tdb_transaction_unlock() so we know whether the
lock was upgraded to a write lock or not.
We also use a "flags" argument tdb_brlock:
1) TDB_LOCK_NOWAIT replaces lck_type = F_SETLK (vs F_SETLKW).
2) TDB_LOCK_MARK_ONLY replaces setting TDB_MARK_LOCK bit in ltype.
3) TDB_LOCK_PROBE replaces the "probe" argument.
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
-----------------------------------------------------------------------
Summary of changes:
lib/tdb/common/io.c | 1 -
lib/tdb/common/lock.c | 578 +++++++++++++++++++++++++++++-------------
lib/tdb/common/open.c | 32 ++-
lib/tdb/common/tdb.c | 7 +-
lib/tdb/common/tdb_private.h | 39 ++-
lib/tdb/common/transaction.c | 107 +++-----
lib/tdb/common/traverse.c | 4 +-
lib/tdb/tools/tdbtorture.c | 199 +++++++++++----
8 files changed, 636 insertions(+), 331 deletions(-)
Changeset truncated at 500 lines:
diff --git a/lib/tdb/common/io.c b/lib/tdb/common/io.c
index d549715..5b20fa1 100644
--- a/lib/tdb/common/io.c
+++ b/lib/tdb/common/io.c
@@ -461,7 +461,6 @@ static const struct tdb_methods io_methods = {
tdb_next_hash_chain,
tdb_oob,
tdb_expand_file,
- tdb_brlock
};
/*
diff --git a/lib/tdb/common/lock.c b/lib/tdb/common/lock.c
index 0984e51..65d6843 100644
--- a/lib/tdb/common/lock.c
+++ b/lib/tdb/common/lock.c
@@ -27,13 +27,104 @@
#include "tdb_private.h"
-#define TDB_MARK_LOCK 0x80000000
-
void tdb_setalarm_sigptr(struct tdb_context *tdb, volatile sig_atomic_t *ptr)
{
tdb->interrupt_sig_ptr = ptr;
}
+static int fcntl_lock(struct tdb_context *tdb,
+ int rw, off_t off, off_t len, bool waitflag)
+{
+ struct flock fl;
+
+ fl.l_type = rw;
+ fl.l_whence = SEEK_SET;
+ fl.l_start = off;
+ fl.l_len = len;
+ fl.l_pid = 0;
+
+ if (waitflag)
+ return fcntl(tdb->fd, F_SETLKW, &fl);
+ else
+ return fcntl(tdb->fd, F_SETLK, &fl);
+}
+
+static int fcntl_unlock(struct tdb_context *tdb, int rw, off_t off, off_t len)
+{
+ struct flock fl;
+#if 0 /* Check they matched up locks and unlocks correctly. */
+ char line[80];
+ FILE *locks;
+ bool found = false;
+
+ locks = fopen("/proc/locks", "r");
+
+ while (fgets(line, 80, locks)) {
+ char *p;
+ int type, start, l;
+
+ /* eg. 1: FLOCK ADVISORY WRITE 2440 08:01:2180826 0 EOF */
+ p = strchr(line, ':') + 1;
+ if (strncmp(p, " POSIX ADVISORY ", strlen(" POSIX ADVISORY ")))
+ continue;
+ p += strlen(" FLOCK ADVISORY ");
+ if (strncmp(p, "READ ", strlen("READ ")) == 0)
+ type = F_RDLCK;
+ else if (strncmp(p, "WRITE ", strlen("WRITE ")) == 0)
+ type = F_WRLCK;
+ else
+ abort();
+ p += 6;
+ if (atoi(p) != getpid())
+ continue;
+ p = strchr(strchr(p, ' ') + 1, ' ') + 1;
+ start = atoi(p);
+ p = strchr(p, ' ') + 1;
+ if (strncmp(p, "EOF", 3) == 0)
+ l = 0;
+ else
+ l = atoi(p) - start + 1;
+
+ if (off == start) {
+ if (len != l) {
+ fprintf(stderr, "Len %u should be %u: %s",
+ (int)len, l, line);
+ abort();
+ }
+ if (type != rw) {
+ fprintf(stderr, "Type %s wrong: %s",
+ rw == F_RDLCK ? "READ" : "WRITE", line);
+ abort();
+ }
+ found = true;
+ break;
+ }
+ }
+
+ if (!found) {
+ fprintf(stderr, "Unlock on %u@%u not found!\n",
+ (int)off, (int)len);
+ abort();
+ }
+
+ fclose(locks);
+#endif
+
+ fl.l_type = F_UNLCK;
+ fl.l_whence = SEEK_SET;
+ fl.l_start = off;
+ fl.l_len = len;
+ fl.l_pid = 0;
+
+ return fcntl(tdb->fd, F_SETLKW, &fl);
+}
+
+/* list -1 is the alloc list, otherwise a hash chain. */
+static tdb_off_t lock_offset(int list)
+{
+ return FREELIST_TOP + 4*list;
+}
+
/* a byte range locking function - return 0 on success
this functions locks/unlocks 1 byte at the specified offset.
@@ -42,30 +133,36 @@ void tdb_setalarm_sigptr(struct tdb_context *tdb, volatile sig_atomic_t *ptr)
note that a len of zero means lock to end of file
*/
-int tdb_brlock(struct tdb_context *tdb, tdb_off_t offset,
- int rw_type, int lck_type, int probe, size_t len)
+int tdb_brlock(struct tdb_context *tdb,
+ int rw_type, tdb_off_t offset, size_t len,
+ enum tdb_lock_flags flags)
{
- struct flock fl;
int ret;
if (tdb->flags & TDB_NOLOCK) {
return 0;
}
+ if (flags & TDB_LOCK_MARK_ONLY) {
+ return 0;
+ }
+
if ((rw_type == F_WRLCK) && (tdb->read_only || tdb->traverse_read)) {
tdb->ecode = TDB_ERR_RDONLY;
return -1;
}
- fl.l_type = rw_type;
- fl.l_whence = SEEK_SET;
- fl.l_start = offset;
- fl.l_len = len;
- fl.l_pid = 0;
+ /* Sanity check */
+ if (tdb->transaction && offset >= lock_offset(-1) && len != 0) {
+ tdb->ecode = TDB_ERR_RDONLY;
+ TDB_LOG((tdb, TDB_DEBUG_TRACE, "tdb_brlock attempted in transaction at offset %d rw_type=%d flags=%d len=%d\n",
+ offset, rw_type, flags, (int)len));
+ return -1;
+ }
do {
- ret = fcntl(tdb->fd,lck_type,&fl);
-
+ ret = fcntl_lock(tdb, rw_type, offset, len,
+ flags & TDB_LOCK_WAIT);
/* Check for a sigalarm break. */
if (ret == -1 && errno == EINTR &&
tdb->interrupt_sig_ptr &&
@@ -79,15 +176,34 @@ int tdb_brlock(struct tdb_context *tdb, tdb_off_t offset,
/* Generic lock error. errno set by fcntl.
* EAGAIN is an expected return from non-blocking
* locks. */
- if (!probe && lck_type != F_SETLK) {
- TDB_LOG((tdb, TDB_DEBUG_TRACE,"tdb_brlock failed (fd=%d) at offset %d rw_type=%d lck_type=%d len=%d\n",
- tdb->fd, offset, rw_type, lck_type, (int)len));
+ if (!(flags & TDB_LOCK_PROBE) && errno != EAGAIN) {
+ TDB_LOG((tdb, TDB_DEBUG_TRACE,"tdb_brlock failed (fd=%d) at offset %d rw_type=%d flags=%d len=%d\n",
+ tdb->fd, offset, rw_type, flags, (int)len));
}
return -1;
}
return 0;
}
+int tdb_brunlock(struct tdb_context *tdb,
+ int rw_type, tdb_off_t offset, size_t len)
+{
+ int ret;
+
+ if (tdb->flags & TDB_NOLOCK) {
+ return 0;
+ }
+
+ do {
+ ret = fcntl_unlock(tdb, rw_type, offset, len);
+ } while (ret == -1 && errno == EINTR);
+
+ if (ret == -1) {
+ TDB_LOG((tdb, TDB_DEBUG_TRACE,"tdb_brunlock failed (fd=%d) at offset %d rw_type=%d len=%d\n",
+ tdb->fd, offset, rw_type, (int)len));
+ }
+ return ret;
+}
/*
upgrade a read lock to a write lock. This needs to be handled in a
@@ -95,12 +211,29 @@ int tdb_brlock(struct tdb_context *tdb, tdb_off_t offset,
deadlock detection and claim a deadlock when progress can be
made. For those OSes we may loop for a while.
*/
-int tdb_brlock_upgrade(struct tdb_context *tdb, tdb_off_t offset, size_t len)
+int tdb_allrecord_upgrade(struct tdb_context *tdb)
{
int count = 1000;
+
+ if (tdb->allrecord_lock.count != 1) {
+ TDB_LOG((tdb, TDB_DEBUG_ERROR,
+ "tdb_allrecord_upgrade failed: count %u too high\n",
+ tdb->allrecord_lock.count));
+ return -1;
+ }
+
+ if (tdb->allrecord_lock.off != 1) {
+ TDB_LOG((tdb, TDB_DEBUG_ERROR,
+ "tdb_allrecord_upgrade failed: already upgraded?\n"));
+ return -1;
+ }
+
while (count--) {
struct timeval tv;
- if (tdb_brlock(tdb, offset, F_WRLCK, F_SETLKW, 1, len) == 0) {
+ if (tdb_brlock(tdb, F_WRLCK, FREELIST_TOP, 0,
+ TDB_LOCK_WAIT|TDB_LOCK_PROBE) == 0) {
+ tdb->allrecord_lock.ltype = F_WRLCK;
+ tdb->allrecord_lock.off = 0;
return 0;
}
if (errno != EDEADLK) {
@@ -111,57 +244,46 @@ int tdb_brlock_upgrade(struct tdb_context *tdb, tdb_off_t offset, size_t len)
tv.tv_usec = 1;
select(0, NULL, NULL, NULL, &tv);
}
- TDB_LOG((tdb, TDB_DEBUG_TRACE,"tdb_brlock_upgrade failed at offset %d\n", offset));
+ TDB_LOG((tdb, TDB_DEBUG_TRACE,"tdb_allrecord_upgrade failed\n"));
return -1;
}
-
-/* lock a list in the database. list -1 is the alloc list */
-static int _tdb_lock(struct tdb_context *tdb, int list, int ltype, int op)
+static struct tdb_lock_type *find_nestlock(struct tdb_context *tdb,
+ tdb_off_t offset)
{
- struct tdb_lock_type *new_lck;
- int i;
- bool mark_lock = ((ltype & TDB_MARK_LOCK) == TDB_MARK_LOCK);
-
- ltype &= ~TDB_MARK_LOCK;
+ unsigned int i;
- /* a global lock allows us to avoid per chain locks */
- if (tdb->global_lock.count &&
- (ltype == tdb->global_lock.ltype || ltype == F_RDLCK)) {
- return 0;
+ for (i=0; i<tdb->num_lockrecs; i++) {
+ if (tdb->lockrecs[i].off == offset) {
+ return &tdb->lockrecs[i];
+ }
}
+ return NULL;
+}
- if (tdb->global_lock.count) {
- tdb->ecode = TDB_ERR_LOCK;
- return -1;
- }
+/* lock an offset in the database. */
+int tdb_nest_lock(struct tdb_context *tdb, uint32_t offset, int ltype,
+ enum tdb_lock_flags flags)
+{
+ struct tdb_lock_type *new_lck;
- if (list < -1 || list >= (int)tdb->header.hash_size) {
+ if (offset >= lock_offset(tdb->header.hash_size)) {
tdb->ecode = TDB_ERR_LOCK;
- TDB_LOG((tdb, TDB_DEBUG_ERROR,"tdb_lock: invalid list %d for ltype=%d\n",
- list, ltype));
+ TDB_LOG((tdb, TDB_DEBUG_ERROR,"tdb_lock: invalid offset %u for ltype=%d\n",
+ offset, ltype));
return -1;
}
if (tdb->flags & TDB_NOLOCK)
return 0;
- for (i=0; i<tdb->num_lockrecs; i++) {
- if (tdb->lockrecs[i].list == list) {
- if (tdb->lockrecs[i].count == 0) {
- /*
- * Can't happen, see tdb_unlock(). It should
- * be an assert.
- */
- TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_lock: "
- "lck->count == 0 for list %d", list));
- }
- /*
- * Just increment the in-memory struct, posix locks
- * don't stack.
- */
- tdb->lockrecs[i].count++;
- return 0;
- }
+ new_lck = find_nestlock(tdb, offset);
+ if (new_lck) {
+ /*
+ * Just increment the in-memory struct, posix locks
+ * don't stack.
+ */
+ new_lck->count++;
+ return 0;
}
new_lck = (struct tdb_lock_type *)realloc(
@@ -175,27 +297,89 @@ static int _tdb_lock(struct tdb_context *tdb, int list, int ltype, int op)
/* Since fcntl locks don't nest, we do a lock for the first one,
and simply bump the count for future ones */
- if (!mark_lock &&
- tdb->methods->tdb_brlock(tdb,FREELIST_TOP+4*list, ltype, op,
- 0, 1)) {
+ if (tdb_brlock(tdb, ltype, offset, 1, flags)) {
return -1;
}
- tdb->num_locks++;
-
- tdb->lockrecs[tdb->num_lockrecs].list = list;
+ tdb->lockrecs[tdb->num_lockrecs].off = offset;
tdb->lockrecs[tdb->num_lockrecs].count = 1;
tdb->lockrecs[tdb->num_lockrecs].ltype = ltype;
- tdb->num_lockrecs += 1;
+ tdb->num_lockrecs++;
return 0;
}
+static int tdb_lock_and_recover(struct tdb_context *tdb)
+{
+ int ret;
+
+ /* We need to match locking order in transaction commit. */
+ if (tdb_brlock(tdb, F_WRLCK, FREELIST_TOP, 0, TDB_LOCK_WAIT)) {
+ return -1;
+ }
+
+ if (tdb_brlock(tdb, F_WRLCK, OPEN_LOCK, 1, TDB_LOCK_WAIT)) {
+ tdb_brunlock(tdb, F_WRLCK, FREELIST_TOP, 0);
+ return -1;
+ }
+
+ ret = tdb_transaction_recover(tdb);
+
+ tdb_brunlock(tdb, F_WRLCK, OPEN_LOCK, 1);
+ tdb_brunlock(tdb, F_WRLCK, FREELIST_TOP, 0);
+
+ return ret;
+}
+
+static bool have_data_locks(const struct tdb_context *tdb)
+{
+ unsigned int i;
+
+ for (i = 0; i < tdb->num_lockrecs; i++) {
+ if (tdb->lockrecs[i].off >= lock_offset(-1))
+ return true;
+ }
+ return false;
+}
+
+static int tdb_lock_list(struct tdb_context *tdb, int list, int ltype,
+ enum tdb_lock_flags waitflag)
+{
+ int ret;
+ bool check = false;
+
+ /* a allrecord lock allows us to avoid per chain locks */
+ if (tdb->allrecord_lock.count &&
+ (ltype == tdb->allrecord_lock.ltype || ltype == F_RDLCK)) {
+ return 0;
+ }
+
+ if (tdb->allrecord_lock.count) {
+ tdb->ecode = TDB_ERR_LOCK;
+ ret = -1;
+ } else {
+ /* Only check when we grab first data lock. */
+ check = !have_data_locks(tdb);
+ ret = tdb_nest_lock(tdb, lock_offset(list), ltype, waitflag);
+
+ if (ret == 0 && check && tdb_needs_recovery(tdb)) {
+ tdb_nest_unlock(tdb, lock_offset(list), ltype, false);
+
+ if (tdb_lock_and_recover(tdb) == -1) {
+ return -1;
+ }
+ return tdb_lock_list(tdb, list, ltype, waitflag);
+ }
+ }
+ return ret;
+}
+
/* lock a list in the database. list -1 is the alloc list */
int tdb_lock(struct tdb_context *tdb, int list, int ltype)
{
int ret;
- ret = _tdb_lock(tdb, list, ltype, F_SETLKW);
+
+ ret = tdb_lock_list(tdb, list, ltype, TDB_LOCK_WAIT);
if (ret) {
TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_lock failed on list %d "
"ltype=%d (%s)\n", list, ltype, strerror(errno)));
@@ -206,49 +390,26 @@ int tdb_lock(struct tdb_context *tdb, int list, int ltype)
/* lock a list in the database. list -1 is the alloc list. non-blocking lock */
int tdb_lock_nonblock(struct tdb_context *tdb, int list, int ltype)
{
- return _tdb_lock(tdb, list, ltype, F_SETLK);
+ return tdb_lock_list(tdb, list, ltype, TDB_LOCK_NOWAIT);
}
-/* unlock the database: returns void because it's too late for errors. */
- /* changed to return int it may be interesting to know there
- has been an error --simo */
-int tdb_unlock(struct tdb_context *tdb, int list, int ltype)
+int tdb_nest_unlock(struct tdb_context *tdb, uint32_t offset, int ltype,
+ bool mark_lock)
{
int ret = -1;
- int i;
- struct tdb_lock_type *lck = NULL;
- bool mark_lock = ((ltype & TDB_MARK_LOCK) == TDB_MARK_LOCK);
-
- ltype &= ~TDB_MARK_LOCK;
-
- /* a global lock allows us to avoid per chain locks */
- if (tdb->global_lock.count &&
- (ltype == tdb->global_lock.ltype || ltype == F_RDLCK)) {
- return 0;
- }
-
- if (tdb->global_lock.count) {
- tdb->ecode = TDB_ERR_LOCK;
- return -1;
- }
+ struct tdb_lock_type *lck;
if (tdb->flags & TDB_NOLOCK)
return 0;
/* Sanity checks */
- if (list < -1 || list >= (int)tdb->header.hash_size) {
- TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_unlock: list %d invalid (%d)\n", list, tdb->header.hash_size));
+ if (offset >= lock_offset(tdb->header.hash_size)) {
+ TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_unlock: offset %u invalid (%d)\n", offset, tdb->header.hash_size));
return ret;
}
- for (i=0; i<tdb->num_lockrecs; i++) {
- if (tdb->lockrecs[i].list == list) {
- lck = &tdb->lockrecs[i];
- break;
- }
- }
-
+ lck = find_nestlock(tdb, offset);
if ((lck == NULL) || (lck->count == 0)) {
TDB_LOG((tdb, TDB_DEBUG_ERROR, "tdb_unlock: count is 0\n"));
return -1;
@@ -269,20 +430,14 @@ int tdb_unlock(struct tdb_context *tdb, int list, int ltype)
if (mark_lock) {
ret = 0;
} else {
- ret = tdb->methods->tdb_brlock(tdb, FREELIST_TOP+4*list, F_UNLCK,
- F_SETLKW, 0, 1);
+ ret = tdb_brunlock(tdb, ltype, offset, 1);
}
- tdb->num_locks--;
/*
* Shrink the array by overwriting the element just unlocked with the
* last array element.
*/
-
- if (tdb->num_lockrecs > 1) {
- *lck = tdb->lockrecs[tdb->num_lockrecs-1];
- }
- tdb->num_lockrecs -= 1;
+ *lck = tdb->lockrecs[--tdb->num_lockrecs];
--
Samba Shared Repository
More information about the samba-cvs
mailing list