[SCM] CTDB repository - branch master updated - ctdb-1.10-87-ge075670
Ronnie Sahlberg
sahlberg at samba.org
Thu Feb 17 17:06:02 MST 2011
The branch, master has been updated
via e075670dee8e6ecaba54986f87a85be3d0528b6b (commit)
via 7db5a4832a9555be53c301f198f72b9e075a8ae7 (commit)
via 0c030c9384500f340d8382c20e1e91b11aa377e9 (commit)
via 155dd1f4885fe142c6f8bd09430f65daf8a17e51 (commit)
via b86feb6fe463dfdb67b2798491df18a4c434a430 (commit)
from 307e5e95548155a31682dfcb0956834d0c85838e (commit)
http://gitweb.samba.org/?p=sahlberg/ctdb.git;a=shortlog;h=master
- Log -----------------------------------------------------------------
commit e075670dee8e6ecaba54986f87a85be3d0528b6b
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date: Mon Dec 6 13:08:53 2010 +1100
Add two new flags for the ltdb header.
One of which signals that the record has never been migrated to/from a node
while containing data.
This property "has never been migrated while non-zero" is important later
to provide heuristics on which records we might be able to purge
from the tdb files cheaply, i.e. without having to rely on the full-blown
database vacuum.
These records are belived to be very common and the pattern would look like
this :
1, no record exists at all.
2, client opens a file
3, samba requests the record for this file
4, an empty record is created on the LMASTER
5, the empty record is migrated to the DMASTER
6, samba writes a <sharemode> to the record locally and the record grows
7, client finishes working the file and closes the file
8, samba removes the sharemode and the record becomes empty again.
9, much later : vacuuming will delete the record
At stage 8, since the record has never been migrated onto a node wile being
non-zero it would be safe, and much more efficient to just delete the record
completely from the database and hand it back to the LMASTER.
The flags occupy the same uint32_t as was previously used for laccessor/lacount
in the header. For now, make sure the flags only define/use the top 16 bits
of this field so that we are sure we dont collide with bits set to one
from previous generations of the ctdb cluster database prior to this
change in semantics of this word.
This is a rework of Michaels patch :
commit 2af1a47cbe1a608496c8caf3eb0c990eb7259a0d
Author: Michael Adam <obnox at samba.org>
Date: Tue Nov 30 17:00:54 2010 +0100
add a DEFAULT record flag and a MIGRATED_WITH_DATA record flag.
commit 7db5a4832a9555be53c301f198f72b9e075a8ae7
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date: Mon Oct 18 11:57:38 2010 +1100
remove checking for filesystems and filesystem health from the cnfs script.
remove the gpfsmount and gpfsumount entry points
commit 0c030c9384500f340d8382c20e1e91b11aa377e9
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date: Fri Jan 21 10:56:56 2011 +1100
60.nfs
Dont update the statd settings that often.
When we have very many nodes and very many ips, this would generate
a lot of unnessecary load on the system
commit 155dd1f4885fe142c6f8bd09430f65daf8a17e51
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date: Mon Nov 29 13:07:59 2010 +1100
Remove LACOUNT and LACCESSOR and migrate the records immediately.
This concept didnt work out and it is really just as expensive as a full migration
anyway, without the benefit of caching the data for subsequence accesses.
Now, migrate the records immediately on first access.
This will be combined with a "cheap vacuum-lite" for special empty records to
prevent growth of databases.
Later extensions to mimic read-only behaviour of records will include proper shared read-only locking of database records, making the laccessor/lacount read-only access to the data obsolete anyway.
By removing this special case and handling of lacount laccessor makes the codapath where shared read-only locking will be be implemented simpler, and frees up space in the ctdb_ltdb header for use by vacuuming flags as well as read-only locking flags.
commit b86feb6fe463dfdb67b2798491df18a4c434a430
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date: Fri Oct 8 13:14:14 2010 +1100
change the hash function to use the much better Jenkins hash
from the tdb library
cq S1020233
-----------------------------------------------------------------------
Summary of changes:
client/ctdb_client.c | 15 ++-------
common/ctdb_ltdb.c | 1 -
common/ctdb_util.c | 9 +-----
config/events.d/60.nfs | 4 +-
config/events.d/62.cnfs | 74 -----------------------------------------------
include/ctdb_client.h | 5 ---
include/ctdb_private.h | 5 +--
include/ctdb_protocol.h | 6 ++-
server/ctdb_call.c | 19 +++++-------
server/ctdb_tunables.c | 1 -
10 files changed, 21 insertions(+), 118 deletions(-)
Changeset truncated at 500 lines:
diff --git a/client/ctdb_client.c b/client/ctdb_client.c
index 5a07a85..99ff72d 100644
--- a/client/ctdb_client.c
+++ b/client/ctdb_client.c
@@ -72,7 +72,7 @@ struct ctdb_req_header *_ctdbd_allocate_pkt(struct ctdb_context *ctdb,
*/
int ctdb_call_local(struct ctdb_db_context *ctdb_db, struct ctdb_call *call,
struct ctdb_ltdb_header *header, TALLOC_CTX *mem_ctx,
- TDB_DATA *data, uint32_t caller)
+ TDB_DATA *data)
{
struct ctdb_call_info *c;
struct ctdb_registered_call *fn;
@@ -105,15 +105,8 @@ int ctdb_call_local(struct ctdb_db_context *ctdb_db, struct ctdb_call *call,
return -1;
}
- if (header->laccessor != caller) {
- header->lacount = 0;
- }
- header->laccessor = caller;
- header->lacount++;
-
- /* we need to force the record to be written out if this was a remote access,
- so that the lacount is updated */
- if (c->new_data == NULL && header->laccessor != ctdb->pnn) {
+ /* we need to force the record to be written out if this was a remote access */
+ if (c->new_data == NULL) {
c->new_data = &c->record_data;
}
@@ -368,7 +361,7 @@ static struct ctdb_client_call_state *ctdb_client_call_local_send(struct ctdb_db
*(state->call) = *call;
state->ctdb_db = ctdb_db;
- ret = ctdb_call_local(ctdb_db, state->call, header, state, data, ctdb->pnn);
+ ret = ctdb_call_local(ctdb_db, state->call, header, state, data);
return state;
}
diff --git a/common/ctdb_ltdb.c b/common/ctdb_ltdb.c
index 3572371..200cca4 100644
--- a/common/ctdb_ltdb.c
+++ b/common/ctdb_ltdb.c
@@ -65,7 +65,6 @@ static void ltdb_initial_header(struct ctdb_db_context *ctdb_db,
ZERO_STRUCTP(header);
/* initial dmaster is the lmaster */
header->dmaster = ctdb_lmaster(ctdb_db->ctdb, &key);
- header->laccessor = header->dmaster;
}
diff --git a/common/ctdb_util.c b/common/ctdb_util.c
index 88741e3..1ff4c1f 100644
--- a/common/ctdb_util.c
+++ b/common/ctdb_util.c
@@ -99,14 +99,7 @@ bool ctdb_same_address(struct ctdb_address *a1, struct ctdb_address *a2)
*/
uint32_t ctdb_hash(const TDB_DATA *key)
{
- uint32_t value; /* Used to compute the hash value. */
- uint32_t i; /* Used to cycle through random values. */
-
- /* Set the initial value from the key size. */
- for (value = 0x238F13AF * key->dsize, i=0; i < key->dsize; i++)
- value = (value + (key->dptr[i] << (i*5 % 24)));
-
- return (1103515243 * value + 12345);
+ return tdb_jenkins_hash(discard_const(key));
}
/*
diff --git a/config/events.d/60.nfs b/config/events.d/60.nfs
index 79a071b..0cea531 100755
--- a/config/events.d/60.nfs
+++ b/config/events.d/60.nfs
@@ -179,11 +179,11 @@ case "$1" in
$cmd &
}
- # once every 60 seconds, update the statd state database for which
+ # once every 600 seconds, update the statd state database for which
# clients need notifications
LAST_UPDATE=`stat --printf="%Y" $CTDB_VARDIR/state/statd/update-trigger 2>/dev/null`
CURRENT_TIME=`date +"%s"`
- [ $CURRENT_TIME -ge $(($LAST_UPDATE + 60)) ] && {
+ [ $CURRENT_TIME -ge $(($LAST_UPDATE + 600)) ] && {
mkdir -p $CTDB_VARDIR/state/statd
touch $CTDB_VARDIR/state/statd/update-trigger
$CTDB_BASE/statd-callout updatelocal &
diff --git a/config/events.d/62.cnfs b/config/events.d/62.cnfs
index e0af722..af4ecc3 100755
--- a/config/events.d/62.cnfs
+++ b/config/events.d/62.cnfs
@@ -8,20 +8,8 @@ loadconfig
STATEDIR=$CTDB_VARDIR/state/gpfs
-# filesystems needed by nfs
-NFS_FSS=`cat /etc/exports | egrep -v "^#" | sed -e "s/[ \t]*[^ \t]*$//" -e "s/\"//g"`
-
-
-
check_if_healthy() {
mkdir -p $STATEDIR/fs
- FS=`(cd $STATEDIR/fs ; ls )`
- [ -z "$FS" ] || {
- MISSING=`echo $FS | sed -e "s/@/\//g"`
- logger Filesystems required for NFS are missing. Node is UNHEALTHY. [$MISSING]
- $CTDB_BASE/events.d/62.cnfs unhealthy "GPFS filesystems required for NFS are not mounted : [$MISSING]"
- exit 0
- }
[ -f "$STATEDIR/gpfsnoquorum" ] && {
logger No GPFS quorum. Node is UNHEALTHY
@@ -40,64 +28,6 @@ case "$1" in
;;
- # This event is called from the GPFS callbacks when a filesystem is
- # unmounted
- gpfsumount)
- # is this a filesystem we need for nfs?
- echo "$NFS_FSS" | egrep "^$2" >/dev/null || {
- # no
- exit 0
- }
-
- logger "GPFS unmounted filesystem $2 used by NFS. Mark node as UNHEALTHY"
-
- MFS=`echo $2 | sed -e "s/\//@/g"`
- mkdir -p $STATEDIR/fs
- touch "$STATEDIR/fs/$MFS"
- $CTDB_BASE/events.d/62.cnfs unhealthy "GPFS unmounted filesystem $2 used by NFS"
- ;;
-
- # This event is called from the GPFS callbacks when a filesystem is
- # mounted
- gpfsmount)
- # is this a filesystem we need for nfs?
- echo "$NFS_FSS" | egrep "^$2" >/dev/null || {
- # no
- exit 0
- }
-
- logger "GPFS mounted filesystem $2 used by NFS."
-
- MFS=`echo $2 | sed -e "s/\//@/g"`
- mkdir -p $STATEDIR/fs
- rm -f "$STATEDIR/fs/$MFS"
-
- check_if_healthy
- ;;
-
-
-
- # This event is called from the gpfs callback when GPFS is being shutdown.
- gpfsshutdown)
- logger "GPFS is shutting down. Marking node as UNHEALTHY and trigger a CTDB failover"
- $CTDB_BASE/events.d/62.cnfs unhealthy "GPFS was shut down!"
- ;;
-
-
- # This event is called from the gpfs callback when GPFS has started.
- # It checks that all required NFS filesystems are mounted
- # and flags the node healthy if so.
- gpfsstartup)
- # assume we always have quorum when starting
- # we are only interested in the case when we explicitely
- # lost quorum in an otherwise happy cluster
- mkdir -p $STATEDIR
- rm -f "$STATEDIR/gpfsnoquorum"
- logger "GPFS is is started."
- check_if_healthy
- ;;
-
-
gpfsquorumreached)
mkdir -p $STATEDIR
rm -f "$STATEDIR/gpfsnoquorum"
@@ -112,10 +42,6 @@ case "$1" in
$CTDB_BASE/events.d/62.cnfs unhealthy "GPFS quorum was lost! Marking node as UNHEALTHY."
;;
-
-
-
-
unhealthy)
# Mark the node as UNHEALTHY which means all public addresses
# will be migrated off the node.
diff --git a/include/ctdb_client.h b/include/ctdb_client.h
index aa9b2c0..3dc115f 100644
--- a/include/ctdb_client.h
+++ b/include/ctdb_client.h
@@ -77,11 +77,6 @@ int ctdb_set_tdb_dir_state(struct ctdb_context *ctdb, const char *dir);
void ctdb_set_flags(struct ctdb_context *ctdb, unsigned flags);
/*
- set max acess count before a dmaster migration
-*/
-void ctdb_set_max_lacount(struct ctdb_context *ctdb, unsigned count);
-
-/*
tell ctdb what address to listen on, in transport specific format
*/
int ctdb_set_address(struct ctdb_context *ctdb, const char *address);
diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index c189a5f..3f36870 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -82,7 +82,6 @@ struct ctdb_tunable {
uint32_t traverse_timeout;
uint32_t keepalive_interval;
uint32_t keepalive_limit;
- uint32_t max_lacount;
uint32_t recover_timeout;
uint32_t recover_interval;
uint32_t election_timeout;
@@ -776,8 +775,8 @@ struct ctdb_call_state *ctdb_daemon_call_send_remote(struct ctdb_db_context *ctd
struct ctdb_ltdb_header *header);
int ctdb_call_local(struct ctdb_db_context *ctdb_db, struct ctdb_call *call,
- struct ctdb_ltdb_header *header, TALLOC_CTX *mem_ctx, TDB_DATA *data,
- uint32_t caller);
+ struct ctdb_ltdb_header *header, TALLOC_CTX *mem_ctx,
+ TDB_DATA *data);
#define ctdb_reqid_find(ctdb, reqid, type) (type *)_ctdb_reqid_find(ctdb, reqid, #type, __location__)
diff --git a/include/ctdb_protocol.h b/include/ctdb_protocol.h
index baf1790..b6b753c 100644
--- a/include/ctdb_protocol.h
+++ b/include/ctdb_protocol.h
@@ -479,8 +479,10 @@ enum ctdb_trans2_commit_error {
struct ctdb_ltdb_header {
uint64_t rsn;
uint32_t dmaster;
- uint32_t laccessor;
- uint32_t lacount;
+ uint32_t reserved1;
+#define CTDB_REC_FLAG_DEFAULT 0x00000000
+#define CTDB_REC_FLAG_MIGRATED_WITH_DATA 0x00010000
+ uint32_t flags;
};
diff --git a/server/ctdb_call.c b/server/ctdb_call.c
index c5f7e7d..d6c0866 100644
--- a/server/ctdb_call.c
+++ b/server/ctdb_call.c
@@ -297,7 +297,7 @@ static void ctdb_become_dmaster(struct ctdb_db_context *ctdb_db,
return;
}
- ctdb_call_local(ctdb_db, state->call, &header, state, &data, ctdb->pnn);
+ ctdb_call_local(ctdb_db, state->call, &header, state, &data);
ret = ctdb_ltdb_unlock(ctdb_db, state->call->key);
if (ret != 0) {
@@ -465,14 +465,11 @@ void ctdb_request_call(struct ctdb_context *ctdb, struct ctdb_req_header *hdr)
CTDB_UPDATE_STAT(ctdb, max_hop_count, c->hopcount);
- /* if this nodes has done enough consecutive calls on the same record
- then give them the record
- or if the node requested an immediate migration
- */
- if ( c->hdr.srcnode != ctdb->pnn &&
- ((header.laccessor == c->hdr.srcnode
- && header.lacount >= ctdb->tunable.max_lacount)
- || (c->flags & CTDB_IMMEDIATE_MIGRATION)) ) {
+ /* Try if possible to migrate the record off to the caller node.
+ * From the clients perspective a fetch of the data is just as
+ * expensive as a migration.
+ */
+ if (c->hdr.srcnode != ctdb->pnn) {
if (ctdb_db->transaction_active) {
DEBUG(DEBUG_INFO, (__location__ " refusing migration"
" of key %s while transaction is active\n",
@@ -491,7 +488,7 @@ void ctdb_request_call(struct ctdb_context *ctdb, struct ctdb_req_header *hdr)
}
}
- ctdb_call_local(ctdb_db, call, &header, hdr, &data, c->hdr.srcnode);
+ ctdb_call_local(ctdb_db, call, &header, hdr, &data);
ret = ctdb_ltdb_unlock(ctdb_db, call->key);
if (ret != 0) {
@@ -707,7 +704,7 @@ struct ctdb_call_state *ctdb_call_local_send(struct ctdb_db_context *ctdb_db,
*(state->call) = *call;
state->ctdb_db = ctdb_db;
- ret = ctdb_call_local(ctdb_db, state->call, header, state, data, ctdb->pnn);
+ ret = ctdb_call_local(ctdb_db, state->call, header, state, data);
event_add_timed(ctdb->ev, state, timeval_zero(), call_local_trigger, state);
diff --git a/server/ctdb_tunables.c b/server/ctdb_tunables.c
index 47694b7..4cd1b45 100644
--- a/server/ctdb_tunables.c
+++ b/server/ctdb_tunables.c
@@ -30,7 +30,6 @@ static const struct {
{ "TraverseTimeout", 20, offsetof(struct ctdb_tunable, traverse_timeout) },
{ "KeepaliveInterval", 5, offsetof(struct ctdb_tunable, keepalive_interval) },
{ "KeepaliveLimit", 5, offsetof(struct ctdb_tunable, keepalive_limit) },
- { "MaxLACount", 7, offsetof(struct ctdb_tunable, max_lacount) },
{ "RecoverTimeout", 20, offsetof(struct ctdb_tunable, recover_timeout) },
{ "RecoverInterval", 1, offsetof(struct ctdb_tunable, recover_interval) },
{ "ElectionTimeout", 3, offsetof(struct ctdb_tunable, election_timeout) },
--
CTDB repository
More information about the samba-cvs
mailing list