[SCM] CTDB repository - branch master updated - b1fed105ad780e89a128a611ef0bd659818eeebf

Andrew Tridgell tridge at samba.org
Wed Jul 23 05:37:11 GMT 2008


The branch, master has been updated
       via  b1fed105ad780e89a128a611ef0bd659818eeebf (commit)
       via  8fed021d11160b137f4140ea02947347250e2959 (commit)
       via  e8ef9891aa31c374921b23cc74e1eda1f8218bf0 (commit)
       via  0de79352c9b36c118e36905f08ebbe38ecbb957e (commit)
       via  b08a988fbdad0da850c9b79791c1a8970555147f (commit)
       via  eca73bcaa33f88c683b79d57d85b590659018ad8 (commit)
       via  c5035657606283d2e35bea40992505e84ca8e7be (commit)
       via  60e2cb175c449ae65793a3e1ffb60cf030a3a0d5 (commit)
       via  3d58f9b524a40c7b43a2a855212db090e9becefa (commit)
       via  554dcf16d37c8b9e4704df11d21fb272f30f5cec (commit)
       via  52716d26eb84104d65828bed38e69f214a5fa824 (commit)
       via  52a38487f981fd5981c02a7a063ad2c598591c10 (commit)
      from  e24152fbd06ba4c2b6cfd473751c7f00a676b9ae (commit)

http://gitweb.samba.org/?p=tridge/ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit b1fed105ad780e89a128a611ef0bd659818eeebf
Author: Andrew Tridgell <tridge at samba.org>
Date:   Wed Jul 23 15:36:23 2008 +1000

    run the testparm commands in 50.samba in the background, only running
    in the foreground if something fails

commit 8fed021d11160b137f4140ea02947347250e2959
Author: Andrew Tridgell <tridge at samba.org>
Date:   Wed Jul 23 15:35:46 2008 +1000

    allow for probing of directories without raising an error

commit e8ef9891aa31c374921b23cc74e1eda1f8218bf0
Author: Andrew Tridgell <tridge at samba.org>
Date:   Wed Jul 23 15:25:52 2008 +1000

    fixed buffering in ctdb logging code to handle multiple lines
    correctly

commit 0de79352c9b36c118e36905f08ebbe38ecbb957e
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Tue Jul 22 09:07:42 2008 +1000

    From Michael Adams,
    change one element from private to private_data
    
    Signed-off-by: Ronnie Sahlberg <ronniesahlberg at gmail.com>

commit b08a988fbdad0da850c9b79791c1a8970555147f
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri Jul 18 13:49:05 2008 +1000

    new version 1.0.50

commit eca73bcaa33f88c683b79d57d85b590659018ad8
Merge: c5035657606283d2e35bea40992505e84ca8e7be e24152fbd06ba4c2b6cfd473751c7f00a676b9ae
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri Jul 18 13:42:39 2008 +1000

    Merge git://git.samba.org/tridge/ctdb

commit c5035657606283d2e35bea40992505e84ca8e7be
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri Jul 18 12:07:25 2008 +1000

    if a new node enters the cluster, that node will already be frozen at start
    but the rest of the nodes are not frozen.
    
    at this stage an election is called by the new node.
    
    Since in this case the nodes are not froze, we can not modify the recmaster
    of the nodes so it is expected that this control would fail.
    
    Add a boolean to send_election_request() to make it not
    try to set the recmaster locally for the case where we are in an election phase
    while not frozen.

commit 60e2cb175c449ae65793a3e1ffb60cf030a3a0d5
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri Jul 18 10:59:34 2008 +1000

    We can not assume that just because we could complete a TCP handshake
    to the remote node  that
    1, we are in fact talking to a CTDB daemon
    2, that IF we are talking to a ctdb daemon, it is operational.
    
    So, we can not blindly mark the node as CONNECTED just because
    we can open a TCP connection.
    
    Instead we rely on "If we did get a KEEPALIVE from the remote node,
    is is connected"

commit 3d58f9b524a40c7b43a2a855212db090e9becefa
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri Jul 18 10:41:18 2008 +1000

    lower a debug statement

commit 554dcf16d37c8b9e4704df11d21fb272f30f5cec
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Fri Jul 18 10:38:51 2008 +1000

    lower a debug message

commit 52716d26eb84104d65828bed38e69f214a5fa824
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Thu Jul 17 18:53:54 2008 +1000

    Allow the fix-to-make-persistent-writes-safer work with unpatched samba versions

commit 52a38487f981fd5981c02a7a063ad2c598591c10
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Thu Jul 17 18:47:20 2008 +1000

    Only decrement the "number of persistent writes in flight" If/when
    it is >0    or we will break if used against an unpatched samba server

-----------------------------------------------------------------------

Summary of changes:
 config/events.d/50.samba |   96 +++++++++++++++++++++++++++++++++++++++++-----
 config/functions         |   25 +++++++++---
 include/ctdb_private.h   |    2 +-
 packaging/RPM/ctdb.spec  |   15 +++++++-
 server/ctdb_daemon.c     |    2 +-
 server/ctdb_logging.c    |   39 ++++++++++---------
 server/ctdb_persistent.c |    8 +++-
 server/ctdb_recoverd.c   |   39 +++++++++++--------
 tcp/tcp_connect.c        |    3 -
 9 files changed, 170 insertions(+), 59 deletions(-)


Changeset truncated at 500 lines:

diff --git a/config/events.d/50.samba b/config/events.d/50.samba
index 9aa21e2..498aa17 100755
--- a/config/events.d/50.samba
+++ b/config/events.d/50.samba
@@ -17,10 +17,81 @@ shift
     SAMBA_CLEANUP_PERIOD=10
 }
 
+# we keep a cached copy of smb.conf here
+smbconf_cache="$CTDB_BASE/state/samba/smb.conf.cache"
+
+
+#############################################
+# update the smb.conf cache in the foreground
+testparm_foreground_update() {
+    mkdir -p "$CTDB_BASE/state/samba" || exit 1
+    testparm -s 2> /dev/null | egrep -v 'registry.shares.=|include.=' > "$smbconf_cache"
+}
+
+#############################################
+# update the smb.conf cache in the background
+testparm_background_update() {
+    # if the cache doesn't exist, then update in the foreground
+    [ -f $smbconf_cache ] || {
+	testparm_foreground_update
+    }
+    # otherwise do a background update
+    (
+	tmpfile="${smbconf_cache}.$$"
+	testparm -s > $tmpfile 2> /dev/null &
+	# remember the pid of the teamparm process
+	pid="$!"
+	# give it 10 seconds to run
+	timeleft=10
+	while [ $timeleft -gt 0 ]; do
+	    timeleft=$(($timeleft - 1))
+	    # see if the process still exists
+	    kill -0 $pid > /dev/null 2>&1 || {
+		# it doesn't exist, grab its exit status
+		wait $pid
+		[ $? = 0 ] || {
+		    echo "50.samba: smb.conf background update exited with status $?"
+		    rm -f "${tmpfile}"
+		    exit 1
+		}		
+		# put the new smb.conf contents in the cache (atomic rename)
+		# make sure we remove references to the registry while doing 
+		# this to ensure that running testparm on the cache does
+		# not use the registry
+		egrep -v 'registry.shares.=|include.=' < "$tmpfile" > "${tmpfile}.2"
+		rm -f "$tmpfile"
+		mv -f "${tmpfile}.2" "$smbconf_cache" || {
+		    echo "50.samba: failed to update background cache"
+		    rm -f "${tmpfile}.2"
+		    exit 1
+		}
+		exit 0
+	    }
+	    # keep waiting for testparm to finish
+	    sleep 1
+	done
+	# it took more than 10 seconds - kill it off
+	rm -f "${tmpfile}"
+	kill -9 "$pid" > /dev/null 2>&1
+	echo "50.samba: timed out updating smbconf cache in background"
+	exit 1
+    ) &
+}
+
+##################################################
+# show the testparm output using a cached smb.conf 
+# to avoid registry access
+testparm_cat() {
+    [ -f $smbconf_cache ] || {
+	testparm_foreground_update
+    }
+    testparm -s "$smbconf_cache" "$@" 2>/dev/null
+}
+
 # function to see if ctdb manages winbind
 check_ctdb_manages_winbind() {
   [ -z "$CTDB_MANAGES_WINBIND" ] && {
-    secmode=`testparm -s --parameter-name=security 2> /dev/null`
+    secmode=`testparm_cat --parameter-name=security`
     case $secmode in
 	ADS|DOMAIN)
 	    CTDB_MANAGES_WINBIND="yes";
@@ -108,21 +179,26 @@ case $cmd in
 		touch $CTDB_BASE/state/samba/periodic_cleanup
 	}
 
-	[ "$CTDB_SAMBA_SKIP_CONF_CHECK" != "yes" ] && {
-		testparm -s 2>&1 | egrep '^WARNING|^ERROR|^Unknown' && {
-			echo "ERROR: testparm shows smb.conf is not clean"
-			exit 1
-		}
+	testparm_background_update
+
+	testparm_cat | egrep '^WARNING|^ERROR|^Unknown' && {
+	    testparm_foreground_update
+	    testparm_cat | egrep '^WARNING|^ERROR|^Unknown' && {
+		echo "ERROR: testparm shows smb.conf is not clean"
+		exit 1
+	    }
 	}
 
-	[ "$CTDB_SAMBA_SKIP_SHARE_CHECK" != "yes" ] && {
-		smb_dirs=`testparm -s 2> /dev/null | egrep '^[[:space:]]*path = '  | cut -d= -f2`
-		ctdb_check_directories "Samba" $smb_dirs	
+	smb_dirs=`testparm_cat | egrep '^[[:space:]]*path = ' | cut -d= -f2`
+	ctdb_check_directories_probe "Samba" $smb_dirs || {
+	    testparm_foreground_update
+	    smb_dirs=`testparm_cat | egrep '^[[:space:]]*path = ' | cut -d= -f2`
+	    ctdb_check_directories "Samba" $smb_dirs
 	}
 
 	smb_ports="$CTDB_SAMBA_CHECK_PORTS"
 	[ -z "$smb_ports" ] && {
-		smb_ports=`testparm -s --parameter-name="smb ports" 2> /dev/null`
+		smb_ports=`testparm_cat --parameter-name="smb ports"`
 	}
 	ctdb_check_tcp_ports "Samba" $smb_ports
 
diff --git a/config/functions b/config/functions
index d15c4b5..20325b1 100644
--- a/config/functions
+++ b/config/functions
@@ -145,19 +145,32 @@ ctdb_check_rpc() {
 
 ######################################################
 # check a set of directories is available
-# usage: ctdb_check_directories SERVICE_NAME <directories...>
+# return 0 on a missing directory
+# usage: ctdb_check_directories_probe SERVICE_NAME <directories...>
 ######################################################
-ctdb_check_directories() {
+ctdb_check_directories_probe() {
   service_name="$1"
   shift
   wait_dirs="$*"
   [ -z "$wait_dirs" ] && return;
   for d in $wait_dirs; do
-      [ -d $d ] || {
-	  echo "ERROR: $service_name directory $d not available"
-	  exit 1
-      }
+      [ -d $d ] || return 1
   done
+  return 0
+}
+
+######################################################
+# check a set of directories is available
+# usage: ctdb_check_directories SERVICE_NAME <directories...>
+######################################################
+ctdb_check_directories() {
+  service_name="$1"
+  shift
+  wait_dirs="$*"
+  ctdb_check_directories_probe "$service_name" $wait_dirs || {
+      echo "ERROR: $service_name directory $d not available"
+      exit 1
+  }
 }
 
 ######################################################
diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index 66e7709..77d1092 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -1098,7 +1098,7 @@ struct ctdb_client_call_state {
 	struct ctdb_call *call;
 	struct {
 		void (*fn)(struct ctdb_client_call_state *);
-		void *private;
+		void *private_data;
 	} async;
 };
 
diff --git a/packaging/RPM/ctdb.spec b/packaging/RPM/ctdb.spec
index 84f9cf6..1ac4129 100644
--- a/packaging/RPM/ctdb.spec
+++ b/packaging/RPM/ctdb.spec
@@ -5,7 +5,7 @@ Vendor: Samba Team
 Packager: Samba Team <samba at samba.org>
 Name: ctdb
 Version: 1.0
-Release: 48
+Release: 50
 Epoch: 0
 License: GNU GPL version 3
 Group: System Environment/Daemons
@@ -118,6 +118,19 @@ fi
 %{_includedir}/ctdb_private.h
 
 %changelog
+* Fri Jul 18 2008 : Version 1.0.50
+ - Dont assume that just because we can establish a TCP connection
+   that we are actually talking to a functioning ctdb daemon.
+   So dont mark the node as CONNECTED just because the tcp handshake
+   was successful.
+ - Dont try to set the recmaster to ourself during elections for those
+   cases we know this will fail. To remove some annoying benign but scary
+   looking entries from the log.
+ - Bugfix for eventsystem for signal handling that could cause a node to
+   hang.
+* Thu Jul 17 2008 : Version 1.0.49
+ - Update the safe persistent update fix to work with unpatched samba
+   servers.
 * Thu Jul 17 2008 : Version 1.0.48
  - Update the spec file.
  - Do not start new user-triggered eventscripts if we are already
diff --git a/server/ctdb_daemon.c b/server/ctdb_daemon.c
index aeb0cbd..3978e28 100644
--- a/server/ctdb_daemon.c
+++ b/server/ctdb_daemon.c
@@ -53,7 +53,7 @@ static void flag_change_handler(struct ctdb_context *ctdb, uint64_t srvid,
 	ctdb->nodes[c->pnn]->flags = 
 		(ctdb->nodes[c->pnn]->flags&NODE_FLAGS_DISCONNECTED) 
 		| (c->new_flags & ~NODE_FLAGS_DISCONNECTED);	
-	DEBUG(DEBUG_INFO,("Node flags for node %u are now 0x%x\n", c->pnn, ctdb->nodes[c->pnn]->flags));
+	DEBUG(DEBUG_DEBUG,("Node flags for node %u are now 0x%x\n", c->pnn, ctdb->nodes[c->pnn]->flags));
 
 	/* make sure we don't hold any IPs when we shouldn't */
 	if (c->pnn == ctdb->pnn &&
diff --git a/server/ctdb_logging.c b/server/ctdb_logging.c
index 6ebc8c1..f551088 100644
--- a/server/ctdb_logging.c
+++ b/server/ctdb_logging.c
@@ -138,38 +138,39 @@ static void ctdb_log_handler(struct event_context *ev, struct fd_event *fde,
 			     uint16_t flags, void *private)
 {
 	struct ctdb_context *ctdb = talloc_get_type(private, struct ctdb_context);
-	int n1, n2;
 	char *p;
+	int n;
 
 	if (!(flags & EVENT_FD_READ)) {
 		return;
 	}
 	
-	n1 = read(ctdb->log->pfd, &ctdb->log->buf[ctdb->log->buf_used],
+	n = read(ctdb->log->pfd, &ctdb->log->buf[ctdb->log->buf_used],
 		 sizeof(ctdb->log->buf) - ctdb->log->buf_used);
-	if (n1 > 0) {
-		ctdb->log->buf_used += n1;
+	if (n > 0) {
+		ctdb->log->buf_used += n;
 	}
 
-	p = memchr(ctdb->log->buf, '\n', ctdb->log->buf_used);
-	if (!p) {
-		if (ctdb->log->buf_used == sizeof(ctdb->log->buf)) {
-			do_debug("%*.*s\n", 
-				 (int)ctdb->log->buf_used, (int)ctdb->log->buf_used, ctdb->log->buf);
-			ctdb->log->buf_used = 0;
+	while (ctdb->log->buf_used > 0 &&
+	       (p = memchr(ctdb->log->buf, '\n', ctdb->log->buf_used)) != NULL) {
+		int n1 = (p - ctdb->log->buf)+1;
+		int n2 = n1 - 1;
+		/* swallow \r from child processes */
+		if (n2 > 0 && ctdb->log->buf[n2-1] == '\r') {
+			n2--;
 		}
-		return;
+		do_debug("%*.*s\n", n2, n2, ctdb->log->buf);
+		memmove(ctdb->log->buf, p+1, sizeof(ctdb->log->buf) - n1);
+		ctdb->log->buf_used -= n1;
 	}
 
-	n1 = (p - ctdb->log->buf)+1;
-	n2 = n1 - 1;
-	/* swallow \r from child processes */
-	if (n2 > 0 && ctdb->log->buf[n2-1] == '\r') {
-		n2--;
+	/* the buffer could have completely filled - unfortunately we have
+	   no choice but to dump it out straight away */
+	if (ctdb->log->buf_used == sizeof(ctdb->log->buf)) {
+		do_debug("%*.*s\n", 
+			 (int)ctdb->log->buf_used, (int)ctdb->log->buf_used, ctdb->log->buf);
+		ctdb->log->buf_used = 0;
 	}
-	do_debug("%*.*s\n", n2, n2, ctdb->log->buf);
-	memmove(ctdb->log->buf, p+1, sizeof(ctdb->log->buf) - n1);
-	ctdb->log->buf_used -= n1;
 }
 
 
diff --git a/server/ctdb_persistent.c b/server/ctdb_persistent.c
index 66311a9..455ccba 100644
--- a/server/ctdb_persistent.c
+++ b/server/ctdb_persistent.c
@@ -89,7 +89,9 @@ int32_t ctdb_control_persistent_store(struct ctdb_context *ctdb,
 		DEBUG(DEBUG_ERR,(__location__ " can not match persistent_store to a client. Returning error\n"));
 		return -1;
 	}
-	client->num_persistent_updates--;
+	if (client->num_persistent_updates > 0) {
+		client->num_persistent_updates--;
+	}
 
 	state = talloc_zero(ctdb, struct ctdb_persistent_state);
 	CTDB_NO_MEMORY(ctdb, state);
@@ -454,7 +456,9 @@ int32_t ctdb_control_cancel_persistent_update(struct ctdb_context *ctdb,
 		return -1;
 	}
 
-	client->num_persistent_updates--;
+	if (client->num_persistent_updates > 0) {
+		client->num_persistent_updates--;
+	}
 
 	return 0;
 }
diff --git a/server/ctdb_recoverd.c b/server/ctdb_recoverd.c
index 69d867a..64a05a7 100644
--- a/server/ctdb_recoverd.c
+++ b/server/ctdb_recoverd.c
@@ -775,7 +775,7 @@ static void vacuum_fetch_next(struct vacuum_info *v);
  */
 static void vacuum_fetch_callback(struct ctdb_client_call_state *state)
 {
-	struct vacuum_info *v = talloc_get_type(state->async.private, struct vacuum_info);
+	struct vacuum_info *v = talloc_get_type(state->async.private_data, struct vacuum_info);
 	talloc_free(state);
 	vacuum_fetch_next(v);
 }
@@ -841,7 +841,7 @@ static void vacuum_fetch_next(struct vacuum_info *v)
 			return;
 		}
 		state->async.fn = vacuum_fetch_callback;
-		state->async.private = v;
+		state->async.private_data = v;
 		return;
 	}
 
@@ -1654,7 +1654,7 @@ static bool ctdb_election_win(struct ctdb_recoverd *rec, struct election_message
 /*
   send out an election request
  */
-static int send_election_request(struct ctdb_recoverd *rec, uint32_t pnn)
+static int send_election_request(struct ctdb_recoverd *rec, uint32_t pnn, bool update_recmaster)
 {
 	int ret;
 	TDB_DATA election_data;
@@ -1670,19 +1670,26 @@ static int send_election_request(struct ctdb_recoverd *rec, uint32_t pnn)
 	election_data.dptr  = (unsigned char *)&emsg;
 
 
-	/* first we assume we will win the election and set 
-	   recoverymaster to be ourself on the current node
-	 */
-	ret = ctdb_ctrl_setrecmaster(ctdb, CONTROL_TIMEOUT(), pnn, pnn);
-	if (ret != 0) {
-		DEBUG(DEBUG_ERR, (__location__ " failed to send recmaster election request\n"));
-		return -1;
-	}
-
-
 	/* send an election message to all active nodes */
 	ctdb_send_message(ctdb, CTDB_BROADCAST_ALL, srvid, election_data);
 
+
+	/* A new node that is already frozen has entered the cluster.
+	   The existing nodes are not frozen and dont need to be frozen
+	   until the election has ended and we start the actual recovery
+	*/
+	if (update_recmaster == true) {
+		/* first we assume we will win the election and set 
+		   recoverymaster to be ourself on the current node
+		 */
+		ret = ctdb_ctrl_setrecmaster(ctdb, CONTROL_TIMEOUT(), pnn, pnn);
+		if (ret != 0) {
+			DEBUG(DEBUG_ERR, (__location__ " failed to send recmaster election request\n"));
+			return -1;
+		}
+	}
+
+
 	return 0;
 }
 
@@ -1720,7 +1727,7 @@ static void election_send_request(struct event_context *ev, struct timed_event *
 	struct ctdb_recoverd *rec = talloc_get_type(p, struct ctdb_recoverd);
 	int ret;
 
-	ret = send_election_request(rec, ctdb_get_pnn(rec->ctdb));
+	ret = send_election_request(rec, ctdb_get_pnn(rec->ctdb), false);
 	if (ret != 0) {
 		DEBUG(DEBUG_ERR,("Failed to send election request!\n"));
 	}
@@ -1856,7 +1863,7 @@ static void force_election(struct ctdb_recoverd *rec, uint32_t pnn,
 						timeval_current_ofs(ctdb->tunable.election_timeout, 0), 
 						ctdb_election_timeout, rec);
 
-	ret = send_election_request(rec, pnn);
+	ret = send_election_request(rec, pnn, true);
 	if (ret!=0) {
 		DEBUG(DEBUG_ERR, (__location__ " failed to initiate recmaster election"));
 		return;
@@ -2901,7 +2908,7 @@ again:
 	}
 
 
-	DEBUG(DEBUG_INFO, (__location__ " Update flags on all nodes\n"));
+	DEBUG(DEBUG_DEBUG, (__location__ " Update flags on all nodes\n"));
 	/*
 	  update all nodes to have the same flags that we have
 	 */
diff --git a/tcp/tcp_connect.c b/tcp/tcp_connect.c
index f3b4f7d..906a665 100644
--- a/tcp/tcp_connect.c
+++ b/tcp/tcp_connect.c
@@ -100,9 +100,6 @@ static void ctdb_node_connect_write(struct event_context *ev, struct fd_event *f
 
 	/* the queue subsystem now owns this fd */
 	tnode->fd = -1;
-       
-	/* tell the ctdb layer we are connected */
-	node->ctdb->upcalls->node_connected(node);
 }
 
 


-- 
CTDB repository


More information about the samba-cvs mailing list