[SCM] CTDB repository - branch master updated - ctdb-1.11-75-gdd9f82d

Ronnie Sahlberg sahlberg at samba.org
Sun Oct 16 22:05:10 MDT 2011


The branch, master has been updated
       via  dd9f82dbe2346c7143b0229e3611c402ee8c4025 (commit)
       via  8c3b6427dbaade87e1a0f5590f0894c2e69b31a3 (commit)
      from  1198df0fd2c90cbca86d0499b43562fac4f25731 (commit)

http://gitweb.samba.org/?p=ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit dd9f82dbe2346c7143b0229e3611c402ee8c4025
Merge: 8c3b6427dbaade87e1a0f5590f0894c2e69b31a3 1198df0fd2c90cbca86d0499b43562fac4f25731
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Mon Oct 17 15:08:39 2011 +1100

    Merge branch 'master' of ssh://git.samba.org/data/git/ctdb

commit 8c3b6427dbaade87e1a0f5590f0894c2e69b31a3
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Oct 7 15:00:42 2011 +1100

    Make ctdb_diagnostics more resilient to uncontactable nodes.
    
    Current behaviour is for onnode to timeout (for about 20s) for each
    attempted ssh to a down node.  With 40 or 50 invocations of onnode
    this takes a long time.
    
    2 changes to work around this:
    
    * If EXTRA_SSH_OPTS (which is passed to ssh by onnode) does not
      contains a ConnectTimeout= setting then add a setting for a 5 second
      timeout.
    
    * Filter the nodes before starting any diagnosis, taking out any "bad
      nodes" that are uncontactable via onnode.
    
      In the nodes summary at the beginning of the output, print
      information about any "bad nodes".
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

-----------------------------------------------------------------------

Summary of changes:
 tools/ctdb_diagnostics |   34 +++++++++++++++++++++++++++++++++-
 1 files changed, 33 insertions(+), 1 deletions(-)


Changeset truncated at 500 lines:

diff --git a/tools/ctdb_diagnostics b/tools/ctdb_diagnostics
index cf166ec..117def8 100755
--- a/tools/ctdb_diagnostics
+++ b/tools/ctdb_diagnostics
@@ -18,6 +18,7 @@ EOF
 }
 
 nodes=$(ctdb listnodes -Y | cut -d: -f2)
+bad_nodes=""
 diff_opts=
 no_ads=false
 
@@ -45,6 +46,25 @@ parse_options ()
 
 parse_options "$@"
 
+# Use 5s ssh timeout if EXTRA_SSH_OPTS doesn't set a timeout.
+case "$EXTRA_SSH_OPTS" in
+    *ConnectTimeout=*) : ;;
+    *)
+	export EXTRA_SSH_OPTS="${EXTRA_SSH_OPTS} -o ConnectTimeout=5"
+esac
+
+# Filter nodes.  Remove any nodes we can't contact from $node and add
+# them to $bad_nodes.
+_nodes=""
+for _i in $nodes ; do
+    if onnode $_i true >/dev/null 2>&1 ; then
+	_nodes="${_nodes}${_nodes:+ }${_i}"
+    else
+	bad_nodes="${bad_nodes}${bad_nodes:+,}${_i}"
+    fi
+done
+nodes="$_nodes"
+
 nodes_comma=$(echo $nodes | sed -e 's@[[:space:]]@, at g')
 
 PATH="$PATH:/sbin:/usr/sbin:/usr/lpp/mmfs/bin"
@@ -138,11 +158,23 @@ NUM_ERRORS=0
 cat <<EOF
 Diagnosis started on these nodes:
 $nodes_comma
+EOF
+
+if [ -n "$bad_nodes" ] ; then
+    cat <<EOF
+
+NOT RUNNING DIAGNOSTICS on these uncontactable nodes:
+$bad_nodes
+EOF
+
+fi
+
+cat <<EOF
 
 For reference, here is the nodes file on the current node...
 EOF
-show_file /etc/ctdb/nodes
 
+show_file /etc/ctdb/nodes
 
 cat <<EOF
 --------------------------------------------------------------------


-- 
CTDB repository


More information about the samba-cvs mailing list