[SCM] CTDB repository - branch master updated - ctdb-1.0.86-52-g2ff6ee0

Wed Jul 8 04:28:11 GMT 2009

The branch, master has been updated
       via  2ff6ee042080ba1c2bea76bbef3742997d84c9a8 (commit)
       via  823019870c0831258b96654646f71e9dd69317ec (commit)
       via  de0b58e18fcc0f90075fca74077ab62ae8dab5da (commit)
       via  ee7caae3a55a64fb50cd28fa2fd4663c5dd83b4f (commit)
       via  1cac8a0ad429f29d1508158c7f7c42a2f1a22945 (commit)
       via  bdb856ee22816ae1f6b8d15856555f488054f489 (commit)
       via  92011cc05bbdb517ec6a4573f5cb9f6f21c3059e (commit)
       via  8e2a89935a969340bfead8ed040d74703947cb81 (commit)
       via  c2bdb77d91761c003e2f0e6918a27c54150f6030 (commit)
       via  e309cb3f95efcf6cff7d7c19713d7b161a138383 (commit)
       via  b6fa044a1364cbb3008085041453ee4885f7ced1 (commit)
       via  c97d56d93d9c1007a4e85affb19ed0c2d0e11b6d (commit)
       via  d440e83bb4f0c19c085915d0f0e87cc0dabbc569 (commit)
       via  8ddd5165f573fc6beaae589b86a6afa4bc17f32a (commit)
       via  10531b50e2d306a5e62b8d488a1acc9e75b0ad4b (commit)
       via  31cc46eb157ca1301312f14879e4fb4da7d81088 (commit)
       via  d5ca4ab325fce1f81361a4d79810cb543979ce57 (commit)
       via  d6e6909ac629212b3028e13b958e1a17c64bee8c (commit)
       via  92be87b5bfed7882b48f4034c82dfdb031f3afdc (commit)
       via  135b72828fc76856fa8f6d7f9c820120de05596b (commit)
       via  951dbcb29fd53cf51a08958efe185db4954d24f3 (commit)
       via  1ea6af7007fe3b5a48d48440a0924c71d7a6000a (commit)
       via  ee5d49324155e3e51371f6f8e5ed9eef4179f08d (commit)
      from  b67946a6f6b185a7920bf1e560988417c8c4d87d (commit)

http://gitweb.samba.org/?p=sahlberg/ctdb.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 2ff6ee042080ba1c2bea76bbef3742997d84c9a8
Merge: de0b58e18fcc0f90075fca74077ab62ae8dab5da 823019870c0831258b96654646f71e9dd69317ec
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 8 14:21:36 2009 +1000

    Merge branch 'ronnie_merge'

commit 823019870c0831258b96654646f71e9dd69317ec
Merge: ee7caae3a55a64fb50cd28fa2fd4663c5dd83b4f b67946a6f6b185a7920bf1e560988417c8c4d87d
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 8 14:21:05 2009 +1000

    Merge commit 'origin/master' into ronnie_merge
    
    Conflicts:
    	config/ctdb.init
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit de0b58e18fcc0f90075fca74077ab62ae8dab5da
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jul 8 13:37:52 2009 +1000

    Test suite: new tests and code factoring.
    
    * 2 new tests for NFS failover.
    
    * Factor repeated code from tests into new functions
      select_test_node_and_ips(), gratarp_sniff_start() and
      gratarp_sniff_wait_show().  Use these new functions in existing and
      new tests.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ee7caae3a55a64fb50cd28fa2fd4663c5dd83b4f
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 6 17:52:11 2009 +1000

    Test suite: better debug info when the cluster is unexpectedly unhealthy.
    
    cluster_is_healthy() is now run locally in tests and internally causes
    _cluster_is_healthy() to be run on node 0.  When it detects that the
    cluster is unhealthy and $ctdb_test_restart_scheduled is not true,
    debug information is printed.  This replaces the previous use of
    $CTDB_TEST_CLEANING_UP.
    
    To avoid spurious debug on expected restarts, added scheduled
    restarts to several tests.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1cac8a0ad429f29d1508158c7f7c42a2f1a22945
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 6 16:40:31 2009 +1000

    Make ctdbd restarts in tests more reliable.
    
    This works around potential race conditions in the init script where
    the restart operation is not necessarily reliable.  It just wraps the
    actual restart in a loop and tries for a successful restart up to 5
    times.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit bdb856ee22816ae1f6b8d15856555f488054f489
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jul 6 16:39:08 2009 +1000

    When testing make the time taken for some operations more obvious.
    
    If wait_until() does not timeout, print the time taken for the command
    to succeed.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 92011cc05bbdb517ec6a4573f5cb9f6f21c3059e
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 3 20:55:02 2009 +1000

    New tests for different aspects of failover.
    
    3 separate tests:
    
    * Check that gratuitous ARPs are received and take effect.
    
    * Check that ping still works after failover.
    
    * Check, via SSH, that the hostname changes after failover.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8e2a89935a969340bfead8ed040d74703947cb81
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 3 20:44:55 2009 +1000

    Updates to TCP tickle tests and supporting functions.
    
    * Removed a race from tpcdump_start().  It seems impossible to tell
      when tcpdump is actually ready to capture packets.  So this function
      now generates some dummy ping packets and waits until it sees them
      in the output file.
    
    * tcpdump_start() sets $tcpdump_filter.  This is the default filter
      for tcpdump_wait() and tcpdump_show(), but other filters may be
      passed to those functions.
    
    * New functions tcptickle_sniff_start() and
      tcptickle_sniff_wait_show() handle capturing TCP tickle packets.
      These are used by complex/31_nfs_tickle.sh and
      complex/32_cifs_tickle.sh.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c2bdb77d91761c003e2f0e6918a27c54150f6030
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 3 18:01:29 2009 +1000

    Add an extra ctdb recovery to test function restart_ctdb().
    
    There are still very rare cases where IPs haven't been reallocated
    before the beginning of the next test, so this adds a sleep and an
    extra call to "ctdb recover" to restart_ctdb().
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit e309cb3f95efcf6cff7d7c19713d7b161a138383
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 3 17:58:38 2009 +1000

    Fix the run_tests script so that the number of columns is never 0.
    
    Sometimes "stty size" reports 0, for example when running in a shell
    under Emacs.  In this case, we just change it to 80.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit b6fa044a1364cbb3008085041453ee4885f7ced1
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jul 3 17:40:16 2009 +1000

    Separate test cleanup code in output and clean up ctdb restart code.
    
    * ctdb_restart_when_done() now schedules a restart by setting an
      explicit variable that is respected in ctdb_test_exit(), rather than
      adding a restart to $ctdb_test_exit_hook.  This means that restarts
      are all done in one place.
    
    * ctdb_test_exit() turns off "set -e" to make sure that all cleanup
      happens.
    
    * ctdb_test_exit() now prints a clear message indicating where the
      test ends and the cleanup begins.  This message also includes the
      return code of the test.
    
    * Add debug in cluster_is_healthy to try to capture information about
      unexpected unhealthiness when a test starts.
    
    * Simplify simple/07_ctdb_process_exists.sh so that the exit code is
      generated more obviously.
    
    * Remove redundant calls to ctdb_test_exit at the end of tests, since
      they're done automatically via a trap.  Also remove any preceding
      warnings of restarts or final hints about test success/failure.
    
    * Allow multi-digit debug levels in simple/12_ctdb_getdebug.sh and
      simple/13_ctdb_setdebug.sh.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit c97d56d93d9c1007a4e85affb19ed0c2d0e11b6d
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 19 12:12:39 2009 +1000

    Fix minor onnode bugs relating to local daemons.
    
    Commit a0f5148ac749758e2dfbd6099e829c5bf1d900e6 caused a subtle
    regression.  Due to the subtlety, this description is much longer than
    the 1 line patch that fixes it!  The regression, where a process that
    invokes onnode is unexpectedly blocked, is only apparent if the
    following conditions are met:
    
    1. $CTDB_NODES_SOCKETS is set;
    2. The command passed to onnode attempts to background a process; and
    3. onnode is run in certain types of subshell (e.g. foo=$(onnode ...)).
    
    In particular, when testing against local daemons (i.e. condition (1)
    is met), tests/simple/07_ctdb_process_exists.sh would fail (because it
    does both (2), (3)).
    
    The problem is caused by the use of file descriptor 3 in the code that
    allows separate filtering of stdout and stderr.  A backgrounded
    process will have this descriptor open and the $(...) construct
    appears to wait for all file descriptors to be closed.  This only
    happens with local daemons because SSH is replaced by a shell and file
    descriptor 3 leaks into that shell.  It does not occur when SSH is
    used because the file descriptor does not leak into the remote shell
    where the process is backgrounded.
    
    The fix is simply to redirect file descriptor 3 to /dev/null in the
    fakessh function, which is used when $CTDB_NODES_SOCKETS is set.
    
    Also fixed is another minor bug when the -o option and
    $CTDB_NODES_SOCKETS are used in combination.  The code uses the node
    name as a suffix for the output filename(s).  Usually this is an IP
    address.  However, when $CTDB_NODES_SOCKETS is in use the node name is
    the socket name, which might be a path several directories deep.
    Each output file is created via a simple redirection and this would
    fail if unexpected directories appear in the filename.  3 possible
    fixes were considered:
    
    1. Replace all '/'s in the node name by '_'s.  Nice and simple.
    2. Use the basename of the node name.  However, sockets may be in
       different directories but have the same basename.
    3. Create all required directories before redirecting.  This is a
       little more complex and probably doesn't meet the user's
       expectations.
    
    Option (1) is implemented here.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d440e83bb4f0c19c085915d0f0e87cc0dabbc569
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jun 19 11:40:09 2009 +1000

    Clean up handling the of CTDB restarts in testcases.
    
    Glitches during restarts of the CTDB cluster have been causing some
    tests to fail.  This is because restarts are initiated in the body of
    many tests.  This adds a simple function ctdb_restart_when_done, which
    schedules a restart using an existing hook in the test exit code.
    This function is now used in tests that need to restart CTDB.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8ddd5165f573fc6beaae589b86a6afa4bc17f32a
Merge: 10531b50e2d306a5e62b8d488a1acc9e75b0ad4b 46e8c3737e6ff54fc80de8e962e922924c27bc35
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 16 12:56:55 2009 +1000

    Merge commit 'origin/master'

commit 10531b50e2d306a5e62b8d488a1acc9e75b0ad4b
Merge: d5ca4ab325fce1f81361a4d79810cb543979ce57 31cc46eb157ca1301312f14879e4fb4da7d81088
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 16 12:52:10 2009 +1000

    Merge branch 'new_tests'

commit 31cc46eb157ca1301312f14879e4fb4da7d81088
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 16 12:47:59 2009 +1000

    New tests for NFS and CIFS tickles.
    
    New tests/complex/ subdirectory contains 2 new tests to ensure that
    NFS and CIFS connections are tracked by CTDB and that tickle resets
    are sent when a node is disabled.
    
    Changes to ctdb_test_functions.bash to support these tests.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d5ca4ab325fce1f81361a4d79810cb543979ce57
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 16 12:42:29 2009 +1000

    Increase threshold in 51_ctdb_bench from 2% to 5%.
    
    The threshold for the difference in the number messages sent in either
    direction around the ring of nodes was set to 2%.  Something
    environmental is causing this different to sometimes be as high as 3%.
    We're confident it isn't a CTDB issue so we're increasing the
    threshold to 5%.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit d6e6909ac629212b3028e13b958e1a17c64bee8c
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Jun 10 16:15:09 2009 +1000

    Make 51_ctdb_bench.sh more tolerant.
    
    Limit the allowable difference in message counts in either direction
    around the ring to 5% (up from 2%).  There is something environmental
    making this blow out to 3% very occasionally when there's no obvious
    problem with ctdb.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 92be87b5bfed7882b48f4034c82dfdb031f3afdc
Merge: 135b72828fc76856fa8f6d7f9c820120de05596b 951dbcb29fd53cf51a08958efe185db4954d24f3
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 2 16:40:01 2009 +1000

    Merge branch 'init_rewrite'
    
    Conflicts:
    	config/ctdb.init
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 135b72828fc76856fa8f6d7f9c820120de05596b
Merge: 1ea6af7007fe3b5a48d48440a0924c71d7a6000a f236fa289f3115b1f4eb108eb668392dc520f61a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 2 16:29:25 2009 +1000

    Merge commit 'origin/master'

commit 951dbcb29fd53cf51a08958efe185db4954d24f3
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 2 16:07:08 2009 +1000

    Initscript cleanups.
    
    * Move building of CTDB_OPTIONS to new function build_ctdb_options()
      and have it use a helper function for readability.
    
    * New functions check_persistent_databases() and set_ctdb_variables().
    
    * Remove valgrind-specific stop code, since the general pkill should
      kill ctdbd when running under valgrind.
    
    * Remove some bash-isms (e.g. >& /dev/null) since the script is /bin/sh.
    
    * Make indentation consistent.
    
    * Minor clean-ups.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 1ea6af7007fe3b5a48d48440a0924c71d7a6000a
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 2 15:54:04 2009 +1000

    Fix minor problem in previous initscript commit.
    
    The valgrind start case should not use daemon, since this is specific
    to Red Hat.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit ee5d49324155e3e51371f6f8e5ed9eef4179f08d
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Jun 2 10:01:50 2009 +1000

    Initscript fixes, mostly for "stop" action.
    
    Use a local variable $ctdbd so that we always run ctdbd from the the
    same place and so that we know what to kill.  This variable respects
    the $CTDBD environment variable, which may be used to specify an
    alternative location for the daemon.
    
    In the important cases use "pkill -0 -f" to check if ctdbd is
    running.  Also, remove the special case for killing ctdbd when running
    under valgrind.  The regular case will handle this just fine.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

-----------------------------------------------------------------------

Summary of changes:
 tests/complex/31_nfs_tickle.sh             |   17 +----
 tests/complex/32_cifs_tickle.sh            |   17 +----
 tests/complex/33_gratuitous_arp.sh         |   26 +------
 tests/complex/41_failover_ping_discrete.sh |   26 +------
 tests/complex/42_failover_ssh_hostname.sh  |   26 +------
 tests/complex/43_failover_nfs_basic.sh     |   86 ++++++++++++++++++++++
 tests/complex/44_failover_nfs_oneway.sh    |  106 ++++++++++++++++++++++++++++
 tests/scripts/ctdb_test_functions.bash     |   34 +++++++++
 tests/simple/31_ctdb_disable.sh            |   18 +----
 tests/simple/32_ctdb_enable.sh             |   20 +-----
 10 files changed, 242 insertions(+), 134 deletions(-)
 create mode 100755 tests/complex/43_failover_nfs_basic.sh
 create mode 100755 tests/complex/44_failover_nfs_oneway.sh


Changeset truncated at 500 lines:

diff --git a/tests/complex/31_nfs_tickle.sh b/tests/complex/31_nfs_tickle.sh
index bbea663..45734cc 100755
--- a/tests/complex/31_nfs_tickle.sh
+++ b/tests/complex/31_nfs_tickle.sh
@@ -57,23 +57,8 @@ try_command_on_node 0 $CTDB getvar MonitorInterval
 monitor_interval="${out#*= }"
 #echo "Monitor interval on node $test_node is $monitor_interval seconds."
 
-echo "Getting list of public IPs..."
-try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
+select_test_node_and_ips
 
-# When selecting test_node we just want a node that has public IPs.
-# This will work and is economically semi-randomly.  :-)
-read x test_node <<<"$out"
-
-ips=""
-while read ip pnn ; do
-    if [ "$pnn" = "$test_node" ] ; then
-	ips="${ips}${ips:+ }${ip}"
-    fi
-done <<<"$out" # bashism to avoid problem setting variable in pipeline.
-
-echo "Selected node ${test_node} with IPs: $ips"
-
-test_ip="${ips%% *}"
 test_port=2049
 
 echo "Connecting to node ${test_node} on IP ${test_ip}:${test_port} with netcat..."
diff --git a/tests/complex/32_cifs_tickle.sh b/tests/complex/32_cifs_tickle.sh
index d024e7f..94b2861 100755
--- a/tests/complex/32_cifs_tickle.sh
+++ b/tests/complex/32_cifs_tickle.sh
@@ -56,23 +56,8 @@ try_command_on_node 0 $CTDB getvar MonitorInterval
 monitor_interval="${out#*= }"
 #echo "Monitor interval on node $test_node is $monitor_interval seconds."
 
-echo "Getting list of public IPs..."
-try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
+select_test_node_and_ips
 
-# When selecting test_node we just want a node that has public IPs.
-# This will work and is economically semi-randomly.  :-)
-read x test_node <<<"$out"
-
-ips=""
-while read ip pnn ; do
-    if [ "$pnn" = "$test_node" ] ; then
-	ips="${ips}${ips:+ }${ip}"
-    fi
-done <<<"$out" # bashism to avoid problem setting variable in pipeline.
-
-echo "Selected node ${test_node} with IPs: $ips"
-
-test_ip="${ips%% *}"
 test_port=445
 
 echo "Connecting to node ${test_node} on IP ${test_ip}:${test_port} with netcat..."
diff --git a/tests/complex/33_gratuitous_arp.sh b/tests/complex/33_gratuitous_arp.sh
index c5e8b81..e94a914 100755
--- a/tests/complex/33_gratuitous_arp.sh
+++ b/tests/complex/33_gratuitous_arp.sh
@@ -51,23 +51,7 @@ cluster_is_healthy
 # Reset configuration
 ctdb_restart_when_done
 
-echo "Getting list of public IPs..."
-try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
-
-# When selecting test_node we just want a node that has public IPs.
-# This will work and is economically semi-randomly.  :-)
-read x test_node <<<"$out"
-
-ips=""
-while read ip pnn ; do
-    if [ "$pnn" = "$test_node" ] ; then
-	ips="${ips}${ips:+ }${ip}"
-    fi
-done <<<"$out" # bashism to avoid problem setting variable in pipeline.
-
-echo "Selected node ${test_node} with IPs: $ips"
-
-test_ip="${ips%% *}"
+select_test_node_and_ips
 
 echo "Removing ${test_ip} from the local ARP table..."
 arp -d $test_ip >/dev/null 2>&1 || true
@@ -81,17 +65,13 @@ original_mac=$(arp -n $test_ip | awk '$2 == "ether" {print $3}')
 
 echo "MAC address is: ${original_mac}"
 
-filter="arp net ${test_ip}"
-tcpdump_start "$filter"
+gratarp_sniff_start
 
 echo "Disabling node $test_node"
 try_command_on_node 1 $CTDB disable -n $test_node
 onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status $test_node disabled
 
-tcpdump_wait 2
-
-echo "GOOD: this should be the gratuitous ARP and the reply:"
-tcpdump_show
+gratarp_sniff_wait_show
 
 echo "Getting MAC address associated with ${test_ip} again..."
 new_mac=$(arp -n $test_ip | awk '$2 == "ether" {print $3}')
diff --git a/tests/complex/41_failover_ping_discrete.sh b/tests/complex/41_failover_ping_discrete.sh
index f9351e4..32841c5 100755
--- a/tests/complex/41_failover_ping_discrete.sh
+++ b/tests/complex/41_failover_ping_discrete.sh
@@ -45,23 +45,7 @@ cluster_is_healthy
 # Reset configuration
 ctdb_restart_when_done
 
-echo "Getting list of public IPs..."
-try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
-
-# When selecting test_node we just want a node that has public IPs.
-# This will work and is economically semi-randomly.  :-)
-read x test_node <<<"$out"
-
-ips=""
-while read ip pnn ; do
-    if [ "$pnn" = "$test_node" ] ; then
-	ips="${ips}${ips:+ }${ip}"
-    fi
-done <<<"$out" # bashism to avoid problem setting variable in pipeline.
-
-echo "Selected node ${test_node} with IPs: $ips"
-
-test_ip="${ips%% *}"
+select_test_node_and_ips
 
 echo "Removing ${test_ip} from the local ARP table..."
 arp -d $test_ip >/dev/null 2>&1 || true
@@ -69,17 +53,13 @@ arp -d $test_ip >/dev/null 2>&1 || true
 echo "Pinging ${test_ip}..."
 ping -q -n -c 1 $test_ip
 
-filter="arp net ${test_ip}"
-tcpdump_start "$filter"
+gratarp_sniff_start
 
 echo "Disabling node $test_node"
 try_command_on_node 1 $CTDB disable -n $test_node
 onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status $test_node disabled
 
-tcpdump_wait 2
-
-echo "GOOD: this should be the gratuitous ARP and the reply:"
-tcpdump_show
+gratarp_sniff_wait_show
 
 echo "Removing ${test_ip} from the local ARP table again..."
 arp -d $test_ip >/dev/null 2>&1 || true
diff --git a/tests/complex/42_failover_ssh_hostname.sh b/tests/complex/42_failover_ssh_hostname.sh
index 7aa9cd8..1965248 100755
--- a/tests/complex/42_failover_ssh_hostname.sh
+++ b/tests/complex/42_failover_ssh_hostname.sh
@@ -45,23 +45,7 @@ cluster_is_healthy
 # Reset configuration
 ctdb_restart_when_done
 
-echo "Getting list of public IPs..."
-try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
-
-# When selecting test_node we just want a node that has public IPs.
-# This will work and is economically semi-randomly.  :-)
-read x test_node <<<"$out"
-
-ips=""
-while read ip pnn ; do
-    if [ "$pnn" = "$test_node" ] ; then
-	ips="${ips}${ips:+ }${ip}"
-    fi
-done <<<"$out" # bashism to avoid problem setting variable in pipeline.
-
-echo "Selected node ${test_node} with IPs: $ips"
-
-test_ip="${ips%% *}"
+select_test_node_and_ips
 
 echo "Removing ${test_ip} from the local ARP table..."
 arp -d $test_ip >/dev/null 2>&1 || true
@@ -72,17 +56,13 @@ original_hostname=$(ssh $test_ip hostname)
 
 echo "Hostname is: ${original_hostname}"
 
-filter="arp net ${test_ip}"
-tcpdump_start "$filter"
+gratarp_sniff_start
 
 echo "Disabling node $test_node"
 try_command_on_node 1 $CTDB disable -n $test_node
 onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status $test_node disabled
 
-tcpdump_wait 2
-
-echo "GOOD: this should be the gratuitous ARP and the reply:"
-tcpdump_show
+gratarp_sniff_wait_show
 
 echo "SSHing to ${test_ip} and running hostname (again)..."
 new_hostname=$(ssh $test_ip hostname)
diff --git a/tests/complex/43_failover_nfs_basic.sh b/tests/complex/43_failover_nfs_basic.sh
new file mode 100755
index 0000000..aa3a65c
--- /dev/null
+++ b/tests/complex/43_failover_nfs_basic.sh
@@ -0,0 +1,86 @@
+#!/bin/bash
+
+test_info()
+{
+    cat <<EOF
+Verify that a mounted NFS share is still operational after failover.
+
+We mount an NFS share from a node, write a file via NFS and then
+confirm that we can correctly read the file after a failover.
+
+Prerequisites:
+
+* An active CTDB cluster with at least 2 nodes with public addresses.
+
+* Test must be run on a real or virtual cluster rather than against
+  local daemons.
+
+* Test must not be run from a cluster node.
+
+Steps:
+
+1. Verify that the cluster is healthy.
+2. Select a public address and its corresponding node.
+3. Select the 1st NFS share exported on the node.
+4. Mount the selected NFS share.
+5. Create a file in the NFS mount and calculate its checksum.
+6. Disable the selected node.
+7. Read the file and calculate its checksum.
+8. Compare the checksums.
+
+Expected results:
+
+* When a node is disabled the public address fails over and it is
+  possible to correctly read a file over NFS.  The checksums should be
+  the same before and after.
+EOF
+}
+
+. ctdb_test_functions.bash
+
+set -e
+
+ctdb_test_init "$@"
+
+ctdb_test_check_real_cluster
+
+cluster_is_healthy
+
+# Reset configuration
+ctdb_restart_when_done
+
+select_test_node_and_ips
+
+first_export=$(showmount -e $test_ip | sed -n -e '2s/ .*//p')
+mnt_d=$(mktemp -d)
+test_file="${mnt_d}/$RANDOM"
+
+ctdb_test_exit_hook_add rm -f "$test_file"
+ctdb_test_exit_hook_add umount -f "$mnt_d"
+ctdb_test_exit_hook_add rmdir "$mnt_d"
+
+echo "Mounting ${test_ip}:${first_export} on ${mnt_d} ..."
+mount -o timeo=1,hard,intr,vers=3 ${test_ip}:${first_export} ${mnt_d}
+
+echo "Create file containing random data..."
+dd if=/dev/urandom of=$test_file bs=1k count=1
+original_sum=$(sum $test_file)
+[ $? -eq 0 ]
+
+gratarp_sniff_start
+
+echo "Disabling node $test_node"
+try_command_on_node 0 $CTDB disable -n $test_node
+onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status $test_node disabled
+
+gratarp_sniff_wait_show
+
+new_sum=$(sum $test_file)
+[ $? -eq 0 ]
+
+if [ "$original_md5" = "$new_md5" ] ; then
+    echo "GOOD: file contents unchanged after failover"
+else
+    echo "BAD: file contents are different after failover"
+    testfailures=1
+fi
diff --git a/tests/complex/44_failover_nfs_oneway.sh b/tests/complex/44_failover_nfs_oneway.sh
new file mode 100755
index 0000000..77503d4
--- /dev/null
+++ b/tests/complex/44_failover_nfs_oneway.sh
@@ -0,0 +1,106 @@
+#!/bin/bash
+
+test_info()
+{
+    cat <<EOF
+Verify that a file created on a node is readable via NFS after a failover.
+
+We write a file into an exported directory on a node, mount the NFS
+share from a node, verify that we can read the file via NFS and that
+we can still read it after a failover.
+
+Prerequisites:
+
+* An active CTDB cluster with at least 2 nodes with public addresses.
+
+* Test must be run on a real or virtual cluster rather than against
+  local daemons.
+
+* Test must not be run from a cluster node.
+
+Steps:
+
+1.  Verify that the cluster is healthy.
+2.  Select a public address and its corresponding node.
+3.  Select the 1st NFS share exported on the node.
+4.  Write a file into exported directory on the node and calculate its
+    checksum.
+5.  Mount the selected NFS share.
+6.  Read the file via the NFS mount and calculate its checksum.
+7.  Compare checksums.
+8.  Disable the selected node.
+9.  Read the file via NFS and calculate its checksum.
+10. Compare the checksums.
+
+Expected results:
+
+* Checksums for the file on all 3 occasions should be the same.
+EOF
+}
+
+. ctdb_test_functions.bash
+
+set -e
+
+ctdb_test_init "$@"
+
+ctdb_test_check_real_cluster
+
+cluster_is_healthy
+
+# Reset configuration
+ctdb_restart_when_done
+
+select_test_node_and_ips
+
+first_export=$(showmount -e $test_ip | sed -n -e '2s/ .*//p')
+local_f=$(mktemp)
+mnt_d=$(mktemp -d)
+nfs_f="${mnt_d}/$RANDOM"
+remote_f="${test_ip}:${first_export}/$(basename $nfs_f)"
+
+ctdb_test_exit_hook_add rm -f "$local_f"
+ctdb_test_exit_hook_add rm -f "$nfs_f"
+ctdb_test_exit_hook_add umount -f "$mnt_d"
+ctdb_test_exit_hook_add rmdir "$mnt_d"
+
+echo "Create file containing random data..."
+dd if=/dev/urandom of=$local_f bs=1k count=1
+local_sum=$(sum $local_f)
+[ $? -eq 0 ]
+
+scp "$local_f" "$remote_f"
+
+echo "Mounting ${test_ip}:${first_export} on ${mnt_d} ..."
+mount -o timeo=1,hard,intr,vers=3 ${test_ip}:${first_export} ${mnt_d}
+
+nfs_sum=$(sum $nfs_f)
+
+if [ "$local_sum" = "$nfs_sum" ] ; then
+    echo "GOOD: file contents read correctly via NFS"
+else
+    echo "BAD: file contents are different over NFS"
+    echo "  original file: $local_sum"
+    echo "       NFS file: $nfs_sum"
+    exit 1
+fi
+
+gratarp_sniff_start
+
+echo "Disabling node $test_node"
+try_command_on_node 0 $CTDB disable -n $test_node
+onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status $test_node disabled
+
+gratarp_sniff_wait_show
+
+new_sum=$(sum $nfs_f)
+[ $? -eq 0 ]
+
+if [ "$nfs_sum" = "$new_sum" ] ; then
+    echo "GOOD: file contents unchanged after failover"
+else
+    echo "BAD: file contents are different after failover"
+    echo "  original file: $nfs_sum"
+    echo "       NFS file: $new_sum"
+    exit 1
+fi
diff --git a/tests/scripts/ctdb_test_functions.bash b/tests/scripts/ctdb_test_functions.bash
index cc82d28..b84fe72 100644
--- a/tests/scripts/ctdb_test_functions.bash
+++ b/tests/scripts/ctdb_test_functions.bash
@@ -258,6 +258,27 @@ sanity_check_ips ()
     return 1
 }
 
+select_test_node_and_ips ()
+{
+    try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
+
+    # When selecting test_node we just want a node that has public
+    # IPs.  This will work and is economically semi-random.  :-)
+    local x
+    read x test_node <<<"$out"
+
+    test_node_ips=""
+    local ip pnn
+    while read ip pnn ; do
+	if [ "$pnn" = "$test_node" ] ; then
+            test_node_ips="${test_node_ips}${test_node_ips:+ }${ip}"
+	fi
+    done <<<"$out" # bashism to avoid problem setting variable in pipeline.
+
+    echo "Selected node ${test_node} with IPs: ${test_node_ips}."
+    test_ip="${test_node_ips%% *}"
+}
+
 #######################################
 
 # Wait until either timeout expires or command succeeds.  The command
@@ -544,6 +565,19 @@ tcptickle_sniff_wait_show ()
     tcpdump_show
 }
 
+gratarp_sniff_start ()
+{
+    tcpdump_start "arp host ${test_ip}"
+}
+
+gratarp_sniff_wait_show ()
+{
+    tcpdump_wait 2
+
+    echo "GOOD: this should be the some gratuitous ARPs:"
+    tcpdump_show
+}
+
 
 #######################################
 
diff --git a/tests/simple/31_ctdb_disable.sh b/tests/simple/31_ctdb_disable.sh
index c513c51..52334f9 100755
--- a/tests/simple/31_ctdb_disable.sh
+++ b/tests/simple/31_ctdb_disable.sh
@@ -40,21 +40,7 @@ cluster_is_healthy
 # Reset configuration
 ctdb_restart_when_done
 
-echo "Getting list of public IPs..."
-try_command_on_node 0 "$CTDB ip -n all | sed -e '1d'"
-
-# When selecting test_node we just want a node that has public IPs.
-# This will work and is economically semi-randomly.  :-)
-read x test_node <<<"$out"
-
-ips=""
-while read ip pnn ; do
-    if [ "$pnn" = "$test_node" ] ; then
-	ips="${ips}${ips:+ }${ip}"
-    fi
-done <<<"$out" # bashism to avoid problem setting variable in pipeline.
-
-echo "Selected node ${test_node} with IPs: $ips"
+select_test_node_and_ips
 
 echo "Disabling node $test_node"
 
@@ -63,7 +49,7 @@ try_command_on_node 1 $CTDB disable -n $test_node
 # Avoid a potential race condition...
 onnode 0 $CTDB_TEST_WRAPPER wait_until_node_has_status $test_node disabled
 
-if wait_until_ips_are_on_nodeglob "[!${test_node}]" $ips ; then
+if wait_until_ips_are_on_nodeglob "[!${test_node}]" $test_node_ips ; then
     echo "All IPs moved."
 else
     echo "Some IPs didn't move."
diff --git a/tests/simple/32_ctdb_enable.sh b/tests/simple/32_ctdb_enable.sh
index a6e60d8..cf0abe8 100755
--- a/tests/simple/32_ctdb_enable.sh


-- 
CTDB repository