[SCM] Samba Shared Repository - branch master updated

Tue May 7 06:57:11 UTC 2019

The branch, master has been updated
       via  5a9e338330f ctdb-tests: Don't clean up test var directory in autotest target
       via  a2ab6485e02 ctdb-tests: Fix usage message
       via  3cb53a7a054 ctdb-tests: Wait to allow database attach/detach to take effect
       via  066cc5b0c56 ctdb-tests: Avoid bulk output in $out, prefer $outfile
       via  9d02452a246 ctdb-tests: Make try_command_on_node less error-prone
       via  7c3819d1ac2 ctdb-tests: Change sanity_check_output() to internally use $out
       via  b80967f5dcc ctdb-scripts: Drop script configuration variable CTDB_MONITOR_SWAP_USAGE
       via  8108b3134c0 ctdb-tests: Extend test to cover ctdb rddumpmemory
       via  f78d9388fb4 ctdb-tools: Fix ctdb dumpmemory to avoid printing trailing NUL
       via  95477e69e3e ctdb-daemon: Log when ctdbd CPU utilisation exceeds a threshold
       via  87032ccebdd ctdb-build: Add check for getrusage()
      from  3d42e257a61 s4 dns_server Bind9: Log opertion durations

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 5a9e338330fe136908a3a17a5df81c054c5cc5b0
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 1 15:17:14 2019 +1000

    ctdb-tests: Don't clean up test var directory in autotest target
    
    If the directory is always cleaned up then it is not possible to look
    at daemon logs to debug test failures.
    
    This target is only really used by autobuild.py, which (optionally)
    cleans up the parent directory anyway.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>
    
    Autobuild-User(master): Amitay Isaacs <amitay at samba.org>
    Autobuild-Date(master): Tue May  7 06:56:01 UTC 2019 on sn-devel-184

commit a2ab6485e027ebb13871c7d83b7626ac5c9b98c0
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 1 15:10:28 2019 +1000

    ctdb-tests: Fix usage message
    
    Since commit 0e9ead8f28fced3ebfa888786a1dc5bb59e734a3 daemons have
    been shut down after each test, so this option no longer has anything
    to do with killing daemons.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 3cb53a7a05409925024d6a67bcfaeb962d896e0b
Author: Martin Schwenke <martin at meltin.net>
Date:   Sat Apr 27 14:54:09 2019 +1000

    ctdb-tests: Wait to allow database attach/detach to take effect
    
    Sometimes the detach test fails:
    
      Check detaching single test database detach_test1.tdb
      BAD: database detach_test1.tdb is still attached
      Number of databases:4
      dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.0/db/volatile/detach_test4.tdb.0
      dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.0/db/volatile/detach_test3.tdb.0
      dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.0/db/volatile/detach_test2.tdb.0
      dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.0/db/volatile/detach_test1.tdb.0
      Number of databases:3
      dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.1/db/volatile/detach_test4.tdb.1
      dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.1/db/volatile/detach_test3.tdb.1
      dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.1/db/volatile/detach_test2.tdb.1
      Number of databases:4
      dbid:0x5ae995ee name:detach_test4.tdb path:tests/var/simple/node.2/db/volatile/detach_test4.tdb.2
      dbid:0xd84cc13c name:detach_test3.tdb path:tests/var/simple/node.2/db/volatile/detach_test3.tdb.2
      dbid:0x8e8e8cef name:detach_test2.tdb path:tests/var/simple/node.2/db/volatile/detach_test2.tdb.2
      dbid:0xc62491f4 name:detach_test1.tdb path:tests/var/simple/node.2/db/volatile/detach_test1.tdb.2
      *** TEST COMPLETED (RC=1) AT 2019-04-27 03:35:40, CLEANING UP...
    
    When issued from a client, the detach control re-broadcasts itself
    asynchronously to all nodes and then returns success.  The controls to
    some nodes to do the actual detach may still be in flight when success
    is returned to the client.  Therefore, the test should wait for a few
    seconds to allow the asynchronous controls to complete.
    
    The same is true for the attach control, so workaround the problem in
    the attach test too.
    
    An alternative is to make the attach and detach controls synchronous
    by avoiding the broadcast and waiting for the results of the
    individual controls sent to the nodes.  However, a simple
    implementation would involve adding new nested event loops.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 066cc5b0c561464ed08890d9aa1a1a55b545e9cc
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Apr 11 20:55:20 2019 +1000

    ctdb-tests: Avoid bulk output in $out, prefer $outfile
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 9d02452a24625df5f62fd6d45a16effe2fa45fbe
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Mar 28 14:26:52 2019 +1100

    ctdb-tests: Make try_command_on_node less error-prone
    
    This sometimes fails, apparently due to a cat process in onnode
    getting EAGAIN.  The conclusion is that tests that process large
    amounts of output should not depend on a sub-shell delivering that
    output into a shell variable.
    
    Change try_command_on_node() to leave all of the output in file
    $outfile and just put the first 1KB into $out.  $outfile is removed
    after each test completes.
    
    Change the implementation of sanity_check_output() to use $outfile
    instead of $out.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 7c3819d1ac264acf998f426e0cef7f6211e0ddee
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Apr 30 12:09:26 2019 +1000

    ctdb-tests: Change sanity_check_output() to internally use $out
    
    All callers are currently passed $out.  Global variable $out is used
    in many other places so use it here to simplify the interface and make
    future changes simpler.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13924
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit b80967f5dcc6b58db0c38ec3e5cf0cbe46dbeb4b
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Mar 29 11:19:55 2019 +1100

    ctdb-scripts: Drop script configuration variable CTDB_MONITOR_SWAP_USAGE
    
    CTDB's system memory monitoring in 05.system.script monitors both main
    memory and swap.  The swap monitoring was originally based on
    the (possibly incorrect, see below) idea that swap space stacks on top
    of main memory, so that when a system starts filling swap space then
    this is supposed to be a good sign that the system is running out of
    memory.  Additionally, performance on a Linux system tends to be
    destroyed by the I/O associated with a lot of swapping to spinning
    disks.
    
    However, some platforms default to creating only 4GB of swap space
    even when there is 128GB of main memory.  With such a small swap to
    main memory ratio, memory pressure can force swap to be nearly full
    even when a significant amount of main memory is still available and
    the system is performing well.  This suggests that checking swap
    utilisation might be less than useful in many circumstances.
    
    So, remove the separate swap space checking and change the memory
    check to cover the total of main memory and swap space.
    
    Test function set_mem_usage() still takes an argument for each of main
    memory and swap space utilisation.  For simplicity, the same number is
    now passed twice to make the intended results comprehensible.  This
    could be changed later.
    
    A couple of tests are cleaned up to no longer use hard-coded
    /proc/meminfo and ps output.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 8108b3134c017c22d245fc5b2207a88d44ab0dd2
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Apr 11 16:58:10 2019 +1000

    ctdb-tests: Extend test to cover ctdb rddumpmemory
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit f78d9388fb459dc83fafb4da6e683e3137ad40e1
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Apr 11 16:56:32 2019 +1000

    ctdb-tools: Fix ctdb dumpmemory to avoid printing trailing NUL
    
    Fix ctdb rddumpmemory too.
    
    BUG: https://bugzilla.samba.org/show_bug.cgi?id=13923
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 95477e69e3e865cb4ee93f947074eef5c873750f
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 18 17:46:37 2019 +1100

    ctdb-daemon: Log when ctdbd CPU utilisation exceeds a threshold
    
    This is to help us notice when ctdbd is using the full capacity of a
    CPU, so is saturated.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

commit 87032ccebdd13feef13d9da8d8958d928f36b75a
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Jan 18 17:43:44 2019 +1100

    ctdb-build: Add check for getrusage()
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Amitay Isaacs <amitay at gmail.com>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/config/events/legacy/05.system.script         |  17 +--
 ctdb/doc/ctdb-script.options.5.xml                 |  21 ----
 ctdb/doc/examples/config_migrate.sh                |   2 +-
 ctdb/server/ctdb_daemon.c                          | 123 +++++++++++++++++++++
 ctdb/tests/complex/11_ctdb_delip_removes_ip.sh     |  10 +-
 ctdb/tests/complex/18_ctdb_reloadips.sh            |   8 +-
 ctdb/tests/complex/32_cifs_tickle.sh               |   7 --
 ctdb/tests/complex/36_smb_reset_server.sh          |  12 +-
 ctdb/tests/complex/37_nfs_reset_server.sh          |   4 +-
 ctdb/tests/complex/60_rogueip_releaseip.sh         |   2 +-
 ctdb/tests/complex/scripts/local.bash              |   5 +-
 ctdb/tests/eventscripts/05.system.monitor.011.sh   |   3 +-
 ctdb/tests/eventscripts/05.system.monitor.012.sh   |   3 +-
 ctdb/tests/eventscripts/05.system.monitor.013.sh   |  21 ----
 ctdb/tests/eventscripts/05.system.monitor.014.sh   |   4 +-
 ctdb/tests/eventscripts/05.system.monitor.015.sh   |   4 +-
 ctdb/tests/eventscripts/05.system.monitor.016.sh   |  19 ----
 ctdb/tests/eventscripts/05.system.monitor.017.sh   |  30 +----
 ctdb/tests/eventscripts/05.system.monitor.018.sh   |  81 +++-----------
 ctdb/tests/run_tests.sh                            |   2 +-
 ctdb/tests/scripts/integration.bash                |  71 ++++++------
 ctdb/tests/simple/02_ctdb_tunables.sh              |   6 +-
 ctdb/tests/simple/05_ctdb_listnodes.sh             |   5 +-
 ctdb/tests/simple/08_ctdb_isnotrecmaster.sh        |  10 +-
 ctdb/tests/simple/09_ctdb_ping.sh                  |   6 +-
 ctdb/tests/simple/11_ctdb_ip.sh                    |  14 ++-
 ctdb/tests/simple/12_ctdb_getdebug.sh              |   3 +-
 ctdb/tests/simple/14_ctdb_statistics.sh            |   2 +-
 ctdb/tests/simple/15_ctdb_statisticsreset.sh       |  21 ++--
 ctdb/tests/simple/19_ip_takeover_noop.sh           |   4 +-
 ctdb/tests/simple/20_delip_iface_gc.sh             |  10 +-
 ctdb/tests/simple/21_ctdb_attach.sh                |  49 ++++----
 ctdb/tests/simple/23_ctdb_moveip.sh                |  25 ++++-
 ctdb/tests/simple/24_ctdb_getdbmap.sh              |  10 +-
 ctdb/tests/simple/25_dumpmemory.sh                 |   9 +-
 ..._ctdb_config_check_error_on_unreachable_ctdb.sh |   6 +-
 ctdb/tests/simple/27_ctdb_detach.sh                |  71 +++++++-----
 ctdb/tests/simple/35_ctdb_getreclock.sh            |   2 +-
 ctdb/tests/simple/51_message_ring.sh               |  14 +--
 ctdb/tests/simple/52_fetch_ring.sh                 |  14 +--
 ctdb/tests/simple/53_transaction_loop.sh           |   4 +-
 ctdb/tests/simple/54_transaction_loop_recovery.sh  |   4 +-
 ctdb/tests/simple/55_ctdb_ptrans.sh                |  12 +-
 .../simple/56_replicated_transaction_recovery.sh   |   4 +-
 ctdb/tests/simple/58_ctdb_restoredb.sh             |   8 +-
 ctdb/tests/simple/69_recovery_resurrect_deleted.sh |  10 +-
 ctdb/tests/simple/70_recoverpdbbyseqnum.sh         |   4 +-
 ctdb/tests/simple/71_ctdb_wipedb.sh                |   4 +-
 ctdb/tests/simple/72_update_record_persistent.sh   |   4 +-
 ctdb/tests/simple/75_readonly_records_basic.sh     |  24 ++--
 ctdb/tests/simple/77_ctdb_db_recovery.sh           |   6 +-
 ctdb/tests/simple/79_volatile_db_traverse.sh       |   4 +-
 ctdb/tests/simple/80_ctdb_traverse.sh              |   2 +-
 ctdb/tests/simple/81_tunnel_ring.sh                |  14 +--
 ctdb/tests/simple/90_debug_hung_script.sh          |   6 +-
 ctdb/tools/ctdb.c                                  |  10 +-
 ctdb/wscript                                       |   3 +-
 57 files changed, 428 insertions(+), 425 deletions(-)
 delete mode 100755 ctdb/tests/eventscripts/05.system.monitor.013.sh
 delete mode 100755 ctdb/tests/eventscripts/05.system.monitor.016.sh


Changeset truncated at 500 lines:

diff --git a/ctdb/config/events/legacy/05.system.script b/ctdb/config/events/legacy/05.system.script
index e2ffeac715a..08e401a9e73 100755
--- a/ctdb/config/events/legacy/05.system.script
+++ b/ctdb/config/events/legacy/05.system.script
@@ -132,9 +132,6 @@ monitor_memory_usage ()
     if [ -z "$CTDB_MONITOR_MEMORY_USAGE" ] ; then
 	CTDB_MONITOR_MEMORY_USAGE=80
     fi
-    if [ -z "$CTDB_MONITOR_SWAP_USAGE" ] ; then
-	CTDB_MONITOR_SWAP_USAGE=25
-    fi
 
     _meminfo=$(get_proc "meminfo")
     # Intentional word splitting here
@@ -149,21 +146,19 @@ $1 == "SwapFree:"     { swapfree  = $2 }
 $1 == "SwapTotal:"    { swaptotal = $2 }
 END {
     if (memavail != 0) { memfree = memavail ; }
-    if (memtotal != 0) { print int((memtotal - memfree) / memtotal * 100) ; } else { print 0 ; }
-    if (swaptotal != 0) { print int((swaptotal - swapfree) / swaptotal * 100) ; } else { print 0 ; }
+    if (memtotal + swaptotal != 0) {
+	usedtotal = memtotal - memfree + swaptotal - swapfree
+	print int(usedtotal / (memtotal + swaptotal) * 100)
+    } else {
+	print 0
+    }
 }')
     _mem_usage="$1"
-    _swap_usage="$2"
 
     check_thresholds "System memory" \
 		     "$CTDB_MONITOR_MEMORY_USAGE" \
 		     "$_mem_usage" \
 		     dump_memory_info
-
-    check_thresholds "System swap" \
-		     "$CTDB_MONITOR_SWAP_USAGE" \
-		     "$_swap_usage" \
-		     dump_memory_info
 }
 
 
diff --git a/ctdb/doc/ctdb-script.options.5.xml b/ctdb/doc/ctdb-script.options.5.xml
index 9d545b5cc0d..6b2efb27ac2 100644
--- a/ctdb/doc/ctdb-script.options.5.xml
+++ b/ctdb/doc/ctdb-script.options.5.xml
@@ -964,27 +964,6 @@ CTDB_PER_IP_ROUTING_TABLE_ID_HIGH=9000
 	  </listitem>
 	</varlistentry>
 
-	<varlistentry>
-	  <term>
-	    CTDB_MONITOR_SWAP_USAGE=<parameter>SWAP-LIMITS</parameter>
-	  </term>
-	  <listitem>
-	    <para>
-	      SWAP-LIMITS takes the form
-	      <parameter>WARN_LIMIT</parameter><optional>:<parameter>UNHEALTHY_LIMIT</parameter></optional>
-	       indicating that warnings should be logged if
-	      swap usage reaches WARN_LIMIT%.  If usage reaches
-	      UNHEALTHY_LIMIT then the node should be flagged
-	      unhealthy.  Either WARN_LIMIT or UNHEALTHY_LIMIT may be
-	      left blank, meaning that check will be omitted.
-	    </para>
-	    <para>
-	      Default is 25, so warnings will be logged when swap
-	      usage reaches 25%.
-	    </para>
-	  </listitem>
-	</varlistentry>
-
       </variablelist>
     </refsect2>
 
diff --git a/ctdb/doc/examples/config_migrate.sh b/ctdb/doc/examples/config_migrate.sh
index 8479aeb39f3..e0d01e77057 100755
--- a/ctdb/doc/examples/config_migrate.sh
+++ b/ctdb/doc/examples/config_migrate.sh
@@ -209,6 +209,7 @@ CTDB_NOTIFY_SCRIPT
 CTDB_PUBLIC_INTERFACE
 CTDB_MAX_PERSISTENT_CHECK_ERRORS
 CTDB_SHUTDOWN_TIMEOUT
+CTDB_MONITOR_SWAP_USAGE
 EOF
 }
 
@@ -262,7 +263,6 @@ CTDB_MAX_CORRUPT_DB_BACKUPS
 # 05.system
 CTDB_MONITOR_FILESYSTEM_USAGE
 CTDB_MONITOR_MEMORY_USAGE
-CTDB_MONITOR_SWAP_USAGE
 # debug_hung_scripts.sh
 CTDB_DEBUG_HUNG_SCRIPT_STACKPAT
 EOF
diff --git a/ctdb/server/ctdb_daemon.c b/ctdb/server/ctdb_daemon.c
index a8691388d4a..c5733bb2592 100644
--- a/ctdb/server/ctdb_daemon.c
+++ b/ctdb/server/ctdb_daemon.c
@@ -72,7 +72,126 @@ static void print_exit_message(void)
 	}
 }
 
+#ifdef HAVE_GETRUSAGE
 
+struct cpu_check_threshold_data {
+	unsigned short percent;
+	struct timeval timeofday;
+	struct timeval ru_time;
+};
+
+static void ctdb_cpu_check_threshold(struct tevent_context *ev,
+				     struct tevent_timer *te,
+				     struct timeval tv,
+				     void *private_data)
+{
+	struct ctdb_context *ctdb = talloc_get_type_abort(
+		private_data, struct ctdb_context);
+	uint32_t interval = 60;
+
+	static unsigned short threshold = 0;
+	static struct cpu_check_threshold_data prev = {
+		.percent = 0,
+		.timeofday = { .tv_sec = 0 },
+		.ru_time = { .tv_sec = 0 },
+	};
+
+	struct rusage usage;
+	struct cpu_check_threshold_data curr = {
+		.percent = 0,
+	};
+	int64_t ru_time_diff, timeofday_diff;
+	bool first;
+	int ret;
+
+	/*
+	 * Cache the threshold so that we don't waste time checking
+	 * the environment variable every time
+	 */
+	if (threshold == 0) {
+		const char *t;
+
+		threshold = 90;
+
+		t = getenv("CTDB_TEST_CPU_USAGE_THRESHOLD");
+		if (t != NULL) {
+			int th;
+
+			th = atoi(t);
+			if (th <= 0 || th > 100) {
+				DBG_WARNING("Failed to parse env var: %s\n", t);
+			} else {
+				threshold = th;
+			}
+		}
+	}
+
+	ret = getrusage(RUSAGE_SELF, &usage);
+	if (ret != 0) {
+		DBG_WARNING("rusage() failed: %d\n", ret);
+		goto next;
+	}
+
+	/* Sum the system and user CPU usage */
+	curr.ru_time = timeval_sum(&usage.ru_utime, &usage.ru_stime);
+
+	curr.timeofday = tv;
+
+	first = timeval_is_zero(&prev.timeofday);
+	if (first) {
+		/* No previous values recorded so no calculation to do */
+		goto done;
+	}
+
+	timeofday_diff = usec_time_diff(&curr.timeofday, &prev.timeofday);
+	if (timeofday_diff <= 0) {
+		/*
+		 * Time went backwards or didn't progress so no (sane)
+		 * calculation can be done
+		 */
+		goto done;
+	}
+
+	ru_time_diff = usec_time_diff(&curr.ru_time, &prev.ru_time);
+
+	curr.percent = ru_time_diff * 100 / timeofday_diff;
+
+	if (curr.percent >= threshold) {
+		/* Log only if the utilisation changes */
+		if (curr.percent != prev.percent) {
+			D_WARNING("WARNING: CPU utilisation %hu%% >= "
+				  "threshold (%hu%%)\n",
+				  curr.percent,
+				  threshold);
+		}
+	} else {
+		/* Log if the utilisation falls below the threshold */
+		if (prev.percent >= threshold) {
+			D_WARNING("WARNING: CPU utilisation %hu%% < "
+				  "threshold (%hu%%)\n",
+				  curr.percent,
+				  threshold);
+		}
+	}
+
+done:
+	prev = curr;
+
+next:
+	tevent_add_timer(ctdb->ev, ctdb,
+			 timeval_current_ofs(interval, 0),
+			 ctdb_cpu_check_threshold,
+			 ctdb);
+}
+
+static void ctdb_start_cpu_check_threshold(struct ctdb_context *ctdb)
+{
+	tevent_add_timer(ctdb->ev, ctdb,
+			 timeval_current(),
+			 ctdb_cpu_check_threshold,
+			 ctdb);
+}
+#endif /* HAVE_GETRUSAGE */
 
 static void ctdb_time_tick(struct tevent_context *ev, struct tevent_timer *te,
 				  struct timeval t, void *private_data)
@@ -111,6 +230,10 @@ static void ctdb_start_periodic_events(struct ctdb_context *ctdb)
 
 	/* start listening to timer ticks */
 	ctdb_start_time_tickd(ctdb);
+
+#ifdef HAVE_GETRUSAGE
+	ctdb_start_cpu_check_threshold(ctdb);
+#endif /* HAVE_GETRUSAGE */
 }
 
 static void ignore_signal(int signum)
diff --git a/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh b/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh
index b5c8866d67a..543472c0f22 100755
--- a/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh
+++ b/ctdb/tests/complex/11_ctdb_delip_removes_ip.sh
@@ -22,8 +22,8 @@ cluster_is_healthy
 select_test_node_and_ips
 get_test_ip_mask_and_iface
 
-echo "Checking that node ${test_node} hosts ${test_ip} on interface ${iface}..."
-try_command_on_node $test_node "ip addr show dev $iface | grep -E 'inet6?[[:space:]]*${test_ip}/'"
+echo "Checking that node ${test_node} hosts ${test_ip}..."
+try_command_on_node $test_node "ip addr show to ${test_ip} | grep -q ."
 
 echo "Attempting to remove ${test_ip} from node ${test_node}."
 try_command_on_node $test_node $CTDB delip $test_ip
@@ -33,10 +33,10 @@ wait_until_ips_are_on_node '!' $test_node $test_ip
 timeout=60
 increment=5
 count=0
-echo "Waiting for ${test_ip} to disappear from ${iface}..."
+echo "Waiting for ${test_ip} to disappear from node ${test_node}..."
 while : ; do
-    try_command_on_node -v $test_node "ip addr show dev $iface"
-    if echo "$out" | grep -E 'inet6?[[:space:]]*${test_ip}/'; then
+    try_command_on_node -v $test_node "ip addr show to ${test_node}"
+    if -n "$out" ; then
 	echo "Still there..."
 	if [ $(($count * $increment)) -ge $timeout ] ; then
 	    echo "BAD: Timed out waiting..."
diff --git a/ctdb/tests/complex/18_ctdb_reloadips.sh b/ctdb/tests/complex/18_ctdb_reloadips.sh
index 2beff771874..4ba1b26a8e8 100755
--- a/ctdb/tests/complex/18_ctdb_reloadips.sh
+++ b/ctdb/tests/complex/18_ctdb_reloadips.sh
@@ -48,12 +48,12 @@ select_test_node_and_ips
 
 echo "Getting public IP information from CTDB..."
 try_command_on_node any "$CTDB ip -X -v all"
-ctdb_ip_info=$(echo "$out" | awk -F'|' 'NR > 1 { print $2, $3, $5 }')
+ctdb_ip_info=$(awk -F'|' 'NR > 1 { print $2, $3, $5 }' "$outfile")
 
 echo "Getting IP information from interfaces..."
 try_command_on_node all "ip addr show"
-ip_addr_info=$(echo "$out" | \
-    awk '$1 == "inet" { ip = $2; sub(/\/.*/, "", ip); print ip }')
+ip_addr_info=$(awk '$1 == "inet" { ip = $2; sub(/\/.*/, "", ip); print ip }' \
+		   "$outfile")
 
 prefix=""
 for b in $(seq 0 255) ; do
@@ -168,7 +168,7 @@ check_ips ()
 
     try_command_on_node $test_node "ip addr show dev ${iface}"
     local ip_addrs_file=$(mktemp)
-    echo "$out" | \
+    cat "$outfile" | \
 	sed -n -e "s at .*inet * \(${prefix//./\.}\.[0-9]*\)/.*@\1 at p" | \
 	sort >"$ip_addrs_file"
 
diff --git a/ctdb/tests/complex/32_cifs_tickle.sh b/ctdb/tests/complex/32_cifs_tickle.sh
index 4f2cdadbdfc..bfe3df4e82f 100755
--- a/ctdb/tests/complex/32_cifs_tickle.sh
+++ b/ctdb/tests/complex/32_cifs_tickle.sh
@@ -61,13 +61,6 @@ echo "Source socket is $src_socket"
 # we sometimes beat the registration.
 echo "Checking if CIFS connection is tracked by CTDB on test node..."
 wait_until 10 check_tickles $test_node $test_ip $test_port $src_socket
-echo "$out"
-
-if [ "${out/SRC: ${src_socket} /}" != "$out" ] ; then
-    echo "GOOD: CIFS connection tracked OK by CTDB."
-else
-    die "BAD: Socket not tracked by CTDB."
-fi
 
 # This is almost immediate.  However, it is sent between nodes
 # asynchonously, so it is worth checking...
diff --git a/ctdb/tests/complex/36_smb_reset_server.sh b/ctdb/tests/complex/36_smb_reset_server.sh
index 0de77722fc3..870b80661aa 100755
--- a/ctdb/tests/complex/36_smb_reset_server.sh
+++ b/ctdb/tests/complex/36_smb_reset_server.sh
@@ -59,16 +59,8 @@ echo "Source socket is $src_socket"
 
 # This should happen as soon as connection is up... but unless we wait
 # we sometimes beat the registration.
-echo "Checking if CIFS connection is tracked by CTDB on test node..."
+echo "Waiting until SMB connection is tracked by CTDB on test node..."
 wait_until 10 check_tickles $test_node $test_ip $test_port $src_socket
-echo "$out"
-
-if [ "${out/SRC: ${src_socket} /}" != "$out" ] ; then
-    echo "GOOD: CIFS connection tracked OK by CTDB."
-else
-    echo "BAD: Socket not tracked by CTDB."
-    exit 1
-fi
 
 # It would be nice if ss consistently used local/peer instead of src/dst
 ss_filter="src ${test_ip}:${test_port} dst ${src_socket}"
@@ -80,7 +72,7 @@ if [ -z "$out" ] ; then
 	exit 1
 fi
 echo "GOOD: ss lists the socket:"
-echo "$out"
+cat "$outfile"
 
 echo "Disabling node $test_node"
 try_command_on_node 1 $CTDB disable -n $test_node
diff --git a/ctdb/tests/complex/37_nfs_reset_server.sh b/ctdb/tests/complex/37_nfs_reset_server.sh
index 7190af0f552..32ff9295cc6 100755
--- a/ctdb/tests/complex/37_nfs_reset_server.sh
+++ b/ctdb/tests/complex/37_nfs_reset_server.sh
@@ -60,7 +60,7 @@ echo "Source socket is $src_socket"
 echo "Wait until NFS connection is tracked by CTDB on test node ..."
 wait_until $((monitor_interval * 2)) \
 	   check_tickles $test_node $test_ip $test_port $src_socket
-echo "$out"
+cat "$outfile"
 
 # It would be nice if ss consistently used local/peer instead of src/dst
 ss_filter="src ${test_ip}:${test_port} dst ${src_socket}"
@@ -72,7 +72,7 @@ if [ -z "$out" ] ; then
 	exit 1
 fi
 echo "GOOD: ss lists the socket:"
-echo "$out"
+cat "$outfile"
 
 echo "Disabling node $test_node"
 try_command_on_node 1 $CTDB disable -n $test_node
diff --git a/ctdb/tests/complex/60_rogueip_releaseip.sh b/ctdb/tests/complex/60_rogueip_releaseip.sh
index 2fddc06f867..88e4e554c34 100755
--- a/ctdb/tests/complex/60_rogueip_releaseip.sh
+++ b/ctdb/tests/complex/60_rogueip_releaseip.sh
@@ -31,7 +31,7 @@ for i in $all_pnns ; do
 		continue
 	fi
 	try_command_on_node $i "$CTDB ip"
-	n=$(awk -v ip="$test_ip" '$1 == ip { print }' <<<"$out")
+	n=$(awk -v ip="$test_ip" '$1 == ip { print }' "$outfile")
 	if [ -n "$n" ] ; then
 		other_node="$i"
 		break
diff --git a/ctdb/tests/complex/scripts/local.bash b/ctdb/tests/complex/scripts/local.bash
index 7787de8f111..787f597edcc 100644
--- a/ctdb/tests/complex/scripts/local.bash
+++ b/ctdb/tests/complex/scripts/local.bash
@@ -67,7 +67,7 @@ check_tickles ()
     local src_socket="$4"
     try_command_on_node $node ctdb gettickles $test_ip $test_port
     # SRC: 10.0.2.45:49091   DST: 10.0.2.143:445
-    [ "${out/SRC: ${src_socket} /}" != "$out" ]
+    grep -Fq "SRC: ${src_socket} " "$outfile"
 }
 
 check_tickles_all ()
@@ -79,8 +79,7 @@ check_tickles_all ()
 
     try_command_on_node all ctdb gettickles $test_ip $test_port
     # SRC: 10.0.2.45:49091   DST: 10.0.2.143:445
-    local t="${src_socket//./\\.}"
-    local count=$(grep -E -c "SRC: ${t} " <<<"$out" || true)
+    local count=$(grep -Fc "SRC: ${src_socket} " "$outfile" || true)
     [ $count -eq $numnodes ]
 }
 
diff --git a/ctdb/tests/eventscripts/05.system.monitor.011.sh b/ctdb/tests/eventscripts/05.system.monitor.011.sh
index a7d2e99c2b7..6cd1dabbb37 100755
--- a/ctdb/tests/eventscripts/05.system.monitor.011.sh
+++ b/ctdb/tests/eventscripts/05.system.monitor.011.sh
@@ -2,13 +2,12 @@
 
 . "${TEST_SCRIPTS_DIR}/unit.sh"
 
-define_test "Memory check, bad situation, default checks enabled"
+define_test "Memory check (default), warning situation"
 
 setup
 
 set_mem_usage 100 100
 ok <<EOF
 WARNING: System memory utilization 100% >= threshold 80%
-WARNING: System swap utilization 100% >= threshold 25%
 EOF
 simple_test
diff --git a/ctdb/tests/eventscripts/05.system.monitor.012.sh b/ctdb/tests/eventscripts/05.system.monitor.012.sh
index bc517081e42..9e840564f49 100755
--- a/ctdb/tests/eventscripts/05.system.monitor.012.sh
+++ b/ctdb/tests/eventscripts/05.system.monitor.012.sh
@@ -2,13 +2,12 @@
 
 . "${TEST_SCRIPTS_DIR}/unit.sh"
 
-define_test "Memory check, good situation, all memory checks enabled"
+define_test "Memory check (custom, both), good situation"
 
 setup
 
 setup_script_options <<EOF
 CTDB_MONITOR_MEMORY_USAGE="80:90"
-CTDB_MONITOR_SWAP_USAGE="1:50"
 EOF
 
 ok_null
diff --git a/ctdb/tests/eventscripts/05.system.monitor.013.sh b/ctdb/tests/eventscripts/05.system.monitor.013.sh
deleted file mode 100755
index f4ea7ded6d0..00000000000
--- a/ctdb/tests/eventscripts/05.system.monitor.013.sh
+++ /dev/null
@@ -1,21 +0,0 @@
-#!/bin/sh
-
-. "${TEST_SCRIPTS_DIR}/unit.sh"
-
-define_test "Memory check, bad situation, custom swap critical"
-
-setup
-
-setup_script_options <<EOF
-CTDB_MONITOR_SWAP_USAGE=":50"
-EOF
-
-set_mem_usage 100 90
-required_result 1 <<EOF
-WARNING: System memory utilization 100% >= threshold 80%
-ERROR: System swap utilization 90% >= threshold 50%
-$FAKE_PROC_MEMINFO
-$(ps foobar)
-EOF
-
-simple_test
diff --git a/ctdb/tests/eventscripts/05.system.monitor.014.sh b/ctdb/tests/eventscripts/05.system.monitor.014.sh
index 1b6d2155272..9e2b21c9822 100755
--- a/ctdb/tests/eventscripts/05.system.monitor.014.sh
+++ b/ctdb/tests/eventscripts/05.system.monitor.014.sh
@@ -2,7 +2,7 @@
 
 . "${TEST_SCRIPTS_DIR}/unit.sh"
 
-define_test "Memory check, bad memory situation, custom memory warning"
+define_test "Memory check (custom, warning only), warning situation"
 
 setup
 
@@ -10,7 +10,7 @@ setup_script_options <<EOF
 CTDB_MONITOR_MEMORY_USAGE="85:"
 EOF
 
-set_mem_usage 90 10
+set_mem_usage 90 90
 ok <<EOF
 WARNING: System memory utilization 90% >= threshold 85%
 EOF
diff --git a/ctdb/tests/eventscripts/05.system.monitor.015.sh b/ctdb/tests/eventscripts/05.system.monitor.015.sh
index 3f1fe9bfc46..0091c429ac1 100755
--- a/ctdb/tests/eventscripts/05.system.monitor.015.sh
+++ b/ctdb/tests/eventscripts/05.system.monitor.015.sh
@@ -2,7 +2,7 @@
 
 . "${TEST_SCRIPTS_DIR}/unit.sh"
 
-define_test "Memory check, bad situation, custom memory critical"
+define_test "Memory check (custom, error only), error situation"
 
 setup


-- 
Samba Shared Repository