[SCM] Samba Shared Repository - branch master updated

Tue Jan 12 21:20:02 UTC 2016

The branch, master has been updated
       via  3ac2d4b ctdb-tests: Fix some incorrect memory allocations
       via  a7ce00c ctdb-scripts: Use more unique temporary file names
       via  4afe822 ctdb-scripts: Don't remove temporary files before use
       via  2083883 ctdb-scripts: Superficial clean-ups to 10.interface
       via  c23744c ctdb-scripts: Clarify logic for success of interface monitoring
       via  d8e4c5a ctdb-scripts: Fix regression in updateip code
       via  18b0aea ctdb-ipalloc: Fix a memory leak
       via  24160ee ctdb-daemon: Don't leak memory if not using recovery lock
       via  56ce230 ctdb-recoverd: Fix some uninitialised memory issues
       via  8f73ae0 ctdb-daemon: Drop the "schedule for deletion" messages to DEBUG level
      from  9790abd ctdb/web: Fix typo.

https://git.samba.org/?p=samba.git;a=shortlog;h=master


- Log -----------------------------------------------------------------
commit 3ac2d4b59eff58f06af1eef19cef0d444f4257ca
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 1 15:32:07 2015 +1100

    ctdb-tests: Fix some incorrect memory allocations
    
    These allocate enough memory but things get confusing if they're used
    as a guide when updating the code.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>
    
    Autobuild-User(master): Michael Adam <obnox at samba.org>
    Autobuild-Date(master): Tue Jan 12 22:19:16 CET 2016 on sn-devel-144

commit a7ce00cc0392cbbc009e2dbe8d58258d6ba0566f
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 10 15:03:46 2015 +1100

    ctdb-scripts: Use more unique temporary file names
    
    Consider this sequence of events:
    
    1. Instance of script running update_tickles() hangs
    2. Script debugging is launched asynchronously
    3. New instance of script is launched, creates temporary file(s)
    4. Original hung script makes progress before asynchronous script
       debugging kills it, so it removes temporary file(s)
    5. New instance of script produces error due to missing files(s)
    
    This is obviously rare.
    
    Use more unique filenames to avoid step (4) removing the file(s)
    belonging to other instances of the script.
    
    This requires some extra cleanup to avoid too many temporary files
    (which is why unique filenames were not originally usd).  It is
    sufficient to remove files modified at least 10 minutes ago.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 4afe822397a0fe8f3e528077089577c89c9895e5
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 10 14:58:53 2015 +1100

    ctdb-scripts: Don't remove temporary files before use
    
    They will be clobbered by the redirect anyway.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 20838833f3fe3053c7f6aed8ac598319a3ecc346
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 18 13:16:27 2015 +1100

    ctdb-scripts: Superficial clean-ups to 10.interface
    
    Whitespace and indentation improvements.
    
    Remove comments describing events, since the README covers that much
    better.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit c23744c61db321a911653d58ed258a3003cf798b
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed Dec 16 19:18:49 2015 +1100

    ctdb-scripts: Clarify logic for success of interface monitoring
    
    The current code uses so many shell idioms that it is difficult to
    follow.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit d8e4c5a468286ecc1c38ecd66a3606e84db02373
Author: Martin Schwenke <martin at meltin.net>
Date:   Fri Dec 18 15:33:38 2015 +1100

    ctdb-scripts: Fix regression in updateip code
    
    Regression introduced in commit
    6471541d6d2bc9f2af0ff92b280abbd1d933cf88.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 18b0aeaae0ba42549a9542b8c08923211e2976ad
Author: Martin Schwenke <martin at meltin.net>
Date:   Tue Dec 1 14:38:48 2015 +1100

    ctdb-ipalloc: Fix a memory leak
    
    Commit cfa0ffe78073f9e3a014bb127fb9a4b7ad95fceb introduced a memory
    leak.  Never assume...
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 24160ee6a4a0727840d73955b99aef690450f345
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 11 13:41:30 2016 +1100

    ctdb-daemon: Don't leak memory if not using recovery lock
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 56ce230de72dbf14ccddb3f7b26b8b7f16986dfc
Author: Martin Schwenke <martin at meltin.net>
Date:   Mon Jan 11 17:23:12 2016 +1100

    ctdb-recoverd: Fix some uninitialised memory issues
    
    The first element of these structures is a 32-bit PNN.  On 64-bit
    systems this field can be followed by 32-bits of padding.  When the
    structures are copied this can cause uninitialised memory to be
    copied.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

commit 8f73ae03cc50f26e85b78e35bf22e40eb1ff7684
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu Dec 17 12:27:58 2015 +1100

    ctdb-daemon: Drop the "schedule for deletion" messages to DEBUG level
    
    Thousands of these can be generated each second, rendering INFO level
    debugging useless.
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Reviewed-by: Michael Adam <obnox at samba.org>

-----------------------------------------------------------------------

Summary of changes:
 ctdb/config/events.d/10.interface    | 47 +++++++++++++-----------------------
 ctdb/config/functions                | 16 +++++++-----
 ctdb/server/ctdb_monitor.c           |  1 +
 ctdb/server/ctdb_recover.c           | 14 +++++------
 ctdb/server/ctdb_recoverd.c          |  2 ++
 ctdb/server/ctdb_takeover.c          |  4 +--
 ctdb/server/ctdb_vacuum.c            | 10 ++++----
 ctdb/tests/src/ctdb_takeover_tests.c | 11 +++++++--
 8 files changed, 53 insertions(+), 52 deletions(-)


Changeset truncated at 500 lines:

diff --git a/ctdb/config/events.d/10.interface b/ctdb/config/events.d/10.interface
index 045a1cf..4b9f31c 100755
--- a/ctdb/config/events.d/10.interface
+++ b/ctdb/config/events.d/10.interface
@@ -152,13 +152,16 @@ monitor_interfaces()
 
 	done
 
-	$fail || return 0
-
-	$up_interfaces_found && \
-	    [ "$CTDB_PARTIALLY_ONLINE_INTERFACES" = "yes" ] && \
+	if $fail ; then
+	    if $up_interfaces_found && \
+		    [ "$CTDB_PARTIALLY_ONLINE_INTERFACES" = "yes" ] ; then
+		return 0
+	    else
+		return 1
+	    fi
+	else
 	    return 0
-
-	return 1
+	fi
 }
 
 # Sets: iface, ip, maskbits, family
@@ -191,10 +194,8 @@ get_iface_ip_maskbits_family ()
 
 ctdb_check_args "$@"
 
-case "$1" in 
-     #############################
-     # called when ctdbd starts up
-     init)
+case "$1" in
+    init)
 	# make sure that we only respond to ARP messages from the NIC where
 	# a particular ip address is associated.
 	get_proc sys/net/ipv4/conf/all/arp_filter >/dev/null 2>&1 && {
@@ -210,17 +211,11 @@ case "$1" in
 	drop_all_public_ips
 	;;
 
-     #############################
-     # called after ctdbd has done its initial recovery
-     # and we start the services to become healthy
-     startup)
+    startup)
 	monitor_interfaces
 	;;
 
-
-     ################################################
-     # called when ctdbd wants to claim an IP address
-     takeip)
+    takeip)
 	iface=$2
 	ip=$3
 	maskbits=$4
@@ -239,10 +234,7 @@ case "$1" in
 	flush_route_cache
 	;;
 
-
-     ##################################################
-     # called when ctdbd wants to release an IP address
-     releaseip)
+    releaseip)
 	# releasing an IP is a bit more complex than it seems. Once the IP
 	# is released, any open tcp connections to that IP on this host will end
 	# up being stuck. Some of them (such as NFS connections) will be unkillable
@@ -272,9 +264,7 @@ case "$1" in
 	flush_route_cache
 	;;
 
-     ##################################################
-     # called when ctdbd wants to update an IP address
-     updateip)
+    updateip)
 	# moving an IP is a bit more complex than it seems.
 	# First we drop all traffic on the old interface.
 	# Then we try to add the ip to the new interface and before
@@ -291,7 +281,7 @@ case "$1" in
 	_ip=$4
 	_maskbits=$5
 
-	get_iface_ip_maskbits_family "$_oiface" "$ip" "$maskbits"
+	get_iface_ip_maskbits_family "$_oiface" "$_ip" "$_maskbits"
 	oiface="$iface"
 
 	# we do an extra delete to cope with the script being killed
@@ -317,12 +307,10 @@ case "$1" in
 
 	# tickle all existing connections, so that dropped packets
 	# are retransmited and the tcp streams work
-
 	tickle_tcp_connections $ip
-
 	;;
 
-     monitor)
+    monitor)
 	monitor_interfaces || exit 1
 	;;
     *)
@@ -331,4 +319,3 @@ case "$1" in
 esac
 
 exit 0
-
diff --git a/ctdb/config/functions b/ctdb/config/functions
index eef8f7e..68e53ab 100755
--- a/ctdb/config/functions
+++ b/ctdb/config/functions
@@ -1082,17 +1082,18 @@ update_tickles ()
 	# IPs as a regexp choice
 	_ipschoice="($(echo $_ips | sed -e 's/ /|/g' -e 's/\./\\\\./g'))"
 
-	# Record connections to our public IPs in a temporary file
-	_my_connections="${tickledir}/${_port}.connections"
-	rm -f "$_my_connections"
+	# Record connections to our public IPs in a temporary file.
+	# This temporary file is in CTDB's private state directory and
+	# $$ is used to avoid a very rare race involving CTDB's script
+	# debugging.  No security issue, nothing to see here...
+	_my_connections="${tickledir}/${_port}.connections.$$"
 	netstat -tn |
 	awk -v destpat="^${_ipschoice}:${_port}\$" \
 	  '$1 == "tcp" && $6 == "ESTABLISHED" && $4 ~ destpat {print $5, $4}' |
 	sort >"$_my_connections"
 
 	# Record our current tickles in a temporary file
-	_my_tickles="${tickledir}/${_port}.tickles"
-	rm -f "$_my_tickles"
+	_my_tickles="${tickledir}/${_port}.tickles.$$"
 	for _i in $_ips ; do
 		ctdb -X gettickles $_i $_port |
 		awk -F'|' 'NR > 1 { printf "%s:%s %s:%s\n", $2, $3, $4, $5 }'
@@ -1111,7 +1112,10 @@ update_tickles ()
 		ctdb deltickle $_src $_dst
 	done
 
-	rm -f "$_my_connections" "$_my_tickles" 
+	rm -f "$_my_connections" "$_my_tickles"
+
+	# Remove stale files from killed scripts
+	find "$tickledir" -type f -mmin +10 | xargs -r rm
 }
 
 ########################################################
diff --git a/ctdb/server/ctdb_monitor.c b/ctdb/server/ctdb_monitor.c
index d8eda2a..0a8273a 100644
--- a/ctdb/server/ctdb_monitor.c
+++ b/ctdb/server/ctdb_monitor.c
@@ -134,6 +134,7 @@ static void ctdb_health_callback(struct ctdb_context *ctdb, int status, void *p)
 	c.pnn = ctdb->pnn;
 	c.old_flags = node->flags;
 
+	ZERO_STRUCT(rd);
 	rd.pnn   = ctdb->pnn;
 	rd.srvid = CTDB_SRVID_TAKEOVER_RUN_RESPONSE;
 
diff --git a/ctdb/server/ctdb_recover.c b/ctdb/server/ctdb_recover.c
index 7b34d7e..127ff6b 100644
--- a/ctdb/server/ctdb_recover.c
+++ b/ctdb/server/ctdb_recover.c
@@ -592,13 +592,6 @@ int32_t ctdb_control_set_recmode(struct ctdb_context *ctdb,
 		}
 	}
 
-	state = talloc(ctdb, struct ctdb_set_recmode_state);
-	CTDB_NO_MEMORY(ctdb, state);
-
-	state->start_time = timeval_current();
-	state->fd[0] = -1;
-	state->fd[1] = -1;
-
 	/* release any deferred attach calls from clients */
 	if (recmode == CTDB_RECOVERY_NORMAL) {
 		ctdb_process_deferred_attach(ctdb);
@@ -610,6 +603,13 @@ int32_t ctdb_control_set_recmode(struct ctdb_context *ctdb,
 		return 0;
 	}
 
+	state = talloc(ctdb, struct ctdb_set_recmode_state);
+	CTDB_NO_MEMORY(ctdb, state);
+
+	state->start_time = timeval_current();
+	state->fd[0] = -1;
+	state->fd[1] = -1;
+
 	/* For the rest of what needs to be done, we need to do this in
 	   a child process since 
 	   1, the call to ctdb_recovery_lock() can block if the cluster
diff --git a/ctdb/server/ctdb_recoverd.c b/ctdb/server/ctdb_recoverd.c
index 1d63526..c89649a 100644
--- a/ctdb/server/ctdb_recoverd.c
+++ b/ctdb/server/ctdb_recoverd.c
@@ -1650,6 +1650,7 @@ static bool do_takeover_run(struct ctdb_recoverd *rec,
 	 * wait for replies since a failure here might cause some
 	 * noise in the logs but will not actually cause a problem.
 	 */
+	ZERO_STRUCT(dtr);
 	dtr.srvid = 0; /* No reply */
 	dtr.pnn = -1;
 
@@ -3202,6 +3203,7 @@ static int verify_local_ip_allocation(struct ctdb_context *ctdb, struct ctdb_rec
 
 		DEBUG(DEBUG_CRIT,("Trigger takeoverrun\n"));
 
+		ZERO_STRUCT(rd);
 		rd.pnn = ctdb->pnn;
 		rd.srvid = 0;
 		data.dptr = (uint8_t *)&rd;
diff --git a/ctdb/server/ctdb_takeover.c b/ctdb/server/ctdb_takeover.c
index 8de1666..227bd16 100644
--- a/ctdb/server/ctdb_takeover.c
+++ b/ctdb/server/ctdb_takeover.c
@@ -1434,7 +1434,7 @@ static int ctdb_reload_remote_public_ips(struct ctdb_context *ctdb,
 		ret = ctdb_ctrl_get_public_ips_flags(ctdb,
 					TAKEOVER_TIMEOUT(),
 					j,
-					ctdb->nodes,
+					ipalloc_state->known_public_ips,
 					0,
 					&ipalloc_state->known_public_ips[j]);
 		if (ret != 0) {
@@ -1454,7 +1454,7 @@ static int ctdb_reload_remote_public_ips(struct ctdb_context *ctdb,
 		ret = ctdb_ctrl_get_public_ips_flags(ctdb,
 					TAKEOVER_TIMEOUT(),
 					j,
-					ctdb->nodes,
+					ipalloc_state->available_public_ips,
 					CTDB_PUBLIC_IP_FLAGS_ONLY_AVAILABLE,
 					&ipalloc_state->available_public_ips[j]);
 		if (ret != 0) {
diff --git a/ctdb/server/ctdb_vacuum.c b/ctdb/server/ctdb_vacuum.c
index 54dfe99..f536129 100644
--- a/ctdb/server/ctdb_vacuum.c
+++ b/ctdb/server/ctdb_vacuum.c
@@ -1659,11 +1659,11 @@ static int insert_record_into_delete_queue(struct ctdb_db_context *ctdb_db,
 
 	hash = (uint32_t)ctdb_hash(&key);
 
-	DEBUG(DEBUG_INFO, (__location__ " schedule for deletion: db[%s] "
-			   "db_id[0x%08x] "
-			   "key_hash[0x%08x] "
-			   "lmaster[%u] "
-			   "migrated_with_data[%s]\n",
+	DEBUG(DEBUG_DEBUG, (__location__ " schedule for deletion: db[%s] "
+			    "db_id[0x%08x] "
+			    "key_hash[0x%08x] "
+			    "lmaster[%u] "
+			    "migrated_with_data[%s]\n",
 			    ctdb_db->db_name, ctdb_db->db_id,
 			    hash,
 			    ctdb_lmaster(ctdb_db->ctdb, &key),
diff --git a/ctdb/tests/src/ctdb_takeover_tests.c b/ctdb/tests/src/ctdb_takeover_tests.c
index 6797de4..bbd004e 100644
--- a/ctdb/tests/src/ctdb_takeover_tests.c
+++ b/ctdb/tests/src/ctdb_takeover_tests.c
@@ -196,7 +196,12 @@ read_ctdb_public_ip_info(TALLOC_CTX *ctx  ,
 		while (t != NULL) {
 			n = (int) strtol(t, (char **) NULL, 10);
 			if ((*known)[n] == NULL) {
-				(*known)[n] = talloc_array(ctx, struct ctdb_public_ip_list_old, CTDB_TEST_MAX_IPS);
+				/* Array size here has to be
+				 * CTDB_TEST_MAX_IPS because total
+				 * number of IPs isn't yet known */
+				(*known)[n] = talloc_size(ctx,
+							  offsetof(struct ctdb_public_ip_list_old, ips) +
+							  CTDB_TEST_MAX_IPS * sizeof(struct ctdb_public_ip));
 				(*known)[n]->num = 0;
 			}
 			curr = (*known)[n]->num;
@@ -210,7 +215,9 @@ read_ctdb_public_ip_info(TALLOC_CTX *ctx  ,
 	}
 
 	/* Build list of all allowed IPs */
-	a = talloc_array(ctx, struct ctdb_public_ip_list_old, CTDB_TEST_MAX_IPS);
+	a = talloc_size(ctx,
+			offsetof(struct ctdb_public_ip_list_old, ips) +
+			numips * sizeof(struct ctdb_public_ip));
 	a->num = numips;
 	for (ta = *all_ips, i=0; ta != NULL && i < numips ; ta = ta->next, i++) {
 		a->ips[i].pnn = ta->pnn;


-- 
Samba Shared Repository