[SCM] CTDB repository - branch 1.2.39 updated - ctdb-1.2.39-8-7-g25ffee1

Amitay Isaacs amitay at samba.org
Wed May 8 01:50:42 MDT 2013


The branch, 1.2.39 has been updated
       via  25ffee12bd06df7c4a46862bb7667d41e42dde7f (commit)
       via  8bca8e1e63c70b959fdee5f680024249b688cc50 (commit)
       via  d22ac9f3f79ea3368fdb49babed2c2252e0d45a5 (commit)
       via  3a01839e650e47663f22a731486b9297030a6524 (commit)
       via  efb72d75201a22905e12cb3b92240054ad36068c (commit)
       via  60f6beca9eb5a2fb410419f555d92434f862a026 (commit)
       via  f5a511b72689557db75353eb5bd6c1ea5724f799 (commit)
      from  a1c04ba95aa21837b8388f91480122e3349d3e92 (commit)

http://gitweb.samba.org/?p=ctdb.git;a=shortlog;h=1.2.39


- Log -----------------------------------------------------------------
commit 25ffee12bd06df7c4a46862bb7667d41e42dde7f
Author: Martin Schwenke <martin at meltin.net>
Date:   Wed May 8 13:06:12 2013 +1000

    New version 1.2.39-9
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>

commit 8bca8e1e63c70b959fdee5f680024249b688cc50
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Dec 5 11:38:42 2012 +1100

    scripts: Add helper script to log locking information using /proc/locks
    
    This finds any processes locking tdb databases used by CTDB and logs
    stack trace for each process.
    
    Includes this fix from 1.2.40 branch:
    
      scripts: Fix the variable name for sed expressions
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit 0e99ca1cdf28a7043554afb78bd439f727ab4f95)
    (cherry picked from commit 9fbd13ea7d3da5e297827e7763f336f484262f47)

commit d22ac9f3f79ea3368fdb49babed2c2252e0d45a5
Author: Amitay Isaacs <amitay at gmail.com>
Date:   Wed Dec 5 11:37:26 2012 +1100

    daemon: Run an external script if freeze locks were not obtained during recovery
    
    If the freeze child is already created in ctdb_start_freeze(), then it indicates
    that the child process has not yet obtained the locks.  This may be because
    another process has locked the databases and has not yet released the locks.
    
    In this case, invoke a helper script defined by environmental variable
    CTDB_DEBUG_LOCKS, to log information about locks.
    
    Signed-off-by: Amitay Isaacs <amitay at gmail.com>
    (cherry picked from commit e80b2c15bf8c8fb5c3793acfebbe09d3cdd617b7)

commit 3a01839e650e47663f22a731486b9297030a6524
Author: Martin Schwenke <martin at meltin.net>
Date:   Thu May 17 10:17:51 2012 +1000

    ctdbd: Backport use of external script to debug hung eventscript
    
    This is a cherry-pick from 6e68797af67bee36f2bad045f94806e7e98f27e9,
    combined with several recent fixes:
    
      8507303b525d20c74e8ec4e7c4f5f275945cd3b6
        scripts: debug-hung-script.sh doesn't need functions/loadconfig
      501461cc3e132d4adee9e91b5d4513a26bae2846
        ctdbd: Remove debug_hung_script_ctx
      0581f9a84e58764d194f4e04064c2c5b393c348b
        ctdbd: Remove command-line option --debug-hung-script
      3400b2ed34b6eb9496eb55f1aab6f89d2952060d
        ctdbd: Complain loudly if CTDB_DEBUG_HUNG_SCRIPT script isn't executable
      9b0d56b16775aa16f33bdfdf831256e085fa3339
        ctdbd: Don't use a fixed length buffer for the hung script command
    
    Signed-off-by: Martin Schwenke <martin at meltin.net>
    Cherry-pick-from: b86270fae7fd9f8a7a718e15d8c7436a918f28c4

commit efb72d75201a22905e12cb3b92240054ad36068c
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 20 15:10:05 2012 +1000

    When we find an ip we shouldnt host, just release it
    
    Dont call a full blown clusterwide ipreallocation,  just release it locally
    
    (cherry picked from commit 9a806dec8687e2ec08a308853b61af6aed5e5d1e)
    
    Conflicts:
    	server/ctdb_recoverd.c

commit 60f6beca9eb5a2fb410419f555d92434f862a026
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 20 10:08:11 2012 +1000

    When we release an ip, get the interface name from the kernel
    
    instead of using the interface where ctdb thinks the ip is hosted at.
    The difference is that this now allows us to handle cases where we want to release an ip   but ctdbd does not know which interface the ip is assigned on.
    (user has used 'ip addr add...'  and manually assigned an ip to the wrong interface)
    
    (cherry picked from commit c6bf22ba5c01001b7febed73dd16a03bd3fd2bed)
    
    Conflicts:
    	server/ctdb_takeover.c

commit f5a511b72689557db75353eb5bd6c1ea5724f799
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date:   Wed Jun 20 13:32:02 2012 +1000

    Add new command to find which interface is located on
    
    (cherry picked from commit f07376309e70f5ccdb7de8453caacc71b451ab48)
    
    Conflicts:
    	common/system_common.c
    	include/ctdb_private.h
    	tools/ctdb.c

-----------------------------------------------------------------------

Summary of changes:
 Makefile.in                 |    2 +
 common/system_common.c      |   84 +++++++++++++++++++++++++++++++++++++++++++
 config/ctdb.init            |    7 ++++
 config/ctdb.sysconfig       |    3 ++
 config/debug-hung-script.sh |    4 ++
 config/debug_locks.sh       |   38 +++++++++++++++++++
 include/ctdb_private.h      |    1 +
 packaging/RPM/ctdb.spec.in  |    8 ++++-
 server/ctdb_freeze.c        |   24 ++++++++++++
 server/ctdb_recoverd.c      |    8 +++-
 server/ctdb_takeover.c      |   17 +++++----
 server/eventscript.c        |   48 +++++++++++++++----------
 tools/ctdb.c                |   22 +++++++++++
 13 files changed, 236 insertions(+), 30 deletions(-)
 create mode 100644 config/debug-hung-script.sh
 create mode 100755 config/debug_locks.sh


Changeset truncated at 500 lines:

diff --git a/Makefile.in b/Makefile.in
index 5fa9e98..298bd7e 100755
--- a/Makefile.in
+++ b/Makefile.in
@@ -246,6 +246,7 @@ install: all
 	${INSTALLCMD} -m 644 config/functions $(DESTDIR)$(etcdir)/ctdb
 	${INSTALLCMD} -m 755 config/statd-callout $(DESTDIR)$(etcdir)/ctdb
 	${INSTALLCMD} -m 755 config/interface_modify.sh $(DESTDIR)$(etcdir)/ctdb
+	${INSTALLCMD} -m 755 config/debug_locks.sh $(DESTDIR)$(etcdir)/ctdb
 	${INSTALLCMD} -m 644 config/events.d/README $(DESTDIR)$(docdir)/ctdb/README.eventscripts
 	${INSTALLCMD} -m 644 doc/recovery-process.txt $(DESTDIR)$(docdir)/ctdb/recovery-process.txt
 	${INSTALLCMD} -m 755 config/events.d/00.ctdb $(DESTDIR)$(etcdir)/ctdb/events.d
@@ -272,6 +273,7 @@ install: all
 	if [ -f doc/onnode.1 ];then ${INSTALLCMD} -m 644 doc/onnode.1 $(DESTDIR)$(mandir)/man1; fi
 	if [ -f doc/ltdbtool.1 ]; then ${INSTALLCMD} -m 644 doc/ltdbtool.1 $(DESTDIR)$(mandir)/man1; fi
 	if [ ! -f $(DESTDIR)$(etcdir)/ctdb/notify.sh ];then ${INSTALLCMD} -m 755 config/notify.sh $(DESTDIR)$(etcdir)/ctdb; fi
+	if [ ! -f $(DESTDIR)$(etcdir)/ctdb/debug-hung-script.sh ];then ${INSTALLCMD} -m 755 config/debug-hung-script.sh $(DESTDIR)$(etcdir)/ctdb; fi
 	if [ ! -f $(DESTDIR)$(etcdir)/ctdb/ctdb-crash-cleanup.sh ];then ${INSTALLCMD} -m 755 config/ctdb-crash-cleanup.sh $(DESTDIR)$(etcdir)/ctdb; fi
 
 test: all
diff --git a/common/system_common.c b/common/system_common.c
index f28045f..6ee615f 100644
--- a/common/system_common.c
+++ b/common/system_common.c
@@ -73,3 +73,87 @@ bool ctdb_sys_have_ip(ctdb_sock_addr *_addr)
 	close(s);
 	return ret == 0;
 }
+
+
+/* find which interface an ip address is currently assigned to */
+char *ctdb_sys_find_ifname(ctdb_sock_addr *addr)
+{
+	int s;
+	int size;
+	struct ifconf ifc;
+	char *ptr;
+
+	s = socket(AF_INET, SOCK_RAW, htons(IPPROTO_RAW));
+	if (s == -1) {
+		DEBUG(DEBUG_CRIT,(__location__ " failed to open raw socket (%s)\n",
+			 strerror(errno)));
+		return NULL;
+	}
+
+
+	size = sizeof(struct ifreq);
+	ifc.ifc_buf = NULL;
+	ifc.ifc_len = size;
+
+	while(ifc.ifc_len > (size - sizeof(struct ifreq))) {
+		size *= 2;
+
+		free(ifc.ifc_buf);	
+		ifc.ifc_len = size;
+		ifc.ifc_buf = malloc(size);
+		memset(ifc.ifc_buf, 0, size);
+		if (ioctl(s, SIOCGIFCONF, (caddr_t)&ifc) < 0) {
+			DEBUG(DEBUG_CRIT,("Failed to read ifc buffer from socket\n"));
+			free(ifc.ifc_buf);	
+			close(s);
+			return NULL;
+		}
+	}
+
+	for (ptr =(char *)ifc.ifc_buf; ptr < ((char *)ifc.ifc_buf) + ifc.ifc_len; ) {
+		char *ifname;
+		struct ifreq *ifr;
+
+		ifr = (struct ifreq *)ptr;
+
+#ifdef HAVE_SOCKADDR_LEN
+		if (ifr->ifr_addr.sa_len > sizeof(struct sockaddr)) {
+			ptr += sizeof(ifr->ifr_name) + ifr->ifr_addr.sa_len;
+		} else {
+			ptr += sizeof(ifr->ifr_name) + sizeof(struct sockaddr);
+		}
+#else
+		ptr += sizeof(struct ifreq);
+#endif
+
+		if (ifr->ifr_addr.sa_family != addr->sa.sa_family) {
+			continue;
+		}
+
+		switch (addr->sa.sa_family) {
+		case AF_INET:
+
+
+			if (memcmp(&addr->ip.sin_addr, &((struct sockaddr_in *)&ifr->ifr_addr)->sin_addr, sizeof(addr->ip.sin_addr))) {
+				continue;
+			}
+			break;
+		case AF_INET6:
+			if (memcmp(&addr->ip6.sin6_addr, &((struct sockaddr_in6 *)&ifr->ifr_addr)->sin6_addr, sizeof(addr->ip6.sin6_addr))) {
+				continue;
+			}
+			break;
+		}
+
+		ifname = strdup(ifr->ifr_name);
+		free(ifc.ifc_buf);	
+		close(s);
+		return ifname;
+	}
+
+
+	free(ifc.ifc_buf);	
+	close(s);
+
+	return NULL;
+}
diff --git a/config/ctdb.init b/config/ctdb.init
index d6493bd..2b9902b 100755
--- a/config/ctdb.init
+++ b/config/ctdb.init
@@ -111,6 +111,11 @@ build_ctdb_options () {
     maybe_set "--max-persistent-check-errors" "$CTDB_MAX_PERSISTENT_CHECK_ERRORS"
 }
 
+export_debug_variables ()
+{
+    export CTDB_DEBUG_HUNG_SCRIPT
+}
+
 check_tdb () {
 	local PDBASE=$1
 
@@ -239,6 +244,8 @@ start() {
 
     build_ctdb_options
 
+    export_debug_variables
+
     # make sure we drop any ips that might still be held if previous
     # instance of ctdb got killed with -9 or similar
     drop_all_public_ips
diff --git a/config/ctdb.sysconfig b/config/ctdb.sysconfig
index 1f2edc4..08a550f 100644
--- a/config/ctdb.sysconfig
+++ b/config/ctdb.sysconfig
@@ -92,6 +92,9 @@ CTDB_RECOVERY_LOCK="/some/place/on/shared/storage"
 # a script to run when node health changes
 # CTDB_NOTIFY_SCRIPT=/etc/ctdb/notify.sh
 
+# a script to collect data when an eventscript has hung
+# CTDB_DEBUG_HUNG_SCRIPT=/etc/ctdb/debug-hung-script.sh
+
 # the directory to put the local ctdb database files in
 # defaults to /var/ctdb
 # CTDB_DBDIR=/var/ctdb
diff --git a/config/debug-hung-script.sh b/config/debug-hung-script.sh
new file mode 100644
index 0000000..dcf68ba
--- /dev/null
+++ b/config/debug-hung-script.sh
@@ -0,0 +1,4 @@
+#!/bin/sh
+
+echo "Pstree output for the hung script:"
+pstree -p -a $1
diff --git a/config/debug_locks.sh b/config/debug_locks.sh
new file mode 100755
index 0000000..91cb405
--- /dev/null
+++ b/config/debug_locks.sh
@@ -0,0 +1,38 @@
+#!/bin/sh
+
+# Create sed expression to convert inodes to names
+sed_cmd=$( ls -li /var/ctdb/*.tdb.* /var/ctdb/persistent/*.tdb.* |
+	   sed -e "s#/var/ctdb[/persistent]*/\(.*\)#\1#" |
+	   awk '{printf "s#[0-9]*:[0-9]*:%s #%s #\n", $1, $10}' )
+
+# Parse /proc/locks and extract following information
+#    pid process_name tdb_name offsets [W]
+out=$( cat /proc/locks |
+    grep -F "POSIX  ADVISORY  WRITE" |
+    awk '{ if($2 == "->") { print $6, $7, $8, $9, "W" } else { print $5, $6, $7, $8 } }' |
+    while read pid rest ; do
+	pname=$(readlink /proc/$pid/exe)
+	echo $pid $pname $rest
+    done | sed -e "$sed_cmd" | grep "\.tdb" )
+
+if [ -n "$out" ]; then
+    # Log information about locks
+    echo "$out" | logger -t "debug-lock"
+
+    # Find processes that are waiting for locks
+    dbs=$(echo "$out" | grep "W$" | awk '{print $3}')
+    all_pids=""
+    for db in $dbs ; do
+	pids=$(echo "$out" | grep -v "W$" | grep "$db" | grep -v ctdbd | awk '{print $1}')
+	all_pids="$all_pids $pids"
+    done
+    pids=$(echo $all_pids | sort -u)
+
+    # For each process waiting, log stack trace
+    for pid in $pids ; do
+	gstack $pid | logger -t "debug-lock $pid"
+#	gcore -o /var/log/core-deadlock-ctdb $pid
+    done
+fi
+
+exit 0
diff --git a/include/ctdb_private.h b/include/ctdb_private.h
index 32e8c68..5ff57b6 100644
--- a/include/ctdb_private.h
+++ b/include/ctdb_private.h
@@ -1114,6 +1114,7 @@ int ctdb_ctrl_set_iface_link(struct ctdb_context *ctdb,
 uint32_t uint16_checksum(uint16_t *data, size_t n);
 int ctdb_sys_send_arp(const ctdb_sock_addr *addr, const char *iface);
 bool ctdb_sys_have_ip(ctdb_sock_addr *addr);
+char *ctdb_sys_find_ifname(ctdb_sock_addr *addr);
 int ctdb_sys_send_tcp(const ctdb_sock_addr *dest, 
 		      const ctdb_sock_addr *src,
 		      uint32_t seq, uint32_t ack, int rst);
diff --git a/packaging/RPM/ctdb.spec.in b/packaging/RPM/ctdb.spec.in
index 1d9781d..2873bb9 100644
--- a/packaging/RPM/ctdb.spec.in
+++ b/packaging/RPM/ctdb.spec.in
@@ -4,7 +4,7 @@ Summary: Clustered TDB
 Vendor: Samba Team
 Packager: Samba Team <samba at samba.org>
 Version: 1.2.39
-Release: 8GITHASH
+Release: 9GITHASH
 Epoch: 0
 License: GNU GPL version 3
 Group: System Environment/Daemons
@@ -88,6 +88,7 @@ rm -rf $RPM_BUILD_ROOT
 
 %config(noreplace) %{_sysconfdir}/sysconfig/ctdb
 %config(noreplace) %{_sysconfdir}/ctdb/notify.sh
+%config(noreplace) %{_sysconfdir}/ctdb/debug-hung-script.sh
 %config(noreplace) %{_sysconfdir}/ctdb/ctdb-crash-cleanup.sh
 %config(noreplace) %{_sysconfdir}/ctdb/functions
 %attr(755,root,root) %{initdir}/ctdb
@@ -112,6 +113,7 @@ rm -rf $RPM_BUILD_ROOT
 %{_sysconfdir}/ctdb/events.d/91.lvs
 %{_sysconfdir}/ctdb/statd-callout
 %{_sysconfdir}/ctdb/interface_modify.sh
+%{_sysconfdir}/ctdb/debug_locks.sh
 %{_sbindir}/ctdbd
 %{_bindir}/ctdb
 %{_bindir}/smnotify
@@ -144,6 +146,10 @@ development libraries for ctdb
 %{_libdir}/libctdb.a
 
 %changelog
+* Wed May 8 2013 : version 1.2.39-9
+  - Robust removal of rogue public IPs
+  - Backport use of external script to debug hung eventscript
+  - Call external script to debug locking problems during recovery
 * Wed Feb 20 2013 : version 1.2.39-8
   - Don't send "ipreallocate" events to stopped nodes
 * Fri Nov 30 2012 : version 1.2.39-7
diff --git a/server/ctdb_freeze.c b/server/ctdb_freeze.c
index 0f70fd3..f422e6d 100644
--- a/server/ctdb_freeze.c
+++ b/server/ctdb_freeze.c
@@ -256,6 +256,26 @@ static int ctdb_freeze_waiter_destructor(struct ctdb_freeze_waiter *w)
 }
 
 /*
+ * Run an external script to check if there is a deadlock situation
+ */
+static void ctdb_debug_locks(void)
+{
+	const char *cmd = getenv("CTDB_DEBUG_LOCKS");
+	int pid;
+
+	if (cmd == NULL) {
+		return;
+	}
+
+	pid = fork();
+
+	/* Execute only in child process */
+	if (pid == 0) {
+		execl(cmd, cmd, NULL);
+	}
+}
+
+/*
   start the freeze process for a certain priority
  */
 int ctdb_start_freeze(struct ctdb_context *ctdb, uint32_t priority)
@@ -283,6 +303,10 @@ int ctdb_start_freeze(struct ctdb_context *ctdb, uint32_t priority)
 		ctdb->freeze_handles[priority] = ctdb_freeze_lock(ctdb, priority);
 		CTDB_NO_MEMORY(ctdb, ctdb->freeze_handles[priority]);
 		ctdb->freeze_mode[priority] = CTDB_FREEZE_PENDING;
+	} else {
+		/* The previous free lock child has not yet been able to get locks.
+		 * Invoke debugging script */
+		ctdb_debug_locks();
 	}
 
 	return 0;
diff --git a/server/ctdb_recoverd.c b/server/ctdb_recoverd.c
index 336a9a7..8b1b517 100644
--- a/server/ctdb_recoverd.c
+++ b/server/ctdb_recoverd.c
@@ -2621,9 +2621,13 @@ static int verify_local_ip_allocation(struct ctdb_context *ctdb, struct ctdb_rec
 				}
 			} else {
 				if (ctdb_sys_have_ip(&ips->ips[j].addr)) {
-					DEBUG(DEBUG_CRIT,("We are still serving a public address '%s' that we should not be serving.\n", 
+
+					DEBUG(DEBUG_CRIT,("We are still serving a public address '%s' that we should not be serving. Removing it.\n", 
 						ctdb_addr_to_str(&ips->ips[j].addr)));
-					need_takeover_run = true;
+
+					if (ctdb_ctrl_release_ip(ctdb, CONTROL_TIMEOUT(), CTDB_CURRENT_NODE, &ips->ips[j]) != 0) {
+						DEBUG(DEBUG_ERR,("Failed to release local ip address\n"));
+					}
 				}
 			}
 		}
diff --git a/server/ctdb_takeover.c b/server/ctdb_takeover.c
index 1e8dc75..affc3ce 100644
--- a/server/ctdb_takeover.c
+++ b/server/ctdb_takeover.c
@@ -784,6 +784,7 @@ int32_t ctdb_control_release_ip(struct ctdb_context *ctdb,
 	struct takeover_callback_state *state;
 	struct ctdb_public_ip *pip = (struct ctdb_public_ip *)indata.dptr;
 	struct ctdb_vnn *vnn;
+	char *iface;
 
 	/* update our vnn list */
 	vnn = find_public_ip_vnn(ctdb, &pip->addr);
@@ -801,23 +802,22 @@ int32_t ctdb_control_release_ip(struct ctdb_context *ctdb,
 	if (!ctdb_sys_have_ip(&pip->addr)) {
 		DEBUG(DEBUG_DEBUG,("Redundant release of IP %s/%u on interface %s (ip not held)\n", 
 			ctdb_addr_to_str(&pip->addr),
-			vnn->public_netmask_bits, 
+			vnn->public_netmask_bits,
 			ctdb_vnn_iface_string(vnn)));
 		ctdb_vnn_unassign_iface(ctdb, vnn);
 		return 0;
 	}
 
-	if (vnn->iface == NULL) {
-		DEBUG(DEBUG_ERR,(__location__ " release_ip of IP %s is known to the kernel, "
-				 "but we have no interface assigned, has someone manually configured it? Ignore for now.\n",
-				 ctdb_addr_to_str(&vnn->public_address)));
+	iface = ctdb_sys_find_ifname(&pip->addr);
+	if (iface == NULL) {
+		DEBUG(DEBUG_ERR, ("Could not find which interface the ip address is hosted on. can not release it\n"));
 		return 0;
 	}
 
 	DEBUG(DEBUG_NOTICE,("Release of IP %s/%u on interface %s  node:%d\n",
 		ctdb_addr_to_str(&pip->addr),
-		vnn->public_netmask_bits, 
-		ctdb_vnn_iface_string(vnn),
+		vnn->public_netmask_bits,
+		iface,
 		pip->pnn));
 
 	state = talloc(ctdb, struct takeover_callback_state);
@@ -834,9 +834,10 @@ int32_t ctdb_control_release_ip(struct ctdb_context *ctdb,
 					 false,
 					 CTDB_EVENT_RELEASE_IP,
 					 "%s %s %u",
-					 ctdb_vnn_iface_string(vnn),
+					 iface,
 					 ctdb_addr_to_str(&pip->addr),
 					 vnn->public_netmask_bits);
+	free(iface);
 	if (ret != 0) {
 		DEBUG(DEBUG_ERR,(__location__ " Failed to release IP %s on interface %s\n",
 			ctdb_addr_to_str(&pip->addr),
diff --git a/server/eventscript.c b/server/eventscript.c
index a1bcf01..05fb37d 100644
--- a/server/eventscript.c
+++ b/server/eventscript.c
@@ -504,15 +504,14 @@ static void ctdb_event_script_handler(struct event_context *ev, struct fd_event
 	}
 }
 
-static void debug_timeout(struct ctdb_event_script_state *state)
+static void ctdb_run_debug_hung_script(struct ctdb_context *ctdb, struct ctdb_event_script_state *state)
 {
 	struct ctdb_script_wire *current = get_current_script(state);
 	char *cmd;
 	pid_t pid;
-	time_t t;
-	char tbuf[100], buf[200];
+	const char * debug_hung_script = ETCDIR "/ctdb/debug-hung-script.sh";
 
-	cmd = child_command_string(state->ctdb, state,
+	cmd = child_command_string(ctdb, state,
 				   state->from_user, current->name,
 				   state->call, state->options);
 	CTDB_NO_MEMORY_VOID(state->ctdb, cmd);
@@ -521,26 +520,36 @@ static void debug_timeout(struct ctdb_event_script_state *state)
 			 cmd, timeval_elapsed(&current->start), state->child));
 	talloc_free(cmd);
 
-	t = time(NULL);
-	strftime(tbuf, sizeof(tbuf)-1, "%Y%m%d%H%M%S", 	localtime(&t));
-	sprintf(buf, "{ pstree -p; cat /proc/locks; ls -li /var/ctdb/ /var/ctdb/persistent; }"
-			" >/tmp/ctdb.event.%s.%d", tbuf, getpid());
-
-	pid = ctdb_fork(state->ctdb);
-	if (pid == 0) {
-		system(buf);
-		/* Now we can kill the child */
+	if (!ctdb_fork_with_logging(ctdb, ctdb, NULL, NULL, &pid)) {
+		DEBUG(DEBUG_ERR,("Failed to fork a child process with logging to track hung event script\n"));
 		kill(state->child, SIGTERM);
-		exit(0);
+		return;
 	}
 	if (pid == -1) {
 		DEBUG(DEBUG_ERR,("Fork for debug script failed : %s\n",
 				 strerror(errno)));
-	} else {
-		DEBUG(DEBUG_ERR,("Logged timedout eventscript : %s\n", buf));
-		/* Don't kill child until timeout done. */
-		state->child = 0;
+		kill(state->child, SIGTERM);
+		return;
 	}
+	if (pid == 0) {
+		char *buf;
+
+		if (getenv("CTDB_DEBUG_HUNG_SCRIPT") != NULL) {
+			debug_hung_script = getenv("CTDB_DEBUG_HUNG_SCRIPT");
+		}
+
+		buf = talloc_asprintf(NULL, "%s %d",
+				      debug_hung_script, state->child);
+		system(buf);
+		talloc_free(buf);
+
+		/* Now we can kill the child */
+		kill(state->child, SIGTERM);
+		_exit(0);
+	}
+
+	/* Don't kill child until timeout done. */
+	state->child = 0;
 }
 
 /* called when child times out */
@@ -564,10 +573,11 @@ static void ctdb_event_script_timeout(struct event_context *ev, struct timed_eve
 	case CTDB_EVENT_STATUS:
 		state->scripts->scripts[state->current].status = 0;
 		DEBUG(DEBUG_ERR,("Ignoring hung script for %s call %d\n", state->options, state->call));
+		ctdb_run_debug_hung_script(ctdb, state);
 		break;
         default:
 		state->scripts->scripts[state->current].status = -ETIME;
-		debug_timeout(state);
+		ctdb_run_debug_hung_script(ctdb, state);
 	}
 
 	talloc_free(state);
diff --git a/tools/ctdb.c b/tools/ctdb.c
index d49bc8f..69dfc70 100644
--- a/tools/ctdb.c
+++ b/tools/ctdb.c
@@ -1772,6 +1772,27 @@ static int control_addip(struct ctdb_context *ctdb, int argc, const char **argv)
 	return 0;
 }
 
+/*
+  add a public ip address to a node
+ */
+static int control_ipiface(struct ctdb_context *ctdb, int argc, const char **argv)
+{
+	ctdb_sock_addr addr;
+
+	if (argc != 1) {
+		usage();
+	}
+
+	if (!parse_ip(argv[0], NULL, 0, &addr)) {
+		printf("Badly formed ip : %s\n", argv[0]);
+		return -1;
+	}
+
+	printf("IP on interface %s\n", ctdb_sys_find_ifname(&addr));
+
+	return 0;
+}
+
 static int control_delip(struct ctdb_context *ctdb, int argc, const char **argv);
 
 static int control_delip_all(struct ctdb_context *ctdb, int argc, const char **argv, ctdb_sock_addr *addr)
@@ -5020,6 +5041,7 @@ static const struct {
 	{ "readkey", 	     control_readkey,      	true,	false,  "read the content off a database key", "<tdb-file> <key>" },
 	{ "writekey", 	     control_writekey,      	true,	false,  "write to a database key", "<tdb-file> <key> <value>" },
 	{ "checktcpport",    control_chktcpport,      	false,	true,  "check if a service is bound to a specific tcp port or not", "<port>" },
+	{ "ipiface",         control_ipiface,           true,	true,  "Find which interface an ip address is hsoted on", "<ip>" },
 };
 
 /*


-- 
CTDB repository


More information about the samba-cvs mailing list