[SCM] CTDB repository - branch 1.0.112 updated - ctdb-1.0.111-142-g16a5cad
Ronnie Sahlberg
sahlberg at samba.org
Thu Sep 2 20:01:13 MDT 2010
The branch, 1.0.112 has been updated
via 16a5cad37fa9093beb3ab5e4c24bbd61056c89f8 (commit)
via 35b719c8e2d97ec7014401a132937a01a1f2da7f (commit)
from d0c57b915d225bcf4c924ff57df7abb99b3ebfd1 (commit)
http://gitweb.samba.org/?p=sahlberg/ctdb.git;a=shortlog;h=1.0.112
- Log -----------------------------------------------------------------
commit 16a5cad37fa9093beb3ab5e4c24bbd61056c89f8
Author: Ronnie Sahlberg <ronniesahlberg at gmail.com>
Date: Fri Sep 3 11:58:27 2010 +1000
When memory allocations for recovery fails,
dont dereference a null pointer while trying to print the log message for the failure.
also shutdown ctdb with ctdb_fatal()
commit 35b719c8e2d97ec7014401a132937a01a1f2da7f
Author: Rusty Russell <rusty at rustcorp.com.au>
Date: Thu Sep 2 12:44:21 2010 +0930
eventscript: make sure we die when we timeout.
Volker noticed that system() can hang on a futex: we do this inside a
signal handler simply to dump extra diagnostics when we timeout, which is
very questionable but usually works.
Add a timeout of 90 seconds: after that, commit suicide.
(This is a workaround for this branch: master does this correctly).
Signed-off-by: Rusty Russell <rusty at rustcorp.com.au>
-----------------------------------------------------------------------
Summary of changes:
server/ctdb_recover.c | 6 ++----
server/eventscript.c | 13 +++++++++++++
2 files changed, 15 insertions(+), 4 deletions(-)
Changeset truncated at 500 lines:
diff --git a/server/ctdb_recover.c b/server/ctdb_recover.c
index f61b6e7..b48b4e7 100644
--- a/server/ctdb_recover.c
+++ b/server/ctdb_recover.c
@@ -340,10 +340,8 @@ static int traverse_pulldb(struct tdb_context *tdb, TDB_DATA key, TDB_DATA data,
}
params->pulldata = talloc_realloc_size(NULL, params->pulldata, rec->length + params->len);
if (params->pulldata == NULL) {
- DEBUG(DEBUG_ERR,(__location__ " Failed to expand pulldb_data to %u (%u records)\n",
- rec->length + params->len, params->pulldata->count));
- params->failed = true;
- return -1;
+ DEBUG(DEBUG_CRIT,(__location__ " Failed to expand pulldb_data to %u\n", rec->length + params->len));
+ ctdb_fatal(params->ctdb, "failed to allocate memory for recovery. shutting down\n");
}
params->pulldata->count++;
memcpy(params->len+(uint8_t *)params->pulldata, rec, rec->length);
diff --git a/server/eventscript.c b/server/eventscript.c
index c403772..37306db 100644
--- a/server/eventscript.c
+++ b/server/eventscript.c
@@ -34,6 +34,13 @@ static struct {
static void ctdb_event_script_timeout(struct event_context *ev, struct timed_event *te, struct timeval t, void *p);
+static void sigalarm(int sig)
+{
+ /* all the child processes will be running in the same process group */
+ kill(-getpgrp(), SIGKILL);
+ _exit(1);
+}
+
/*
ctdbd sends us a SIGTERM when we should time out the current script
*/
@@ -42,6 +49,12 @@ static void sigterm(int sig)
char tbuf[100], buf[200];
time_t t;
+ /* Calling system() inside a signal handler can do strange things:
+ * it usually works, and that's enough for us: it's only for debugging.
+ * But make sure we terminate. */
+ signal(SIGTERM, sigalarm);
+ alarm(90);
+
DEBUG(DEBUG_ERR,("Timed out running script '%s' after %.1f seconds pid :%d\n",
child_state.script_running, timeval_elapsed(&child_state.start), getpid()));
--
CTDB repository
More information about the samba-cvs
mailing list