[PATCH] Restart cleanupd as needed

Ralph Boehme slow at samba.org
Tue Apr 19 11:18:24 UTC 2016


Hi!

On a cluster with "ctdb timeout = x" cleanupd may exit when a recovery
is started while running a db traverse in cleanupd.

When recovery kicks in, the traverse freezes and if recovery takes
longer then ctdb timeout, cleanupd closes the ctdb connection and
exits.

So we should better be restarting cleanup in smbd when we notice it
exitted.

Patch attached, please review and push if ok.

Cheerio!
-slow
-------------- next part --------------
From 4a103c1a8f086659306f4e06be95fcd9bf9795ab Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Tue, 19 Apr 2016 12:55:19 +0200
Subject: [PATCH] cleanupd: restart as needed

Bug: https://bugzilla.samba.org/show_bug.cgi?id=11855

Signed-off-by: Ralph Boehme <slow at samba.org>
---
 source3/smbd/server.c | 35 +++++++++++++++++++++++++----------
 1 file changed, 25 insertions(+), 10 deletions(-)

diff --git a/source3/smbd/server.c b/source3/smbd/server.c
index 7e5b5d9..82e686e 100644
--- a/source3/smbd/server.c
+++ b/source3/smbd/server.c
@@ -468,6 +468,10 @@ static bool cleanupd_init(struct messaging_context *msg, bool interactive,
 
 		DBG_DEBUG("Started cleanupd pid=%d\n", (int)pid);
 
+		if (am_parent != NULL) {
+			add_child_pid(am_parent, pid);
+		}
+
 		*ppid = pid_to_procid(pid);
 		return true;
 	}
@@ -557,16 +561,6 @@ static void remove_child_pid(struct smbd_parent_context *parent,
 	struct iovec iov[2];
 	NTSTATUS status;
 
-	iov[0] = (struct iovec) { .iov_base = (uint8_t *)&pid,
-				  .iov_len = sizeof(pid) };
-	iov[1] = (struct iovec) { .iov_base = (uint8_t *)&unclean_shutdown,
-				  .iov_len = sizeof(bool) };
-
-	status = messaging_send_iov(parent->msg_ctx, parent->cleanupd,
-				    MSG_SMB_NOTIFY_CLEANUP,
-				    iov, ARRAY_SIZE(iov), NULL, 0);
-	DEBUG(10, ("messaging_send_iov returned %s\n", nt_errstr(status)));
-
 	for (child = parent->children; child != NULL; child = child->next) {
 		if (child->pid == pid) {
 			struct smbd_child_pid *tmp = child;
@@ -583,6 +577,27 @@ static void remove_child_pid(struct smbd_parent_context *parent,
 		return;
 	}
 
+	if (child->pid == procid_to_pid(&parent->cleanupd)) {
+		bool ok;
+
+		DBG_WARNING("Restarting cleanupd\n");
+		ok = cleanupd_init(parent->msg_ctx, false, &parent->cleanupd);
+		if (!ok) {
+			DBG_ERR("Failed to restart cleanupd\n");
+		}
+		return;
+	}
+
+	iov[0] = (struct iovec) { .iov_base = (uint8_t *)&pid,
+				  .iov_len = sizeof(pid) };
+	iov[1] = (struct iovec) { .iov_base = (uint8_t *)&unclean_shutdown,
+				  .iov_len = sizeof(bool) };
+
+	status = messaging_send_iov(parent->msg_ctx, parent->cleanupd,
+				    MSG_SMB_NOTIFY_CLEANUP,
+				    iov, ARRAY_SIZE(iov), NULL, 0);
+	DEBUG(10, ("messaging_send_iov returned %s\n", nt_errstr(status)));
+
 	if (unclean_shutdown) {
 		/* a child terminated uncleanly so tickle all
 		   processes to see if they can grab any of the
-- 
2.5.0



More information about the samba-technical mailing list