Process hierarchy on a DC?

Ralph Böhme slow at samba.org
Fri Jun 30 17:36:47 UTC 2017


Hi

On Fri, Apr 28, 2017 at 01:11:30PM +0200, Stefan Metzmacher wrote:
> Am 28.04.2017 um 12:06 schrieb Ralph Böhme:
> > On Thu, Apr 27, 2017 at 02:39:21PM -0700, Jeremy Allison wrote:
> >> Much though I hate to revert, I think this might be a good case
> >> for it temporarily.
> > 
> > two patchsets attached:
> > 
> > 1) revert the use of tfork in samba_runcmd_send(). This restores the process
> > hierarchy and stopping samba by sending a single SIGTERM
> 
> Please add a large explaination to the commit message, basically a summary
> of this mail thread.
> 
> > If this process hierarchy is a problem then I guess we need two functions in the
> > future:
> > 
> > - apply the revert and rename samba_runcmd_send (and the callers) to something
> >   like samba_runchld_send()
> > 
> > - re-add a samba_runcmd_send() that uses tfork
> > 
> > samba_runchld_send() would be used for starting long lived processes where you
> > care about the process hierarchy, preserving the problematic behaviour that you
> > have to correctly handle SIGCHLD yourself. So just the existing behaviour.
> 
> I'd name it samba_start_child_send().
> 
> Here I'd still like to deliver the exit status of the process via a pipe,
> maybe just doing a double fork, basically skipping the first child that
> directly exists.
> The main process would still call waitpid() after getting the exit status
> from the pipe, to avoid zombies, but the result of waitpid() will be
> ignored.

attached is a patchset to tfork() that implements this.

This version preserves the process hierarchy, it also has an actual test for
this. But it adds an intermediate process between the caller and the final
forked child.

The patchset also let samba_runcmd_send() use tfork() which means any of the
long lived tasks started from the samba process, like smbd or winbindd will have
a resulting process hierarchy like this

samba
 `-waiter
     `-smbd

Any objections?

From the commit message:
---8<---
This function is a solution to the problem of fork() requiring special
preperations in the caller to handle SIGCHLD signals and to reap the
child by wait()ing for it.

Instead, tfork provides a pollable file descriptor. The caller gets the
file descriptor by calling tfork_event_fd() on the handle returned from
tfork_create() and the caller can then get the status of the child
with a call to tfork_status().

tfork avoids raising SIGCHLD signals in the caller by installing a
temporary SIGCHLD handler from inside tfork_create() and tfork_status().

The termination signal of other child processes not created with tfork()
is forwarded to the existing signal handler if any.

There's one thing this thing can't protect us against and that is if a
process installs a SIGCHLD handler from one thread while another thread
is running inside tfork_create() or tfork_status() and the signal
handler doesn't forward signals for exitted childs it didn't fork, ie
our childs.
---8<---

The upside is that it gets rid of the signal handler requirement and doesn't
require calling waitpid to get the exit status.

With this patchset samba_runcmd_send() can then be used from source3 as well
(which is the customer use case I implemented this for...).

The patchset has already two team-member +1, but any additional reviewers are
highly welcome! The code is a little piece of insanity, if you have any
questions feel free to ask... :)

Cheerio!
-slow
-------------- next part --------------
From 6d23e22aff84567d5d372c332b2162478cd7e3bb Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Wed, 26 Apr 2017 00:48:39 +0200
Subject: [PATCH 1/4] lib/util: enhanced tfork()

This function is a solution to the problem of fork() requiring special
preperations in the caller to handle SIGCHLD signals and to reap the
child by wait()ing for it.

Instead, tfork provides a pollable file descriptor. The caller gets the
file descriptor by calling tfork_event_fd() on the handle returned from
tfork_create() and the caller can then get the status of the child
with a call to tfork_status().

tfork avoids raising SIGCHLD signals in the caller by installing a
temporary SIGCHLD handler from inside tfork_create() and tfork_status().

The termination signal of other child processes not created with tfork()
is forwarded to the existing signal handler if any.

There's one thing this thing can't protect us against and that is if a
process installs a SIGCHLD handler from one thread while another thread
is running inside tfork_create() or tfork_status() and the signal
handler doesn't forward signals for exitted childs it didn't fork, ie
our childs.

Pair-Programmed-With: Stefan Metzmacher <metze at samba.org>

Signed-off-by: Ralph Boehme <slow at samba.org>
Signed-off-by: Stefan Metzmacher <metze at samba.org>
---
 lib/util/tests/tfork.c |  63 ++--
 lib/util/tfork.c       | 950 ++++++++++++++++++++++++++++++++++++++-----------
 lib/util/tfork.h       | 102 ++++--
 3 files changed, 855 insertions(+), 260 deletions(-)

diff --git a/lib/util/tests/tfork.c b/lib/util/tests/tfork.c
index bd5809d..2f140dd 100644
--- a/lib/util/tests/tfork.c
+++ b/lib/util/tests/tfork.c
@@ -32,44 +32,49 @@
 
 static bool test_tfork_simple(struct torture_context *tctx)
 {
-	pid_t pid;
-	pid_t parent = getpid();
-	pid_t parent_arg;
-
-	pid = tfork(NULL, &parent_arg);
-	if (pid == 0) {
-		torture_comment(tctx, "my parent pid is %d\n", parent);
-		torture_assert(tctx, parent == parent_arg, "tfork failed\n");
-		_exit(0);
-	}
-	if (pid == -1) {
-		torture_fail(tctx, "tfork failed\n");
-		return false;
-	}
-
-	return true;
+        pid_t parent = getpid();
+        struct tfork *t = NULL;
+        pid_t child;
+        int ret;
+
+        t = tfork_create();
+        if (t == NULL) {
+                torture_fail(tctx, "tfork failed\n");
+                return false;
+        }
+        child = tfork_child_pid(t);
+        if (child == 0) {
+                torture_comment(tctx, "my parent pid is %d\n", parent);
+                torture_assert(tctx, getpid() != parent, "tfork failed\n");
+                _exit(0);
+        }
+
+        ret = tfork_destroy(&t);
+        torture_assert(tctx, ret == 0, "tfork_destroy failed\n");
+
+        return true;
 }
 
 static bool test_tfork_status(struct torture_context *tctx)
 {
-	pid_t child;
+	struct tfork *t = NULL;
 	int status;
-	ssize_t nread;
-	int status_fd = -1;
+	pid_t child;
 	bool ok = true;
 
-	child = tfork(&status_fd, NULL);
-	if (child == 0) {
-		_exit(123);
-	}
-	if (child == -1) {
+	t = tfork_create();
+	if (t == NULL) {
 		torture_fail(tctx, "tfork failed\n");
 		return false;
 	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		_exit(123);
+	}
 
-	nread = sys_read(status_fd, &status, sizeof(status));
-	if (nread != sizeof(status)) {
-		torture_fail(tctx, "sys_read failed\n");
+	status = tfork_status(&t, true);
+	if (status == -1) {
+		torture_fail(tctx, "tfork_status failed\n");
 	}
 
 	torture_assert_goto(tctx, WIFEXITED(status) == true, ok, done,
@@ -80,10 +85,6 @@ static bool test_tfork_status(struct torture_context *tctx)
 	torture_comment(tctx, "exit status [%d]\n", WEXITSTATUS(status));
 
 done:
-	if (status_fd != -1) {
-		close(status_fd);
-	}
-
 	return ok;
 }
 
diff --git a/lib/util/tfork.c b/lib/util/tfork.c
index 37c00e6..cc2e0a0 100644
--- a/lib/util/tfork.c
+++ b/lib/util/tfork.c
@@ -21,319 +21,859 @@
 #include "replace.h"
 #include "system/wait.h"
 #include "system/filesys.h"
+#include "system/network.h"
 #include "lib/util/samba_util.h"
 #include "lib/util/sys_rw.h"
 #include "lib/util/tfork.h"
 #include "lib/util/debug.h"
 
-struct tfork_state {
-	void (*old_sig_chld)(int);
-	int status_pipe[2];
-	pid_t *parent;
+#ifdef HAVE_PTHREAD
+#include <pthread.h>
+#endif
+
+#ifdef NDEBUG
+#undef NDEBUG
+#endif
+#include <assert.h>
+
+/*
+ * This is how the process hierarchy looks like:
+ *
+ *   +----------+
+ *   |  caller  |
+ *   +----------+
+ *         |
+ *       fork
+ *         |
+ *         v
+ *   +----------+
+ *   |  waiter  |
+ *   +----------+
+ *         |
+ *       fork
+ *         |
+ *         v
+ *   +----------+
+ *   |  worker  |
+ *   +----------+
+ */
+
+/*
+ * The resulting (private) state per tfork_create() call, returned as a opaque
+ * handle to the caller.
+ */
+struct tfork {
+	/*
+	 * This is returned to the caller with tfork_event_fd()
+	 */
+	int event_fd;
 
-	pid_t level0_pid;
-	int level0_status;
+	/*
+	 * This is used in the caller by tfork_status() to read the worker exit
+	 * status and to tell the waiter to exit by closing the fd.
+	 */
+	int status_fd;
 
-	pid_t level1_pid;
-	int level1_errno;
+	pid_t waiter_pid;
+	pid_t worker_pid;
+};
 
-	pid_t level2_pid;
-	int level2_errno;
+/*
+ * Internal per-thread state maintained while inside tfork.
+ */
+struct tfork_state {
+	pid_t waiter_pid;
+	int waiter_errno;
 
-	pid_t level3_pid;
+	pid_t worker_pid;
 };
 
 /*
- * TODO: We should make this global thread local
+ * A global state that synchronizes access to handling SIGCHLD and waiting for
+ * childs.
  */
-static struct tfork_state *tfork_global;
+struct tfork_signal_state {
+	bool available;
 
-static void tfork_sig_chld(int signum)
+#ifdef HAVE_PTHREAD
+	pthread_cond_t cond;
+	pthread_mutex_t mutex;
+#endif
+
+	/*
+	 * pid of the waiter child. This points at waiter_pid in either struct
+	 * tfork or struct tfork_state, depending on who called
+	 * tfork_install_sigchld_handler().
+	 *
+	 * When tfork_install_sigchld_handler() is called the waiter_pid is
+	 * still -1 and only set later after fork(), that's why this is must be
+	 * a pointer. The signal handler checks this.
+	 */
+	pid_t *pid;
+
+	struct sigaction oldact;
+	sigset_t oldset;
+};
+
+static struct tfork_signal_state signal_state;
+
+#ifdef HAVE_PTHREAD
+static pthread_once_t tfork_global_is_initialized = PTHREAD_ONCE_INIT;
+static pthread_key_t tfork_global_key;
+#else
+static struct tfork_state *global_state;
+#endif
+
+static void tfork_sigchld_handler(int signum, siginfo_t *si, void *p);
+
+#ifdef HAVE_PTHREAD
+static void tfork_global_destructor(void *state)
 {
-	if (tfork_global->level1_pid > 0) {
-		int ret = waitpid(tfork_global->level1_pid,
-			      &tfork_global->level0_status,
-			      WNOHANG);
-		if (ret == tfork_global->level1_pid) {
-			tfork_global->level1_pid = -1;
-			return;
+	anonymous_shared_free(state);
+}
+#endif
+
+static int tfork_acquire_sighandling(void)
+{
+	int ret = 0;
+
+#ifdef HAVE_PTHREAD
+	ret = pthread_mutex_lock(&signal_state.mutex);
+	if (ret != 0) {
+		return ret;
+	}
+
+	while (!signal_state.available) {
+		ret = pthread_cond_wait(&signal_state.cond,
+					&signal_state.mutex);
+		if (ret != 0) {
+			return ret;
 		}
 	}
 
-	/*
-	 * Not our child, forward to old handler
-	 */
+	signal_state.available = false;
 
-	if (tfork_global->old_sig_chld == SIG_IGN) {
-		return;
+	ret = pthread_mutex_unlock(&signal_state.mutex);
+	if (ret != 0) {
+		return ret;
 	}
+#endif
 
-	if (tfork_global->old_sig_chld == SIG_DFL) {
-		return;
+	return ret;
+}
+
+static int tfork_release_sighandling(void)
+{
+	int ret = 0;
+
+#ifdef HAVE_PTHREAD
+	ret = pthread_mutex_lock(&signal_state.mutex);
+	if (ret != 0) {
+		return ret;
+	}
+
+	signal_state.available = true;
+
+	ret = pthread_cond_signal(&signal_state.cond);
+	if (ret != 0) {
+		pthread_mutex_unlock(&signal_state.mutex);
+		return ret;
+	}
+
+	ret = pthread_mutex_unlock(&signal_state.mutex);
+	if (ret != 0) {
+		return ret;
 	}
+#endif
 
-	tfork_global->old_sig_chld(signum);
+	return ret;
 }
 
-static pid_t level2_fork_and_wait(int child_ready_fd)
+#ifdef HAVE_PTHREAD
+static void tfork_atfork_prepare(void)
 {
-	int status;
-	ssize_t written;
-	pid_t pid;
-	int fd;
-	bool wait;
+	int ret;
+
+	ret = pthread_mutex_lock(&signal_state.mutex);
+	assert(ret == 0);
+}
+
+static void tfork_atfork_parent(void)
+{
+	int ret;
+
+	ret = pthread_mutex_unlock(&signal_state.mutex);
+	assert(ret == 0);
+}
+#endif
+
+static void tfork_atfork_child(void)
+{
+	int ret;
+
+#ifdef HAVE_PTHREAD
+	ret = pthread_mutex_unlock(&signal_state.mutex);
+	assert(ret == 0);
+
+	ret = pthread_key_delete(tfork_global_key);
+	assert(ret == 0);
+
+	ret = pthread_key_create(&tfork_global_key, tfork_global_destructor);
+	assert(ret == 0);
 
 	/*
-	 * Child level 2.
-	 *
-	 * Do a final fork and if the tfork() caller passed a status_fd, wait
-	 * for child3 and return its exit status via status_fd.
+	 * There's no way to destroy a condition variable if there are waiters,
+	 * pthread_cond_destroy() will return EBUSY. Just zero out memory and
+	 * then initialize again. This is not backed by POSIX but should be ok.
 	 */
+	ZERO_STRUCT(signal_state.cond);
+	ret = pthread_cond_init(&signal_state.cond, NULL);
+	assert(ret == 0);
+#endif
 
-	pid = fork();
-	if (pid == 0) {
-		/*
-		 * Child level 3, this one finally returns from tfork() as child
-		 * with pid 0.
-		 *
-		 * Cleanup all ressources we allocated before returning.
-		 */
-		close(child_ready_fd);
-		close(tfork_global->status_pipe[1]);
-
-		if (tfork_global->parent != NULL) {
-			/*
-			 * we're in the child and return the level0 parent pid
-			 */
-			*tfork_global->parent = tfork_global->level0_pid;
-		}
+	if (signal_state.pid != NULL) {
 
-		anonymous_shared_free(tfork_global);
-		tfork_global = NULL;
+		ret = sigaction(SIGCHLD, &signal_state.oldact, NULL);
+		assert(ret == 0);
 
-		return 0;
+#ifdef HAVE_PTHREAD
+		ret = pthread_sigmask(SIG_SETMASK, &signal_state.oldset, NULL);
+#else
+		ret = sigprocmask(SIG_SETMASK, &signal_state.oldset, NULL);
+		assert(ret == 0);
+#endif
+
+		signal_state.pid = NULL;
 	}
 
-	tfork_global->level3_pid = pid;
-	if (tfork_global->level3_pid == -1) {
-		tfork_global->level2_errno = errno;
-		_exit(0);
+	signal_state.available = true;
+}
+
+static void tfork_global_initialize(void)
+{
+#ifdef HAVE_PTHREAD
+	int ret;
+
+	pthread_atfork(tfork_atfork_prepare,
+		       tfork_atfork_parent,
+		       tfork_atfork_child);
+
+	ret = pthread_key_create(&tfork_global_key, tfork_global_destructor);
+	assert(ret == 0);
+
+	ret = pthread_mutex_init(&signal_state.mutex, NULL);
+	assert(ret == 0);
+
+	ret = pthread_cond_init(&signal_state.cond, NULL);
+	assert(ret == 0);
+#endif
+
+	signal_state.available = true;
+}
+
+static struct tfork_state *tfork_global_get(void)
+{
+	struct tfork_state *state = NULL;
+#ifdef HAVE_PTHREAD
+	int ret;
+#endif
+
+#ifdef HAVE_PTHREAD
+	state = (struct tfork_state *)pthread_getspecific(tfork_global_key);
+#else
+	state = global_state;
+#endif
+	if (state != NULL) {
+		return state;
 	}
 
-	sys_write(child_ready_fd, &(char){0}, 1);
+	state = (struct tfork_state *)anonymous_shared_allocate(
+		sizeof(struct tfork_state));
+	if (state == NULL) {
+		return NULL;
+	}
 
-	if (tfork_global->status_pipe[1] == -1) {
-		_exit(0);
+#ifdef HAVE_PTHREAD
+	ret = pthread_setspecific(tfork_global_key, state);
+	if (ret != 0) {
+		anonymous_shared_free(state);
+		return NULL;
 	}
-	wait = true;
+#endif
+	return state;
+}
 
-	/*
-	 * We're going to stay around until child3 exits, so lets close all fds
-	 * other then the pipe fd we may have inherited from the caller.
-	 */
-	while (true) {
-		fd = dup2(tfork_global->status_pipe[1], 0);
-		if (fd == -1) {
-			if (errno == EINTR) {
-				continue;
-			}
-			status = errno;
+static void tfork_global_free(void)
+{
+	struct tfork_state *state = NULL;
+#ifdef HAVE_PTHREAD
+	int ret;
+#endif
+
+#ifdef HAVE_PTHREAD
+	state = (struct tfork_state *)pthread_getspecific(tfork_global_key);
+#else
+	state = global_state;
+#endif
+	if (state == NULL) {
+		return;
+	}
 
-			kill(tfork_global->level3_pid, SIGKILL);
+#ifdef HAVE_PTHREAD
+	ret = pthread_setspecific(tfork_global_key, NULL);
+	if (ret != 0) {
+		return;
+	}
+#endif
+	anonymous_shared_free(state);
+}
 
-			written = sys_write(tfork_global->status_pipe[1],
-					    &status, sizeof(status));
-			if (written != sizeof(status)) {
-				abort();
-			}
-			_exit(0);
-		}
-		break;
+/**
+ * Only one thread at a time is allowed to handle SIGCHLD signals
+ **/
+static int tfork_install_sigchld_handler(pid_t *pid)
+{
+	int ret;
+	struct sigaction act;
+	sigset_t set;
+
+	ret = tfork_acquire_sighandling();
+	if (ret != 0) {
+		return -1;
 	}
-	closefrom(1);
 
-	while (wait) {
-		int ret = waitpid(tfork_global->level3_pid, &status, 0);
-		if (ret == -1) {
-			if (errno == EINTR) {
-				continue;
-			}
-			status = errno;
-		}
-		break;
+	assert(signal_state.pid == NULL);
+	signal_state.pid = pid;
+
+	act = (struct sigaction) {
+		.sa_sigaction = tfork_sigchld_handler,
+		.sa_flags = SA_SIGINFO,
+	};
+
+	ret = sigaction(SIGCHLD, &act, &signal_state.oldact);
+	if (ret != 0) {
+		return -1;
 	}
 
-	written = sys_write(fd, &status, sizeof(status));
-	if (written != sizeof(status)) {
-		abort();
+	sigemptyset(&set);
+	sigaddset(&set, SIGCHLD);
+#ifdef HAVE_PTHREAD
+	ret = pthread_sigmask(SIG_UNBLOCK, &set, &signal_state.oldset);
+#else
+	ret = sigprocmask(SIG_UNBLOCK, &set, &signal_state.oldset);
+#endif
+	if (ret != 0) {
+		return -1;
 	}
 
-	_exit(0);
+	return 0;
 }
 
-pid_t tfork(int *status_fd, pid_t *parent)
+static int tfork_uninstall_sigchld_handler(void)
 {
 	int ret;
+
+	signal_state.pid = NULL;
+
+	ret = sigaction(SIGCHLD, &signal_state.oldact, NULL);
+	if (ret != 0) {
+		return -1;
+	}
+
+#ifdef HAVE_PTHREAD
+	ret = pthread_sigmask(SIG_SETMASK, &signal_state.oldset, NULL);
+#else
+	ret = sigprocmask(SIG_SETMASK, &signal_state.oldset, NULL);
+#endif
+	if (ret != 0) {
+		return -1;
+	}
+
+	ret = tfork_release_sighandling();
+	if (ret != 0) {
+		return -1;
+	}
+
+	return 0;
+}
+
+static void tfork_sigchld_handler(int signum, siginfo_t *si, void *p)
+{
+	if ((signal_state.pid != NULL) &&
+	    (*signal_state.pid != -1) &&
+	    (si->si_pid == *signal_state.pid))
+	{
+		return;
+	}
+
+	/*
+	 * Not our child, forward to old handler
+	 */
+	if (signal_state.oldact.sa_flags & SA_SIGINFO) {
+		signal_state.oldact.sa_sigaction(signum, si, p);
+		return;
+	}
+
+	if (signal_state.oldact.sa_handler == SIG_IGN) {
+		return;
+	}
+	if (signal_state.oldact.sa_handler == SIG_DFL) {
+		return;
+	}
+	signal_state.oldact.sa_handler(signum);
+}
+
+static pid_t tfork_start_waiter_and_worker(struct tfork_state *state,
+					   int *_event_fd,
+					   int *_status_fd)
+{
+	int p[2];
+	int status_sp_caller_fd = -1;
+	int status_sp_waiter_fd = -1;
+	int event_pipe_caller_fd = -1;
+	int event_pipe_waiter_fd = -1;
+	int ready_pipe_caller_fd = -1;
+	int ready_pipe_worker_fd = -1;
+	ssize_t nwritten;
+	ssize_t nread;
 	pid_t pid;
-	pid_t child;
+	int status;
+	int fd;
+	char c;
+	int ret;
+
+	*_event_fd = -1;
+	*_status_fd = -1;
 
-	tfork_global = (struct tfork_state *)
-		anonymous_shared_allocate(sizeof(struct tfork_state));
-	if (tfork_global == NULL) {
+	if (state == NULL) {
 		return -1;
 	}
 
-	tfork_global->parent = parent;
-	tfork_global->status_pipe[0] = -1;
-	tfork_global->status_pipe[1] = -1;
+	ret = socketpair(AF_UNIX, SOCK_STREAM, 0, p);
+	if (ret != 0) {
+		return -1;
+	}
+	set_close_on_exec(p[0]);
+	set_close_on_exec(p[1]);
+	status_sp_caller_fd = p[0];
+	status_sp_waiter_fd = p[1];
+
+	ret = pipe(p);
+	if (ret != 0) {
+		close(status_sp_caller_fd);
+		close(status_sp_waiter_fd);
+		return -1;
+	}
+	set_close_on_exec(p[0]);
+	set_close_on_exec(p[1]);
+	event_pipe_caller_fd = p[0];
+	event_pipe_waiter_fd = p[1];
+
+
+	ret = pipe(p);
+	if (ret != 0) {
+		close(status_sp_caller_fd);
+		close(status_sp_waiter_fd);
+		close(event_pipe_caller_fd);
+		close(event_pipe_waiter_fd);
+		return -1;
+	}
+	set_close_on_exec(p[0]);
+	set_close_on_exec(p[1]);
+	ready_pipe_worker_fd = p[0];
+	ready_pipe_caller_fd = p[1];
 
-	tfork_global->level0_pid = getpid();
-	tfork_global->level0_status = -1;
-	tfork_global->level1_pid = -1;
-	tfork_global->level1_errno = ECANCELED;
-	tfork_global->level2_pid = -1;
-	tfork_global->level2_errno = ECANCELED;
-	tfork_global->level3_pid = -1;
+	pid = fork();
+	if (pid == -1) {
+		close(status_sp_caller_fd);
+		close(status_sp_waiter_fd);
+		close(event_pipe_caller_fd);
+		close(event_pipe_waiter_fd);
+		close(ready_pipe_caller_fd);
+		close(ready_pipe_worker_fd);
+		return -1;
+	}
+	if (pid != 0) {
+		/* The caller */
 
-	if (status_fd != NULL) {
-		ret = pipe(&tfork_global->status_pipe[0]);
-		if (ret != 0) {
-			int saved_errno = errno;
+		state->waiter_pid = pid;
+
+		close(status_sp_waiter_fd);
+		close(event_pipe_waiter_fd);
+		close(ready_pipe_worker_fd);
+
+		set_blocking(event_pipe_caller_fd, false);
 
-			anonymous_shared_free(tfork_global);
-			tfork_global = NULL;
-			errno = saved_errno;
+		/*
+		 * wait for the waiter to get ready.
+		 */
+		nread = sys_read(status_sp_caller_fd, &c, sizeof(char));
+		if (nread != sizeof(char)) {
 			return -1;
 		}
 
-		*status_fd = tfork_global->status_pipe[0];
+		/*
+		 * Notify the worker to start.
+		 */
+		nwritten = sys_write(ready_pipe_caller_fd,
+				     &(char){0}, sizeof(char));
+		if (nwritten != sizeof(char)) {
+			close(ready_pipe_caller_fd);
+			return -1;
+		}
+		close(ready_pipe_caller_fd);
+
+		*_event_fd = event_pipe_caller_fd;
+		*_status_fd = status_sp_caller_fd;
+
+		return pid;
 	}
 
+#ifndef HAVE_PTHREAD
+	/* cleanup sigchld_handler */
+	tfork_atfork_child();
+#endif
+
 	/*
-	 * We need to set our own signal handler to prevent any existing signal
-	 * handler from reaping our child.
+	 * The "waiter" child.
 	 */
-	tfork_global->old_sig_chld = CatchSignal(SIGCHLD, tfork_sig_chld);
+	CatchSignal(SIGCHLD, SIG_DFL);
+
+	close(status_sp_caller_fd);
+	close(event_pipe_caller_fd);
+	close(ready_pipe_caller_fd);
 
 	pid = fork();
+	if (pid == -1) {
+		state->waiter_errno = errno;
+		_exit(0);
+	}
 	if (pid == 0) {
-		int level2_pipe[2];
-		char c;
-		ssize_t nread;
-
 		/*
-		 * Child level 1.
-		 *
-		 * Restore SIGCHLD handler
+		 * The worker child.
 		 */
-		CatchSignal(SIGCHLD, SIG_DFL);
 
-		/*
-		 * Close read end of the signal pipe, we don't need it anymore
-		 * and don't want to leak it into childs.
-		 */
-		if (tfork_global->status_pipe[0] != -1) {
-			close(tfork_global->status_pipe[0]);
-			tfork_global->status_pipe[0] = -1;
-		}
+		close(status_sp_waiter_fd);
+		close(event_pipe_waiter_fd);
 
 		/*
-		 * Create a pipe for waiting for the child level 2 to finish
-		 * forking.
+		 * Wait for the caller to give us a go!
 		 */
-		ret = pipe(&level2_pipe[0]);
-		if (ret != 0) {
-			tfork_global->level1_errno = errno;
-			_exit(0);
+		nread = sys_read(ready_pipe_worker_fd, &c, sizeof(char));
+		if (nread != sizeof(char)) {
+			_exit(1);
 		}
+		close(ready_pipe_worker_fd);
 
-		pid = fork();
-		if (pid == 0) {
+		return 0;
+	}
+	state->worker_pid = pid;
+
+	close(ready_pipe_worker_fd);
+
+	/*
+	 * We're going to stay around until child2 exits, so lets close all fds
+	 * other then the pipe fd we may have inherited from the caller.
+	 *
+	 * Dup event_sp_waiter_fd and status_sp_waiter_fd onto fds 0 and 1 so we
+	 * can then call closefrom(2).
+	 */
+	if (event_pipe_waiter_fd > 0) {
+		int dup_fd = 0;
 
-			/*
-			 * Child level 2.
-			 */
+		if (status_sp_waiter_fd == 0) {
+			dup_fd = 1;
+		}
 
-			close(level2_pipe[0]);
-			return level2_fork_and_wait(level2_pipe[1]);
+		do {
+			fd = dup2(event_pipe_waiter_fd, dup_fd);
+		} while ((fd == -1) && (errno == EINTR));
+		if (fd == -1) {
+			state->waiter_errno = errno;
+			kill(state->worker_pid, SIGKILL);
+			state->worker_pid = -1;
+			_exit(1);
 		}
+		event_pipe_waiter_fd = fd;
+	}
 
-		tfork_global->level2_pid = pid;
-		if (tfork_global->level2_pid == -1) {
-			tfork_global->level1_errno = errno;
-			_exit(0);
+	if (status_sp_waiter_fd > 1) {
+		do {
+			fd = dup2(status_sp_waiter_fd, 1);
+		} while ((fd == -1) && (errno == EINTR));
+		if (fd == -1) {
+			state->waiter_errno = errno;
+			kill(state->worker_pid, SIGKILL);
+			state->worker_pid = -1;
+			_exit(1);
 		}
+		status_sp_waiter_fd = fd;
+	}
 
-		close(level2_pipe[1]);
-		level2_pipe[1] = -1;
+	closefrom(2);
 
-		nread = sys_read(level2_pipe[0], &c, 1);
-		if (nread != 1) {
-			abort();
+	/* Tell the caller we're ready */
+	nwritten = sys_write(status_sp_waiter_fd, &(char){0}, sizeof(char));
+	if (nwritten != sizeof(char)) {
+		_exit(1);
+	}
+
+	tfork_global_free();
+	state = NULL;
+
+	do {
+		ret = waitpid(pid, &status, 0);
+	} while ((ret == -1) && (errno == EINTR));
+	if (ret == -1) {
+		status = errno;
+		kill(pid, SIGKILL);
+	}
+
+	/*
+	 * This writes the worker child exit status via our internal socketpair
+	 * so the tfork_status() implementation can read it from its end.
+	 */
+	nwritten = sys_write(status_sp_waiter_fd, &status, sizeof(status));
+	if (nwritten == -1) {
+		if (errno != EPIPE && errno != ECONNRESET) {
+			_exit(errno);
 		}
+		/*
+		 * The caller exitted and didn't call tfork_status().
+		 */
 		_exit(0);
 	}
+	if (nwritten != sizeof(status)) {
+		_exit(1);
+	}
 
-	tfork_global->level1_pid = pid;
-	if (tfork_global->level1_pid == -1) {
-		int saved_errno = errno;
-
-		anonymous_shared_free(tfork_global);
-		tfork_global = NULL;
-		errno = saved_errno;
-		return -1;
+	/*
+	 * This write to the event_fd returned by tfork_event_fd() and notifies
+	 * the caller that the worker child is done and he may now call
+	 * tfork_status().
+	 */
+	nwritten = sys_write(event_pipe_waiter_fd, &(char){0}, sizeof(char));
+	if (nwritten != sizeof(char)) {
+		_exit(1);
 	}
 
 	/*
-	 * By using the helper variable pid we avoid a TOCTOU with the signal
-	 * handler that will set tfork_global->level1_pid to -1 (which would
-	 * cause waitpid() to block waiting for another exitted child).
+	 * Wait for our parent (the process that called tfork_create()) to
+	 * close() the socketpair fd in tfork_status().
 	 *
-	 * We can't avoid the race waiting for pid twice (in the signal handler
-	 * and then again here in the while loop), but we must avoid waiting for
-	 * -1 and this does the trick.
+	 * Again, the caller might have exitted without calling tfork_status().
 	 */
-	pid = tfork_global->level1_pid;
-
-	while (tfork_global->level1_pid != -1) {
-		ret = waitpid(pid, &tfork_global->level0_status, 0);
-		if (ret == -1 && errno == EINTR) {
-			continue;
+	nread = sys_read(status_sp_waiter_fd, &c, 1);
+	if (nread == -1) {
+		if (errno == EPIPE || errno == ECONNRESET) {
+			_exit(0);
 		}
+		_exit(errno);
+	}
+	if (nread != 0) {
+		_exit(255);
+	}
+
+	_exit(0);
+}
 
-		break;
+static int tfork_create_reap_waiter(pid_t waiter_pid)
+{
+	pid_t pid;
+	int waiter_status;
+
+	if (waiter_pid == -1) {
+		return 0;
 	}
 
-	CatchSignal(SIGCHLD, tfork_global->old_sig_chld);
+	kill(waiter_pid, SIGKILL);
 
-	if (tfork_global->level0_status != 0) {
-		anonymous_shared_free(tfork_global);
-		tfork_global = NULL;
-		errno = ECHILD;
-		return -1;
+	do {
+		pid = waitpid(waiter_pid, &waiter_status, 0);
+	} while ((pid == -1) && (errno == EINTR));
+	assert(pid == waiter_pid);
+
+	return 0;
+}
+
+struct tfork *tfork_create(void)
+{
+	struct tfork_state *state = NULL;
+	struct tfork *t = NULL;
+	pid_t pid;
+	int saved_errno;
+	int ret = 0;
+
+#ifdef HAVE_PTHREAD
+	ret = pthread_once(&tfork_global_is_initialized,
+			   tfork_global_initialize);
+	if (ret != 0) {
+		return NULL;
+	}
+#else
+	tfork_global_initialize();
+#endif
+
+	state = tfork_global_get();
+	if (state == NULL) {
+		return NULL;
+	}
+	*state = (struct tfork_state) {
+		.waiter_pid = -1,
+		.waiter_errno = ECANCELED,
+		.worker_pid = -1,
+	};
+
+	t = malloc(sizeof(struct tfork));
+	if (t == NULL) {
+		ret = -1;
+		goto cleanup;
 	}
 
-	if (tfork_global->level2_pid == -1) {
-		int saved_errno = tfork_global->level1_errno;
+	*t = (struct tfork) {
+		.event_fd = -1,
+		.status_fd = -1,
+		.waiter_pid = -1,
+		.worker_pid = -1,
+	};
+
+	ret = tfork_install_sigchld_handler(&state->waiter_pid);
+	if (ret != 0) {
+		goto cleanup;
+	}
+
+	pid = tfork_start_waiter_and_worker(state,
+					    &t->event_fd,
+					    &t->status_fd);
+	if (pid == -1) {
+		ret = -1;
+		goto cleanup;
+	}
+	if (pid == 0) {
+		/* In the worker */
+		tfork_global_free();
+		t->worker_pid = 0;
+		return t;
+	}
+
+	t->waiter_pid = pid;
+	t->worker_pid = state->worker_pid;
+
+cleanup:
+	if (ret == -1) {
+		saved_errno = errno;
+
+		if (t != NULL) {
+			if (t->status_fd != -1) {
+				close(t->status_fd);
+			}
+			if (t->event_fd != -1) {
+				close(t->event_fd);
+			}
+
+			ret = tfork_create_reap_waiter(state->waiter_pid);
+			assert(ret == 0);
+
+			free(t);
+			t = NULL;
+		}
+	}
+
+	ret = tfork_uninstall_sigchld_handler();
+	assert(ret == 0);
+
+	tfork_global_free();
 
-		anonymous_shared_free(tfork_global);
-		tfork_global = NULL;
+	if (ret == -1) {
 		errno = saved_errno;
+	}
+	return t;
+}
+
+pid_t tfork_child_pid(const struct tfork *t)
+{
+	return t->worker_pid;
+}
+
+int tfork_event_fd(const struct tfork *t)
+{
+	return t->event_fd;
+}
+
+int tfork_status(struct tfork **_t, bool wait)
+{
+	struct tfork *t = *_t;
+	int status;
+	ssize_t nread;
+	int waiter_status;
+	pid_t pid;
+	int ret;
+
+	if (t == NULL) {
 		return -1;
 	}
 
-	if (tfork_global->level3_pid == -1) {
-		int saved_errno = tfork_global->level2_errno;
+	if (wait) {
+		set_blocking(t->status_fd, true);
 
-		anonymous_shared_free(tfork_global);
-		tfork_global = NULL;
-		errno = saved_errno;
+		nread = sys_read(t->status_fd, &status, sizeof(int));
+	} else {
+		set_blocking(t->status_fd, false);
+
+		nread = read(t->status_fd, &status, sizeof(int));
+		if ((nread == -1) &&
+		    ((errno == EAGAIN) || (errno == EWOULDBLOCK) || errno == EINTR)) {
+			errno = EAGAIN;
+			return -1;
+		}
+	}
+	if (nread != sizeof(int)) {
 		return -1;
 	}
 
-	child = tfork_global->level3_pid;
-	anonymous_shared_free(tfork_global);
-	tfork_global = NULL;
+	ret = tfork_install_sigchld_handler(&t->waiter_pid);
+	if (ret != 0) {
+		return -1;
+	}
+
+	/*
+	 * This triggers process exit in the waiter.
+	 */
+	close(t->status_fd);
+
+	do {
+		pid = waitpid(t->waiter_pid, &waiter_status, 0);
+	} while ((pid == -1) && (errno == EINTR));
+	assert(pid == t->waiter_pid);
+
+	close(t->event_fd);
+
+	free(t);
+	t = NULL;
+	*_t = NULL;
+
+	ret = tfork_uninstall_sigchld_handler();
+	assert(ret == 0);
+
+	return status;
+}
+
+int tfork_destroy(struct tfork **_t)
+{
+        struct tfork *t = *_t;
+        int ret;
+
+        if (t == NULL) {
+                errno = EINVAL;
+                return -1;
+        }
+
+        kill(t->worker_pid, SIGKILL);
+
+        ret = tfork_status(_t, true);
+        if (ret == -1) {
+                return -1;
+        }
 
-	return child;
+        return 0;
 }
diff --git a/lib/util/tfork.h b/lib/util/tfork.h
index 0c62fc3..1fea2ba 100644
--- a/lib/util/tfork.h
+++ b/lib/util/tfork.h
@@ -21,32 +21,86 @@
 #ifndef LIB_UTIL_TFORK_H
 #define LIB_UTIL_TFORK_H
 
+struct tfork;
+
 /**
  * @brief a fork() that avoids SIGCHLD and waitpid
  *
- * This function is a workaround for the problem of using fork() in
- * library code. In that case the library should avoid to set a global
- * signal handler for SIGCHLD, because the application may wants to use its
- * own handler.
- *
- * The child process will start with SIGCHLD handler set to SIG_DFL, so the
- * child might need to setup its own handler.
- *
- * @param[out] status_fd  If this is not NULL, tfork creates a pipe and returns
- *                        the readable end via this pointer. The caller can
- *                        wait for the process to finish by polling the
- *                        status_fd for readability and can then read the exit
- *                        status (an int).
- *
- * @param[out] parent     The PID of the parent process, if 0 is returned
- *                        otherwise the variable will not be touched at all.
- *                        It is possible to pass NULL.
- *
- * @return                On success, the PID of the child process is returned
- *                        in the parent, and 0 is returned in the child. On
- *                        failure, -1 is returned in the parent, no child
- *                        process is created, and errno is set appropriately.
- */
-int tfork(int *status_fd, int *parent);
+ * This function is a solution to the problem of fork() requiring special
+ * preperations in the caller to handle SIGCHLD signals and to reap the child by
+ * wait()ing for it.
+ *
+ * The advantage over fork() is that the child process termination is signalled
+ * to the caller by making a pipe fd readable returned by tfork_event_fd(), in
+ * which case the exit status of the child can be fetched with tfork_status()
+ * without blocking.
+ *
+ * The child process will start with SIGCHLD handler set to SIG_DFL.
+ *
+ * @return                On success, a struct tfork. NULL on failure.
+ *                        Use tfork_worker_pid() to get the pid of the created
+ *                        child and tfork_event_fd() to get the file descriptor
+ *                        that can be used to poll for process termination and
+ *                        reading the child process exit status.
+ *
+ * @note There's one thing this thing can't protect us against and that is if a
+ * process installs a SIGCHLD handler from one thread while another thread is
+ * running inside tfork_create() or tfork_status() and the signal handler
+ * doesn't forward signals for exitted childs it didn't fork, ie our childs.
+ **/
+struct tfork *tfork_create(void);
+
+/**
+ * @brief Return the child pid from tfork_create()
+ *
+ * @param[in]   t    Pointer to struct tfork returned by tfork_create()
+ *
+ * @return           In the caller this returns the pid of the child,
+ *                   in the child this returns 0.
+ **/
+pid_t tfork_child_pid(const struct tfork *t);
+
+/**
+ * @brief Return an event fd that signals child termination
+ *
+ * @param[in]   t    Pointer to struct tfork returned by tfork_create()
+ *
+ * @return           An fd that becomes readable when the child created with
+ *                   tfork_create() terminates. It is guaranteed that a
+ *                   subsequent call to tfork_status() will not block and return
+ *                   the exit status of the child.
+ **/
+int tfork_event_fd(const struct tfork *t);
+
+/**
+ * @brief Wait for the child to terminate and return its exit status
+ *
+ * @param[in]   t     Pointer-pointer to a struct tfork returned by
+ *                    tfork_create(). Upon successful completion t is freed and
+ *                    set to NULL.
+ *
+ * @param[in]   wait  Whether to wait for the child to change state. If wait is
+ *                    false, and the child hasn't changed state, tfork_status()
+ *                    will return -1 with errno set to EAGAIN. If wait is true,
+ *                    tfork_status() will block waiting for the child to change
+ *                    runstate.
+ *
+ * @return            The exit status of the child, -1 on error.
+ *
+ * @note We overload the return value a bit, but a process exit status is pretty
+ * much guaranteed to be a 16-bit int and can't be -1.
+ **/
+int tfork_status(struct tfork **_t, bool wait);
+
+/**
+ * @brief Terminate the child discarding the exit status
+ *
+ * @param[in]   t     Pointer-pointer to a struct tfork returned by
+ *                    tfork_create(). Upon successful completion t is freed and
+ *                    set to NULL.
+ *
+ * @return            0 on success, -1 on error.
+ **/
+int tfork_destroy(struct tfork **_t);
 
 #endif /* LIB_UTIL_TFORK_H */
-- 
2.9.4


From 0f7a20f9f9bb84713f37f2c4b4c701b067efc62f Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Tue, 16 May 2017 18:36:03 +0200
Subject: [PATCH 2/4] lib/util: make use of tfork in samba_runcmd_send()

This makes it possible to use samba_runcmd_send() in processes like smbd
that install a SIGCHLD handler that reaps all terminated children.

Signed-off-by: Ralph Boehme <slow at samba.org>
Reviewed-by: Stefan Metzmacher <metze at samba.org>
---
 lib/util/util_runcmd.c | 105 ++++++++++++++++++++++++++-----------------------
 lib/util/util_runcmd.h |   4 +-
 2 files changed, 59 insertions(+), 50 deletions(-)

diff --git a/lib/util/util_runcmd.c b/lib/util/util_runcmd.c
index 9264cbb..ac74371 100644
--- a/lib/util/util_runcmd.c
+++ b/lib/util/util_runcmd.c
@@ -29,6 +29,8 @@
 #include "system/filesys.h"
 #include "../lib/util/tevent_unix.h"
 #include "../lib/util/util_runcmd.h"
+#include "../lib/util/tfork.h"
+#include "../lib/util/sys_rw.h"
 
 static int samba_runcmd_state_destructor(struct samba_runcmd_state *state)
 {
@@ -106,8 +108,9 @@ struct tevent_req *samba_runcmd_send(TALLOC_CTX *mem_ctx,
 		return tevent_req_post(req, ev);
 	}
 
-	state->pid = fork();
-	if (state->pid == (pid_t)-1) {
+	state->tfork = tfork_create();
+	if (state->tfork == NULL) {
+		printf("state->tfork == NULL\n");
 		close(p1[0]);
 		close(p1[1]);
 		close(p2[0]);
@@ -117,7 +120,7 @@ struct tevent_req *samba_runcmd_send(TALLOC_CTX *mem_ctx,
 		tevent_req_error(req, errno);
 		return tevent_req_post(req, ev);
 	}
-
+	state->pid = tfork_child_pid(state->tfork);
 	if (state->pid != 0) {
 		/* the parent */
 		close(p1[1]);
@@ -126,14 +129,17 @@ struct tevent_req *samba_runcmd_send(TALLOC_CTX *mem_ctx,
 		state->fd_stdout = p1[0];
 		state->fd_stderr = p2[0];
 		state->fd_stdin  = p3[1];
+		state->fd_status = tfork_event_fd(state->tfork);
 
 		set_blocking(state->fd_stdout, false);
 		set_blocking(state->fd_stderr, false);
 		set_blocking(state->fd_stdin,  false);
+		set_blocking(state->fd_status, false);
 
 		smb_set_close_on_exec(state->fd_stdin);
 		smb_set_close_on_exec(state->fd_stdout);
 		smb_set_close_on_exec(state->fd_stderr);
+		smb_set_close_on_exec(state->fd_status);
 
 		talloc_set_destructor(state, samba_runcmd_state_destructor);
 
@@ -145,6 +151,7 @@ struct tevent_req *samba_runcmd_send(TALLOC_CTX *mem_ctx,
 		if (tevent_req_nomem(state->fde_stdout, req)) {
 			close(state->fd_stdout);
 			close(state->fd_stderr);
+			close(state->fd_status);
 			return tevent_req_post(req, ev);
 		}
 		tevent_fd_set_auto_close(state->fde_stdout);
@@ -155,11 +162,26 @@ struct tevent_req *samba_runcmd_send(TALLOC_CTX *mem_ctx,
 						  samba_runcmd_io_handler,
 						  req);
 		if (tevent_req_nomem(state->fde_stdout, req)) {
+			close(state->fd_stdout);
 			close(state->fd_stderr);
+			close(state->fd_status);
 			return tevent_req_post(req, ev);
 		}
 		tevent_fd_set_auto_close(state->fde_stderr);
 
+		state->fde_status = tevent_add_fd(ev, state,
+						  state->fd_status,
+						  TEVENT_FD_READ,
+						  samba_runcmd_io_handler,
+						  req);
+		if (tevent_req_nomem(state->fde_stdout, req)) {
+			close(state->fd_stdout);
+			close(state->fd_stderr);
+			close(state->fd_status);
+			return tevent_req_post(req, ev);
+		}
+		tevent_fd_set_auto_close(state->fde_status);
+
 		if (!timeval_is_zero(&endtime)) {
 			tevent_req_set_endtime(req, ev, endtime);
 		}
@@ -231,6 +253,10 @@ static void samba_runcmd_io_handler(struct tevent_context *ev,
 	char *p;
 	int n, fd;
 
+	if (!(flags & TEVENT_FD_READ)) {
+		return;
+	}
+
 	if (fde == state->fde_stdout) {
 		level = state->stdout_log_level;
 		fd = state->fd_stdout;
@@ -238,10 +264,33 @@ static void samba_runcmd_io_handler(struct tevent_context *ev,
 		level = state->stderr_log_level;
 		fd = state->fd_stderr;
 	} else {
-		return;
-	}
+		int status;
 
-	if (!(flags & TEVENT_FD_READ)) {
+		status = tfork_status(&state->tfork, false);
+		if (status == -1) {
+			if (errno == EAGAIN || errno == EWOULDBLOCK) {
+				return;
+			}
+			DBG_ERR("Bad read on status pipe\n");
+			tevent_req_error(req, errno);
+			return;
+		}
+
+		if (WIFEXITED(status)) {
+			status = WEXITSTATUS(status);
+		} else if (WIFSIGNALED(status)) {
+			status = WTERMSIG(status);
+		} else {
+			status = ECHILD;
+		}
+
+		DBG_NOTICE("Child %s exited %d\n", state->arg0, status);
+		if (status != 0) {
+			tevent_req_error(req, status);
+			return;
+		}
+
+		tevent_req_done(req);
 		return;
 	}
 
@@ -253,53 +302,11 @@ static void samba_runcmd_io_handler(struct tevent_context *ev,
 		if (fde == state->fde_stdout) {
 			talloc_free(fde);
 			state->fde_stdout = NULL;
+			return;
 		}
 		if (fde == state->fde_stderr) {
 			talloc_free(fde);
 			state->fde_stderr = NULL;
-		}
-		if (state->fde_stdout == NULL &&
-		    state->fde_stderr == NULL) {
-			int status;
-			/* the child has closed both stdout and
-			 * stderr, assume its dead */
-			pid_t pid = waitpid(state->pid, &status, 0);
-			if (pid != state->pid) {
-				if (errno == ECHILD) {
-					/* this happens when the
-					   parent has set SIGCHLD to
-					   SIG_IGN. In that case we
-					   can only get error
-					   information for the child
-					   via its logging. We should
-					   stop using SIG_IGN on
-					   SIGCHLD in the standard
-					   process model.
-					*/
-					DEBUG(0, ("Error in waitpid() unexpectedly got ECHILD "
-						  "for %s child %d - %s, "
-						  "someone has set SIGCHLD to SIG_IGN!\n",
-					state->arg0, (int)state->pid, strerror(errno)));
-					tevent_req_error(req, errno);
-					return;
-				}
-				DEBUG(0,("Error in waitpid() for child %s - %s \n",
-					 state->arg0, strerror(errno)));
-				if (errno == 0) {
-					errno = ECHILD;
-				}
-				tevent_req_error(req, errno);
-				return;
-			}
-			status = WEXITSTATUS(status);
-			DEBUG(3,("Child %s exited with status %d\n",
-				 state->arg0, status));
-			if (status != 0) {
-				tevent_req_error(req, status);
-				return;
-			}
-
-			tevent_req_done(req);
 			return;
 		}
 		return;
diff --git a/lib/util/util_runcmd.h b/lib/util/util_runcmd.h
index 26fdbc6..5532961 100644
--- a/lib/util/util_runcmd.h
+++ b/lib/util/util_runcmd.h
@@ -27,9 +27,11 @@ struct samba_runcmd_state {
 	int stderr_log_level;
 	struct tevent_fd *fde_stdout;
 	struct tevent_fd *fde_stderr;
-	int fd_stdin, fd_stdout, fd_stderr;
+	struct tevent_fd *fde_status;
+	int fd_stdin, fd_stdout, fd_stderr, fd_status;
 	char *arg0;
 	pid_t pid;
+	struct tfork *tfork;
 	char buf[1024];
 	uint16_t buf_used;
 };
-- 
2.9.4


From 566e81ada385f77905eb165da3ed0903e55e88d6 Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Thu, 18 May 2017 12:02:22 +0200
Subject: [PATCH 3/4] lib/util: adjust loglevel in tfork test with
 samba_runcmd_send()

No change in behaviour, this just ensures stdout and stderror are
logged with log level 0.

Signed-off-by: Ralph Boehme <slow at samba.org>
Reviewed-by: Stefan Metzmacher <metze at samba.org>
---
 lib/util/tests/tfork.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/util/tests/tfork.c b/lib/util/tests/tfork.c
index 2f140dd..2bb941d 100644
--- a/lib/util/tests/tfork.c
+++ b/lib/util/tests/tfork.c
@@ -103,7 +103,7 @@ static bool test_tfork_cmd_send(struct torture_context *tctx)
 	torture_assert_goto(tctx, cmd[0] != NULL, ok, done,
 			    "talloc_asprintf failed\n");
 
-	req = samba_runcmd_send(tctx, ev, timeval_zero(), 1, 0,
+	req = samba_runcmd_send(tctx, ev, timeval_zero(), 0, 0,
 				cmd, "foo", NULL);
 	torture_assert_goto(tctx, req != NULL, ok, done,
 			    "samba_runcmd_send failed\n");
-- 
2.9.4


From 7dfbd6a2bf719ee51335d4f3c40a98fbf87f5db6 Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Fri, 26 May 2017 18:10:07 +0200
Subject: [PATCH 4/4] lib/util: add more tfork tests

Signed-off-by: Ralph Boehme <slow at samba.org>
Reviewed-by: Stefan Metzmacher <metze at samba.org>
---
 lib/util/tests/tfork.c | 438 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 438 insertions(+)

diff --git a/lib/util/tests/tfork.c b/lib/util/tests/tfork.c
index 2bb941d..c900502 100644
--- a/lib/util/tests/tfork.c
+++ b/lib/util/tests/tfork.c
@@ -29,6 +29,10 @@
 #include "lib/util/tfork.h"
 #include "lib/util/samba_util.h"
 #include "lib/util/sys_rw.h"
+#ifdef HAVE_PTHREAD
+#include <pthread.h>
+#include <sys/syscall.h>
+#endif
 
 static bool test_tfork_simple(struct torture_context *tctx)
 {
@@ -88,6 +92,416 @@ done:
 	return ok;
 }
 
+static bool test_tfork_sigign(struct torture_context *tctx)
+{
+	struct tfork *t = NULL;
+	struct sigaction act;
+	pid_t child;
+	int status;
+	bool ok = true;
+	int ret;
+
+	act = (struct sigaction) {
+		.sa_flags = SA_NOCLDWAIT,
+		.sa_handler = SIG_IGN,
+	};
+
+	ret = sigaction(SIGCHLD, &act, NULL);
+	torture_assert_goto(tctx, ret == 0, ok, done, "sigaction failed\n");
+
+	t = tfork_create();
+	if (t == NULL) {
+		torture_fail(tctx, "tfork failed\n");
+		return false;
+	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		sleep(1);
+		_exit(123);
+	}
+
+	child = fork();
+	if (child == -1) {
+		torture_fail(tctx, "fork failed\n");
+		return false;
+	}
+	if (child == 0) {
+		_exit(0);
+	}
+
+	status = tfork_status(&t, true);
+	if (status == -1) {
+		torture_fail(tctx, "tfork_status failed\n");
+	}
+
+	torture_assert_goto(tctx, WIFEXITED(status) == true, ok, done,
+			    "tfork failed\n");
+	torture_assert_goto(tctx, WEXITSTATUS(status) == 123, ok, done,
+			    "tfork failed\n");
+	torture_comment(tctx, "exit status [%d]\n", WEXITSTATUS(status));
+
+done:
+	return ok;
+}
+
+static void sigchld_handler1(int signum, siginfo_t *si, void *u)
+{
+	pid_t pid;
+	int status;
+
+	if (signum != SIGCHLD) {
+		abort();
+	}
+
+	pid = waitpid(si->si_pid, &status, 0);
+	if (pid != si->si_pid) {
+		abort();
+	}
+}
+
+static bool test_tfork_sighandler(struct torture_context *tctx)
+{
+	struct tfork *t = NULL;
+	struct sigaction act;
+	struct sigaction oldact;
+	pid_t child;
+	int status;
+	bool ok = true;
+	int ret;
+
+	act = (struct sigaction) {
+		.sa_flags = SA_SIGINFO,
+		.sa_sigaction = sigchld_handler1,
+	};
+
+	ret = sigaction(SIGCHLD, &act, &oldact);
+	torture_assert_goto(tctx, ret == 0, ok, done, "sigaction failed\n");
+
+	t = tfork_create();
+	if (t == NULL) {
+		torture_fail(tctx, "tfork failed\n");
+		return false;
+	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		sleep(1);
+		_exit(123);
+	}
+
+	child = fork();
+	if (child == -1) {
+		torture_fail(tctx, "fork failed\n");
+		return false;
+	}
+	if (child == 0) {
+		_exit(0);
+	}
+
+	status = tfork_status(&t, true);
+	if (status == -1) {
+		torture_fail(tctx, "tfork_status failed\n");
+	}
+
+	torture_assert_goto(tctx, WIFEXITED(status) == true, ok, done,
+			    "tfork failed\n");
+	torture_assert_goto(tctx, WEXITSTATUS(status) == 123, ok, done,
+			    "tfork failed\n");
+	torture_comment(tctx, "exit status [%d]\n", WEXITSTATUS(status));
+
+done:
+	sigaction(SIGCHLD, &oldact, NULL);
+
+	return ok;
+}
+
+static bool test_tfork_process_hierarchy(struct torture_context *tctx)
+{
+	struct tfork *t = NULL;
+	pid_t pid = getpid();
+	pid_t child;
+	pid_t pgid = getpgid(0);
+	pid_t sid = getsid(0);
+	char *procpath = NULL;
+	int status;
+	struct stat st;
+	int ret;
+	bool ok = true;
+
+	procpath = talloc_asprintf(tctx, "/proc/%d/status", getpid());
+	torture_assert_not_null(tctx, procpath, "talloc_asprintf failed\n");
+
+	ret = stat(procpath, &st);
+	TALLOC_FREE(procpath);
+	if (ret != 0) {
+		if (errno == ENOENT) {
+			torture_skip(tctx, "/proc missing\n");
+		}
+		torture_fail(tctx, "stat failed\n");
+	}
+
+	t = tfork_create();
+	if (t == NULL) {
+		torture_fail(tctx, "tfork failed\n");
+		return false;
+	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		char *cmd = NULL;
+		FILE *fp = NULL;
+		char line[64];
+		char *p;
+		pid_t ppid;
+
+		torture_assert_goto(tctx, pgid == getpgid(0), ok, child_fail, "tfork failed\n");
+		torture_assert_goto(tctx, sid == getsid(0), ok, child_fail, "tfork failed\n");
+
+		cmd = talloc_asprintf(tctx, "cat /proc/%d/status | awk '/^PPid:/ {print $2}'", getppid());
+		torture_assert_goto(tctx, cmd != NULL, ok, child_fail, "talloc_asprintf failed\n");
+
+		fp = popen(cmd, "r");
+		torture_assert_goto(tctx, fp != NULL, ok, child_fail, "popen failed\n");
+
+		p = fgets(line, sizeof(line) - 1, fp);
+		pclose(fp);
+		torture_assert_goto(tctx, p != NULL, ok, child_fail, "popen failed\n");
+
+		ret = sscanf(line, "%d", &ppid);
+		torture_assert_goto(tctx, ret == 1, ok, child_fail, "sscanf failed\n");
+		torture_assert_goto(tctx, ppid == pid, ok, child_fail, "process hierachy not rooted at caller\n");
+
+		_exit(0);
+
+	child_fail:
+		_exit(1);
+	}
+
+	status = tfork_status(&t, true);
+	if (status == -1) {
+		torture_fail(tctx, "tfork_status failed\n");
+	}
+
+	torture_assert_goto(tctx, WIFEXITED(status) == true, ok, done,
+			    "tfork failed\n");
+	torture_assert_goto(tctx, WEXITSTATUS(status) == 0, ok, done,
+			    "tfork failed\n");
+	torture_comment(tctx, "exit status [%d]\n", WEXITSTATUS(status));
+
+done:
+	return ok;
+}
+
+static bool test_tfork_pipe(struct torture_context *tctx)
+{
+	struct tfork *t = NULL;
+	int status;
+	pid_t child;
+	int up[2];
+	int down[2];
+	char c;
+	int ret;
+	bool ok = true;
+
+	ret = pipe(&up[0]);
+	torture_assert(tctx, ret == 0, "pipe failed\n");
+
+	ret = pipe(&down[0]);
+	torture_assert(tctx, ret == 0, "pipe failed\n");
+
+	t = tfork_create();
+	if (t == NULL) {
+		torture_fail(tctx, "tfork failed\n");
+		return false;
+	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		close(up[0]);
+		close(down[1]);
+
+		ret = read(down[0], &c, 1);
+		torture_assert_goto(tctx, ret == 1, ok, child_fail, "read failed\n");
+		torture_assert_goto(tctx, c == 1, ok, child_fail, "read failed\n");
+
+		ret = write(up[1], &(char){2}, 1);
+		torture_assert_goto(tctx, ret == 1, ok, child_fail, "write failed\n");
+
+		_exit(0);
+
+	child_fail:
+		_exit(1);
+	}
+
+	close(up[1]);
+	close(down[0]);
+
+	ret = write(down[1], &(char){1}, 1);
+	torture_assert(tctx, ret == 1, "read failed\n");
+
+	ret = read(up[0], &c, 1);
+	torture_assert(tctx, ret == 1, "read failed\n");
+	torture_assert(tctx, c == 2, "read failed\n");
+
+	status = tfork_status(&t, true);
+	if (status == -1) {
+		torture_fail(tctx, "tfork_status failed\n");
+	}
+
+	torture_assert_goto(tctx, WIFEXITED(status) == true, ok, done,
+			    "tfork failed\n");
+	torture_assert_goto(tctx, WEXITSTATUS(status) == 0, ok, done,
+			    "tfork failed\n");
+done:
+	return ok;
+}
+
+static bool test_tfork_twice(struct torture_context *tctx)
+{
+	struct tfork *t = NULL;
+	int status;
+	pid_t child;
+	pid_t pid;
+	int up[2];
+	int ret;
+	bool ok = true;
+
+	ret = pipe(&up[0]);
+	torture_assert(tctx, ret == 0, "pipe failed\n");
+
+	t = tfork_create();
+	if (t == NULL) {
+		torture_fail(tctx, "tfork failed\n");
+		return false;
+	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		close(up[0]);
+
+		t = tfork_create();
+		if (t == NULL) {
+			torture_fail(tctx, "tfork failed\n");
+			return false;
+		}
+		child = tfork_child_pid(t);
+		if (child == 0) {
+			sleep(1);
+			pid = getpid();
+			ret = write(up[1], &pid, sizeof(pid_t));
+			torture_assert_goto(tctx, ret == sizeof(pid_t), ok, child_fail, "write failed\n");
+
+			_exit(0);
+
+		child_fail:
+			_exit(1);
+		}
+
+		_exit(0);
+	}
+
+	close(up[1]);
+
+	ret = read(up[0], &pid, sizeof(pid_t));
+	torture_assert(tctx, ret == sizeof(pid_t), "read failed\n");
+
+	status = tfork_status(&t, true);
+	torture_assert_goto(tctx, status != -1, ok, done, "tfork_status failed\n");
+
+	torture_assert_goto(tctx, WIFEXITED(status) == true, ok, done,
+			    "tfork failed\n");
+	torture_assert_goto(tctx, WEXITSTATUS(status) == 0, ok, done,
+			    "tfork failed\n");
+done:
+	return ok;
+}
+
+static void *tfork_thread(void *p)
+{
+	struct tfork *t = NULL;
+	int status;
+	pid_t child;
+	pthread_t *ptid = (pthread_t *)p;
+	uint64_t tid;
+	uint64_t *result = NULL;
+	int up[2];
+	ssize_t nread;
+	int ret;
+
+	ret = pipe(up);
+	if (ret != 0) {
+		pthread_exit(NULL);
+	}
+
+	tid = (uint64_t)*ptid;
+
+	t = tfork_create();
+	if (t == NULL) {
+		pthread_exit(NULL);
+	}
+	child = tfork_child_pid(t);
+	if (child == 0) {
+		ssize_t nwritten;
+
+		close(up[0]);
+		tid++;
+		nwritten = sys_write(up[1], &tid, sizeof(uint64_t));
+		if (nwritten != sizeof(uint64_t)) {
+			_exit(1);
+		}
+		_exit(0);
+	}
+	close(up[1]);
+
+	result = malloc(sizeof(uint64_t));
+	if (result == NULL) {
+		pthread_exit(NULL);
+	}
+
+	nread = sys_read(up[0], result, sizeof(uint64_t));
+	if (nread != sizeof(uint64_t)) {
+		pthread_exit(NULL);
+	}
+
+	status = tfork_status(&t, true);
+	if (status == -1) {
+		pthread_exit(NULL);
+	}
+
+	pthread_exit(result);
+}
+
+static bool test_tfork_threads(struct torture_context *tctx)
+{
+	int ret;
+	bool ok = true;
+	const int num_threads = 64;
+	pthread_t threads[num_threads];
+	int i;
+
+#ifndef HAVE_PTHREAD
+	torture_skip(tctx, "no pthread support\n");
+#endif
+
+	for (i = 0; i < num_threads; i++) {
+		ret = pthread_create(&threads[i], NULL, tfork_thread, &threads[i]);
+		torture_assert_goto(tctx, ret == 0, ok, done,
+				    "pthread_create failed\n");
+	}
+
+	for (i = 0; i < num_threads; i++) {
+		void *p;
+		uint64_t *result;
+
+		ret = pthread_join(threads[i], &p);
+		torture_assert_goto(tctx, ret == 0, ok, done,
+				    "pthread_join failed\n");
+		result = (uint64_t *)p;
+		torture_assert_goto(tctx, *result == (uint64_t)threads[i] + 1,
+				    ok, done, "thread failed\n");
+		free(p);
+	}
+
+done:
+	return ok;
+}
+
 static bool test_tfork_cmd_send(struct torture_context *tctx)
 {
 	struct tevent_context *ev = NULL;
@@ -133,6 +547,30 @@ struct torture_suite *torture_local_tfork(TALLOC_CTX *mem_ctx)
 				      test_tfork_status);
 
 	torture_suite_add_simple_test(suite,
+				      "tfork_sigign",
+				      test_tfork_sigign);
+
+	torture_suite_add_simple_test(suite,
+				      "tfork_sighandler",
+				      test_tfork_sighandler);
+
+	torture_suite_add_simple_test(suite,
+				      "tfork_process_hierarchy",
+				      test_tfork_process_hierarchy);
+
+	torture_suite_add_simple_test(suite,
+				      "tfork_pipe",
+				      test_tfork_pipe);
+
+	torture_suite_add_simple_test(suite,
+				      "tfork_twice",
+				      test_tfork_twice);
+
+	torture_suite_add_simple_test(suite,
+				      "tfork_threads",
+				      test_tfork_threads);
+
+	torture_suite_add_simple_test(suite,
 				      "tfork_cmd_send",
 				      test_tfork_cmd_send);
 
-- 
2.9.4



More information about the samba-technical mailing list