[PATCHES][BUG 13170] Fix crash in pthreadpool thread after failure from pthread_create

Volker Lendecke Volker.Lendecke at SerNet.DE
Thu Nov 30 12:40:00 UTC 2017


On Wed, Nov 29, 2017 at 12:23:34PM -0700, Christof Schmitt via samba-technical wrote:
> As discussed in a side-thread, my initial patch lets the error surface
> back to the client. Here is an updated patch that avoids this issue. If
> no thread can be created, but another still exists, let that handle the
> request. If no thread exists, then fallback to processing the job
> synchronously, as the system might be hitting a resource limit.

From a review perspective, this looks good. For the refactoring one of
course RB+. However, pthreadpool being such a fragile piece of code I
would love to see unit tests for this.  This will require explicit
error injects with some DEVELOPER only code, but is it possible to get
this? If you don't have time, I might try to code something up.

On top of your refactoring patch, attached find another pthreadpool
race condition fix. For me it does fix something, but it also
introduces another piece of technical debt. I have to put some more
thought into handling pthread_create failure in the parent atfork
handler. This might require interaction with the users of this code,
don't know yet.

It's running a private autobuild now, running pthreadpooltest manually
gives the expected result: Without the fix it fails, with the fix it
succeeds.

Comments?

Volker

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt at sernet.de
-------------- next part --------------
From 163df5a06d37eaab665f94a29f22b11bbc17bab6 Mon Sep 17 00:00:00 2001
From: Volker Lendecke <vl at samba.org>
Date: Wed, 29 Nov 2017 16:45:40 +0100
Subject: [PATCH 1/2] pthreadpool: Fix a race condition

The comment says it all

Signed-off-by: Volker Lendecke <vl at samba.org>
---
 lib/pthreadpool/pthreadpool.c | 23 +++++++++++++++++++++++
 1 file changed, 23 insertions(+)

diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
index 874ddd23c1d..abf0e8f6f61 100644
--- a/lib/pthreadpool/pthreadpool.c
+++ b/lib/pthreadpool/pthreadpool.c
@@ -103,6 +103,7 @@ static struct pthreadpool *pthreadpools = NULL;
 static pthread_once_t pthreadpool_atfork_initialized = PTHREAD_ONCE_INIT;
 
 static void pthreadpool_prep_atfork(void);
+static int pthreadpool_create_thread(struct pthreadpool *pool);
 
 /*
  * Initialize a thread pool
@@ -244,6 +245,27 @@ static void pthreadpool_parent(void)
 	     pool = DLIST_PREV(pool)) {
 		ret = pthread_cond_init(&pool->condvar, NULL);
 		assert(ret == 0);
+
+		if ((pool->num_jobs != 0) && (pool->num_threads == 0)) {
+			/*
+			 * We added a job with idle threads and
+			 * immediately did a fork before any of the
+			 * idle threads got scheduled.
+			 *
+			 * pthreadpool_prepare killed all helper
+			 * threads to get them off pool->condvar, so
+			 * we need to create at least one to get
+			 * things going again.
+			 */
+			ret = pthreadpool_create_thread(pool);
+
+			/*
+			 * TODO: Handle temporary thread creation
+			 * failure gracefully.
+			 */
+			assert(ret == 0);
+		}
+
 		ret = pthread_mutex_unlock(&pool->mutex);
 		assert(ret == 0);
 	}
@@ -268,6 +290,7 @@ static void pthreadpool_child(void)
 
 		ret = pthread_cond_init(&pool->condvar, NULL);
 		assert(ret == 0);
+
 		ret = pthread_mutex_unlock(&pool->mutex);
 		assert(ret == 0);
 	}
-- 
2.11.0


From 91ab9b6febdd7fc47cf9d1b26bcd01298ea934ff Mon Sep 17 00:00:00 2001
From: Volker Lendecke <vl at samba.org>
Date: Wed, 29 Nov 2017 18:55:21 +0100
Subject: [PATCH 2/2] pthreadpool: Add a test for the race condition fixed in
 the last commit

Signed-off-by: Volker Lendecke <vl at samba.org>
---
 lib/pthreadpool/tests.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/lib/pthreadpool/tests.c b/lib/pthreadpool/tests.c
index 1aab80c2bb4..f0ae0aa4a93 100644
--- a/lib/pthreadpool/tests.c
+++ b/lib/pthreadpool/tests.c
@@ -308,6 +308,82 @@ static int test_busyfork(void)
 	return 0;
 }
 
+static int test_busyfork2(void)
+{
+	struct pthreadpool_pipe *p;
+	pid_t child;
+	int ret, jobnum;
+	struct pollfd pfd;
+
+	ret = pthreadpool_pipe_init(1, &p);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_pipe_init failed: %s\n",
+			strerror(ret));
+		return -1;
+	}
+
+	ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_add_job failed: %s\n",
+			strerror(ret));
+		return -1;
+	}
+
+	ret = pthreadpool_pipe_finished_jobs(p, &jobnum, 1);
+	if (ret != 1) {
+		fprintf(stderr, "pthreadpool_pipe_finished_jobs failed\n");
+		return -1;
+	}
+
+	ret = poll(NULL, 0, 10);
+	if (ret == -1) {
+		perror("poll failed");
+		return -1;
+	}
+
+	ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_add_job failed: %s\n",
+			strerror(ret));
+		return -1;
+	}
+
+	/*
+	 * Do the fork right after the add_job. This tests a race
+	 * where the atfork prepare handler gets all idle threads off
+	 * the condvar. If we are faster doing the fork than the
+	 * existing idle thread could get out of idle and take the
+	 * job, after the fork we end up with no threads to take care
+	 * of the job.
+	 */
+
+	child = fork();
+	if (child < 0) {
+		perror("fork failed");
+		return -1;
+	}
+
+	if (child == 0) {
+		exit(0);
+	}
+
+	pfd = (struct pollfd) {
+		.fd = pthreadpool_pipe_signal_fd(p),
+		.events = POLLIN|POLLERR
+	};
+
+	do {
+		ret = poll(&pfd, 1, 5000);
+	} while ((ret == -1) && (errno == EINTR));
+
+	if (ret == 0) {
+		fprintf(stderr, "job unfinished after 5 seconds\n");
+		return -1;
+	}
+
+	return 0;
+}
+
 static void test_tevent_wait(void *private_data)
 {
 	int *timeout = private_data;
@@ -423,6 +499,12 @@ int main(void)
 		return 1;
 	}
 
+	ret = test_busyfork2();
+	if (ret != 0) {
+		fprintf(stderr, "test_busyfork2 failed\n");
+		return 1;
+	}
+
 	printf("success\n");
 	return 0;
 }
-- 
2.11.0



More information about the samba-technical mailing list