[PATCH] Fix a pthreadpool race

Volker Lendecke Volker.Lendecke at SerNet.DE
Wed Dec 6 17:56:56 UTC 2017


On Wed, Dec 06, 2017 at 05:53:57PM +0100, Volker Lendecke wrote:
> Hi!
> 
> This survives pthreadpooltest, it has not seen a full autobuild yet.
> I'd just like to post it for thorough review. It has gone through
> quite a few internal discussions back and forth bouncing ideas about
> different solutions. If someone has a simpler solution that survives
> the test in the second patch, I'm more than happy to review it.
> 
> Christof, looking at your patches now :-)
> 
> Review appreciated!

Sorry, this missed a prerequisite patch. Attached.

Volker

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt at sernet.de
-------------- next part --------------
From 20b2b0a926a2119c77cc25127d9f6c6c683d4b13 Mon Sep 17 00:00:00 2001
From: Volker Lendecke <vl at samba.org>
Date: Tue, 5 Dec 2017 15:34:09 +0100
Subject: [PATCH 1/3] lib: Don't use PTHREAD_COND_INITIALIZER for a stack
 variable

susv4 speaks about statically allocated variables. Stack vars are not.

Signed-off-by: Volker Lendecke <vl at samba.org>
---
 lib/pthreadpool/pthreadpool.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
index 23885aa6d11..646708adb62 100644
--- a/lib/pthreadpool/pthreadpool.c
+++ b/lib/pthreadpool/pthreadpool.c
@@ -179,9 +179,12 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
 
 static void pthreadpool_prepare_pool(struct pthreadpool *pool)
 {
-	pthread_cond_t prefork_cond = PTHREAD_COND_INITIALIZER;
+	pthread_cond_t prefork_cond;
 	int ret;
 
+	ret = pthread_cond_init(&prefork_cond, NULL);
+	assert(ret == 0);
+
 	ret = pthread_mutex_lock(&pool->mutex);
 	assert(ret == 0);
 
-- 
2.11.0


From 03934e534d190542e91184ab47a591f68772c6ea Mon Sep 17 00:00:00 2001
From: Volker Lendecke <vl at samba.org>
Date: Wed, 29 Nov 2017 16:45:40 +0100
Subject: [PATCH 2/3] pthreadpool: Fix starvation after fork

After the race is before the race:

1) Create an idle thread
2) Add a job: This won't create a thread anymore
3) Immediately fork

The idle thread will be woken twice before it's actually woken up: Both
pthreadpool_add_job and pthreadpool_prepare_pool call cond_signal, for
different reasons. We must look at pool->prefork_cond first because otherwise
we will end up in a blocking job deep within a fork call, the helper thread
must take its fingers off the condvar as quickly as possible.  This means that
after the fork there's no idle thread around anymore that would pick up the job
submitted in 2). So we must keep the idle threads around across the fork.

The quick solution to re-create one helper thread in pthreadpool_parent has a
fatal flaw: What do we do if that pthread_create call fails? We're deep in an
application calling fork(), and doing fancy signalling from there is really
something we must avoid.

This has one potential performance issue: If we have hundreds of idle threads
(do we ever have that) during the fork, the call to pthread_mutex_lock on the
fork_mutex from pthreadpool_server (the helper thread) will probably cause a
thundering herd when the _parent call unlocks the fork_mutex. The solution for
this to just keep one idle thread around. But this adds code that is not
strictly required functionally for now.

Signed-off-by: Volker Lendecke <vl at samba.org>
---
 lib/pthreadpool/pthreadpool.c | 92 ++++++++++++++++++++++++++++++++++---------
 1 file changed, 73 insertions(+), 19 deletions(-)

diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
index 646708adb62..0bf7109a8f8 100644
--- a/lib/pthreadpool/pthreadpool.c
+++ b/lib/pthreadpool/pthreadpool.c
@@ -91,11 +91,19 @@ struct pthreadpool {
 	int num_idle;
 
 	/*
-	 * Condition variable indicating that we should quickly go
-	 * away making way for fork() without anybody waiting on
-	 * pool->condvar.
+	 * Condition variable indicating that helper threads should
+	 * quickly go away making way for fork() without anybody
+	 * waiting on pool->condvar.
 	 */
 	pthread_cond_t *prefork_cond;
+
+	/*
+	 * Waiting position for helper threads while fork is
+	 * running. The forking thread will have locked it, and all
+	 * idle helper threads will sit here until after the fork,
+	 * where the forking thread will unlock it again.
+	 */
+	pthread_mutex_t fork_mutex;
 };
 
 static pthread_mutex_t pthreadpools_mutex = PTHREAD_MUTEX_INITIALIZER;
@@ -151,6 +159,15 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
 		return ret;
 	}
 
+	ret = pthread_mutex_init(&pool->fork_mutex, NULL);
+	if (ret != 0) {
+		pthread_cond_destroy(&pool->condvar);
+		pthread_mutex_destroy(&pool->mutex);
+		free(pool->jobs);
+		free(pool);
+		return ret;
+	}
+
 	pool->shutdown = false;
 	pool->num_threads = 0;
 	pool->max_threads = max_threads;
@@ -159,6 +176,7 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
 
 	ret = pthread_mutex_lock(&pthreadpools_mutex);
 	if (ret != 0) {
+		pthread_mutex_destroy(&pool->fork_mutex);
 		pthread_cond_destroy(&pool->condvar);
 		pthread_mutex_destroy(&pool->mutex);
 		free(pool->jobs);
@@ -179,21 +197,26 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
 
 static void pthreadpool_prepare_pool(struct pthreadpool *pool)
 {
-	pthread_cond_t prefork_cond;
 	int ret;
 
-	ret = pthread_cond_init(&prefork_cond, NULL);
+	ret = pthread_mutex_lock(&pool->fork_mutex);
 	assert(ret == 0);
 
 	ret = pthread_mutex_lock(&pool->mutex);
 	assert(ret == 0);
 
 	while (pool->num_idle != 0) {
+		int num_idle = pool->num_idle;
+		pthread_cond_t prefork_cond;
+
+		ret = pthread_cond_init(&prefork_cond, NULL);
+		assert(ret == 0);
+
 		/*
-		 * Exit all idle threads, which are all blocked in
-		 * pool->condvar. In the child we can destroy the
-		 * pool, which would result in undefined behaviour in
-		 * the pthread_cond_destroy(pool->condvar). glibc just
+		 * Push all idle threads off pool->condvar. In the
+		 * child we can destroy the pool, which would result
+		 * in undefined behaviour in the
+		 * pthread_cond_destroy(pool->condvar). glibc just
 		 * blocks here.
 		 */
 		pool->prefork_cond = &prefork_cond;
@@ -201,14 +224,16 @@ static void pthreadpool_prepare_pool(struct pthreadpool *pool)
 		ret = pthread_cond_signal(&pool->condvar);
 		assert(ret == 0);
 
-		ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
-		assert(ret == 0);
+		while (pool->num_idle == num_idle) {
+			ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
+			assert(ret == 0);
+		}
 
 		pool->prefork_cond = NULL;
-	}
 
-	ret = pthread_cond_destroy(&prefork_cond);
-	assert(ret == 0);
+		ret = pthread_cond_destroy(&prefork_cond);
+		assert(ret == 0);
+	}
 
 	/*
 	 * Probably it's well-defined somewhere: What happens to
@@ -249,6 +274,8 @@ static void pthreadpool_parent(void)
 		assert(ret == 0);
 		ret = pthread_mutex_unlock(&pool->mutex);
 		assert(ret == 0);
+		ret = pthread_mutex_unlock(&pool->fork_mutex);
+		assert(ret == 0);
 	}
 
 	ret = pthread_mutex_unlock(&pthreadpools_mutex);
@@ -271,8 +298,12 @@ static void pthreadpool_child(void)
 
 		ret = pthread_cond_init(&pool->condvar, NULL);
 		assert(ret == 0);
+
 		ret = pthread_mutex_unlock(&pool->mutex);
 		assert(ret == 0);
+
+		ret = pthread_mutex_unlock(&pool->fork_mutex);
+		assert(ret == 0);
 	}
 
 	ret = pthread_mutex_unlock(&pthreadpools_mutex);
@@ -287,7 +318,7 @@ static void pthreadpool_prep_atfork(void)
 
 static int pthreadpool_free(struct pthreadpool *pool)
 {
-	int ret, ret1;
+	int ret, ret1, ret2;
 
 	ret = pthread_mutex_lock(&pthreadpools_mutex);
 	if (ret != 0) {
@@ -299,6 +330,7 @@ static int pthreadpool_free(struct pthreadpool *pool)
 
 	ret = pthread_mutex_destroy(&pool->mutex);
 	ret1 = pthread_cond_destroy(&pool->condvar);
+	ret2 = pthread_mutex_destroy(&pool->fork_mutex);
 
 	if (ret != 0) {
 		return ret;
@@ -306,6 +338,9 @@ static int pthreadpool_free(struct pthreadpool *pool)
 	if (ret1 != 0) {
 		return ret1;
 	}
+	if (ret2 != 0) {
+		return ret2;
+	}
 
 	free(pool->jobs);
 	free(pool);
@@ -465,11 +500,30 @@ static void *pthreadpool_server(void *arg)
 				/*
 				 * Me must allow fork() to continue
 				 * without anybody waiting on
-				 * &pool->condvar.
+				 * &pool->condvar. Tell
+				 * pthreadpool_prepare_pool that we
+				 * got that message.
 				 */
-				pthread_cond_signal(pool->prefork_cond);
-				pthreadpool_server_exit(pool);
-				return NULL;
+
+				res = pthread_cond_signal(pool->prefork_cond);
+				assert(res == 0);
+
+				res = pthread_mutex_unlock(&pool->mutex);
+				assert(res == 0);
+
+				/*
+				 * pthreadpool_prepare_pool has
+				 * already locked this mutex across
+				 * the fork. This makes us wait
+				 * without sitting in a condvar.
+				 */
+				res = pthread_mutex_lock(&pool->fork_mutex);
+				assert(res == 0);
+				res = pthread_mutex_unlock(&pool->fork_mutex);
+				assert(res == 0);
+
+				res = pthread_mutex_lock(&pool->mutex);
+				assert(res == 0);
 			}
 
 			if (res == ETIMEDOUT) {
-- 
2.11.0


From b4d0c5ae20bce5849470d1acf2e567e6824685c9 Mon Sep 17 00:00:00 2001
From: Volker Lendecke <vl at samba.org>
Date: Wed, 29 Nov 2017 18:55:21 +0100
Subject: [PATCH 3/3] pthreadpool: Add a test for the race condition fixed in
 the last commit

Signed-off-by: Volker Lendecke <vl at samba.org>
---
 lib/pthreadpool/tests.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 82 insertions(+)

diff --git a/lib/pthreadpool/tests.c b/lib/pthreadpool/tests.c
index 1aab80c2bb4..f0ae0aa4a93 100644
--- a/lib/pthreadpool/tests.c
+++ b/lib/pthreadpool/tests.c
@@ -308,6 +308,82 @@ static int test_busyfork(void)
 	return 0;
 }
 
+static int test_busyfork2(void)
+{
+	struct pthreadpool_pipe *p;
+	pid_t child;
+	int ret, jobnum;
+	struct pollfd pfd;
+
+	ret = pthreadpool_pipe_init(1, &p);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_pipe_init failed: %s\n",
+			strerror(ret));
+		return -1;
+	}
+
+	ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_add_job failed: %s\n",
+			strerror(ret));
+		return -1;
+	}
+
+	ret = pthreadpool_pipe_finished_jobs(p, &jobnum, 1);
+	if (ret != 1) {
+		fprintf(stderr, "pthreadpool_pipe_finished_jobs failed\n");
+		return -1;
+	}
+
+	ret = poll(NULL, 0, 10);
+	if (ret == -1) {
+		perror("poll failed");
+		return -1;
+	}
+
+	ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_add_job failed: %s\n",
+			strerror(ret));
+		return -1;
+	}
+
+	/*
+	 * Do the fork right after the add_job. This tests a race
+	 * where the atfork prepare handler gets all idle threads off
+	 * the condvar. If we are faster doing the fork than the
+	 * existing idle thread could get out of idle and take the
+	 * job, after the fork we end up with no threads to take care
+	 * of the job.
+	 */
+
+	child = fork();
+	if (child < 0) {
+		perror("fork failed");
+		return -1;
+	}
+
+	if (child == 0) {
+		exit(0);
+	}
+
+	pfd = (struct pollfd) {
+		.fd = pthreadpool_pipe_signal_fd(p),
+		.events = POLLIN|POLLERR
+	};
+
+	do {
+		ret = poll(&pfd, 1, 5000);
+	} while ((ret == -1) && (errno == EINTR));
+
+	if (ret == 0) {
+		fprintf(stderr, "job unfinished after 5 seconds\n");
+		return -1;
+	}
+
+	return 0;
+}
+
 static void test_tevent_wait(void *private_data)
 {
 	int *timeout = private_data;
@@ -423,6 +499,12 @@ int main(void)
 		return 1;
 	}
 
+	ret = test_busyfork2();
+	if (ret != 0) {
+		fprintf(stderr, "test_busyfork2 failed\n");
+		return 1;
+	}
+
 	printf("success\n");
 	return 0;
 }
-- 
2.11.0



More information about the samba-technical mailing list