[PATCH] Fix a pthreadpool race

Jeremy Allison jra at samba.org
Thu Dec 7 16:56:41 UTC 2017


On Thu, Dec 07, 2017 at 07:59:35AM +0100, Volker Lendecke wrote:
> On Wed, Dec 06, 2017 at 04:53:15PM -0800, Jeremy Allison wrote:
> > Feel free to push with any of the
> > text of this or the previous message in the
> > commit message if you think it will help others :-).
> 
> Attached patch has the very nice explanation in the commit message. It
> also removes the necessity for the preliminary patch. Runs in a
> private autobuild now. I'll push once this survived and nobody objects
> in the meantime.
> 
> Thanks!

Thanks for adding the extra text Volker, it makes it much
easier for people who don't understand MT-code (looking at
myself in the mirror here :-) to understand the commit
changes. Please push !

Jeremy.

> -- 
> SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
> phone: +49-551-370000-0, fax: +49-551-370000-9
> AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
> http://www.sernet.de, mailto:kontakt at sernet.de

> From 287c0232d2e7ef4d7c6ce278e31606f2b5a470e4 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Wed, 29 Nov 2017 16:45:40 +0100
> Subject: [PATCH 1/2] pthreadpool: Fix starvation after fork
> 
> After the race is before the race:
> 
> 1) Create an idle thread
> 2) Add a job: This won't create a thread anymore
> 3) Immediately fork
> 
> The idle thread will be woken twice before it's actually woken up: Both
> pthreadpool_add_job and pthreadpool_prepare_pool call cond_signal, for
> different reasons. We must look at pool->prefork_cond first because otherwise
> we will end up in a blocking job deep within a fork call, the helper thread
> must take its fingers off the condvar as quickly as possible.  This means that
> after the fork there's no idle thread around anymore that would pick up the job
> submitted in 2). So we must keep the idle threads around across the fork.
> 
> The quick solution to re-create one helper thread in pthreadpool_parent has a
> fatal flaw: What do we do if that pthread_create call fails? We're deep in an
> application calling fork(), and doing fancy signalling from there is really
> something we must avoid.
> 
> This has one potential performance issue: If we have hundreds of idle threads
> (do we ever have that) during the fork, the call to pthread_mutex_lock on the
> fork_mutex from pthreadpool_server (the helper thread) will probably cause a
> thundering herd when the _parent call unlocks the fork_mutex. The solution for
> this to just keep one idle thread around. But this adds code that is not
> strictly required functionally for now.
> 
> More detailed explanation from Jeremy:
> 
> First, understanding the problem the test reproduces:
> 
> add a job (num_jobs = 1) -> creates thread to run it.
> job finishes, thread sticks around (num_idle = 1).
> num_jobs is now zero (initial job finished).
> 
> a) Idle thread is now waiting on pool->condvar inside
> pthreadpool_server() in pthread_cond_timedwait().
> 
> Now, add another job ->
> 
> 	pthreadpool_add_job()
> 		-> pthreadpool_put_job()
> 			This adds the job to the queue.
> 		Oh, there is an idle thread so don't
> 		create one, do:
> 
> 		pthread_cond_signal(&pool->condvar);
> 
> 		and return.
> 
> Now call fork *before* idle thread in (a) wakes from
> the signaling of pool->condvar.
> 
> In the parent (child is irrelevent):
> 
> Go into: pthreadpool_prepare() ->
> 		pthreadpool_prepare_pool()
> 
> 		Set the variable to tell idle threads to exit:
> 
> 		pool->prefork_cond = &prefork_cond;
> 
> 		then wake them up with:
> 
> 		pthread_cond_signal(&pool->condvar);
> 
> 		This does nothing as the idle thread
> 		is already awoken.
> 
> b) Idle thread wakes up and does:
> 
> 		Reduce idle thread count (num_idle = 0)
> 
> 		pool->num_idle -= 1;
> 
> 		Check if we're in the middle of a fork.
> 
> 		if (pool->prefork_cond != NULL) {
> 
> 			Yes we are, tell pthreadpool_prepare()
> 			we are exiting.
> 
> 			pthread_cond_signal(pool->prefork_cond);
> 
> 			And exit.
> 
> 			pthreadpool_server_exit(pool);
> 			return NULL;
> 		}
> 
> So we come back from the fork in the parent with num_jobs = 1,
> a job on the queue but no idle threads - and the code that
> creates a new thread on job submission was skipped because
> an idle thread existed at point (a).
> 
> OK, assuming that the previous explaination is correct, the
> fix is to create a new pthreadpool context mutex:
> 
> pool->fork_mutex
> 
> and in pthreadpool_server(), when an idle thread wakes up and
> notices we're in the prepare fork state, it puts itself to
> sleep by waiting on the new pool->fork_mutex.
> 
> And in pthreadpool_prepare_pool(), instead of waiting for
> the idle threads to exit, hold the pool->fork_mutex and
> signal each idle thread in turn, and wait for the pool->num_idle
> to go to zero - which means they're all blocked waiting on
> pool->fork_mutex.
> 
> When the parent continues, pthreadpool_parent()
> unlocks the pool->fork_mutex and all the previously
> 'idle' threads wake up (and you mention the thundering
> herd problem, which is as you say vanishingly small :-)
> and pick up any remaining job.
> 
> Bug: https://bugzilla.samba.org/show_bug.cgi?id=13179
> Signed-off-by: Volker Lendecke <vl at samba.org>
> Reviewed-by: Jeremy Allison <jra at samba.org>
> ---
>  lib/pthreadpool/pthreadpool.c | 93 ++++++++++++++++++++++++++++++++++---------
>  1 file changed, 75 insertions(+), 18 deletions(-)
> 
> diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
> index 23885aa6d11..0bf7109a8f8 100644
> --- a/lib/pthreadpool/pthreadpool.c
> +++ b/lib/pthreadpool/pthreadpool.c
> @@ -91,11 +91,19 @@ struct pthreadpool {
>  	int num_idle;
>  
>  	/*
> -	 * Condition variable indicating that we should quickly go
> -	 * away making way for fork() without anybody waiting on
> -	 * pool->condvar.
> +	 * Condition variable indicating that helper threads should
> +	 * quickly go away making way for fork() without anybody
> +	 * waiting on pool->condvar.
>  	 */
>  	pthread_cond_t *prefork_cond;
> +
> +	/*
> +	 * Waiting position for helper threads while fork is
> +	 * running. The forking thread will have locked it, and all
> +	 * idle helper threads will sit here until after the fork,
> +	 * where the forking thread will unlock it again.
> +	 */
> +	pthread_mutex_t fork_mutex;
>  };
>  
>  static pthread_mutex_t pthreadpools_mutex = PTHREAD_MUTEX_INITIALIZER;
> @@ -151,6 +159,15 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>  		return ret;
>  	}
>  
> +	ret = pthread_mutex_init(&pool->fork_mutex, NULL);
> +	if (ret != 0) {
> +		pthread_cond_destroy(&pool->condvar);
> +		pthread_mutex_destroy(&pool->mutex);
> +		free(pool->jobs);
> +		free(pool);
> +		return ret;
> +	}
> +
>  	pool->shutdown = false;
>  	pool->num_threads = 0;
>  	pool->max_threads = max_threads;
> @@ -159,6 +176,7 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>  
>  	ret = pthread_mutex_lock(&pthreadpools_mutex);
>  	if (ret != 0) {
> +		pthread_mutex_destroy(&pool->fork_mutex);
>  		pthread_cond_destroy(&pool->condvar);
>  		pthread_mutex_destroy(&pool->mutex);
>  		free(pool->jobs);
> @@ -179,18 +197,26 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>  
>  static void pthreadpool_prepare_pool(struct pthreadpool *pool)
>  {
> -	pthread_cond_t prefork_cond = PTHREAD_COND_INITIALIZER;
>  	int ret;
>  
> +	ret = pthread_mutex_lock(&pool->fork_mutex);
> +	assert(ret == 0);
> +
>  	ret = pthread_mutex_lock(&pool->mutex);
>  	assert(ret == 0);
>  
>  	while (pool->num_idle != 0) {
> +		int num_idle = pool->num_idle;
> +		pthread_cond_t prefork_cond;
> +
> +		ret = pthread_cond_init(&prefork_cond, NULL);
> +		assert(ret == 0);
> +
>  		/*
> -		 * Exit all idle threads, which are all blocked in
> -		 * pool->condvar. In the child we can destroy the
> -		 * pool, which would result in undefined behaviour in
> -		 * the pthread_cond_destroy(pool->condvar). glibc just
> +		 * Push all idle threads off pool->condvar. In the
> +		 * child we can destroy the pool, which would result
> +		 * in undefined behaviour in the
> +		 * pthread_cond_destroy(pool->condvar). glibc just
>  		 * blocks here.
>  		 */
>  		pool->prefork_cond = &prefork_cond;
> @@ -198,14 +224,16 @@ static void pthreadpool_prepare_pool(struct pthreadpool *pool)
>  		ret = pthread_cond_signal(&pool->condvar);
>  		assert(ret == 0);
>  
> -		ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
> -		assert(ret == 0);
> +		while (pool->num_idle == num_idle) {
> +			ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
> +			assert(ret == 0);
> +		}
>  
>  		pool->prefork_cond = NULL;
> -	}
>  
> -	ret = pthread_cond_destroy(&prefork_cond);
> -	assert(ret == 0);
> +		ret = pthread_cond_destroy(&prefork_cond);
> +		assert(ret == 0);
> +	}
>  
>  	/*
>  	 * Probably it's well-defined somewhere: What happens to
> @@ -246,6 +274,8 @@ static void pthreadpool_parent(void)
>  		assert(ret == 0);
>  		ret = pthread_mutex_unlock(&pool->mutex);
>  		assert(ret == 0);
> +		ret = pthread_mutex_unlock(&pool->fork_mutex);
> +		assert(ret == 0);
>  	}
>  
>  	ret = pthread_mutex_unlock(&pthreadpools_mutex);
> @@ -268,8 +298,12 @@ static void pthreadpool_child(void)
>  
>  		ret = pthread_cond_init(&pool->condvar, NULL);
>  		assert(ret == 0);
> +
>  		ret = pthread_mutex_unlock(&pool->mutex);
>  		assert(ret == 0);
> +
> +		ret = pthread_mutex_unlock(&pool->fork_mutex);
> +		assert(ret == 0);
>  	}
>  
>  	ret = pthread_mutex_unlock(&pthreadpools_mutex);
> @@ -284,7 +318,7 @@ static void pthreadpool_prep_atfork(void)
>  
>  static int pthreadpool_free(struct pthreadpool *pool)
>  {
> -	int ret, ret1;
> +	int ret, ret1, ret2;
>  
>  	ret = pthread_mutex_lock(&pthreadpools_mutex);
>  	if (ret != 0) {
> @@ -296,6 +330,7 @@ static int pthreadpool_free(struct pthreadpool *pool)
>  
>  	ret = pthread_mutex_destroy(&pool->mutex);
>  	ret1 = pthread_cond_destroy(&pool->condvar);
> +	ret2 = pthread_mutex_destroy(&pool->fork_mutex);
>  
>  	if (ret != 0) {
>  		return ret;
> @@ -303,6 +338,9 @@ static int pthreadpool_free(struct pthreadpool *pool)
>  	if (ret1 != 0) {
>  		return ret1;
>  	}
> +	if (ret2 != 0) {
> +		return ret2;
> +	}
>  
>  	free(pool->jobs);
>  	free(pool);
> @@ -462,11 +500,30 @@ static void *pthreadpool_server(void *arg)
>  				/*
>  				 * Me must allow fork() to continue
>  				 * without anybody waiting on
> -				 * &pool->condvar.
> +				 * &pool->condvar. Tell
> +				 * pthreadpool_prepare_pool that we
> +				 * got that message.
>  				 */
> -				pthread_cond_signal(pool->prefork_cond);
> -				pthreadpool_server_exit(pool);
> -				return NULL;
> +
> +				res = pthread_cond_signal(pool->prefork_cond);
> +				assert(res == 0);
> +
> +				res = pthread_mutex_unlock(&pool->mutex);
> +				assert(res == 0);
> +
> +				/*
> +				 * pthreadpool_prepare_pool has
> +				 * already locked this mutex across
> +				 * the fork. This makes us wait
> +				 * without sitting in a condvar.
> +				 */
> +				res = pthread_mutex_lock(&pool->fork_mutex);
> +				assert(res == 0);
> +				res = pthread_mutex_unlock(&pool->fork_mutex);
> +				assert(res == 0);
> +
> +				res = pthread_mutex_lock(&pool->mutex);
> +				assert(res == 0);
>  			}
>  
>  			if (res == ETIMEDOUT) {
> -- 
> 2.11.0
> 
> 
> From c9486a43c83083e47893979a3d450e4d4b464673 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Wed, 29 Nov 2017 18:55:21 +0100
> Subject: [PATCH 2/2] pthreadpool: Add a test for the race condition fixed in
>  the last commit
> 
> Bug: https://bugzilla.samba.org/show_bug.cgi?id=13179
> Signed-off-by: Volker Lendecke <vl at samba.org>
> Reviewed-by: Jeremy Allison <jra at samba.org>
> ---
>  lib/pthreadpool/tests.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 82 insertions(+)
> 
> diff --git a/lib/pthreadpool/tests.c b/lib/pthreadpool/tests.c
> index 1aab80c2bb4..f0ae0aa4a93 100644
> --- a/lib/pthreadpool/tests.c
> +++ b/lib/pthreadpool/tests.c
> @@ -308,6 +308,82 @@ static int test_busyfork(void)
>  	return 0;
>  }
>  
> +static int test_busyfork2(void)
> +{
> +	struct pthreadpool_pipe *p;
> +	pid_t child;
> +	int ret, jobnum;
> +	struct pollfd pfd;
> +
> +	ret = pthreadpool_pipe_init(1, &p);
> +	if (ret != 0) {
> +		fprintf(stderr, "pthreadpool_pipe_init failed: %s\n",
> +			strerror(ret));
> +		return -1;
> +	}
> +
> +	ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
> +	if (ret != 0) {
> +		fprintf(stderr, "pthreadpool_add_job failed: %s\n",
> +			strerror(ret));
> +		return -1;
> +	}
> +
> +	ret = pthreadpool_pipe_finished_jobs(p, &jobnum, 1);
> +	if (ret != 1) {
> +		fprintf(stderr, "pthreadpool_pipe_finished_jobs failed\n");
> +		return -1;
> +	}
> +
> +	ret = poll(NULL, 0, 10);
> +	if (ret == -1) {
> +		perror("poll failed");
> +		return -1;
> +	}
> +
> +	ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
> +	if (ret != 0) {
> +		fprintf(stderr, "pthreadpool_add_job failed: %s\n",
> +			strerror(ret));
> +		return -1;
> +	}
> +
> +	/*
> +	 * Do the fork right after the add_job. This tests a race
> +	 * where the atfork prepare handler gets all idle threads off
> +	 * the condvar. If we are faster doing the fork than the
> +	 * existing idle thread could get out of idle and take the
> +	 * job, after the fork we end up with no threads to take care
> +	 * of the job.
> +	 */
> +
> +	child = fork();
> +	if (child < 0) {
> +		perror("fork failed");
> +		return -1;
> +	}
> +
> +	if (child == 0) {
> +		exit(0);
> +	}
> +
> +	pfd = (struct pollfd) {
> +		.fd = pthreadpool_pipe_signal_fd(p),
> +		.events = POLLIN|POLLERR
> +	};
> +
> +	do {
> +		ret = poll(&pfd, 1, 5000);
> +	} while ((ret == -1) && (errno == EINTR));
> +
> +	if (ret == 0) {
> +		fprintf(stderr, "job unfinished after 5 seconds\n");
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>  static void test_tevent_wait(void *private_data)
>  {
>  	int *timeout = private_data;
> @@ -423,6 +499,12 @@ int main(void)
>  		return 1;
>  	}
>  
> +	ret = test_busyfork2();
> +	if (ret != 0) {
> +		fprintf(stderr, "test_busyfork2 failed\n");
> +		return 1;
> +	}
> +
>  	printf("success\n");
>  	return 0;
>  }
> -- 
> 2.11.0
> 




More information about the samba-technical mailing list