[PATCH] Fix a pthreadpool race
Jeremy Allison
jra at samba.org
Thu Dec 7 16:56:41 UTC 2017
On Thu, Dec 07, 2017 at 07:59:35AM +0100, Volker Lendecke wrote:
> On Wed, Dec 06, 2017 at 04:53:15PM -0800, Jeremy Allison wrote:
> > Feel free to push with any of the
> > text of this or the previous message in the
> > commit message if you think it will help others :-).
>
> Attached patch has the very nice explanation in the commit message. It
> also removes the necessity for the preliminary patch. Runs in a
> private autobuild now. I'll push once this survived and nobody objects
> in the meantime.
>
> Thanks!
Thanks for adding the extra text Volker, it makes it much
easier for people who don't understand MT-code (looking at
myself in the mirror here :-) to understand the commit
changes. Please push !
Jeremy.
> --
> SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
> phone: +49-551-370000-0, fax: +49-551-370000-9
> AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
> http://www.sernet.de, mailto:kontakt at sernet.de
> From 287c0232d2e7ef4d7c6ce278e31606f2b5a470e4 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Wed, 29 Nov 2017 16:45:40 +0100
> Subject: [PATCH 1/2] pthreadpool: Fix starvation after fork
>
> After the race is before the race:
>
> 1) Create an idle thread
> 2) Add a job: This won't create a thread anymore
> 3) Immediately fork
>
> The idle thread will be woken twice before it's actually woken up: Both
> pthreadpool_add_job and pthreadpool_prepare_pool call cond_signal, for
> different reasons. We must look at pool->prefork_cond first because otherwise
> we will end up in a blocking job deep within a fork call, the helper thread
> must take its fingers off the condvar as quickly as possible. This means that
> after the fork there's no idle thread around anymore that would pick up the job
> submitted in 2). So we must keep the idle threads around across the fork.
>
> The quick solution to re-create one helper thread in pthreadpool_parent has a
> fatal flaw: What do we do if that pthread_create call fails? We're deep in an
> application calling fork(), and doing fancy signalling from there is really
> something we must avoid.
>
> This has one potential performance issue: If we have hundreds of idle threads
> (do we ever have that) during the fork, the call to pthread_mutex_lock on the
> fork_mutex from pthreadpool_server (the helper thread) will probably cause a
> thundering herd when the _parent call unlocks the fork_mutex. The solution for
> this to just keep one idle thread around. But this adds code that is not
> strictly required functionally for now.
>
> More detailed explanation from Jeremy:
>
> First, understanding the problem the test reproduces:
>
> add a job (num_jobs = 1) -> creates thread to run it.
> job finishes, thread sticks around (num_idle = 1).
> num_jobs is now zero (initial job finished).
>
> a) Idle thread is now waiting on pool->condvar inside
> pthreadpool_server() in pthread_cond_timedwait().
>
> Now, add another job ->
>
> pthreadpool_add_job()
> -> pthreadpool_put_job()
> This adds the job to the queue.
> Oh, there is an idle thread so don't
> create one, do:
>
> pthread_cond_signal(&pool->condvar);
>
> and return.
>
> Now call fork *before* idle thread in (a) wakes from
> the signaling of pool->condvar.
>
> In the parent (child is irrelevent):
>
> Go into: pthreadpool_prepare() ->
> pthreadpool_prepare_pool()
>
> Set the variable to tell idle threads to exit:
>
> pool->prefork_cond = &prefork_cond;
>
> then wake them up with:
>
> pthread_cond_signal(&pool->condvar);
>
> This does nothing as the idle thread
> is already awoken.
>
> b) Idle thread wakes up and does:
>
> Reduce idle thread count (num_idle = 0)
>
> pool->num_idle -= 1;
>
> Check if we're in the middle of a fork.
>
> if (pool->prefork_cond != NULL) {
>
> Yes we are, tell pthreadpool_prepare()
> we are exiting.
>
> pthread_cond_signal(pool->prefork_cond);
>
> And exit.
>
> pthreadpool_server_exit(pool);
> return NULL;
> }
>
> So we come back from the fork in the parent with num_jobs = 1,
> a job on the queue but no idle threads - and the code that
> creates a new thread on job submission was skipped because
> an idle thread existed at point (a).
>
> OK, assuming that the previous explaination is correct, the
> fix is to create a new pthreadpool context mutex:
>
> pool->fork_mutex
>
> and in pthreadpool_server(), when an idle thread wakes up and
> notices we're in the prepare fork state, it puts itself to
> sleep by waiting on the new pool->fork_mutex.
>
> And in pthreadpool_prepare_pool(), instead of waiting for
> the idle threads to exit, hold the pool->fork_mutex and
> signal each idle thread in turn, and wait for the pool->num_idle
> to go to zero - which means they're all blocked waiting on
> pool->fork_mutex.
>
> When the parent continues, pthreadpool_parent()
> unlocks the pool->fork_mutex and all the previously
> 'idle' threads wake up (and you mention the thundering
> herd problem, which is as you say vanishingly small :-)
> and pick up any remaining job.
>
> Bug: https://bugzilla.samba.org/show_bug.cgi?id=13179
> Signed-off-by: Volker Lendecke <vl at samba.org>
> Reviewed-by: Jeremy Allison <jra at samba.org>
> ---
> lib/pthreadpool/pthreadpool.c | 93 ++++++++++++++++++++++++++++++++++---------
> 1 file changed, 75 insertions(+), 18 deletions(-)
>
> diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
> index 23885aa6d11..0bf7109a8f8 100644
> --- a/lib/pthreadpool/pthreadpool.c
> +++ b/lib/pthreadpool/pthreadpool.c
> @@ -91,11 +91,19 @@ struct pthreadpool {
> int num_idle;
>
> /*
> - * Condition variable indicating that we should quickly go
> - * away making way for fork() without anybody waiting on
> - * pool->condvar.
> + * Condition variable indicating that helper threads should
> + * quickly go away making way for fork() without anybody
> + * waiting on pool->condvar.
> */
> pthread_cond_t *prefork_cond;
> +
> + /*
> + * Waiting position for helper threads while fork is
> + * running. The forking thread will have locked it, and all
> + * idle helper threads will sit here until after the fork,
> + * where the forking thread will unlock it again.
> + */
> + pthread_mutex_t fork_mutex;
> };
>
> static pthread_mutex_t pthreadpools_mutex = PTHREAD_MUTEX_INITIALIZER;
> @@ -151,6 +159,15 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
> return ret;
> }
>
> + ret = pthread_mutex_init(&pool->fork_mutex, NULL);
> + if (ret != 0) {
> + pthread_cond_destroy(&pool->condvar);
> + pthread_mutex_destroy(&pool->mutex);
> + free(pool->jobs);
> + free(pool);
> + return ret;
> + }
> +
> pool->shutdown = false;
> pool->num_threads = 0;
> pool->max_threads = max_threads;
> @@ -159,6 +176,7 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>
> ret = pthread_mutex_lock(&pthreadpools_mutex);
> if (ret != 0) {
> + pthread_mutex_destroy(&pool->fork_mutex);
> pthread_cond_destroy(&pool->condvar);
> pthread_mutex_destroy(&pool->mutex);
> free(pool->jobs);
> @@ -179,18 +197,26 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>
> static void pthreadpool_prepare_pool(struct pthreadpool *pool)
> {
> - pthread_cond_t prefork_cond = PTHREAD_COND_INITIALIZER;
> int ret;
>
> + ret = pthread_mutex_lock(&pool->fork_mutex);
> + assert(ret == 0);
> +
> ret = pthread_mutex_lock(&pool->mutex);
> assert(ret == 0);
>
> while (pool->num_idle != 0) {
> + int num_idle = pool->num_idle;
> + pthread_cond_t prefork_cond;
> +
> + ret = pthread_cond_init(&prefork_cond, NULL);
> + assert(ret == 0);
> +
> /*
> - * Exit all idle threads, which are all blocked in
> - * pool->condvar. In the child we can destroy the
> - * pool, which would result in undefined behaviour in
> - * the pthread_cond_destroy(pool->condvar). glibc just
> + * Push all idle threads off pool->condvar. In the
> + * child we can destroy the pool, which would result
> + * in undefined behaviour in the
> + * pthread_cond_destroy(pool->condvar). glibc just
> * blocks here.
> */
> pool->prefork_cond = &prefork_cond;
> @@ -198,14 +224,16 @@ static void pthreadpool_prepare_pool(struct pthreadpool *pool)
> ret = pthread_cond_signal(&pool->condvar);
> assert(ret == 0);
>
> - ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
> - assert(ret == 0);
> + while (pool->num_idle == num_idle) {
> + ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
> + assert(ret == 0);
> + }
>
> pool->prefork_cond = NULL;
> - }
>
> - ret = pthread_cond_destroy(&prefork_cond);
> - assert(ret == 0);
> + ret = pthread_cond_destroy(&prefork_cond);
> + assert(ret == 0);
> + }
>
> /*
> * Probably it's well-defined somewhere: What happens to
> @@ -246,6 +274,8 @@ static void pthreadpool_parent(void)
> assert(ret == 0);
> ret = pthread_mutex_unlock(&pool->mutex);
> assert(ret == 0);
> + ret = pthread_mutex_unlock(&pool->fork_mutex);
> + assert(ret == 0);
> }
>
> ret = pthread_mutex_unlock(&pthreadpools_mutex);
> @@ -268,8 +298,12 @@ static void pthreadpool_child(void)
>
> ret = pthread_cond_init(&pool->condvar, NULL);
> assert(ret == 0);
> +
> ret = pthread_mutex_unlock(&pool->mutex);
> assert(ret == 0);
> +
> + ret = pthread_mutex_unlock(&pool->fork_mutex);
> + assert(ret == 0);
> }
>
> ret = pthread_mutex_unlock(&pthreadpools_mutex);
> @@ -284,7 +318,7 @@ static void pthreadpool_prep_atfork(void)
>
> static int pthreadpool_free(struct pthreadpool *pool)
> {
> - int ret, ret1;
> + int ret, ret1, ret2;
>
> ret = pthread_mutex_lock(&pthreadpools_mutex);
> if (ret != 0) {
> @@ -296,6 +330,7 @@ static int pthreadpool_free(struct pthreadpool *pool)
>
> ret = pthread_mutex_destroy(&pool->mutex);
> ret1 = pthread_cond_destroy(&pool->condvar);
> + ret2 = pthread_mutex_destroy(&pool->fork_mutex);
>
> if (ret != 0) {
> return ret;
> @@ -303,6 +338,9 @@ static int pthreadpool_free(struct pthreadpool *pool)
> if (ret1 != 0) {
> return ret1;
> }
> + if (ret2 != 0) {
> + return ret2;
> + }
>
> free(pool->jobs);
> free(pool);
> @@ -462,11 +500,30 @@ static void *pthreadpool_server(void *arg)
> /*
> * Me must allow fork() to continue
> * without anybody waiting on
> - * &pool->condvar.
> + * &pool->condvar. Tell
> + * pthreadpool_prepare_pool that we
> + * got that message.
> */
> - pthread_cond_signal(pool->prefork_cond);
> - pthreadpool_server_exit(pool);
> - return NULL;
> +
> + res = pthread_cond_signal(pool->prefork_cond);
> + assert(res == 0);
> +
> + res = pthread_mutex_unlock(&pool->mutex);
> + assert(res == 0);
> +
> + /*
> + * pthreadpool_prepare_pool has
> + * already locked this mutex across
> + * the fork. This makes us wait
> + * without sitting in a condvar.
> + */
> + res = pthread_mutex_lock(&pool->fork_mutex);
> + assert(res == 0);
> + res = pthread_mutex_unlock(&pool->fork_mutex);
> + assert(res == 0);
> +
> + res = pthread_mutex_lock(&pool->mutex);
> + assert(res == 0);
> }
>
> if (res == ETIMEDOUT) {
> --
> 2.11.0
>
>
> From c9486a43c83083e47893979a3d450e4d4b464673 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Wed, 29 Nov 2017 18:55:21 +0100
> Subject: [PATCH 2/2] pthreadpool: Add a test for the race condition fixed in
> the last commit
>
> Bug: https://bugzilla.samba.org/show_bug.cgi?id=13179
> Signed-off-by: Volker Lendecke <vl at samba.org>
> Reviewed-by: Jeremy Allison <jra at samba.org>
> ---
> lib/pthreadpool/tests.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 82 insertions(+)
>
> diff --git a/lib/pthreadpool/tests.c b/lib/pthreadpool/tests.c
> index 1aab80c2bb4..f0ae0aa4a93 100644
> --- a/lib/pthreadpool/tests.c
> +++ b/lib/pthreadpool/tests.c
> @@ -308,6 +308,82 @@ static int test_busyfork(void)
> return 0;
> }
>
> +static int test_busyfork2(void)
> +{
> + struct pthreadpool_pipe *p;
> + pid_t child;
> + int ret, jobnum;
> + struct pollfd pfd;
> +
> + ret = pthreadpool_pipe_init(1, &p);
> + if (ret != 0) {
> + fprintf(stderr, "pthreadpool_pipe_init failed: %s\n",
> + strerror(ret));
> + return -1;
> + }
> +
> + ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
> + if (ret != 0) {
> + fprintf(stderr, "pthreadpool_add_job failed: %s\n",
> + strerror(ret));
> + return -1;
> + }
> +
> + ret = pthreadpool_pipe_finished_jobs(p, &jobnum, 1);
> + if (ret != 1) {
> + fprintf(stderr, "pthreadpool_pipe_finished_jobs failed\n");
> + return -1;
> + }
> +
> + ret = poll(NULL, 0, 10);
> + if (ret == -1) {
> + perror("poll failed");
> + return -1;
> + }
> +
> + ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
> + if (ret != 0) {
> + fprintf(stderr, "pthreadpool_add_job failed: %s\n",
> + strerror(ret));
> + return -1;
> + }
> +
> + /*
> + * Do the fork right after the add_job. This tests a race
> + * where the atfork prepare handler gets all idle threads off
> + * the condvar. If we are faster doing the fork than the
> + * existing idle thread could get out of idle and take the
> + * job, after the fork we end up with no threads to take care
> + * of the job.
> + */
> +
> + child = fork();
> + if (child < 0) {
> + perror("fork failed");
> + return -1;
> + }
> +
> + if (child == 0) {
> + exit(0);
> + }
> +
> + pfd = (struct pollfd) {
> + .fd = pthreadpool_pipe_signal_fd(p),
> + .events = POLLIN|POLLERR
> + };
> +
> + do {
> + ret = poll(&pfd, 1, 5000);
> + } while ((ret == -1) && (errno == EINTR));
> +
> + if (ret == 0) {
> + fprintf(stderr, "job unfinished after 5 seconds\n");
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> static void test_tevent_wait(void *private_data)
> {
> int *timeout = private_data;
> @@ -423,6 +499,12 @@ int main(void)
> return 1;
> }
>
> + ret = test_busyfork2();
> + if (ret != 0) {
> + fprintf(stderr, "test_busyfork2 failed\n");
> + return 1;
> + }
> +
> printf("success\n");
> return 0;
> }
> --
> 2.11.0
>
More information about the samba-technical
mailing list