[PATCH] Fix a pthreadpool race
Jeremy Allison
jra at samba.org
Wed Dec 6 22:28:06 UTC 2017
On Wed, Dec 06, 2017 at 06:56:56PM +0100, Volker Lendecke via samba-technical wrote:
> On Wed, Dec 06, 2017 at 05:53:57PM +0100, Volker Lendecke wrote:
> > Hi!
> >
> > This survives pthreadpooltest, it has not seen a full autobuild yet.
> > I'd just like to post it for thorough review. It has gone through
> > quite a few internal discussions back and forth bouncing ideas about
> > different solutions. If someone has a simpler solution that survives
> > the test in the second patch, I'm more than happy to review it.
> >
> > Christof, looking at your patches now :-)
> >
> > Review appreciated!
>
> Sorry, this missed a prerequisite patch. Attached.
OK - reviewing. First, understanding the problem the
test reproduces:
add a job (num_jobs = 1) -> creates thread to run it.
job finishes, thread sticks around (num_idle = 1).
num_jobs is now zero (initil job finished).
a) Idle thread is now waiting on pool->condvar inside
pthreadpool_server() in pthread_cond_timedwait().
Now, add another job ->
pthreadpool_add_job()
-> pthreadpool_put_job()
This adds the job to the queue.
Oh, there is an idle thread so don't
create one, do:
pthread_cond_signal(&pool->condvar);
and return.
Now call fork *before* idle thread in (a) wakes from
the signaling of pool->condvar.
In the parent (child is irrelevent):
Go into: pthreadpool_prepare() ->
pthreadpool_prepare_pool()
Set the variable to tell idle threads to exit:
pool->prefork_cond = &prefork_cond;
then wake them up with:
pthread_cond_signal(&pool->condvar);
This does nothing as the idle thread
is already awoken.
b) Idle thread wakes up and does:
Reduce idle thread count (num_idle = 0)
pool->num_idle -= 1;
Check if we're in the middle of a fork.
if (pool->prefork_cond != NULL) {
Yes we are, tell pthreadpool_prepare()
we are exiting.
pthread_cond_signal(pool->prefork_cond);
And exit.
pthreadpool_server_exit(pool);
return NULL;
}
So we come back from the fork in the parent with num_jobs = 1,
a job on the queue but no idle threads - and the code that
creates a new thread on job submission was skipped because
an idle thread existed at point (a).
Is that a correct explaination of the test ? If so I'll go
on to review the fix :-) :-).
Cheers,
Jeremy.
> --
> SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
> phone: +49-551-370000-0, fax: +49-551-370000-9
> AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
> http://www.sernet.de, mailto:kontakt at sernet.de
> From 20b2b0a926a2119c77cc25127d9f6c6c683d4b13 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Tue, 5 Dec 2017 15:34:09 +0100
> Subject: [PATCH 1/3] lib: Don't use PTHREAD_COND_INITIALIZER for a stack
> variable
>
> susv4 speaks about statically allocated variables. Stack vars are not.
>
> Signed-off-by: Volker Lendecke <vl at samba.org>
> ---
> lib/pthreadpool/pthreadpool.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
> index 23885aa6d11..646708adb62 100644
> --- a/lib/pthreadpool/pthreadpool.c
> +++ b/lib/pthreadpool/pthreadpool.c
> @@ -179,9 +179,12 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>
> static void pthreadpool_prepare_pool(struct pthreadpool *pool)
> {
> - pthread_cond_t prefork_cond = PTHREAD_COND_INITIALIZER;
> + pthread_cond_t prefork_cond;
> int ret;
>
> + ret = pthread_cond_init(&prefork_cond, NULL);
> + assert(ret == 0);
> +
> ret = pthread_mutex_lock(&pool->mutex);
> assert(ret == 0);
>
> --
> 2.11.0
>
>
> From 03934e534d190542e91184ab47a591f68772c6ea Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Wed, 29 Nov 2017 16:45:40 +0100
> Subject: [PATCH 2/3] pthreadpool: Fix starvation after fork
>
> After the race is before the race:
>
> 1) Create an idle thread
> 2) Add a job: This won't create a thread anymore
> 3) Immediately fork
>
> The idle thread will be woken twice before it's actually woken up: Both
> pthreadpool_add_job and pthreadpool_prepare_pool call cond_signal, for
> different reasons. We must look at pool->prefork_cond first because otherwise
> we will end up in a blocking job deep within a fork call, the helper thread
> must take its fingers off the condvar as quickly as possible. This means that
> after the fork there's no idle thread around anymore that would pick up the job
> submitted in 2). So we must keep the idle threads around across the fork.
>
> The quick solution to re-create one helper thread in pthreadpool_parent has a
> fatal flaw: What do we do if that pthread_create call fails? We're deep in an
> application calling fork(), and doing fancy signalling from there is really
> something we must avoid.
>
> This has one potential performance issue: If we have hundreds of idle threads
> (do we ever have that) during the fork, the call to pthread_mutex_lock on the
> fork_mutex from pthreadpool_server (the helper thread) will probably cause a
> thundering herd when the _parent call unlocks the fork_mutex. The solution for
> this to just keep one idle thread around. But this adds code that is not
> strictly required functionally for now.
>
> Signed-off-by: Volker Lendecke <vl at samba.org>
> ---
> lib/pthreadpool/pthreadpool.c | 92 ++++++++++++++++++++++++++++++++++---------
> 1 file changed, 73 insertions(+), 19 deletions(-)
>
> diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
> index 646708adb62..0bf7109a8f8 100644
> --- a/lib/pthreadpool/pthreadpool.c
> +++ b/lib/pthreadpool/pthreadpool.c
> @@ -91,11 +91,19 @@ struct pthreadpool {
> int num_idle;
>
> /*
> - * Condition variable indicating that we should quickly go
> - * away making way for fork() without anybody waiting on
> - * pool->condvar.
> + * Condition variable indicating that helper threads should
> + * quickly go away making way for fork() without anybody
> + * waiting on pool->condvar.
> */
> pthread_cond_t *prefork_cond;
> +
> + /*
> + * Waiting position for helper threads while fork is
> + * running. The forking thread will have locked it, and all
> + * idle helper threads will sit here until after the fork,
> + * where the forking thread will unlock it again.
> + */
> + pthread_mutex_t fork_mutex;
> };
>
> static pthread_mutex_t pthreadpools_mutex = PTHREAD_MUTEX_INITIALIZER;
> @@ -151,6 +159,15 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
> return ret;
> }
>
> + ret = pthread_mutex_init(&pool->fork_mutex, NULL);
> + if (ret != 0) {
> + pthread_cond_destroy(&pool->condvar);
> + pthread_mutex_destroy(&pool->mutex);
> + free(pool->jobs);
> + free(pool);
> + return ret;
> + }
> +
> pool->shutdown = false;
> pool->num_threads = 0;
> pool->max_threads = max_threads;
> @@ -159,6 +176,7 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>
> ret = pthread_mutex_lock(&pthreadpools_mutex);
> if (ret != 0) {
> + pthread_mutex_destroy(&pool->fork_mutex);
> pthread_cond_destroy(&pool->condvar);
> pthread_mutex_destroy(&pool->mutex);
> free(pool->jobs);
> @@ -179,21 +197,26 @@ int pthreadpool_init(unsigned max_threads, struct pthreadpool **presult,
>
> static void pthreadpool_prepare_pool(struct pthreadpool *pool)
> {
> - pthread_cond_t prefork_cond;
> int ret;
>
> - ret = pthread_cond_init(&prefork_cond, NULL);
> + ret = pthread_mutex_lock(&pool->fork_mutex);
> assert(ret == 0);
>
> ret = pthread_mutex_lock(&pool->mutex);
> assert(ret == 0);
>
> while (pool->num_idle != 0) {
> + int num_idle = pool->num_idle;
> + pthread_cond_t prefork_cond;
> +
> + ret = pthread_cond_init(&prefork_cond, NULL);
> + assert(ret == 0);
> +
> /*
> - * Exit all idle threads, which are all blocked in
> - * pool->condvar. In the child we can destroy the
> - * pool, which would result in undefined behaviour in
> - * the pthread_cond_destroy(pool->condvar). glibc just
> + * Push all idle threads off pool->condvar. In the
> + * child we can destroy the pool, which would result
> + * in undefined behaviour in the
> + * pthread_cond_destroy(pool->condvar). glibc just
> * blocks here.
> */
> pool->prefork_cond = &prefork_cond;
> @@ -201,14 +224,16 @@ static void pthreadpool_prepare_pool(struct pthreadpool *pool)
> ret = pthread_cond_signal(&pool->condvar);
> assert(ret == 0);
>
> - ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
> - assert(ret == 0);
> + while (pool->num_idle == num_idle) {
> + ret = pthread_cond_wait(&prefork_cond, &pool->mutex);
> + assert(ret == 0);
> + }
>
> pool->prefork_cond = NULL;
> - }
>
> - ret = pthread_cond_destroy(&prefork_cond);
> - assert(ret == 0);
> + ret = pthread_cond_destroy(&prefork_cond);
> + assert(ret == 0);
> + }
>
> /*
> * Probably it's well-defined somewhere: What happens to
> @@ -249,6 +274,8 @@ static void pthreadpool_parent(void)
> assert(ret == 0);
> ret = pthread_mutex_unlock(&pool->mutex);
> assert(ret == 0);
> + ret = pthread_mutex_unlock(&pool->fork_mutex);
> + assert(ret == 0);
> }
>
> ret = pthread_mutex_unlock(&pthreadpools_mutex);
> @@ -271,8 +298,12 @@ static void pthreadpool_child(void)
>
> ret = pthread_cond_init(&pool->condvar, NULL);
> assert(ret == 0);
> +
> ret = pthread_mutex_unlock(&pool->mutex);
> assert(ret == 0);
> +
> + ret = pthread_mutex_unlock(&pool->fork_mutex);
> + assert(ret == 0);
> }
>
> ret = pthread_mutex_unlock(&pthreadpools_mutex);
> @@ -287,7 +318,7 @@ static void pthreadpool_prep_atfork(void)
>
> static int pthreadpool_free(struct pthreadpool *pool)
> {
> - int ret, ret1;
> + int ret, ret1, ret2;
>
> ret = pthread_mutex_lock(&pthreadpools_mutex);
> if (ret != 0) {
> @@ -299,6 +330,7 @@ static int pthreadpool_free(struct pthreadpool *pool)
>
> ret = pthread_mutex_destroy(&pool->mutex);
> ret1 = pthread_cond_destroy(&pool->condvar);
> + ret2 = pthread_mutex_destroy(&pool->fork_mutex);
>
> if (ret != 0) {
> return ret;
> @@ -306,6 +338,9 @@ static int pthreadpool_free(struct pthreadpool *pool)
> if (ret1 != 0) {
> return ret1;
> }
> + if (ret2 != 0) {
> + return ret2;
> + }
>
> free(pool->jobs);
> free(pool);
> @@ -465,11 +500,30 @@ static void *pthreadpool_server(void *arg)
> /*
> * Me must allow fork() to continue
> * without anybody waiting on
> - * &pool->condvar.
> + * &pool->condvar. Tell
> + * pthreadpool_prepare_pool that we
> + * got that message.
> */
> - pthread_cond_signal(pool->prefork_cond);
> - pthreadpool_server_exit(pool);
> - return NULL;
> +
> + res = pthread_cond_signal(pool->prefork_cond);
> + assert(res == 0);
> +
> + res = pthread_mutex_unlock(&pool->mutex);
> + assert(res == 0);
> +
> + /*
> + * pthreadpool_prepare_pool has
> + * already locked this mutex across
> + * the fork. This makes us wait
> + * without sitting in a condvar.
> + */
> + res = pthread_mutex_lock(&pool->fork_mutex);
> + assert(res == 0);
> + res = pthread_mutex_unlock(&pool->fork_mutex);
> + assert(res == 0);
> +
> + res = pthread_mutex_lock(&pool->mutex);
> + assert(res == 0);
> }
>
> if (res == ETIMEDOUT) {
> --
> 2.11.0
>
>
> From b4d0c5ae20bce5849470d1acf2e567e6824685c9 Mon Sep 17 00:00:00 2001
> From: Volker Lendecke <vl at samba.org>
> Date: Wed, 29 Nov 2017 18:55:21 +0100
> Subject: [PATCH 3/3] pthreadpool: Add a test for the race condition fixed in
> the last commit
>
> Signed-off-by: Volker Lendecke <vl at samba.org>
> ---
> lib/pthreadpool/tests.c | 82 +++++++++++++++++++++++++++++++++++++++++++++++++
> 1 file changed, 82 insertions(+)
>
> diff --git a/lib/pthreadpool/tests.c b/lib/pthreadpool/tests.c
> index 1aab80c2bb4..f0ae0aa4a93 100644
> --- a/lib/pthreadpool/tests.c
> +++ b/lib/pthreadpool/tests.c
> @@ -308,6 +308,82 @@ static int test_busyfork(void)
> return 0;
> }
>
> +static int test_busyfork2(void)
> +{
> + struct pthreadpool_pipe *p;
> + pid_t child;
> + int ret, jobnum;
> + struct pollfd pfd;
> +
> + ret = pthreadpool_pipe_init(1, &p);
> + if (ret != 0) {
> + fprintf(stderr, "pthreadpool_pipe_init failed: %s\n",
> + strerror(ret));
> + return -1;
> + }
> +
> + ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
> + if (ret != 0) {
> + fprintf(stderr, "pthreadpool_add_job failed: %s\n",
> + strerror(ret));
> + return -1;
> + }
> +
> + ret = pthreadpool_pipe_finished_jobs(p, &jobnum, 1);
> + if (ret != 1) {
> + fprintf(stderr, "pthreadpool_pipe_finished_jobs failed\n");
> + return -1;
> + }
> +
> + ret = poll(NULL, 0, 10);
> + if (ret == -1) {
> + perror("poll failed");
> + return -1;
> + }
> +
> + ret = pthreadpool_pipe_add_job(p, 1, busyfork_job, NULL);
> + if (ret != 0) {
> + fprintf(stderr, "pthreadpool_add_job failed: %s\n",
> + strerror(ret));
> + return -1;
> + }
> +
> + /*
> + * Do the fork right after the add_job. This tests a race
> + * where the atfork prepare handler gets all idle threads off
> + * the condvar. If we are faster doing the fork than the
> + * existing idle thread could get out of idle and take the
> + * job, after the fork we end up with no threads to take care
> + * of the job.
> + */
> +
> + child = fork();
> + if (child < 0) {
> + perror("fork failed");
> + return -1;
> + }
> +
> + if (child == 0) {
> + exit(0);
> + }
> +
> + pfd = (struct pollfd) {
> + .fd = pthreadpool_pipe_signal_fd(p),
> + .events = POLLIN|POLLERR
> + };
> +
> + do {
> + ret = poll(&pfd, 1, 5000);
> + } while ((ret == -1) && (errno == EINTR));
> +
> + if (ret == 0) {
> + fprintf(stderr, "job unfinished after 5 seconds\n");
> + return -1;
> + }
> +
> + return 0;
> +}
> +
> static void test_tevent_wait(void *private_data)
> {
> int *timeout = private_data;
> @@ -423,6 +499,12 @@ int main(void)
> return 1;
> }
>
> + ret = test_busyfork2();
> + if (ret != 0) {
> + fprintf(stderr, "test_busyfork2 failed\n");
> + return 1;
> + }
> +
> printf("success\n");
> return 0;
> }
> --
> 2.11.0
>
More information about the samba-technical
mailing list