[PATCH] Fix a pthreadpool race
Jeremy Allison
jra at samba.org
Thu Dec 7 00:53:15 UTC 2017
On Wed, Dec 06, 2017 at 02:28:06PM -0800, Jeremy Allison wrote:
> On Wed, Dec 06, 2017 at 06:56:56PM +0100, Volker Lendecke via samba-technical wrote:
> > On Wed, Dec 06, 2017 at 05:53:57PM +0100, Volker Lendecke wrote:
> > > Hi!
> > >
> > > This survives pthreadpooltest, it has not seen a full autobuild yet.
> > > I'd just like to post it for thorough review. It has gone through
> > > quite a few internal discussions back and forth bouncing ideas about
> > > different solutions. If someone has a simpler solution that survives
> > > the test in the second patch, I'm more than happy to review it.
> > >
> > > Christof, looking at your patches now :-)
> > >
> > > Review appreciated!
> >
> > Sorry, this missed a prerequisite patch. Attached.
>
>
> OK - reviewing. First, understanding the problem the
> test reproduces:
>
> add a job (num_jobs = 1) -> creates thread to run it.
> job finishes, thread sticks around (num_idle = 1).
> num_jobs is now zero (initil job finished).
>
> a) Idle thread is now waiting on pool->condvar inside
> pthreadpool_server() in pthread_cond_timedwait().
>
> Now, add another job ->
>
> pthreadpool_add_job()
> -> pthreadpool_put_job()
> This adds the job to the queue.
> Oh, there is an idle thread so don't
> create one, do:
>
> pthread_cond_signal(&pool->condvar);
>
> and return.
>
> Now call fork *before* idle thread in (a) wakes from
> the signaling of pool->condvar.
>
> In the parent (child is irrelevent):
>
> Go into: pthreadpool_prepare() ->
> pthreadpool_prepare_pool()
>
> Set the variable to tell idle threads to exit:
>
> pool->prefork_cond = &prefork_cond;
>
> then wake them up with:
>
> pthread_cond_signal(&pool->condvar);
>
> This does nothing as the idle thread
> is already awoken.
>
> b) Idle thread wakes up and does:
>
> Reduce idle thread count (num_idle = 0)
>
> pool->num_idle -= 1;
>
> Check if we're in the middle of a fork.
>
> if (pool->prefork_cond != NULL) {
>
> Yes we are, tell pthreadpool_prepare()
> we are exiting.
>
> pthread_cond_signal(pool->prefork_cond);
>
> And exit.
>
> pthreadpool_server_exit(pool);
> return NULL;
> }
>
> So we come back from the fork in the parent with num_jobs = 1,
> a job on the queue but no idle threads - and the code that
> creates a new thread on job submission was skipped because
> an idle thread existed at point (a).
>
> Is that a correct explaination of the test ? If so I'll go
> on to review the fix :-) :-).
OK, assuming that the previous explaination is correct, the
fix is to create a new pthreadpool context mutex:
pool->fork_mutex
and in pthreadpool_server(), when an idle thread wakes up and
notices we're in the prepare fork state, it puts itself to
sleep by waiting on the new pool->fork_mutex.
And in pthreadpool_prepare_pool(), instead of waiting for
the idle threads to exit, hold the pool->fork_mutex and
signal each idle thread in turn, and wait for the pool->num_idle
to go to zero - which means they're all blocked waiting on
pool->fork_mutex.
When the parent continues, pthreadpool_parent()
unlocks the pool->fork_mutex and all the previously
'idle' threads wake up (and you mention the thundering
herd problem, which is as you say vanishingly small :-)
and pick up any remaining job.
Nice work. Looks elegant to me and I can't see any
other way of fixing it rather than the create pthread
in pthreadpool_parent() call, which you ruled out
for the lack of ability to return any error.
Cool ! Reviewed-by: Jeremy Allison <jra at samba.org>
Feel free to push with any of the
text of this or the previous message in the
commit message if you think it will help others :-).
Cheers,
Jeremy.
More information about the samba-technical
mailing list