[PATCH] Fix a pthreadpool race

Jeremy Allison jra at samba.org
Thu Dec 7 00:53:15 UTC 2017


On Wed, Dec 06, 2017 at 02:28:06PM -0800, Jeremy Allison wrote:
> On Wed, Dec 06, 2017 at 06:56:56PM +0100, Volker Lendecke via samba-technical wrote:
> > On Wed, Dec 06, 2017 at 05:53:57PM +0100, Volker Lendecke wrote:
> > > Hi!
> > > 
> > > This survives pthreadpooltest, it has not seen a full autobuild yet.
> > > I'd just like to post it for thorough review. It has gone through
> > > quite a few internal discussions back and forth bouncing ideas about
> > > different solutions. If someone has a simpler solution that survives
> > > the test in the second patch, I'm more than happy to review it.
> > > 
> > > Christof, looking at your patches now :-)
> > > 
> > > Review appreciated!
> > 
> > Sorry, this missed a prerequisite patch. Attached.
> 
> 
> OK - reviewing. First, understanding the problem the
> test reproduces:
> 
> add a job (num_jobs = 1) -> creates thread to run it.
> job finishes, thread sticks around (num_idle = 1).
> num_jobs is now zero (initil job finished).
> 
> a) Idle thread is now waiting on pool->condvar inside
> pthreadpool_server() in pthread_cond_timedwait().
> 
> Now, add another job ->
> 
> 	pthreadpool_add_job()
> 		-> pthreadpool_put_job()
> 			This adds the job to the queue.
> 		Oh, there is an idle thread so don't
> 		create one, do:
> 
> 		pthread_cond_signal(&pool->condvar);
> 
> 		and return.
> 
> Now call fork *before* idle thread in (a) wakes from
> the signaling of pool->condvar.
> 
> In the parent (child is irrelevent):
> 
> Go into: pthreadpool_prepare() ->
> 		pthreadpool_prepare_pool()
> 
> 		Set the variable to tell idle threads to exit:
> 
> 		pool->prefork_cond = &prefork_cond;
> 
> 		then wake them up with:
> 
> 		pthread_cond_signal(&pool->condvar);
> 
> 		This does nothing as the idle thread
> 		is already awoken.
> 
> b) Idle thread wakes up and does:
> 
> 		Reduce idle thread count (num_idle = 0)
> 
> 		pool->num_idle -= 1;
> 
> 		Check if we're in the middle of a fork.
> 
> 		if (pool->prefork_cond != NULL) {
> 
> 			Yes we are, tell pthreadpool_prepare()
> 			we are exiting.
> 
> 			pthread_cond_signal(pool->prefork_cond);
> 
> 			And exit.
> 
> 			pthreadpool_server_exit(pool);
> 			return NULL;
> 		}
> 
> So we come back from the fork in the parent with num_jobs = 1,
> a job on the queue but no idle threads - and the code that
> creates a new thread on job submission was skipped because
> an idle thread existed at point (a).
> 
> Is that a correct explaination of the test ? If so I'll go
> on to review the fix :-) :-).

OK, assuming that the previous explaination is correct, the
fix is to create a new pthreadpool context mutex:

pool->fork_mutex

and in pthreadpool_server(), when an idle thread wakes up and
notices we're in the prepare fork state, it puts itself to
sleep by waiting on the new pool->fork_mutex.

And in pthreadpool_prepare_pool(), instead of waiting for
the idle threads to exit, hold the pool->fork_mutex and
signal each idle thread in turn, and wait for the pool->num_idle
to go to zero - which means they're all blocked waiting on
pool->fork_mutex.

When the parent continues, pthreadpool_parent()
unlocks the pool->fork_mutex and all the previously
'idle' threads wake up (and you mention the thundering
herd problem, which is as you say vanishingly small :-)
and pick up any remaining job.

Nice work. Looks elegant to me and I can't see any
other way of fixing it rather than the create pthread
in pthreadpool_parent() call, which you ruled out
for the lack of ability to return any error.

Cool ! Reviewed-by: Jeremy Allison <jra at samba.org>

Feel free to push with any of the
text of this or the previous message in the
commit message if you think it will help others :-).

Cheers,

	Jeremy.



More information about the samba-technical mailing list