[PATCH] tevent and threads - infrastructure improvements - version #2
jra at samba.org
Tue Sep 15 16:46:30 UTC 2015
On Tue, Sep 15, 2015 at 06:32:55PM +0200, Ralph Böhme wrote:
> Afaict helgrind complains because destroying the mutex can't be
> done under protection of the lock and as such is subject to race
> conditions when other threads try to acquire the very same mutex.
Yeah, well that's part of the standard :-). Can't destroy a locked
> Applying the helgrind patch from this thread also lets helgrind pass
> without errors:
> Afaict the patch changes helgrind behaviour to not report *possible*
> races as actual errors. As the test passes helgrind with this patch,
> we know our code is safe in practice as long as the user only doesn't
> prematurely free the proxy object.
Oh cool ! I was looking for something like that yesterday,
but couldn't find the bug report. Thanks a *lot* for confirming
it's a helgrind issue.
> I'm not fully convinced. :)
> There's a race condition in tevent_thread_proxy_destructor() and
> helgrind pointed at it: we can't guarantee that a thread calling
> tevent_thread_proxy_schedule() with a reply request passing a proxy
> object doesn't free that proxy object prematurely for whatever reason.
Indeed. But that has to be done by programmer convention I
think. The whole point of this is that the synchronization
mechanism between the lock/unlock and destroy is done via
the messaging in immediate events passed between the threads,
thus outside the knowledge of any pthread analyzer.
Freeing the proxy object whilst in use is A BAD THING (tm) :-),
but there are many such conventions within talloc/tevent and
we just have to document them (as you point out below).
> The called thread would call tevent_thread_proxy_schedule() to send
> the reply, locking an already destroyed mutex which is undefined
> behaviour. Can we add some documentation that freeing proxy objects
> with in flight operations should be avoided or similar?
Yes, that's a good idea. I'll update the docs to add this and
post a new patchset so we have something complete to discuss
next week at SDC.
> Fwiw, my patch moving the signaling out the lock protected code in
> tevent_thread_proxy_schedule() indeed added a similar race that would
> be triggered by prematurely freeing the proxy object. At least this
> one can be fixed by keeping the signalling in the lock protected code.
Yep. I think we've gotten to something that works.
More information about the samba-technical