Race condition in tdb_runtime_check_for_robust_mutexes()

Uri Simchoni uri at samba.org
Sun Mar 27 16:45:21 UTC 2016

Whew! lots of activity during the weekend! :)
RB+ me - not pushing 'cause Jeremy's already into it, so let him have
the final word.


On 03/27/2016 10:34 AM, Ralph Boehme wrote:
> On Sat, Mar 26, 2016 at 09:59:24PM -0700, Jeremy Allison wrote:
>> On Sat, Mar 26, 2016 at 01:06:39PM +0100, Ralph Boehme wrote:
>>> On Fri, Mar 25, 2016 at 11:50:15AM -0700, Jeremy Allison wrote:
>>>> On Wed, Mar 23, 2016 at 01:02:48PM +0200, Uri Simchoni wrote:
>>>>> Here's a patch to fix the issue. I don't have confirmation that it
>>>>> fixes the issue but would like to continue the discussion on:
>>>>> a. Whether this is the right direction or should we eliminate the
>>>>> signal handler as Ralph suggested.
>>>>> b. The use of sigprocmask without #ifdefs, as opposed to installing
>>>>> and uninstalling the signal handler - it's hard as it is to get
>>>>> those things right, and I wonder whether we target envs which don't
>>>>> support sigprocmask. fork() hasn't #ifdefs around it ... :)
>>>> OK, I like this patch as clearly we need to make
>>>> the assignment to tdb_robust_mutex_pid atomic.
>>>> +1 from me.
>>> Looks generally ok. One issue: calling sigprocmask() in a
>>> mutlithreaded program results in undefined behaviour per the spec, so
>>> we should use pthread_sigmask(). We're using pthread* stuff anyway, so
>>> we know it will be available.
>>> Attached is something that should work as well that uses
>>> sigsuspend(). Feel free to push whatever you prefer. :)
>> Is sigsuspend real in glibc ?
> Yes, see sysdeps/unix/sysv/linux/sigsuspend.c. It's a syscall.
>> I remember the implementation
>> of pselect used to be something like:
>>     sigprocmask(SIG_SETMASK, &sigmask, &sigsaved);
>>     ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
>>     sigprocmask(SIG_SETMASK, &sigsaved, NULL);
>> which still has the race condition :-).
> Afaict this was fixed around 2006. :)
>> Also, the use of sigsuspend here isn't idiomatic,
>> which makes me nervous.
> That is just example code. The key point is to block the critical
> signal at some point before calling sigsupend() which will unblock and
> wait for that (or other) signal.
> Attached is an updated version that moves the signal mask business out
> of tdb_robust_mutex_setup_sigchild().
>> In the glibc manual we have:
>> http://www.gnu.org/software/libc/manual/html_node/Sigsuspend.html#Sigsuspend
>> The idiomatic use of sigsuspend is:
>> 	/* Set up the mask of signals to temporarily block. */
>> 	sigemptyset (&mask);
>> 	sigaddset (&mask, SIGUSR1);
>> 	/* Wait for a signal to arrive. */
>> 	sigprocmask (SIG_BLOCK, &mask, &oldmask);
>> 	while (!usr_interrupt)
>> 		sigsuspend (&oldmask);
>> 	sigprocmask (SIG_UNBLOCK, &mask, NULL);
>> "This last piece of code is a little tricky. The key point to
>> remember here is that when sigsuspend returns, it resets
>> the process’s signal mask to the original value, the value
>> from before the call to sigsuspend—in this case, the SIGUSR1
>> signal is once again blocked. The second call to sigprocmask
>> is necessary to explicitly unblock this signal."
>> I *think* the changes you made to tdb_robust_mutex_setup_sigchild()
>> are doing the same thing by doing the block/unblock inside
>> it depending on the 'bool' paramter,
> exactly.
>> but to be honest I'd much prefer it if you moved those calls outside
>> and back to just before/after the calls to
>> tdb_robust_mutex_setup_sigchild(), and leave that call as-is. That
>> way I can *see* they're there at the right point in the code.
> attached. :)
> Cheerio!
> -slow

More information about the samba-technical mailing list