Race condition in tdb_runtime_check_for_robust_mutexes()
Uri Simchoni
uri at samba.org
Sun Mar 27 16:45:21 UTC 2016
Whew! lots of activity during the weekend! :)
RB+ me - not pushing 'cause Jeremy's already into it, so let him have
the final word.
Thanks,
Uri.
On 03/27/2016 10:34 AM, Ralph Boehme wrote:
> On Sat, Mar 26, 2016 at 09:59:24PM -0700, Jeremy Allison wrote:
>> On Sat, Mar 26, 2016 at 01:06:39PM +0100, Ralph Boehme wrote:
>>> On Fri, Mar 25, 2016 at 11:50:15AM -0700, Jeremy Allison wrote:
>>>> On Wed, Mar 23, 2016 at 01:02:48PM +0200, Uri Simchoni wrote:
>>>>> Here's a patch to fix the issue. I don't have confirmation that it
>>>>> fixes the issue but would like to continue the discussion on:
>>>>> a. Whether this is the right direction or should we eliminate the
>>>>> signal handler as Ralph suggested.
>>>>> b. The use of sigprocmask without #ifdefs, as opposed to installing
>>>>> and uninstalling the signal handler - it's hard as it is to get
>>>>> those things right, and I wonder whether we target envs which don't
>>>>> support sigprocmask. fork() hasn't #ifdefs around it ... :)
>>>>
>>>> OK, I like this patch as clearly we need to make
>>>> the assignment to tdb_robust_mutex_pid atomic.
>>>>
>>>> +1 from me.
>>>
>>> Looks generally ok. One issue: calling sigprocmask() in a
>>> mutlithreaded program results in undefined behaviour per the spec, so
>>> we should use pthread_sigmask(). We're using pthread* stuff anyway, so
>>> we know it will be available.
>>>
>>> Attached is something that should work as well that uses
>>> sigsuspend(). Feel free to push whatever you prefer. :)
>>
>> Is sigsuspend real in glibc ?
>
> Yes, see sysdeps/unix/sysv/linux/sigsuspend.c. It's a syscall.
>
>> I remember the implementation
>> of pselect used to be something like:
>>
>> sigprocmask(SIG_SETMASK, &sigmask, &sigsaved);
>> ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
>> sigprocmask(SIG_SETMASK, &sigsaved, NULL);
>>
>> which still has the race condition :-).
>
> Afaict this was fixed around 2006. :)
>
>> Also, the use of sigsuspend here isn't idiomatic,
>> which makes me nervous.
>
> That is just example code. The key point is to block the critical
> signal at some point before calling sigsupend() which will unblock and
> wait for that (or other) signal.
>
> Attached is an updated version that moves the signal mask business out
> of tdb_robust_mutex_setup_sigchild().
>
>> In the glibc manual we have:
>>
>> http://www.gnu.org/software/libc/manual/html_node/Sigsuspend.html#Sigsuspend
>>
>> The idiomatic use of sigsuspend is:
>>
>> /* Set up the mask of signals to temporarily block. */
>> sigemptyset (&mask);
>> sigaddset (&mask, SIGUSR1);
>>
>> …
>>
>> /* Wait for a signal to arrive. */
>> sigprocmask (SIG_BLOCK, &mask, &oldmask);
>> while (!usr_interrupt)
>> sigsuspend (&oldmask);
>> sigprocmask (SIG_UNBLOCK, &mask, NULL);
>>
>> "This last piece of code is a little tricky. The key point to
>> remember here is that when sigsuspend returns, it resets
>> the process’s signal mask to the original value, the value
>> from before the call to sigsuspend—in this case, the SIGUSR1
>> signal is once again blocked. The second call to sigprocmask
>> is necessary to explicitly unblock this signal."
>>
>> I *think* the changes you made to tdb_robust_mutex_setup_sigchild()
>> are doing the same thing by doing the block/unblock inside
>> it depending on the 'bool' paramter,
>
> exactly.
>
>> but to be honest I'd much prefer it if you moved those calls outside
>> and back to just before/after the calls to
>> tdb_robust_mutex_setup_sigchild(), and leave that call as-is. That
>> way I can *see* they're there at the right point in the code.
>
> attached. :)
>
> Cheerio!
> -slow
>
More information about the samba-technical
mailing list