Race condition in tdb_runtime_check_for_robust_mutexes()
uri at samba.org
Sun Mar 27 16:45:21 UTC 2016
Whew! lots of activity during the weekend! :)
RB+ me - not pushing 'cause Jeremy's already into it, so let him have
the final word.
On 03/27/2016 10:34 AM, Ralph Boehme wrote:
> On Sat, Mar 26, 2016 at 09:59:24PM -0700, Jeremy Allison wrote:
>> On Sat, Mar 26, 2016 at 01:06:39PM +0100, Ralph Boehme wrote:
>>> On Fri, Mar 25, 2016 at 11:50:15AM -0700, Jeremy Allison wrote:
>>>> On Wed, Mar 23, 2016 at 01:02:48PM +0200, Uri Simchoni wrote:
>>>>> Here's a patch to fix the issue. I don't have confirmation that it
>>>>> fixes the issue but would like to continue the discussion on:
>>>>> a. Whether this is the right direction or should we eliminate the
>>>>> signal handler as Ralph suggested.
>>>>> b. The use of sigprocmask without #ifdefs, as opposed to installing
>>>>> and uninstalling the signal handler - it's hard as it is to get
>>>>> those things right, and I wonder whether we target envs which don't
>>>>> support sigprocmask. fork() hasn't #ifdefs around it ... :)
>>>> OK, I like this patch as clearly we need to make
>>>> the assignment to tdb_robust_mutex_pid atomic.
>>>> +1 from me.
>>> Looks generally ok. One issue: calling sigprocmask() in a
>>> mutlithreaded program results in undefined behaviour per the spec, so
>>> we should use pthread_sigmask(). We're using pthread* stuff anyway, so
>>> we know it will be available.
>>> Attached is something that should work as well that uses
>>> sigsuspend(). Feel free to push whatever you prefer. :)
>> Is sigsuspend real in glibc ?
> Yes, see sysdeps/unix/sysv/linux/sigsuspend.c. It's a syscall.
>> I remember the implementation
>> of pselect used to be something like:
>> sigprocmask(SIG_SETMASK, &sigmask, &sigsaved);
>> ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
>> sigprocmask(SIG_SETMASK, &sigsaved, NULL);
>> which still has the race condition :-).
> Afaict this was fixed around 2006. :)
>> Also, the use of sigsuspend here isn't idiomatic,
>> which makes me nervous.
> That is just example code. The key point is to block the critical
> signal at some point before calling sigsupend() which will unblock and
> wait for that (or other) signal.
> Attached is an updated version that moves the signal mask business out
> of tdb_robust_mutex_setup_sigchild().
>> In the glibc manual we have:
>> The idiomatic use of sigsuspend is:
>> /* Set up the mask of signals to temporarily block. */
>> sigemptyset (&mask);
>> sigaddset (&mask, SIGUSR1);
>> /* Wait for a signal to arrive. */
>> sigprocmask (SIG_BLOCK, &mask, &oldmask);
>> while (!usr_interrupt)
>> sigsuspend (&oldmask);
>> sigprocmask (SIG_UNBLOCK, &mask, NULL);
>> "This last piece of code is a little tricky. The key point to
>> remember here is that when sigsuspend returns, it resets
>> the process’s signal mask to the original value, the value
>> from before the call to sigsuspend—in this case, the SIGUSR1
>> signal is once again blocked. The second call to sigprocmask
>> is necessary to explicitly unblock this signal."
>> I *think* the changes you made to tdb_robust_mutex_setup_sigchild()
>> are doing the same thing by doing the block/unblock inside
>> it depending on the 'bool' paramter,
>> but to be honest I'd much prefer it if you moved those calls outside
>> and back to just before/after the calls to
>> tdb_robust_mutex_setup_sigchild(), and leave that call as-is. That
>> way I can *see* they're there at the right point in the code.
> attached. :)
More information about the samba-technical