Race condition in tdb_runtime_check_for_robust_mutexes()

Ralph Boehme slow at samba.org
Sun Mar 27 07:34:38 UTC 2016


On Sat, Mar 26, 2016 at 09:59:24PM -0700, Jeremy Allison wrote:
> On Sat, Mar 26, 2016 at 01:06:39PM +0100, Ralph Boehme wrote:
> > On Fri, Mar 25, 2016 at 11:50:15AM -0700, Jeremy Allison wrote:
> > > On Wed, Mar 23, 2016 at 01:02:48PM +0200, Uri Simchoni wrote:
> > > > Here's a patch to fix the issue. I don't have confirmation that it
> > > > fixes the issue but would like to continue the discussion on:
> > > > a. Whether this is the right direction or should we eliminate the
> > > > signal handler as Ralph suggested.
> > > > b. The use of sigprocmask without #ifdefs, as opposed to installing
> > > > and uninstalling the signal handler - it's hard as it is to get
> > > > those things right, and I wonder whether we target envs which don't
> > > > support sigprocmask. fork() hasn't #ifdefs around it ... :)
> > > 
> > > OK, I like this patch as clearly we need to make
> > > the assignment to tdb_robust_mutex_pid atomic.
> > > 
> > > +1 from me.
> > 
> > Looks generally ok. One issue: calling sigprocmask() in a
> > mutlithreaded program results in undefined behaviour per the spec, so
> > we should use pthread_sigmask(). We're using pthread* stuff anyway, so
> > we know it will be available.
> > 
> > Attached is something that should work as well that uses
> > sigsuspend(). Feel free to push whatever you prefer. :)
> 
> Is sigsuspend real in glibc ?

Yes, see sysdeps/unix/sysv/linux/sigsuspend.c. It's a syscall.

> I remember the implementation
> of pselect used to be something like:
> 
>     sigprocmask(SIG_SETMASK, &sigmask, &sigsaved);
>     ready = select(nfds, &readfds, &writefds, &exceptfds, timeout);
>     sigprocmask(SIG_SETMASK, &sigsaved, NULL);
> 
> which still has the race condition :-).

Afaict this was fixed around 2006. :)

> Also, the use of sigsuspend here isn't idiomatic,
> which makes me nervous.

That is just example code. The key point is to block the critical
signal at some point before calling sigsupend() which will unblock and
wait for that (or other) signal.

Attached is an updated version that moves the signal mask business out
of tdb_robust_mutex_setup_sigchild().

> In the glibc manual we have:
> 
> http://www.gnu.org/software/libc/manual/html_node/Sigsuspend.html#Sigsuspend
> 
> The idiomatic use of sigsuspend is:
> 
> 	/* Set up the mask of signals to temporarily block. */
> 	sigemptyset (&mask);
> 	sigaddset (&mask, SIGUSR1);
> 
>> 
> 	/* Wait for a signal to arrive. */
> 	sigprocmask (SIG_BLOCK, &mask, &oldmask);
> 	while (!usr_interrupt)
> 		sigsuspend (&oldmask);
> 	sigprocmask (SIG_UNBLOCK, &mask, NULL);
> 
> "This last piece of code is a little tricky. The key point to
> remember here is that when sigsuspend returns, it resets
> the process’s signal mask to the original value, the value
> from before the call to sigsuspend—in this case, the SIGUSR1
> signal is once again blocked. The second call to sigprocmask
> is necessary to explicitly unblock this signal."
> 
> I *think* the changes you made to tdb_robust_mutex_setup_sigchild()
> are doing the same thing by doing the block/unblock inside
> it depending on the 'bool' paramter,

exactly.

> but to be honest I'd much prefer it if you moved those calls outside
> and back to just before/after the calls to
> tdb_robust_mutex_setup_sigchild(), and leave that call as-is. That
> way I can *see* they're there at the right point in the code.

attached. :)

Cheerio!
-slow
-------------- next part --------------
From 40cef032dab216e7098bf08652f40be004594f9a Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Sat, 26 Mar 2016 12:43:55 +0100
Subject: [PATCH] tdb: avoid a race condition when checking for robust mutexes

This fixes a race between calling waitpid() in two places (SIGCHLD the
signal handler and the rendezvous code when waiting for the child to
terminate), by

- blocking SIGCHLD before installing our signal handler

- in the rendezvous code call sigssuspend() which unblocks SIGCHLD and
  suspends the thread and waits for signal delivery

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11808

Signed-off-by: Ralph Boehme <slow at samba.org>
---
 lib/tdb/common/mutex.c | 45 ++++++++++++++++++++++++---------------------
 1 file changed, 24 insertions(+), 21 deletions(-)

diff --git a/lib/tdb/common/mutex.c b/lib/tdb/common/mutex.c
index fae43d4..1f9ea22 100644
--- a/lib/tdb/common/mutex.c
+++ b/lib/tdb/common/mutex.c
@@ -777,6 +777,7 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 	bool ok;
 	int status;
 	static bool initialized;
+	sigset_t mask, old_mask, suspend_mask;
 
 	if (initialized) {
 		return tdb_mutex_locking_cached;
@@ -828,6 +829,19 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 		goto cleanup_ma;
 	}
 
+	/*
+	 * Block SIGCHLD so we can atomically wait for it later with
+	 * sigsuspend()
+	 */
+	sigemptyset(&mask);
+	sigaddset(&mask, SIGCHLD);
+	ret = pthread_sigmask(SIG_BLOCK, &mask, &old_mask);
+	if (ret != 0) {
+		goto cleanup_ma;
+	}
+	suspend_mask = old_mask;
+	sigdelset(&suspend_mask, SIGCHLD);
+
 	if (tdb_robust_mutex_setup_sigchild(tdb_robust_mutex_handler,
 			&tdb_robust_mutext_old_handler) == false) {
 		goto cleanup_ma;
@@ -884,16 +898,9 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 	}
 
 	while (tdb_robust_mutex_pid > 0) {
-		pid_t pid;
-
-		errno = 0;
-		pid = waitpid(tdb_robust_mutex_pid, &status, 0);
-		if (pid == tdb_robust_mutex_pid) {
-			tdb_robust_mutex_pid = -1;
-			break;
-		}
-		if (pid == -1 && errno != EINTR) {
-			goto cleanup_child;
+		ret = sigsuspend(&suspend_mask);
+		if (ret != -1 || errno != EINTR) {
+			abort();
 		}
 	}
 	tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler, NULL);
@@ -927,22 +934,18 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 
 cleanup_child:
 	while (tdb_robust_mutex_pid > 0) {
-		pid_t pid;
-
 		kill(tdb_robust_mutex_pid, SIGKILL);
-
-		errno = 0;
-		pid = waitpid(tdb_robust_mutex_pid, &status, 0);
-		if (pid == tdb_robust_mutex_pid) {
-			tdb_robust_mutex_pid = -1;
-			break;
-		}
-		if (pid == -1 && errno != EINTR) {
-			break;
+		ret = sigsuspend(&suspend_mask);
+		if (ret != -1 || errno != EINTR) {
+			abort();
 		}
 	}
 cleanup_sig_child:
 	tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler, NULL);
+	ret = pthread_sigmask(SIG_SETMASK, &old_mask, NULL);
+	if (ret != 0) {
+		abort();
+	}
 cleanup_m:
 	pthread_mutex_destroy(m);
 cleanup_ma:
-- 
2.5.0



More information about the samba-technical mailing list