Race condition in tdb_runtime_check_for_robust_mutexes()

Ralph Boehme rb at sernet.de
Sat Mar 26 12:06:39 UTC 2016


On Fri, Mar 25, 2016 at 11:50:15AM -0700, Jeremy Allison wrote:
> On Wed, Mar 23, 2016 at 01:02:48PM +0200, Uri Simchoni wrote:
> > Here's a patch to fix the issue. I don't have confirmation that it
> > fixes the issue but would like to continue the discussion on:
> > a. Whether this is the right direction or should we eliminate the
> > signal handler as Ralph suggested.
> > b. The use of sigprocmask without #ifdefs, as opposed to installing
> > and uninstalling the signal handler - it's hard as it is to get
> > those things right, and I wonder whether we target envs which don't
> > support sigprocmask. fork() hasn't #ifdefs around it ... :)
> 
> OK, I like this patch as clearly we need to make
> the assignment to tdb_robust_mutex_pid atomic.
> 
> +1 from me.

Looks generally ok. One issue: calling sigprocmask() in a
mutlithreaded program results in undefined behaviour per the spec, so
we should use pthread_sigmask(). We're using pthread* stuff anyway, so
we know it will be available.

Attached is something that should work as well that uses
sigsuspend(). Feel free to push whatever you prefer. :)

-Ralph

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de,mailto:kontakt@sernet.de
-------------- next part --------------
From 9f9ad6fff9a3000a984c9093b4f46d166c662d35 Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Sat, 26 Mar 2016 12:43:55 +0100
Subject: [PATCH] tdb: avoid a race condition when checking for robust mutexes

This fixes a race between calling waitpid() in two places (SIGCHLD the
signal handler and the rendezvous code when waiting for the child to
terminate), by

- blocking SIGCHLD before installing our signal handler

- in the rendezvous code call sigssuspend() which unblocks SIGCHLD and
  suspends the thread and waits for signal delivery

BUG: https://bugzilla.samba.org/show_bug.cgi?id=11808

Signed-off-by: Ralph Boehme <slow at samba.org>
---
 lib/tdb/common/mutex.c | 53 ++++++++++++++++++++++++++------------------------
 1 file changed, 28 insertions(+), 25 deletions(-)

diff --git a/lib/tdb/common/mutex.c b/lib/tdb/common/mutex.c
index fae43d4..22107d4 100644
--- a/lib/tdb/common/mutex.c
+++ b/lib/tdb/common/mutex.c
@@ -714,11 +714,23 @@ static void (*tdb_robust_mutext_old_handler)(int) = SIG_ERR;
 static pid_t tdb_robust_mutex_pid = -1;
 
 static bool tdb_robust_mutex_setup_sigchild(void (*handler)(int),
-			void (**p_old_handler)(int))
+					    void (**p_old_handler)(int),
+					    bool block_sigchld,
+					    sigset_t *old_mask)
 {
 #ifdef HAVE_SIGACTION
+	int ret;
 	struct sigaction act;
 	struct sigaction oldact;
+	sigset_t mask;
+	int action = block_sigchld ? SIG_BLOCK : SIG_UNBLOCK;
+
+	sigemptyset(&mask);
+	sigaddset(&mask, SIGCHLD);
+	ret = pthread_sigmask(action, &mask, old_mask);
+	if (ret != 0) {
+		return false;
+	}
 
 	memset(&act, '\0', sizeof(act));
 
@@ -777,6 +789,7 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 	bool ok;
 	int status;
 	static bool initialized;
+	sigset_t old_mask;
 
 	if (initialized) {
 		return tdb_mutex_locking_cached;
@@ -828,8 +841,10 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 		goto cleanup_ma;
 	}
 
-	if (tdb_robust_mutex_setup_sigchild(tdb_robust_mutex_handler,
-			&tdb_robust_mutext_old_handler) == false) {
+	ok = tdb_robust_mutex_setup_sigchild(tdb_robust_mutex_handler,
+					     &tdb_robust_mutext_old_handler,
+					     true, &old_mask);
+	if (!ok) {
 		goto cleanup_ma;
 	}
 
@@ -883,20 +898,15 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 		goto cleanup_child;
 	}
 
+	sigdelset(&old_mask, SIGCHLD);
 	while (tdb_robust_mutex_pid > 0) {
-		pid_t pid;
-
-		errno = 0;
-		pid = waitpid(tdb_robust_mutex_pid, &status, 0);
-		if (pid == tdb_robust_mutex_pid) {
-			tdb_robust_mutex_pid = -1;
-			break;
-		}
-		if (pid == -1 && errno != EINTR) {
+		ret = sigsuspend(&old_mask);
+		if (ret != -1 || errno != EINTR) {
 			goto cleanup_child;
 		}
 	}
-	tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler, NULL);
+	tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler,
+					NULL, false, NULL);
 
 	ret = pthread_mutex_trylock(m);
 	if (ret != EOWNERDEAD) {
@@ -927,22 +937,15 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
 
 cleanup_child:
 	while (tdb_robust_mutex_pid > 0) {
-		pid_t pid;
-
 		kill(tdb_robust_mutex_pid, SIGKILL);
-
-		errno = 0;
-		pid = waitpid(tdb_robust_mutex_pid, &status, 0);
-		if (pid == tdb_robust_mutex_pid) {
-			tdb_robust_mutex_pid = -1;
-			break;
-		}
-		if (pid == -1 && errno != EINTR) {
-			break;
+		ret = sigsuspend(&old_mask);
+		if (ret != -1 || errno != EINTR) {
+			goto cleanup_child;
 		}
 	}
 cleanup_sig_child:
-	tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler, NULL);
+	tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler,
+					NULL, false, NULL);
 cleanup_m:
 	pthread_mutex_destroy(m);
 cleanup_ma:
-- 
2.5.0



More information about the samba-technical mailing list