Race condition in tdb_runtime_check_for_robust_mutexes()
Ralph Boehme
slow at samba.org
Tue Mar 29 09:27:49 UTC 2016
On Mon, Mar 28, 2016 at 07:47:13PM +0200, Ralph Boehme wrote:
> On Mon, Mar 28, 2016 at 10:38:09AM -0700, Jeremy Allison wrote:
> > On Mon, Mar 28, 2016 at 09:28:34AM -0700, Jeremy Allison wrote:
> > > On Mon, Mar 28, 2016 at 09:23:48AM -0700, Jeremy Allison wrote:
> > > > On Sun, Mar 27, 2016 at 09:34:38AM +0200, Ralph Boehme wrote:
> > > > >
> > > > > > but to be honest I'd much prefer it if you moved those calls outside
> > > > > > and back to just before/after the calls to
> > > > > > tdb_robust_mutex_setup_sigchild(), and leave that call as-is. That
> > > > > > way I can *see* they're there at the right point in the code.
> > > > >
> > > > > attached. :)
> > > >
> > > > Thanks Ralph, I can actually understand your code
> > > > changes now :-).
> > > >
> > > > Pushed !
> > >
> > > With one minor change to remove the now-unused
> > > 'status' variable :-).
> >
> > Oh. I'm getting a reproducible fail on :
> >
> > UNEXPECTED(error): samba3.smbtorture_s3.plain(nt4_dc).CLEANUP1.smbtorture(nt4_dc)
> > REASON: Exception: Exception: reason (failure) interrupted
> >
> > command: /home/jeremy/src/samba/git/master/source3/script/tests/test_smbtorture_s3.sh CLEANUP1 //$SERVER_IP/tmp $USERNAME $PASSWORD /home/jeremy/src/samba/git/master/bin/smbtorture3 -l $LOCAL_PATH 2>&1 | /home/jeremy/src/samba/git/master/selftest/filter-subunit --fail-on-empty --prefix="samba3.smbtorture_s3.plain(nt4_dc).CLEANUP1." --suffix="(nt4_dc)"
> > expanded command: /home/jeremy/src/samba/git/master/source3/script/tests/test_smbtorture_s3.sh CLEANUP1 //127.0.0.3/tmp jeremy localntdc2pass /home/jeremy/src/samba/git/master/bin/smbtorture3 -l /home/jeremy/src/samba/git/master/st/nt4_dc/share 2>&1 | /home/jeremy/src/samba/git/master/selftest/filter-subunit --fail-on-empty --prefix="samba3.smbtorture_s3.plain(nt4_dc).CLEANUP1." --suffix="(nt4_dc)"
> > ERROR: Testsuite[samba3.smbtorture_s3.plain(nt4_dc).CLEANUP1]
> > REASON: Exit code was 1
> >
> > with this patchset. Ralph, can you
> > take a look please ? I'm off to a
> > conference (lasts most of the week)
> > today so I won't be able to check
> > immediately what this breaks.
>
> hm, I'll take a look.
ok, I forgot to update the goto hell, so when the test succeeded the
signal mask was not reset. Updated patchset attached.
Cheerio!
-slow
-------------- next part --------------
From 847daeacb35f4b125288192c00dbb1354840a44b Mon Sep 17 00:00:00 2001
From: Ralph Boehme <slow at samba.org>
Date: Sat, 26 Mar 2016 12:43:55 +0100
Subject: [PATCH] tdb: avoid a race condition when checking for robust mutexes
This fixes a race between calling waitpid() in two places (SIGCHLD the
signal handler and the rendezvous code when waiting for the child to
terminate), by
- blocking SIGCHLD before installing our signal handler
- in the rendezvous code call sigssuspend() which unblocks SIGCHLD and
suspends the thread and waits for signal delivery
BUG: https://bugzilla.samba.org/show_bug.cgi?id=11808
Signed-off-by: Ralph Boehme <slow at samba.org>
---
lib/tdb/common/mutex.c | 59 +++++++++++++++++++++++++++-----------------------
1 file changed, 32 insertions(+), 27 deletions(-)
diff --git a/lib/tdb/common/mutex.c b/lib/tdb/common/mutex.c
index fae43d4..e96b28d 100644
--- a/lib/tdb/common/mutex.c
+++ b/lib/tdb/common/mutex.c
@@ -777,6 +777,7 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
bool ok;
int status;
static bool initialized;
+ sigset_t mask, old_mask, suspend_mask;
if (initialized) {
return tdb_mutex_locking_cached;
@@ -828,9 +829,22 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
goto cleanup_ma;
}
+ /*
+ * Block SIGCHLD so we can atomically wait for it later with
+ * sigsuspend()
+ */
+ sigemptyset(&mask);
+ sigaddset(&mask, SIGCHLD);
+ ret = pthread_sigmask(SIG_BLOCK, &mask, &old_mask);
+ if (ret != 0) {
+ goto cleanup_ma;
+ }
+ suspend_mask = old_mask;
+ sigdelset(&suspend_mask, SIGCHLD);
+
if (tdb_robust_mutex_setup_sigchild(tdb_robust_mutex_handler,
&tdb_robust_mutext_old_handler) == false) {
- goto cleanup_ma;
+ goto cleanup_sigmask;
}
tdb_robust_mutex_pid = fork();
@@ -884,16 +898,9 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
}
while (tdb_robust_mutex_pid > 0) {
- pid_t pid;
-
- errno = 0;
- pid = waitpid(tdb_robust_mutex_pid, &status, 0);
- if (pid == tdb_robust_mutex_pid) {
- tdb_robust_mutex_pid = -1;
- break;
- }
- if (pid == -1 && errno != EINTR) {
- goto cleanup_child;
+ ret = sigsuspend(&suspend_mask);
+ if (ret != -1 || errno != EINTR) {
+ abort();
}
}
tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler, NULL);
@@ -903,46 +910,44 @@ _PUBLIC_ bool tdb_runtime_check_for_robust_mutexes(void)
if (ret == 0) {
pthread_mutex_unlock(m);
}
- goto cleanup_m;
+ goto cleanup_sigmask;
}
ret = pthread_mutex_consistent(m);
if (ret != 0) {
- goto cleanup_m;
+ goto cleanup_sigmask;
}
ret = pthread_mutex_trylock(m);
if (ret != EDEADLK) {
pthread_mutex_unlock(m);
- goto cleanup_m;
+ goto cleanup_sigmask;
}
ret = pthread_mutex_unlock(m);
if (ret != 0) {
- goto cleanup_m;
+ goto cleanup_sigmask;
}
tdb_mutex_locking_cached = true;
- goto cleanup_m;
+ goto cleanup_sigmask;
cleanup_child:
while (tdb_robust_mutex_pid > 0) {
- pid_t pid;
-
kill(tdb_robust_mutex_pid, SIGKILL);
-
- errno = 0;
- pid = waitpid(tdb_robust_mutex_pid, &status, 0);
- if (pid == tdb_robust_mutex_pid) {
- tdb_robust_mutex_pid = -1;
- break;
- }
- if (pid == -1 && errno != EINTR) {
- break;
+ ret = sigsuspend(&suspend_mask);
+ if (ret != -1 || errno != EINTR) {
+ abort();
}
}
+
cleanup_sig_child:
tdb_robust_mutex_setup_sigchild(tdb_robust_mutext_old_handler, NULL);
+cleanup_sigmask:
+ ret = pthread_sigmask(SIG_SETMASK, &old_mask, NULL);
+ if (ret != 0) {
+ abort();
+ }
cleanup_m:
pthread_mutex_destroy(m);
cleanup_ma:
--
2.5.0
More information about the samba-technical
mailing list