[PATCHES][BUG 13170] Fix crash in pthreadpool thread after failure from pthread_create

Christof Schmitt cs at samba.org
Tue Dec 5 23:23:18 UTC 2017


On Tue, Dec 05, 2017 at 12:24:21PM -0700, Christof Schmitt via samba-technical wrote:
> On Mon, Dec 04, 2017 at 01:33:20PM +0100, Andreas Schneider via samba-technical wrote:
> > On Thursday, 30 November 2017 20:14:35 CET Christof Schmitt via samba-
> > technical wrote:
> > > On Thu, Nov 30, 2017 at 01:40:00PM +0100, Volker Lendecke wrote:
> > > > On Wed, Nov 29, 2017 at 12:23:34PM -0700, Christof Schmitt via samba-
> > technical wrote:
> > > > > As discussed in a side-thread, my initial patch lets the error surface
> > > > > back to the client. Here is an updated patch that avoids this issue. If
> > > > > no thread can be created, but another still exists, let that handle the
> > > > > request. If no thread exists, then fallback to processing the job
> > > > > synchronously, as the system might be hitting a resource limit.
> > > > 
> > > > From a review perspective, this looks good. For the refactoring one of
> > > > course RB+. However, pthreadpool being such a fragile piece of code I
> > > > would love to see unit tests for this.  This will require explicit
> > > > error injects with some DEVELOPER only code, but is it possible to get
> > > > this? If you don't have time, I might try to code something up.
> > > 
> > > Yes, see attached patch. I am not sure if that is the best solution, but
> > > it is easy enough. A more complicate approach would be LD_PRELOAD to
> > > overwrite the libc functions.
> > 
> > Use mocking support from cmocka to mock pthreadpool_create_thread()?
> > 
> > 
> > See https://lwn.net/Articles/558106/
> 
> Thanks for the pointer. Mocking the function completely does not work,
> as the good path requires creating an actual thread, but the --wrap
> option to override the call to pthread_create in the linker is useful.
> 
> cmocka looks useful and the pthreadpool tests could be converted to
> cmocka. I decided against this approach for now to minimize the changes
> and to also allow an easy backport to 4.6 where cmocka is not yet
> available.

Another small update. I forgot to clear the inject variable at the end
of the test. That is not relevant for the test added here, but it causes
problems when adding more tests after that.

Christof
-------------- next part --------------
From 3addce5db0504bc9850fc29b8fa77a327ea20a70 Mon Sep 17 00:00:00 2001
From: Christof Schmitt <cs at samba.org>
Date: Tue, 28 Nov 2017 10:49:36 -0700
Subject: [PATCH 1/3] pthreadpool: Move creating of thread to new function

No functional change, but this simplifies error handling.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13170

Signed-off-by: Christof Schmitt <cs at samba.org>
---
 lib/pthreadpool/pthreadpool.c | 79 ++++++++++++++++++++++++-------------------
 1 file changed, 45 insertions(+), 34 deletions(-)

diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
index 23885aa..4e1e5d4 100644
--- a/lib/pthreadpool/pthreadpool.c
+++ b/lib/pthreadpool/pthreadpool.c
@@ -521,14 +521,56 @@ static void *pthreadpool_server(void *arg)
 	}
 }
 
-int pthreadpool_add_job(struct pthreadpool *pool, int job_id,
-			void (*fn)(void *private_data), void *private_data)
+static int pthreadpool_create_thread(struct pthreadpool *pool)
 {
 	pthread_attr_t thread_attr;
 	pthread_t thread_id;
 	int res;
 	sigset_t mask, omask;
 
+	/*
+	 * Create a new worker thread. It should not receive any signals.
+	 */
+
+	sigfillset(&mask);
+
+	res = pthread_attr_init(&thread_attr);
+	if (res != 0) {
+		return res;
+	}
+
+	res = pthread_attr_setdetachstate(
+		&thread_attr, PTHREAD_CREATE_DETACHED);
+	if (res != 0) {
+		pthread_attr_destroy(&thread_attr);
+		return res;
+	}
+
+	res = pthread_sigmask(SIG_BLOCK, &mask, &omask);
+	if (res != 0) {
+		pthread_attr_destroy(&thread_attr);
+		return res;
+	}
+
+	res = pthread_create(&thread_id, &thread_attr, pthreadpool_server,
+			     (void *)pool);
+
+	assert(pthread_sigmask(SIG_SETMASK, &omask, NULL) == 0);
+
+	pthread_attr_destroy(&thread_attr);
+
+	if (res == 0) {
+		pool->num_threads += 1;
+	}
+
+	return res;
+}
+
+int pthreadpool_add_job(struct pthreadpool *pool, int job_id,
+			void (*fn)(void *private_data), void *private_data)
+{
+	int res;
+
 	res = pthread_mutex_lock(&pool->mutex);
 	if (res != 0) {
 		return res;
@@ -570,43 +612,12 @@ int pthreadpool_add_job(struct pthreadpool *pool, int job_id,
 		return 0;
 	}
 
-	/*
-	 * Create a new worker thread. It should not receive any signals.
-	 */
-
-	sigfillset(&mask);
-
-	res = pthread_attr_init(&thread_attr);
-	if (res != 0) {
-		pthread_mutex_unlock(&pool->mutex);
-		return res;
-	}
-
-	res = pthread_attr_setdetachstate(
-		&thread_attr, PTHREAD_CREATE_DETACHED);
+	res = pthreadpool_create_thread(pool);
 	if (res != 0) {
-		pthread_attr_destroy(&thread_attr);
-		pthread_mutex_unlock(&pool->mutex);
-		return res;
-	}
-
-        res = pthread_sigmask(SIG_BLOCK, &mask, &omask);
-	if (res != 0) {
-		pthread_attr_destroy(&thread_attr);
 		pthread_mutex_unlock(&pool->mutex);
 		return res;
 	}
 
-	res = pthread_create(&thread_id, &thread_attr, pthreadpool_server,
-			     (void *)pool);
-	if (res == 0) {
-		pool->num_threads += 1;
-	}
-
-        assert(pthread_sigmask(SIG_SETMASK, &omask, NULL) == 0);
-
-	pthread_attr_destroy(&thread_attr);
-
 	pthread_mutex_unlock(&pool->mutex);
 	return res;
 }
-- 
1.8.3.1


From 927d6f69a1177d8ade815b0256370a140b79d695 Mon Sep 17 00:00:00 2001
From: Christof Schmitt <cs at samba.org>
Date: Tue, 28 Nov 2017 10:59:06 -0700
Subject: [PATCH 2/3] pthreadpool: Undo put_job when returning error

When an error is returned to the caller of pthreadpool_add_job, the job
should not be kept in the internal job array. Otherwise the caller might
free the data structure and a later worker thread would still reference
it.

When it is not possible to create a single worker thread, the system
might be out of resources or hitting a configured limit. In this case
fall back to calling the job function synchronously instead of raising
the error to the caller and possibly back to the SMB client.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13170

Signed-off-by: Christof Schmitt <cs at samba.org>
---
 lib/pthreadpool/pthreadpool.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/lib/pthreadpool/pthreadpool.c b/lib/pthreadpool/pthreadpool.c
index 4e1e5d4..309aba9 100644
--- a/lib/pthreadpool/pthreadpool.c
+++ b/lib/pthreadpool/pthreadpool.c
@@ -429,6 +429,11 @@ static bool pthreadpool_put_job(struct pthreadpool *p,
 	return true;
 }
 
+static void pthreadpool_undo_put_job(struct pthreadpool *p)
+{
+	p->num_jobs -= 1;
+}
+
 static void *pthreadpool_server(void *arg)
 {
 	struct pthreadpool *pool = (struct pthreadpool *)arg;
@@ -599,6 +604,9 @@ int pthreadpool_add_job(struct pthreadpool *pool, int job_id,
 		 * We have idle threads, wake one.
 		 */
 		res = pthread_cond_signal(&pool->condvar);
+		if (res != 0) {
+			pthreadpool_undo_put_job(pool);
+		}
 		pthread_mutex_unlock(&pool->mutex);
 		return res;
 	}
@@ -614,8 +622,24 @@ int pthreadpool_add_job(struct pthreadpool *pool, int job_id,
 
 	res = pthreadpool_create_thread(pool);
 	if (res != 0) {
-		pthread_mutex_unlock(&pool->mutex);
-		return res;
+		if (pool->num_threads == 0) {
+			/*
+			 * No thread could be created to run job,
+			 * fallback to sync call.
+			 */
+			pthreadpool_undo_put_job(pool);
+			pthread_mutex_unlock(&pool->mutex);
+
+			fn(private_data);
+			return pool->signal_fn(job_id, fn, private_data,
+					       pool->signal_fn_private_data);
+		}
+
+		/*
+		 * At least one thread is still available, let
+		 * that one run the queued job.
+		 */
+		res = 0;
 	}
 
 	pthread_mutex_unlock(&pool->mutex);
-- 
1.8.3.1


From c1f742002e5a5977c4b09035549694bc64e169a5 Mon Sep 17 00:00:00 2001
From: Christof Schmitt <cs at samba.org>
Date: Tue, 5 Dec 2017 11:48:10 -0700
Subject: [PATCH 3/3] pthreadpool: Add test for handling pthread_create failure

This test uses the --wrap option of the linker to simulate errors from
pthread_create.

BUG: https://bugzilla.samba.org/show_bug.cgi?id=13170

Signed-off-by: Christof Schmitt <cs at samba.org>
---
 lib/pthreadpool/tests.c       | 149 ++++++++++++++++++++++++++++++++++++++++++
 lib/pthreadpool/wscript_build |   1 +
 2 files changed, 150 insertions(+)

diff --git a/lib/pthreadpool/tests.c b/lib/pthreadpool/tests.c
index 1aab80c..58f52be 100644
--- a/lib/pthreadpool/tests.c
+++ b/lib/pthreadpool/tests.c
@@ -382,6 +382,149 @@ static int test_tevent_1(void)
 	return 0;
 }
 
+static int inject_pthread_create_failure;
+
+int __wrap_pthread_create(pthread_t *thread, const pthread_attr_t *attr,
+			  void *(*start_routine) (void *), void *arg);
+int __real_pthread_create(pthread_t *thread, const pthread_attr_t *attr,
+			  void *(*start_routine) (void *),  void *arg);
+
+int __wrap_pthread_create(pthread_t *thread, const pthread_attr_t *attr,
+			  void *(*start_routine) (void *), void *arg)
+{
+	if (inject_pthread_create_failure != 0) {
+		return inject_pthread_create_failure;
+	}
+
+	return __real_pthread_create(thread, attr, start_routine, arg);
+}
+
+static void test_job_threadid(void *ptr)
+{
+	pthread_t *threadid = ptr;
+
+	*threadid = pthread_self();
+}
+
+static int test_create_do(struct tevent_context *ev,
+			  struct pthreadpool_tevent *pool,
+			  bool *in_main_thread)
+{
+	struct tevent_req *req;
+	pthread_t main_thread, worker_thread;
+	bool ok;
+	int ret;
+
+	main_thread = pthread_self();
+
+	req = pthreadpool_tevent_job_send(
+		ev, ev, pool, test_job_threadid, &worker_thread);
+	if (req == NULL) {
+		fprintf(stderr, "pthreadpool_tevent_job_send failed\n");
+		TALLOC_FREE(ev);
+		return ENOMEM;
+	}
+
+	ok = tevent_req_poll(req, ev);
+	if (!ok) {
+		ret = errno;
+		fprintf(stderr, "tevent_req_poll failed: %s\n",
+			strerror(ret));
+		TALLOC_FREE(ev);
+		return ret;
+	}
+
+
+	ret = pthreadpool_tevent_job_recv(req);
+	TALLOC_FREE(req);
+	if (ret != 0) {
+		fprintf(stderr, "tevent_req_recv failed: %s\n",
+			strerror(ret));
+		TALLOC_FREE(ev);
+		return ret;
+	}
+	*in_main_thread = pthread_equal(worker_thread, main_thread);
+
+	return 0;
+}
+
+static int test_create(void)
+{
+	struct tevent_context *ev;
+	struct pthreadpool_tevent *pool;
+	int ret;
+	bool in_main_thread;
+
+	ev = tevent_context_init(NULL);
+	if (ev == NULL) {
+		ret = errno;
+		fprintf(stderr, "tevent_context_init failed: %s\n",
+			strerror(ret));
+		return ret;
+	}
+	ret = pthreadpool_tevent_init(ev, 0, &pool);
+	if (ret != 0) {
+		fprintf(stderr, "pthreadpool_tevent_init failed: %s\n",
+			strerror(ret));
+		TALLOC_FREE(ev);
+		return ret;
+	}
+
+	/*
+	 * When pthreadpool cannot create the first worker thread,
+	 * this job will run in the sync fallback in the main thread.
+	 */
+	inject_pthread_create_failure = EAGAIN;
+	ret = test_create_do(ev, pool, &in_main_thread);
+	if (ret != 0) {
+		fprintf(stderr, "First job failed.\n");
+		goto out;
+	}
+	if (!in_main_thread) {
+		fprintf(stderr, "First job did not run in main thread.\n");
+		ret = -1;
+		goto out;
+	}
+
+	/*
+	 * When a thread can be created, the job will run in the worker thread.
+	 */
+	inject_pthread_create_failure = 0;
+	ret = test_create_do(ev, pool, &in_main_thread);
+	if (ret != 0) {
+		fprintf(stderr, "Second job failed.\n");
+		goto out;
+	}
+	if (in_main_thread) {
+		fprintf(stderr, "Second job did not run in worker thread.\n");
+		ret = -1;
+		goto out;
+	}
+
+	/*
+	 * Workerthread will still be active for a second; immediately
+	 * running another job will also use the worker thread, even
+	 * if a new thread cannot be created.
+	 */
+	inject_pthread_create_failure = EAGAIN;
+	ret = test_create_do(ev, pool, &in_main_thread);
+	if (ret != 0) {
+		fprintf(stderr, "Third job failed.\n");
+		goto out;
+	}
+	if (in_main_thread) {
+		fprintf(stderr, "Third job did not run in worker thread.\n");
+		ret = -1;
+		goto out;
+	}
+
+out:
+	inject_pthread_create_failure = 0;
+	TALLOC_FREE(pool);
+	TALLOC_FREE(ev);
+	return ret;
+}
+
 int main(void)
 {
 	int ret;
@@ -423,6 +566,12 @@ int main(void)
 		return 1;
 	}
 
+	ret = test_create();
+	if (ret != 0) {
+		fprintf(stderr, "test_create failed\n");
+		return 1;
+	}
+
 	printf("success\n");
 	return 0;
 }
diff --git a/lib/pthreadpool/wscript_build b/lib/pthreadpool/wscript_build
index d530463..b60557b 100644
--- a/lib/pthreadpool/wscript_build
+++ b/lib/pthreadpool/wscript_build
@@ -19,5 +19,6 @@ else:
 bld.SAMBA_BINARY('pthreadpooltest',
                   source='tests.c',
                   deps='PTHREADPOOL',
+                  ldflags='-Wl,--wrap=pthread_create',
                   enabled=bld.env.WITH_PTHREADPOOL,
                   install=False)
-- 
1.8.3.1



More information about the samba-technical mailing list