[PATCH] Fix a pthreadpool race

Christof Schmitt cs at samba.org
Tue Dec 12 20:02:46 UTC 2017


On Tue, Dec 12, 2017 at 01:40:29PM +0100, Volker Lendecke via samba-technical wrote:
> On Tue, Dec 12, 2017 at 06:57:22AM +1300, Gary Lockyer wrote:
> > Stack trace with symbols attached.
> 
> Thanks. This is pretty easy to reproduce in the new cmocka test by
> Christof Schmitt. Just run
> 
> while : ; do valgrind --tool=helgrind --num-callers=50 bin/pthreadpooltest_cmocka; done
> 
> and wait a few seconds. I'm getting
> 
> ==5733== ---Thread-Announcement------------------------------------------
> ==5733==
> ==5733== Thread #1 is the program's root thread
> ==5733==
> ==5733== ----------------------------------------------------------------
> ==5733==
> ==5733== Thread #1: Attempt to re-lock a non-recursive lock I already hold
> ==5733==    at 0x4C30044: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
> ==5733==    by 0x10AC08: pthreadpool_destroy (pthreadpool.c:360)
> ==5733==    by 0x10C77F: pthreadpool_tevent_destructor (pthreadpool_tevent.c:111)
> ==5733==    by 0x56887BD: _tc_free_internal (talloc.c:1078)
> ==5733==    by 0x568491B: _tc_free_children_internal (talloc.c:1593)
> ==5733==    by 0x568899C: _tc_free_internal (talloc.c:1104)
> ==5733==    by 0x568491B: _tc_free_children_internal (talloc.c:1593)
> ==5733==    by 0x568899C: _tc_free_internal (talloc.c:1104)
> ==5733==    by 0x5683C61: _talloc_free_internal (talloc.c:1174)
> ==5733==    by 0x5684B53: _talloc_free (talloc.c:1716)
> ==5733==    by 0x10A5D4: teardown_pthreadpool_tevent (tests_cmocka.c:60)
> ==5733==    by 0x50622AC: cmocka_run_one_test_or_fixture (cmocka.c:2628)
> ==5733==    by 0x5060C1D: cmocka_run_one_tests (cmocka.c:2752)
> ==5733==    by 0x5060414: _cmocka_run_group_tests (cmocka.c:2839)
> ==5733==    by 0x10A307: main (tests_cmocka.c:175)
> ==5733==  Lock was previously acquired
> ==5733==    at 0x4C3010C: ??? (in /usr/lib/valgrind/vgpreload_helgrind-amd64-linux.so)
> ==5733==    by 0x10AFA6: pthreadpool_add_job (pthreadpool.c:636)
> ==5733==    by 0x10CADB: pthreadpool_tevent_job_send (pthreadpool_tevent.c:297)
> ==5733==    by 0x10A641: test_create_do (tests_cmocka.c:101)
> ==5733==    by 0x10A465: test_create (tests_cmocka.c:160)
> ==5733==    by 0x50621EC: cmocka_run_one_test_or_fixture (cmocka.c:2616)
> ==5733==    by 0x5060B0B: cmocka_run_one_tests (cmocka.c:2724)
> ==5733==    by 0x5060414: _cmocka_run_group_tests (cmocka.c:2839)
> ==5733==    by 0x10A307: main (tests_cmocka.c:175)
> 
> and I am completely confused. How can we leave the mutex locked when
> leaving pthreadpool_add_job... Looking further...

cmocka keeps the global list to pass input to the mock function. Data is
appeneded through will_return and then the mock function calls mock_type
to dequeue an entry and act on it.

cmocka verifies that for each test, the number of will_return calls
matches the number of calls to mock_type. That could be a problem in
this test case:

	/*
	 * When a thread can be created, the job will run in the worker thread.
	 */
	will_return(__wrap_pthread_create, 0);
	ret = test_create_do(t->ev, t->pool, &in_main_thread);
	assert_return_code(ret, 0);
	assert_false(in_main_thread);

In the above, pthread_create will be called once, so having one entry passed
through will_return is always correct.

	/*
	 * Workerthread will still be active for a second; immediately
	 * running another job will also use the worker thread, even
	 * if a new thread cannot be created.
	 */
	ret = test_create_do(t->ev, t->pool, &in_main_thread);
	assert_return_code(ret, 0);
	assert_false(in_main_thread);

The assumption here is that no call to pthread_create is done and hence no call
to will_return is required. What likely happened is that the worker thread
exited, then test_create_do triggered the creation of a new pthread, which in
turn caused a call to mock_type. As there is no entry in the global list,
cmocka tried to abort through a jump while still holding the mutex.

Since this now appears to be a problem in some test environments, i would
suggest to simply remove the last part of the test (the call to test_create_do
without a previous will_return).

Christof



More information about the samba-technical mailing list