Race condition in tdb_runtime_check_for_robust_mutexes()
Uri Simchoni
uri at samba.org
Tue Mar 22 20:06:15 UTC 2016
I saw smbd hang on startup, and got lucky to get a decent stack trace
that shows it hangs in tdb_runtime_check_for_robust_mutexes(), line 890,
and the child process he's waiting for is gone.
After some head scratching I think I may have figured it out:
886 while (tdb_robust_mutex_pid > 0) {
887 pid_t pid;
888
889 errno = 0; /** BAM! SIGCHLD!!! exit status
collected and tdb_robust_mutex_pid becomes -1 */
890 pid = waitpid(tdb_robust_mutex_pid, &status, 0);
/* wait for ANY child process to finish - hang */
891 if (pid == tdb_robust_mutex_pid) {
892 tdb_robust_mutex_pid = -1;
893 break;
894 }
895 if (pid == -1 && errno != EINTR) {
896 goto cleanup_child;
897 }
898 }
And the question, assuming this is correct, is why do we have to
waitpid() in the signal handler (I understand we need the signal handler
to cope with SIG_IGN since this is a generic library and we don't know
what's the signal arrangement).
Also it seems like tdb_robust_mutex_setup_sigchild() doesn't necessarily
restore the SIGCHLD exactly to the way it was.
Comments?
Thanks,
Uri.
More information about the samba-technical
mailing list