[PATCH] Handle SIGCHLD better in process_standard, remove process_{onefork,prefork}

Andrew Bartlett abartlet at samba.org
Mon Mar 2 17:05:20 MST 2015


On Mon, 2015-03-02 at 16:21 +0100, Volker Lendecke wrote:
> On Mon, Mar 02, 2015 at 05:14:32PM +1300, Andrew Bartlett wrote:
> > This patch set shows a good (better?) way to handle cleaning up dead
> > sockets from child process, because rather than expecting SIGCHLD and
> > handling that (doing a waitpid on all pids, including a PID that someone
> > else is hoping to clean up), we use the read event on the socket to know
> > exactly which child exited.  
> > 
> > You might want to do something similar in source3/ or hook the source3
> > process termination code up to this code. 
> 
> In source3/smbd/server.c we have a routine smbd_sig_chld_handler() that
> is able to figure out which child died. We maintain a list of normal smbd
> worker children, and in remove_child_pid() we have a DEBUG message that
> fires when an "unknown" child according to our list exits. This works
> pretty well even under high load. We respond to the MSG_SMB_NUM_CHILDREN
> control and this seems very reliable according to our tests.
> 
> I'm happy to improve this, but right now I don't see a problem with
> the smbd code. Can you clarify what the new approach fixes beyond what
> we have?

The difference is that this code knows what the PID is for waitpid(),
rather than using a PID of -1: 	while ((pid = sys_waitpid(-1, &status,
WNOHANG)) > 0) {
		bool unclean_shutdown = False;

The issue with waitpid() of -1, as I understand it, is that no other
waitpid (expecting a specific pid) can be called in that codebase,
because this one will preempt the more specific call, say in
samba_runcmd_recv(). 

Even without using the pattern I show, the code above could instead loop
with waitpid() over all the PIDs in the list, one at a time, testing to
see if this was the one that caused SIGCHLD.

What my approach does is use the fd enrolled in epoll to identify which
PID it is, directly.  

That said, the performance difference between these last two approaches
may not matter, and the pattern of children in smbd compared with
bin/samba may mean that even if samba_runcmd was used to improve the way
we run scripts in smbd, it might not matter.  

Either way, I figured you might be interested in the approach, and in
any case it is clear we need to have abnormal process termination
handling in source4 as well, which this now allows. 

Thanks,

Andrew Bartlett

-- 
Andrew Bartlett
http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba






More information about the samba-technical mailing list