SIGPIPE problems

Andrew Tridgell tridge at samba.anu.edu.au
Fri Aug 28 15:36:58 GMT 1998


John asked me to look at a couple of recent bug reports for 1.9.18p10
where nmbd crashed with a SIGPIPE. I think I've worked out what is
going on.

In open_socket_out() we do some rather complex stuff to ensure that we
can control the timeouts for open a tcp connection. What we do is set
the socket non-blocking before the connect() call and loop until our
timeout passes re-trying the connect() each time.

In most cases this works, but unfortunately there is a nasty race
condition that can happen if the destination host is not reachable or
is not listening on that port.

What I think happens is that the initial connect() causes the SYN to
be sent, then we get EINPROGRESS back from connect() then the remote
host sends us a ICMP error and the local OS sets the socket as invalid
then we call connect() again and we get a SIGPIPE because we are
operating on an invalid socket. Unfortunately we have set SIGPIPE as
fatal in nmbd (we didn't expect to get any) so nmbd just exits.

I think that the above scenario might explain a farly large number of
"nmbd exits" bug reports we are getting. 

The obvious fix is to catch SIGPIPE (reinstalling the signal handler
if necessary). Any other suggestions? Any better ways of getting
timeouts on connect() ?

	 Cheers, Tridge


More information about the samba-technical mailing list