[distcc] Re: upgrading from 2.16 to 2.17: compiler crashes

Martin Pool martinpool at gmail.com
Mon Oct 11 09:25:44 GMT 2004


> Maybe the signal handler is doing something unsafe.

That's not exactly it; or at least when thinking of it that way I
didn't understand the full problem.

The signal can go off when the application is in the middle of
malloc(), for example.  Not only can the signal handler not call
malloc, but we can't do anything on the stack until the signal handler
returns and the malloc() call completes.

One option is to not do anything on the stack, but rather to just exit
with a failure code.  That would almost be OK if you assume timeouts
are very rare, but they're probably not.

Therefore the signal handler has to return, and we have to detect the
expired timeout some other way.  Simply using select() on the calls,
or even just detecting EINTR from an alarm might be enough.

The catch is name resolution.  I know of no portable way to set a
timeout on gethostbyname(), and the default is pretty long.  It would
be good to cover this case, since it's reasonably common for e.g. a
laptop to be unable to reach its nameserver.

It's impossible to avoid this using signals; if we interrupt the
resolver it may leave things in an arbitrary mess.  (Which should have
been obvious...)

There are a few options:

0- just suffer; or rather tell people to fix their nameserver or turn
down the timeout in resolv.conf

1- use resolv.h, which can take a timeout but might be nonportable

2- use an asynchronous DNS library; should be easy but adds another
dependency and/or thing that must be bundled

3- run lookups in a separate process, which can be killed off (a bit gross)

I guess #0 is probably the best thing.

-- 
Martin



More information about the distcc mailing list