[distcc] [PATCH] retry compiling when remote gcc was killed by signal

Eric Frias efrias at goblin.syncad.com
Thu Jul 29 14:07:11 GMT 2004


Jakub Stachowski wrote:
> I noticed that return codes > 128 (which means that compiler was killed by 
> signal) is treated exactly as normal compilation error. However, killing 
> compiler indicates problems with system (in most cases not enough memory) or 
> compiler itself, not with source code. So there is still chance that retrying 
> compilation on local machine will succeed. Typical example: I compile KDE 
> (lots of big c++ files) on several machines with 64-128MB of RAM ('localhost' 
> has 512MB RAM). I use distccKnoppix so there is no possibility to use swap on 
> these machines. Sometimes one of them run out of memory and gcc gets killed 
> and make finishes with error. Then I have to restart make with -j1 (to build 
> offending file on localhost as it has enough RAM), then restart it again with 
> -j8 and hope for the best.

Sorry for jumping in so late on this thread, I just joined the list today.  
This feature is something I've wanted for a long time.  I haven't tried 
this patch yet (I'm having other problems with distcc I'll complain about 
in a future message), but the description sounds like exactly what I've 
been looking for.

I use distcc to speed compile jobs on both linux and solaris. All of the
other machines in the distcc farm are win2k/xp cygwin cross-compilers.  
All machines have 1G of RAM (most of the windows boxes were upgraded to 1G
just to run distcc). The windows boxes are not dedicated to distcc -- they
are normal desktop systems other developers use, and these are sometimes
running local compiles of their own (with visual studio) which can eat up
resources temporarily.  It also seems like a large compile job that our
linux machine can handle will sometimes run out of virtual memory if 
distributed to a similarly configured windows box even when it isn't 
heavily loaded.

Most of the files we compile are medium sized, but we have a handful of
files that are huge, usually generated from an IDL compiler.  With these,
the .cpp source file is larger than one megabyte, and when preprocessed it
is three megabytes or more.  Often these will cause internal compiler
errors in g++ on the windows machines if they are given these large files.  
Initially, I had to do the same thing as Jakub: manually 'make -j1' to
compile that file, then break and restart with -j15.  After a while, I
patched our distcc client to force any files greater than some arbitrary
size (around 2M) to compile locally, and I kept lowering that size until
the errors mostly stopped.  What I really wanted was a way to make it
retry remote failures locally, but it wasn't immediately obvious to me how
to do that so I made a simpler but uglier fix.

Anyway, I just wanted to say thanks for the patch -- I look forward to
trying it.  This feature is definitely something I'd like to see become
part of the regular distcc distribution (even if it has to be enabled with
an environment variable).  And thanks to Martin and the rest of the distcc
developers...  distcc has made developing a large project on unix so much
more enjoyable.

Eric



More information about the distcc mailing list