[distcc] Distcc crashes Linux (Intel e1000 ethernet driver?)

Martin Pool mbp at sourcefrog.net
Tue Aug 3 15:47:38 GMT 2004


On  3 Aug 2004, mhuhtala at abo.fi wrote:
> 
> This is probably a Linux e1000 driver problem, but I figured I'd ask on
> this list whether anyone else has seen it.
> 
> We run distcc in a server cluster. Running a large distcc compilation on
> 4 to 8 cluster nodes via rsh causes about half of the nodes running
> distccd to crash, seemigly at random. The entire system goes down, the
> crashed nodes do not respond to ping etc. Sometimes the e1000 network
> driver module fails to start upon reboot. A second reboot always brings
> the system and the e1000 interface up correctly. The same distcc and OS
> version work ok on desktop systems that use fast ethernet and other
> network drivers.

Yes, this is probably a kernel problem not a distcc problem.  It's
probably this one:

  http://distcc.samba.org/faq.html#tg3-panic

It's not SMP-specific but it is timing dependent.  It may be that only
the SMP machines are fast enough to make it happen.

So you can either upgrade your kernel or try setting DISTCC_MMAP=0.
Please let me know how it works out.

-- 
Martin



More information about the distcc mailing list