[distcc] Re: upgrading from 2.16 to 2.17: compiler crashes

Dimitri Papadopoulos-Orfanos papadopo at www.NOSPAM.fr
Thu Aug 26 11:12:06 GMT 2004


I'm still struggling trying to analye the distcc 2.17 crashes.

Martin, maybe you can have a look at this if you have time, I'm sure you 
can help me find out what's wrong with the new timeout code.

I've tried to instrument distcc 2.17 with Insure++. I did get some 
runtime errors, such as uninitialized reads, but they're unrelated and 
I'd rather fix the crash before looking into them.

The result is that distcc crashes Insure++ just like it crashes Valgrind:

### Unix/Signal.cc:332: panic: received signal 11 while in runtime
###  <at> (#)$RCSfile: Signal.cc,v $ $Revision: 32.52 $ $Date: 2003/07/28
16:15:14 $
### ThisThread.cc:593: abort
###  <at> (#)$RCSfile: ThisThread.cc,v $ $Revision: $ $Date:
2003/08/01 22:37:30 $

At least this seems to indicate that something's wrong in the timeout 
signal handler. Also the comments from the Valgrind team were:

| That's an ENTER instruction with a non-zero nesting level. It
| sounds like a pretty unusual instruction to be using - are you
| sure your program isn't jumping through a bad pointer somewhere?

| If the pointer is undefined and valgrind realises that it is
| undefined then it should warn you. It could be well defined but
| bogus however, in which case valgrind wouldn't be able to help.

See also this thread on the Valgrind-users mailing list:

It seems distcc is jumping to some bogus address somwhere in the timeout 
handlers. Do you have a clue where that could be happening?


More information about the distcc mailing list