Different problem (Re: [distcc] Problems with distcc hanging on large compiles (Patch not effective))

Andreas Granig andreas.granig at infonova.com
Thu Aug 29 23:47:00 GMT 2002


if I got it correctly, your problem is a hang on the client side? So
I've the same problem here. The strange thing is that it only occures
when distributing a job to a specific machine (client is Debian
unstable on 2.4.18, daemon is Debian stable on 2.2.20), all other
machines (Debian stable/unstable, RedHat, SuSe) run fine :o/

It happens that the client is blocking in io.c - dcc_pump_readwrite(...)
while read()ing the successfully compiled .o-file. "wanted" is e.a. 150000
bytes, but I only read 149050 and than read() blocks. It seems that in
some circumstances either "wanted" is calculated wrong on daemon side or
some bytes of the .o-file get lost in some way...

Little more info:

*** client ***

 ** netstat **
[agranig at azrael:agranig]$ netstat -pnat|grep distcc
 tcp        0      0
 ESTABLISHED 13575/distcc

 ** gdb **
[agranig at azrael:agranig]$ gdb /usr/local/bin/distcc
GNU gdb 2002-08-18-cvs
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-linux"...
(gdb) attach 13575
Attaching to program: /usr/local/bin/distcc, process 13575
Reading symbols from /lib/libnsl.so.1...done.
Loaded symbols for /lib/libnsl.so.1
Reading symbols from /lib/libpopt.so.0...done.
Loaded symbols for /lib/libpopt.so.0
Reading symbols from /lib/libc.so.6...done.
Loaded symbols for /lib/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
Reading symbols from /lib/libnss_files.so.2...done.
Loaded symbols for /lib/libnss_files.so.2
Reading symbols from /lib/libnss_dns.so.2...done.
Loaded symbols for /lib/libnss_dns.so.2
Reading symbols from /lib/libresolv.so.2...done.
Loaded symbols for /lib/libresolv.so.2
0x4010abb4 in read () from /lib/libc.so.6
(gdb) backtrace full
#0  0x4010abb4 in read () from /lib/libc.so.6
No symbol table info available.
#1  0x4015ddd0 in __check_rhosts_file () from /lib/libc.so.6
No symbol table info available.
#2  0x0804d7d5 in dcc_r_fd (ifd=5, ofd=6, token=0x804e307 "DOTO", size_out=0x0)
    at bulk.c:198
        len = 206736
#3  0x0804d6a5 in dcc_r_file (ifd=5, filename=0xbffff609 "src/.debug/IString.o",
    token=0x804e307 "DOTO", size_out=0x0) at bulk.c:170
        ofd = 6
        ret = 388167
#4  0x080495b2 in dcc_compile_remote (argv=0x8050ed8,
    cpp_fname=0x8052178 "/tmp/distcc_000003e8/cppout_0000013575.i",
    output_fname=0xbffff609 "src/.debug/IString.o", cpp_pid=13605,
    host=0x8050f58, status=0xbffff398) at distcc.c:179
        stime_usec = 10000
        utime_usec = 90000
#5  0x08049860 in dcc_build_somewhere (argv=0x8050ed8, status=0xbffff398)
    at distcc.c:312
        input_fname = 0xbffff5f6 "src/IString.cxx"
        output_fname = 0xbffff609 "src/.debug/IString.o"
        cpp_fname = 0x8052178 "/tmp/distcc_000003e8/cppout_0000013575.i"
        cpp_pid = 13605
        ret = 0
        host = (struct dcc_hostdef *) 0x8050f58
#6  0x08049a56 in main (argc=14, argv=0xbffff404) at distcc.c:374
        status = 0

*** daemon ***

 ** netstat **
[agranig at corelli:agranig]$ netstat -pnat|grep distccd
tcp        0      0  *               LISTEN      22789/distccd

 ** ps **
[agranig at corelli:agranig]$ ps auxw|grep distccd
agranig  22789  0.0  0.0  1380  136 ?        SN   Aug28   0:00 src/distccd --concurrent 1 --nice 5 --log-file=/home/agranig/distccd_corelli.log --verbose

The logfile doesn't look very interesting, the process terminated

Btw, Martin, what about that idea/patch about task limitiation I sent
you per mail last week? Already had a look on that?


More information about the distcc mailing list