[distcc] distcc 2.14 bugs in lzo code(?)

Arkadiusz Miskiewicz arekm at pld-linux.org
Tue May 25 14:55:44 GMT 2004


(again, this time I've subscribed so no need to moderator to approve my 
previous posting)

Hi,

I'm having problems with distcc 2.14 - it segfaults when compiling
KDE 3.2.9x snapshots using their make replacement called unsermake.

I'm using it with 6 hosts (and only one of them doesn't use LZO, rest is
using it). All machines are PLD Linux 2.0 (development version) and
gcc is ,,gcc version 3.3.3 (PLD Linux)''.

make -j10, dmalloc set in mode in where if something bad then program is
 interrupted

Core was generated by `/home/users/misiek/bin/distcc/g++ -DHAVE_CONFIG_H
 -I./kdecore -I./kdecore -I. -'.

When running distcc linked with dmalloc (www.dmalloc.com) library which
 protects and counts allocations + other stuff:
#0  0x40048120 in _dmalloc_alloc_total () from /usr/lib/libdmalloc.so
#1  0x0804e5e3 in dcc_r_bulk_lzo1x (out_fd=7, in_fd=6, in_len=49525) at
 src/compress.c:326 #2  0x08052914 in dcc_r_bulk (ofd=7, ifd=6, f_size=49525,
 compression=DCC_COMPRESS_LZO1X) at src/pump.c:158 #3  0x080515dd in
 dcc_r_file (ifd=6, filename=0xbffff978
 "./kdecore/.libs/kcalendarsystemgregorian.o", len=49525,
 compr=DCC_COMPRESS_LZO1X) at src/bulk.c:272
#4  0x080516fa in dcc_r_file_timed (ifd=6, fname=0xbffff978
 "./kdecore/.libs/kcalendarsystemgregorian.o", size=49525,
 compr=DCC_COMPRESS_LZO1X) at src/bulk.c:305
#5  0x0804a2f0 in dcc_retrieve_results (net_fd=6, status=0xbffff3c4,
    output_fname=0xbffff978 "./kdecore/.libs/kcalendarsystemgregorian.o",
 host=0x80b6ec8) at src/clirpc.c:245 #6  0x0804b9af in dcc_compile_remote
 (argv=0x80b2f08, input_fname=0xbffff94e
 "./kdecore/kcalendarsystemgregorian.cpp", cpp_fname=0x80b6f88
 "/home/users/misiek/tmp/distcc_2344f598.ii", output_fname=0xbffff978
 "./kdecore/.libs/kcalendarsystemgregorian.o", cpp_pid=25602, host=0x80b6ec8,
 status=0xbffff3c4) at src/remote.c:171
#7  0x0804a4fa in dcc_build_somewhere (argv=0x80b2d08, sg_level=0,
 status=0xbffff3c4) at src/compile.c:157 #8  0x0804a65c in
 dcc_build_somewhere_timed (argv=0x80b2e08, sg_level=0, status=0xbffff3c4) at
 src/compile.c:208 #9  0x0804aa90 in main (argc=57, argv=0xbffff454) at
 src/distcc.c:217 (gdb) frame 1
#1  0x0804e5e3 in dcc_r_bulk_lzo1x (out_fd=7, in_fd=6, in_len=49525) at
 src/compress.c:326 326         free(in_buf);
(gdb) l
321             ret = EXIT_IO_ERROR;
322             goto out;
323         }
324
325     out:
326         free(in_buf);
327
328         if (is_mmapped) {
329             /* NOTE: We must ftruncate *after* unmapping -- it's unsafe
 to have 330              * mmapped data we care about past the end of the
 file. */ (gdb) print in_buf
$1 = 0x80d7008 "\005\177ELF\001\001\001"
(gdb) print ret
$2 = 0
(gdb) print is_mmapped
$3 = 1
(gdb) print out_buf
$4 = 0x40017000 "\177ELF\001\001\001"
(gdb) print out_size
$5 = 198100


dmalloc says
1085337755: 86: WARNING: tried to free(0) from 'ra=0x804a790'
(gdb) x 0x804a790
0x804a790 <dcc_cpp_maybe+180>:  0x83f84589
(gdb) info line *(0x804a790)
Line 83 of "src/cpp.c" starts at address 0x804a776 <dcc_cpp_maybe+154> and
 ends at 0x804a799 <dcc_cpp_maybe+189>.

line 83 of src/cpp.c is  if ((ret = dcc_make_tmpnam("distcc", output_exten,
 cpp_fname))) so something weird.

It's repeatable. Since I started using dmalloc to debug
this thing, dmalloc stops program exactly in the same place as above
 (free(in_buf)).

Log from distcc:
distcc[25601] (dcc_x_token_int) send ARGV0000000f
distcc[25601] (dcc_x_token_int) send ARGV0000000f
distcc[25601] (dcc_x_token_int) send ARGV00000003
distcc[25601] (dcc_x_token_int) send ARGV00000003
distcc[25601] (dcc_x_token_int) send ARGV0000000d
distcc[25601] (dcc_x_token_int) send ARGV00000005
distcc[25601] (dcc_x_token_int) send ARGV00000011
distcc[25601] (dcc_x_token_int) send ARGV0000001a
distcc[25601] (dcc_x_token_int) send ARGV0000000f
distcc[25601] (dcc_x_token_int) send ARGV0000000e
distcc[25601] (dcc_x_token_int) send ARGV0000000b
distcc[25601] (dcc_x_token_int) send ARGV00000005
distcc[25601] (dcc_x_token_int) send ARGV00000002
distcc[25601] (dcc_x_token_int) send ARGV00000026
distcc[25601] (dcc_x_token_int) send ARGV00000002
distcc[25601] (dcc_x_token_int) send ARGV0000002a
distcc[25601] (dcc_note_state) note state 3, file "(NULL)", host "(NULL)"
distcc[25601] (dcc_collect_child) cpp child 25602 terminated with status 0
distcc[25601] (dcc_collect_child) cpp times: user 0.035994s, system
 0.018997s, 756 minflt, 0 majflt distcc[25601] cpp
 ./kdecore/kcalendarsystemgregorian.cpp on localhost completed ok
 distcc[25601] (dcc_note_state) note state 4, file "(NULL)", host "(NULL)"
 distcc[25601] (dcc_x_file) send 107903 byte file
 /home/users/misiek/tmp/distcc_2344f598.ii with token DOTI distcc[25601]
 (dcc_compress_file_lzo1x) compress 107903 bytes using mmap distcc[25601]
 (dcc_compress_lzo1x_alloc) compressed 107903 bytes to 31812 bytes: 29%
 distcc[25601] (dcc_x_token_int) send DOTI00007c44
distcc[25601] (dcc_send_job) client finished sending request to server
distcc[25601] (dcc_note_state) note state 5, file "(NULL)", host
 "213.25.186.11" distcc[25601] (dcc_r_token_int) got DONE00000002
distcc[25601] (dcc_r_result_header) got response header
distcc[25601] (dcc_note_state) note state 6, file "(NULL)", host "(NULL)"
distcc[25601] (dcc_r_token_int) got STAT00000000
distcc[25601] (dcc_r_token_int) got SERR00000000
distcc[25601] (dcc_r_token_int) got SOUT00000000
distcc[25601] (dcc_r_token_int) got DOTO0000c175
distcc[25601] (dcc_r_bulk_lzo1x) receive 49525 compressed bytes using mmap
distcc[25601] (dcc_r_bulk_lzo1x) LZO_E_OUTPUT_OVERRUN, trying again with
 198100 byte buffer distcc[25601] (dcc_r_bulk_lzo1x) receive 49525 compressed
 bytes using mmap distcc[25601] (dcc_r_bulk_lzo1x) decompressed 49525 bytes
 to 99232 bytes: 49%

Note that without using dmalloc it seems that some memory is overwrited/freed
since gdb shows stupid backtrace

#0  do_lookup_x (undef_name=0x8048b55 "unlink", hash=130363467,
 ref=0x8048514, result=0xbffff30c, scope=0x4003f0c4, i=1, version=0x40016978,
 flags=1, skip=0x0, type_class=1) at do-lookup.h:61 61      do-lookup.h: Nie
 ma takiego pliku ani katalogu.
        in do-lookup.h
(gdb) bt
#0  do_lookup_x (undef_name=0x8048b55 "unlink", hash=130363467,
 ref=0x8048514, result=0xbffff30c, scope=0x4003f0c4, i=1, version=0x40016978,
 flags=1, skip=0x0, type_class=1) at do-lookup.h:61 #1  0x40007d98 in
 _dl_lookup_symbol_x (undef_name=0x8048b55 "unlink", undef_map=0x40015c90,
 ref=0xbffff380, symbol_scope=0x40015e30, version=0x40016978, type_class=1,
 flags=1, skip_map=0x0) at dl-lookup.c:246 #2  0x4000ae09 in fixup
 (l=0x40015c90, reloc_offset=1073832968) at dl-runtime.c:98 #3  0x4000afe0 in
 _dl_runtime_resolve () at dl-runtime.c:198
#4  0x0804cac4 in dcc_remove_timefile (lockname=0x8054e64 "backoff",
 host=0x80ab3e8) at src/timefile.c:109 #5  0x08049711 in dcc_enjoyed_host
 (host=0x80ab3e8) at src/backoff.c:67 #6  0x0804a510 in dcc_build_somewhere
 (argv=0x80ab108, sg_level=0, status=0xbffff4a4) at src/compile.c:167 #7 
 0x0804a65c in dcc_build_somewhere_timed (argv=0x80ab020, sg_level=0,
 status=0xbffff4a4) at src/compile.c:208 #8  0x0804aa90 in main (argc=55,
 argv=0xbffff534) at src/distcc.c:217 #4  0x0804cac4 in dcc_remove_timefile
 (lockname=0x8054e64 "backoff", host=0x80ab3e8) at src/timefile.c:109 109    
     if (unlink(filename) == 0) {
(gdb) l
104         int ret = 0;
105
106         if ((ret = dcc_make_lock_filename(lockname, host, 0, &filename)))
107             return ret;
108
109         if (unlink(filename) == 0) {
110             rs_trace("remove %s", filename);
111         } else {
112             if (errno == ENOENT) {
113                 /* it's ok if somebody else already removed it */

--
Arkadiusz Miśkiewicz     CS at FoE, Wroclaw University of Technology
arekm.pld-linux.org, 1024/3DB19BBD, JID: arekm.jabber.org, PLD/Linux



More information about the distcc mailing list