[distcc] Repeatable .o and .so checksums with distcc

Martin Pool mbp at sourcefrog.net
Mon Jun 28 22:32:10 MDT 2010


On 29 June 2010 13:02, Jeff Kilpatrick <kilpatrick.jeff at gmail.com> wrote:
> Hello,
>
> At my work, we've just begun to investigate how much of an impact that
> distcc will have on our builds.
>
> We typically perform 200 builds a week, ranging from a thousand lines of
> code, up to 600,000 lines of code each. Our back end build scripts are based
> on python, and use Linux make to build. We are running VMWare images on a
> blade cluster, and each of our three new build servers have 20Ghz processing
> power, with 4G of RAM. Our primary build environments are loop back ISOs,
> from a central CIFS server, and are unioned together with unionfs. Our
> source code is then copied into this environment, and we proceed with our
> build, using chroot to enter our build environment. Our 'distcc' machines
> use the same loop back system, with only our OS and distcc being accessible.

That's pretty cool.

> One of the most important things for our builds, due to the market that we
> are in, is that our builds must be reproducible, with repeatable md5sums on
> our shared objects, based on the same label and same dependencies. In our
> recent tests, we were able to take a particular build from 24 minutes to 14
> minutes, then finally 5 minutes, using distcc and adjusting our VMs.
> However, when performing an md5sum on our final shared objects / object
> files, the checksums change every build. We dropped down to just using g++
> to perform our linking, all locally, but our object files are still
> mismatching.
>
> In the object files' `objdump -s` output, it appears that an entry is being
> made into all our object files with the following syntax "distccd_XXXXX",
> with XXXXX being a seemingly random combination of characters.

Hi Jeff,

I think this is coming from gcc recording the input file name in the
object file.  distccd_xxxx.ii is the temporary file name used on the
server.

> In the same object file, compiled locally without distcc, we get a rather
> generic <built-in> placeholder.

I think this means it's coming from the builtin preprocessor.

I probably won't have time to work on this myself but if you have a
programmer interested in it there are two possible avenues:

- make gcc read from a file called <built-in> in a temporary subdirectory

- find some way to stop it recording the compiler input file name

Is that the only difference in the object files?  It's pretty common
for compilers to also record something about the time the compilation
was run or for source files to build this in, which would mean they
change every time.

>
> I've reviewed the source code for distcc, and seen a few references to this
> distccd_xxxxx. Unfortunately, I'm not a programmer, and thus am at a loss on
> how to further troubleshoot this, or even if its possible to get consistent
> checksums with distcc.
>
>
> Versions
> =======
> g++ (Gentoo 4.3.2-r4 p1.8, pie-10.1.5) 4.3.2
>
> distcc 3.1 i686-pc-linux-gnu
>   (protocols 1, 2 and 3) (default port 3632)
>   built Mar 29 2010 10:55:35
>
> Kernel: 2.6.9-89.ELsmp
>
> Command being issued:
>       DISTCC_VERBOSE=1 make -j24 CXX="distcc"
>
> Here's the partial output of objdump -s:
>  04f0 00030000 5f6d6f76 655f636f 6e737472  ...._move_constr
>  0500 7563745f 66776b2e 68000300 00474454  uct_fwk.h....GDT
>  0510 79706573 2e68000a 00007365 72646566  ypes.h....serdef
>  0520 732e6800 01000073 75666669 782e6870  s.h....suffix.hp
>  0530 70000b00 00646973 74636364 5f616333  p....distccd_ac3
>  0540 31633936 612e6969 000c0000 61646c5f  1c96a.ii....adl_
>  0550 62617272 6965722e 68707000 0d000062  barrier.hpp....b
>  0560 6f6f6c5f 6677642e 68707000 0e000069  ool_fwd.hpp....i
>  0570 6e746567 72616c5f 635f7461 672e6870  ntegral_c_tag.hp
>  0580 70000e00 00766f69 645f6677 642e6870  p....void_fwd.hp
>
> Thank you for reviewing my issue.
>
> -Jeff
>
> __
> distcc mailing list            http://distcc.samba.org/
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/distcc
>



-- 
Martin


More information about the distcc mailing list