[distcc] Ada support in distcc

Martin Pool mbp at sourcefrog.net
Tue Feb 21 17:18:15 MST 2012


On 22 February 2012 10:40, Tim Dossett <timothy.dossett at gmail.com> wrote:
> Martin,
> Thanks for the review comments. All this code can be improved, but
> it's a matter of balancing cost and benefit. I'll attempt to address
> each of your concerns in-line below.

OK, thanks.  Since Fergus is the new maintainer and I'm a bit rusty on
this hopefully he can review the updated one (too).

>
> On Mon, Feb 20, 2012 at 8:03 PM, Martin Pool <mbp at sourcefrog.net> wrote:
>> Hi, thanks for the patch.
>>
>> I think I recall having to write the dining philosophers in Ada for a
>> uni assignment many years ago :)
>>
>> I have some comments on the diff but they don't mean I'm not happy to have this.
>>
>> It's good to have some smoke tests for Ada.  I do wonder if the
>> example code is a bit larger than it really has to be to make sure
>> that it compiles correctly.  Probably just 'hello world' would do
>> about as well, unless exercising eg cursor movement is particularly
>> useful to be sure things link correctly.  I suppose disk is cheap, but
>> some people will download this over slow connections and there's no
>> point making things much bigger than they have to be.  Also, the more
>> code, the more likely it will cause spurious ada-environment-dependent
>> failures.
>
> Agree; this source was chosen because it was quick and easy. Smaller
> would be better; I'll see what I can do.

Sure, that makes sense, I think it's probably fine as it is.

>> The cgpr files seem to have hardcoded paths and options.  What uses them?
>>
>
> .cgpr files are gnat build instruction files, used by the gnat .gpr
> "makefile". They are auto-generated by gprconfig, and are supposed to
> have full paths valid for the local environment. Probably should be
> deleted in this patch. I need to add instructions for building with
> gprbuild/cprconfig.

OK, it does sound like they'll need to be generated by autoconf to
work for other people. It should be pretty easy to have foo.cgpr.in
templates and then tell autoconf to expand them.

>> [tweak] You seem to be mixing in tabs and spaces, which scrambles the
>> diff.  Please reindent everything to just use spaces and to be
>> consistent with the current style (4-space columns etc).  Also, you're
>> sometimes putting a space between the function and the arguments;
>> please don't.
>
> I'll try to fix these. I generally don't get too excited about
> formatting, and let the IDE do it for me.

It's fine with me if the IDE does it. :-)

I also don't normally get too excited about formatting, but your diff
has things indented contrary to their semantic blocks and that Is
likely to be misinterpreted.
>
>>
>> +
>>  /**
>> + * Transmit environment information
>> + *  - only sent to include server, not distcc server
>> + *  - needed for ADA_PRJ_INCLUDE_FILE, CPATH, C_INCLUDE_PATH,
>> CPLUS_INCLUDE_PATH
>> + **/
>> +int dcc_x_env(int fd, unsigned int n_env, const char **envnames)
>> +{
>> +    char *env_value;
>> +    char *envname;
>> +
>> +    if (dcc_x_token_int(fd, "NENV", (unsigned) n_env))
>> +        return EXIT_PROTOCOL_ERROR;
>> +
>> +    for (; *envnames != NULL; ++envnames) {
>> +        envname = (char*) *envnames;
>> +       // make sure at least an empty string is sent
>> +       asprintf (&env_value, getenv(envname) ? getenv(envname) : "");
>> +        if (dcc_x_token_string(fd, "NAME", envname) ||
>> +            dcc_x_token_string(fd, "ENVI", env_value)) {
>> +           printf("client failed to send env %s = %s \n", envname, env_value);
>> +               free (env_value);
>> +            return -1;
>> +        }
>> +        free (env_value);
>> +   }
>> +   return 0;
>> +}
>>
>> [fix] The n_env is redundant with looking for the null; this will be
>> less likely to go wrong if you just count the number of items within
>> this function, before you send it.
>>
>> It looks like this is going to be an incompatible protocol break, so
>> people will need to upgrade both clients and servers together?  That's
>> ok, but it ought to be clearly called out in the news.
>
> Yep; this function evolved this way, and was never cleaned up. And
> yes, there is a protocol change, which needs to be documented
> somewhere. Additionally, I did not know how to update the protocol
> version sent back by the server, so the protocol check on the client
> is not updated either.

I think the fact that it's changed should be in the news and the
readme, and the protocol version ought to be bumped where ever that is
sent.  Perhaps it should get a new major version number to indicate a
new protocol.


>
>>
>> There seems to be some mojibake in the news file.
>
> I noticed the non-English character changes in the NEWS, which should
> be unchanged except for the Ada news. I'll see if I can fix this.
>
>
>>
>> [fix] There ought to be a manpage update for this, at least mentioning
>> that it _can_ do Ada, and the new config options.
>
> Agree it should be somewhere; how about a Readme.ada?

Or just a section in the readme would be fine.

>
>
>>
>> Maybe you should mention, in the readme, something about how ada
>> support works, since most people who might want to hack on this (or
>> review your patch ;-) won't have used it much.
>
> Add the basic fact that Ada is supported to the readme, and add a
> Readme.ada for more detail on how to use it and how it works?
>
>>
>> [fix] You're using printf to stdout for some errors; is that right?
>
> OK; I'll use rs_log_error()
>
>
>>
>>
>> +//static int tweak_ada_gnatem_file (char *filename, const char *root_dir)
>>
>> If this is going to be useful to anyone else, make it configurable,
>> otherwise delete it.
>
> Agree; I'll delete.
>
>>
>> +
>> +    // ada does not use the -MMD, -MF options for dependency
>> generation; skip for ada
>> +    if (!source_is_ada)
>> +    {
>>
>> it seems like those are not really "not ada" but rather
>> source_is_cish.  more generally, let's introduce a concept (probably
>> an enum) of say 'language family', that can be C, Ada, maybe other
>> things.  That will give people a clue how to add non-gcc support.
>>
>>
>
> Agree, but this seems like a larger design issue to me. I'd be happy
> to make this however you want it, but I don't want to design this as a
> part of this patch. Similar issue for the python changes.

ok

>
>
>> +    int Ada_compile = FALSE;
>>
>> lowercase locals please
>
> OK
>
>>
>> It would be good to run your version under Valgrind, for both C and
>> Ada, to see if there are any memory handling bugs.
>
> I did this; there are a few pre-existing leaks in distcc which I did
> not address.
>
>
>>
>>
>> ===================================================================
>> --- src/compile.c       (revision 755)
>> +++ src/compile.c       (working copy)
>> @@ -220,18 +220,6 @@
>>                                        host->cpp_where,
>>                                        &host->protover);
>>     }
>> -    /* Environment variables CPATH and two friends are hidden ways of passing
>> -     * -I's. Beware! */
>> -    if (getenv("CPATH") || getenv("C_INCLUDE_PATH")
>> -        || getenv("CPLUS_INCLUDE_PATH")) {
>> -        rs_log_warning("cannot use distcc_pump with any of environment"
>> -                       " variables CPATH, C_INCLUDE_PATH or CPLUS_INCLUDE_PATH"
>> -                       " set, preprocessing locally");
>> -        host->cpp_where = DCC_CPP_ON_CLIENT;
>> -        dcc_get_protover_from_features(host->compr,
>> -                                       host->cpp_where,
>> -                                       &host->protover);
>> -    }
>>  }
>>
>>
>> Can you explain why you removed this?
>
> This is part of a feature which allows distributed pre-processing of
> C/C++ when CPATH, C_INCLUDE_PATH or CPLUS_INCLUDE_PATH are defined.

Is that feature removed by this patch?

>
>
>>
>>
>> +
>> +    /* Fallback to non-pump mode for files listed in the env variable
>> DISTCC_NO_PUMP */
>> +    /* example: export DISTCC_NO_PUMP="file1.c file2.ada file3.cpp" */
>> +    char *skip_list = NULL;
>> +    char *skip_file = NULL;
>> +    asprintf (&skip_list, getenv("DISTCC_NO_PUMP"));
>>
>> Why this line?
>
> This is a feature which allows users/makefiles to specify files which
> should not be compiled with pump mode. What we see when building Ada
> is there are certain source files which always fail distributed
> compilation, then fall back to the local host after a timeout; this
> increases build time. Having an environmental variable set with these
> files shortcuts this, and decreases the build time.

OK, I understand the feature.  My question was about the specific line

>> +    asprintf (&skip_list, getenv("DISTCC_NO_PUMP"));

which is using DISTCC_NO_PUMP as a printf template?  I guess you need
to copy the string to pass it to strtok, but perhaps just a strdup
would be easier?

Rather than strtok'ing, I think you could just look for the next
whitespace and then strncmp that against the input file name.

>
>
>>
>> +    for (skip_file = strtok(skip_list, "       "); skip_file != NULL; ) {
>> +       if (strcmp (basename(input_fname), basename(skip_file)) == 0) {
>> +          host->cpp_where = DCC_CPP_ON_CLIENT;
>> +           dcc_get_protover_from_features(host->compr,
>> +                                          host->cpp_where,
>> +                                          &host->protover);
>> +           break;
>> +       }
>> +       skip_file = strtok(NULL, "      ");
>> +    }
>> +       free(skip_list);
>>
>> Please put tabs in the code as \t not ascii 9, or it will definitely get broken.
>>
>
> Good catch; I'll fix this.
>
>
>> In some ways it would be cleaner to just pass the option for the
>> compiler command that's specifically having trouble.  I can see how it
>> may be hard to get a makefile to do that and you want distcc to match
>> on the basename....
>>
>> strtok is an awful api; maybe there are better alternatives already in distcc?
>>
>
> distcc is just C, no libs, correct? I am open to suggestions on this.
>
>
>
>>
>> @@ -786,6 +794,20 @@
>>         free(server_side_argv);
>>     }
>>     free(discrepancy_filename);
>> +    free(deps_fname);
>> +//-    free(files);
>> +
>> +//    free(input_fname);
>> +//    free(output_fname);
>> +//-    free(cpp_fname);
>> +//     free(*files);
>> +//     free(*server_side_argv);
>> +       free(host);
>> +       free(server_stderr_fname);
>> +
>> +
>> +
>> +
>>     return ret;
>>  }
>>
>> why this?  It smells a bit like there are unknown memory bugs.
>
> Valgrind reports memory leaks for stuff that is not free'ed before
> distcc exits; I think I decided it was not important since distcc was
> going to exit anyway, but it's been awhile since I looked at this.

I agree that some amount of leaking (in the client) is fine, if not
exactly desirable.  Maybe those lines should just be deleted.

>
>>
>>
>> @@ -234,6 +235,8 @@
>>  int dcc_is_source(const char *sfile)
>>  {
>>     const char *dot, *ext;
>> +    char lower_ext[] = "xxx";
>> +    char *c;
>>     dot = dcc_find_extension_const(sfile);
>>     if (!dot)
>>         return 0;
>> @@ -268,13 +271,45 @@
>>     case 'S':
>>         return !strcmp(ext, "S");
>>  #endif
>> -    default:
>> +     // TODO: how to make this accommodate all ada suffixes
>> +     case 'a':
>> +     case 'A':
>> +        // match all cases
>> +           strncpy (lower_ext, ext, 3);
>> +           for (c=lower_ext; *c != 0; *c = tolower(*c), c++);
>> +        return !strcmp(lower_ext,"ada")
>> +            || !strcmp(lower_ext,"adb")
>> +            || !strcmp(lower_ext,"ads");
>> +   default:
>>         return 0;
>>     }
>>  }
>>
>> [fix] This is writing over a constant string, lower_ext.  I'm
>> surprised it's not crashing for you, and I'm pretty sure it will crash
>> on some systems.  Why not just use strcasecmp?
>
> lower_ext is an array variable initialized to "xxx", so I can write to it.

Oops, the rust on my C is showing.

> But your strcasecmp() suggestion is a  much better solution. I'll fix this.
>
>
>>
>>
>> +/**
>> + * Is the file ada source?
>> + **/
>> +int dcc_is_ada(const char *sfile)
>> +{
>> +    const char *dot, *ext;
>> +    char lower_ext[] = "xxx";
>> +    char *c;
>> +    dot = dcc_find_extension_const(sfile);
>> +    if (!dot)
>> +        return 0;
>> +    ext = dot+1;
>>
>> +    // TODO: how to make this accommodate all ada suffixes?
>> +    // => read gnatec file
>> +    // for now, match all cases
>> +    strncpy (lower_ext, ext, 3);
>> +    for (c=lower_ext; *c != 0; *c = tolower(*c), c++);
>> +    return !strcmp(lower_ext,"ada")
>> +        || !strcmp(lower_ext,"adb")
>> +        || !strcmp(lower_ext,"ads");
>> +}
>> +
>> +
>>  /**
>>
>> This seems redundant with the code above, and it has the same serious
>> bug.  Please put spaces between parameters like the rest of the code.
>
> Same comment as above. I'm not very picky about formatting.
>
>>
>>
>> Index: src/cleanup.c
>> ===================================================================
>> --- src/cleanup.c       (revision 755)
>> +++ src/cleanup.c       (working copy)
>> @@ -100,7 +100,7 @@
>>              * if that fails, try removing is as a file.
>>              * Report the error from removing-as-a-file
>>              * if both fail. */
>> -            if ((rmdir(cleanups[i]) == -1) &&
>> +             if ((rmdir(cleanups[i]) == -1) &&
>>                 (unlink(cleanups[i]) == -1) &&
>>                 (errno != ENOENT)) {
>>                 rs_log_notice("cleanup %s failed: %s", cleanups[i],
>>
>> it was fine where it was :)
>
> OK
>
>>
>> I think this would be safer with some unit tests for the new ada logic.
>>
> Are you talking about the python-based unit tests? I didn't have time
> to figure out the python unit test setup, so I went with the scripted
> tests in the test_ada directory. I agree it would be better integrated
> if the tests used the python test script.
>
>>
>> @@ -186,6 +189,35 @@
>>     }
>>
>>     for (i = 0; (a = argv[i]); i++) {
>> +
>> +       // Special case for gnat ada, where .ali files are
>> +       // always generated, and are always named <source>.ali
>> +        if (strstr(a, "-gnat") != NULL) {
>>
>> If this is saying "if there's any argument containing '-gnat' it's an
>> ada compile" then I think it's wrong.
>
> I think that's what it says; why do you think it's wrong? Any suggestions?

Well, suppose I'm running

   distcc -c test-squashing-gnats.cc

That's not Ada...

What's the condition you want to test?  I would guess it's something
about the compiler name, or is it to do with the source files having
Ada file names?


>
>
> (dotd.c)
>>
>> +               *needs_dotd = 1;
>> +               *sets_dotd_target = 1;
>> +               rs_log_info("found gnat");
>> +
>> +               // find source filename and mod the extension
>> +            for (j = 0; (a = argv[j]); j++) {
>> +               if (dcc_is_source (a)) {
>> +                       rs_log_info("found source name: %s", a);
>> +                       if (asprintf (&temp_dotd_fname, "%s---", a) ==-1) {
>> +                       return EXIT_OUT_OF_MEMORY;
>> +                       }
>>
>> is this a weird way of allocating a 3-byte longer buffer? why?
>
> I don't remember why I did this; it does seem complex. It's like I
> really wanted to use dcc_output_from_source().
>
>>
>> +                           // this changes the file extension to .ali
>> +                       dcc_output_from_source(temp_dotd_fname, ".ali", dotd_fname);
>> +                       free (temp_dotd_fname);
>> +                       rs_log_info("ali=%s", *dotd_fname);
>> +                       return 0;
>> +               }
>> +            }
>> +            // could not find the source file???
>> +            rs_log_error("Cound not find the gnat source file to set
>> the .ali filename");
>> +            return EXIT_DISTCC_FAILED;
>> +        }
>> +
>> +
>> +
>>
>>
>> you have several function calls such as asprintf, whose return values
>> need to be checked but aren't.
>
> OK
>
>
>>
>> +                       //TODO: - utime overflows in the year 2038
>>
>> :-)
>>
>> So this could probably do with another look over the specific ada
>> manipulations.  I do suspect there is some dodgy pointer manipulation
>> there and running some tests under valgrind and writing some unit
>> tests would be a good way to shake that out.
>>
>> regards,
>> Martin
>
> Thanks again for the comments. I'll try to get a revised patch
> together in the next week or so.




-- 
Martin


More information about the distcc mailing list