CMake Proposal

tridge at samba.org tridge at samba.org
Wed Feb 24 20:40:45 MST 2010


Hi Simo,

 > Oh come on, you know I did not mean that cmake is standard, I wasn't
 > referring to cmake at all, I was saying that I would like to make things
 > in a more standardized way whetehr that is the cmake way, the waf way or
 > the autotools way.

What we are trying to do in Samba is complex. What features of Samba4
do you want to drop in order to make it 'simpler' ? Do you want to
drop PIDL? Maybe you'd like to drop the python scripting? Or perhaps
you don't want to use ASN1?

Those are the sorts of things that make the build complex, but they
are also things that save us a lot of time in other ways. Samba tries
to do in 500k lines of code what Windows does in vastly more code. If
we went with 'plain vanilla' code, we wouldn't be able to achieve
what we want to achieve.

 > The problem I have with the current build system is that it is too
 > smart, we have created a mess that encouraged circular dependencies and
 > broken interfaces all over.

I appreciate this was a rant, and you weren't really thinking hard
about it, but it might be worth while you looking at what I've done
with this dependency problem in the waf Samba build.

Rather than just shouting "its too complex!", I'm trying to tame what
we have. The config.mk system contains the information we need, but it
doesn't have enough validation around it to ensure that the result is
sane.

If you look at CHECK_DEPENDENCIES() in the wafsamba.py tool, you'll
see that it calculates the full dependency tree of Samba, and then it
automatically fixes the circular dependencies we have. You get a
message like this when you run "waf -v" to build Samba:

tridge at blu:~/samba/git/combined/source4$ waf -v
 Waf: Entering directory `/home/tridge/samba/git/combined/source4/bin'
 ERROR: Circular dependency for auth_system_session: auth_system_session->auth_session->SAMDB->auth_system_session
 ERROR: Circular dependency for LDBSAMBA: LDBSAMBA->SAMDB_SCHEMA->LDBSAMBA

If you go and look at our config.mk files, you'll see that it's
right. The waf scripts were auto-generated from our existing config.mk
files, so they have the same circular dependency.

I've setup waf at the moment to automatically fix the dependencies,
and it then says:

  Removing dependency auth_session from target auth_system_session

which happens to be the right fix. We could instead just fail the
build, but I am not doing that right now as I'm aiming for the full
build to work without modifying our code, so that we can run a waf
build and the existing build system in parallel, giving us a smooth
transition.

Later on we would disable the "auto-fix" of the circular dependencies,
and instead throw an error (in fact, I have already set it to throw an
error, but I've made the dependency checker catch the exception and
fix it if it can).

If you want to see all the gory details on how this is done, then run:

  waf --zone=deps

and it will show you all the dependency calculations it does.

 > We still pay the price of the mega proto.h file in samba3 land,
 > even if a lot of people are begging us to reduce the size of the
 > binaries.

On the size of the binaries, that is not a build system problem. It's
a design problem of the project, but luckily it's easily fixable.

The key is this:

  - all SUBSYSTEM elements should only be linked into one library. It
    should be an error to link a SUBSYSTEM into two libraries (that
    is why our libraries are so large)

  - a SUBSYSTEM should never be linked both directly into a binary,
    and into a library used by that binary. That leads to duplicated
    symbols (big binaries) and also to the problem of symbols being
    instantiated twice in the binary (which leads to runtime errors).

  - similarly, a SUBSYSTEM should never be linked into both a library
    and a module. That should throw an error.

With waf I can give it those rules, and it can throw a build error if
the rules are ever violated. I can do that because all of the
dependencies, along with the type of each dependency is available to
me as simple python objects. I can also only do it if we include in
our build rules the type of the target, which is why I have separate
SAMBA_MODULE(), SAMBA_SUBSYSTEM(), SAMBA_BINARY() etc wrappers. By
using those wrappers I'm ensuring that the build system knows enough
about each target to be able to check the project rules.

No build system can have this sort of rule builtin because it is a
rule about the project, not about how to build things. To fix our
current problem we would have to encode the same types of rules in no
matter what build system we use.

Have a look at the 'TARGET_TYPE' code I've added in wafsamba.py. It
knows that when we build a Samba module that it gets the 'TARGET_TYPE'
of 'MODULE'. Then CHECK_DEPENDENCIES() can validate that all the rules
for building Samba are obeyed. Right now CHECK_DEPENDENCIES() just
checks the dependency graph and fixes any circular dependencies, but
it is very little additional code to add rules about linking.

 > The problem is so bad that some still try to use samba 2.2.x
 > because of the size!!

I remember a funny incident from the early days of Samba. I had
released one of the Samba 1.x series, and there was a discussion on
comp.protocols.pathworks about Samba. Someone popped up and said that
you shouldn't use Samba, you should instead use a much smaller and
neater program called 'server-0.5'. He thought it was much better, as
it was so much less code. The punchline of course is that 'server' was
what I called Samba before we changed the name.

Anyway, the current build sizes of Samba4 are indeed silly. What is
needed though is not rants, it's someone to start to actually fix
it. Any build system (including cmake) will let you do silly
things. You could write cmake rules to build identical huge binaries
to what we have now. That is the nature of build systems.

To see that this is true, just have a look at what was done in the
cmake conversion of talloc versus the cmake conversion of
replace. Even those these conversions were done by the same person,
they already are inconsistent. Using cmake on both, I get this:

     # ldd test/replace_testsuite:
        linux-vdso.so.1 =>  (0x00007fffb12c0000)
        libc.so.6 => /lib/libc.so.6 (0x00007fe9f9add000)
        /lib64/ld-linux-x86-64.so.2 (0x00007fe9f9e4c000)

     # ldd talloc_testsuite
       linux-vdso.so.1 =>  (0x00007fffb97ff000)
       libtalloc.so.2 => /home/tridge/samba/git/s3/lib/talloc/build/libtalloc.so.2 (0x00007fe7c4c2f000)
       libc.so.6 => /lib/libc.so.6 (0x00007fe7c48c0000)
       /lib64/ld-linux-x86-64.so.2 (0x00007fe7c4e37000)

Why did it put one binary in build/test/ and the other in build? Why
did it decide to link one directly but the other with a rpath library?
Exactly the same options were given to cmake in both cases.

If we're going to build a library, then the testsuite for that library
needs to link against the library. Otherwise we're not testing the
library. If you look in the build farm right now it's even worse. The
test suite for tdb runs against the system libtdb.so on many of the
systems, which means it isn't testing the current version of tdb at
all.

We need to make it nearly impossible to make these sorts of
mistakes. The only way we can do that is by adding rules to say what a
mistake is. 

So, what we need is to work out what rules we want for linking, then
we need to encode those rules into the build system so that we ensure
we stick to them. I've listed the rules above that I'm thinking about
at the moment. If you have any suggestions on alternative rules, or
additional rules, then please let us know. We need these rules no
matter what build system we use.

Note also that most FOSS projects don't ever encode rules like this
into the build system. Instead it just evolves. So if we went with the
standard (simple?) way of doing things then we'd not have any rules,
and just let developers make up their own way for each library, binary
or module. I think that is not a good approach.

Cheers, Tridge


More information about the samba-technical mailing list