code duplication in ccan (was: Re: snprintf on SunOS)

Sat Jun 16 02:28:46 MDT 2012

Hi Rusty!

On Fri, Jun 15, 2012 at 11:40:13AM +0930, Rusty Russell wrote:
> On Fri, 15 Jun 2012 00:36:57 +0200, Jelmer Vernooij <jelmer at samba.org> wrote:
>         I'm delighted we're actually discussing this stuff BTW!
> Sometimes it seems hard to get other people's attention on these kind of
> details.

Thanks for not taking it the wrong way; Sorry about not bringing up my
concerns earlier.

> > On Sun, Jun 10, 2012 at 01:02:50PM +0930, Rusty Russell wrote:
> > > If you think of CCAN as a separate project, the question becomes: should
> > > we help CCAN, or just do our own samba-specific code?
> > If we want to get our utility code from an external source we could
> > also look at using e.g. glib.  That's got a fair amount of
> > functionality, and is present on a lot of systems already. One of the
> > advantages would also be that we would no longer have to worry about
> > maintaining that code.  Instead, we could ditch parts of samba-util
> > that provide the same functionality.
> I really dislike glib; it's more a framework you have to adapt to than a
> set of things we can cherry-pick from.  If we were writing Samba from
> scratch, perhaps.
I've used it just for its convenience functions and data structures in
several projects without it having a big impact. Some of the related
code - gobject for example - does operate more like a framework.

Anyway, the point I was trying to make was that there are several
libraries out there with convenience functions like this; there is a
good case to be made for ccan, but I don't see think it should be the
default choice.

> > ccan as is is not really appropriate to build as a shared library - it
> > is more like a code collection from which you can cherrypick what you
> > like.
> Yes!  It's built as a shared lib because that's how Samba builds things,
> not because it's sane.
Sure, but that does make ccan a bad match for Samba in its current form.

> > In its current form it has extra overhead with each module (a copy of
> > the license and a description) and a test system that's different from
> > the rest of Samba.

> In the ccan repo we symlink the licenses.  We could probably do that
> here (it's mainly 10 copies of the GPL).  The description is
> documentation: that's not overhead!

> See below on the test system integration...
Do we really need 10 files for that in the tree though? I see how it
would be useful if users grab just one or two modules from the
website, in this case it's just extra noise.

Likewise, the _info file uses a format that's completely different
from the doxygen format that we've standardized on (for better or worse) in the
rest of Samba.

> > > OTOH, I'd like SAMBA to unify on ccan/likely and ccan/time; the former
> > > because it has nice infrastructure for measuring branches which
> > > CCAN_LIKELY_DEBUG is set, and the latter because Samba doesn't have a
> > > unified time infrastructure; there are pieces all over the tree.
> > Note that samba already has a definition of likely() somewhere in
> > lib/util.
> Gah!!!  You keep saying vague things like this.  If it's worth
> discussing, it's worth reading the code first!

> There are *five* definitions of likely() in the tree.  Four are
> potentially buggy, because they treat the condition differently for
> gcc vs other compilers.

> I want to unify all of them.  Let's discuss how!

> > Samba has time functionality in lib/util/time.h, in what way is the
> > version in ccan better?

> And we also have a complete duplication of functionality in tevent :(

> That's a bit hard to solve, since it's in the ABI, but we should
> back-end them onto common code, IMHO.  Pulling in libutil is overkill
> though, because it has samba-specific functions like NTTIME stuff.

> But it would be easy if lib/util/time.h used ccan/time for the generic
> stuff.  So could tevent.

> I think ccan/time could use improvement, too.  Mainly, it would be nice
> just to use timespec everywhere, even on systems where we don't get ns
> accuracy.  Frobbing between different time formats is a recipe for
> confusion.  I think Samba would be neater for such a change.  What do
> you think?

> I do think that we should pass structures by copy rather than pointer,
> as it makes it far easier to chain things, eg: 

>    end = time_add(time_now(), time_from_msec(500));

> We should also consider some inlining for the trivial stuff, too.

> I think if we tried to unify them we'd understand much better what a
> great set of time primitives would look like.  Put those in ccan/time.
> Put the Samba-specific stuff in lib/util/time.h.

I know the code isn't the same and that the existing code isn't
perfect. My objection is against adding another version without a
clear path to reconciliation rather than fixing the existing version(s).

Anyway, we seem to agree that we want to reconcile this stuff in a
single location so let's focus on that.

> > > It makes sense also to share some of samba-util with others, by turning
> > > them into CCAN modules, eg. dlinklist.
> > Why not just expose samba-util as a shared library? That seems like an
> > easier way of sharing it with others than CCAN.

> Because you'd have to maintain an ABI.  And that would stop us from
> fixing bugs or removing functions; you really don't want a bag of utils
> to be a shared library!

> Or we could keep changing the ABI, making it useless for anyone outside
> Samba anyway.
The problem is that in its current form, we already rely on the ccan
ABI not changing.

If you have an older version of tdb installed on your system which includes a
copy of ccan and you build a newer version of Samba, then you end up
with ABI incompatibilities.

> > > Well, ccan and tdb2/ntdb share the same test infrastructure.  I'd
> > > actually like to make that more common: having infrastructure for
> > > such low-level unit tests in Samba would be a significant benefit.
> > I don't see why we need to come up with a new infrastructure for
> > low-level unit tests for that, the existing code can be used for that
> > as well.
> Specifically smbtorture?  I don't think we can :(

> We want these lowlevel unit tests to depend on as little as possible,
> and we want them run first.  Little test programs are nice for that;
> easy to write, easy to run.

Not really smbtorture so much as subunit, compatibility with our
test lists in selftest/tests.py and integration in the test namespace.

This gives us a uniform way of reporting and analysing test results
and of running (selected) tests.

I'm not a fan of lots of little C files with simple tests.
If we would e.g. split up the talloc tests like that we would end up
with two dozen extra C files; I don't particularly see the advantage
in that. That's not particularly what I have a problem with here

smbtorture isn't the only way we run tests - we also have perl tests
that generate subunit output, shell tests, python based tests, etc.

> The talloc testsuite.  How would we unify this?
The talloc testsuite seems fair reasonable to me at the moment; it would be
nice to have a simple way to run individual tests, but it's not that
slow.

> The tdb testsuite?  It uses direct includes, which is a very powerful
> unit testing technique which is much harder to do in a framework like
> smbtorture.

> ldb also seems to have its own testsuite.

> Let's unify the test infrastructure, but we need to do it intelligently.
> My preference would be to treat unit tests separately: standalone little
> programs which test individual things, and get run first, then
> smbtorture for higher-level tests?  If we standardize on naming, we can
> make nice helpers to find, build, and run them.
We already have that infrastructure - but it's not smbtorture.
smbtorture is just one of the test runners, but it matches the behaviour of the
others.

> I'd like to write more unit tests for Samba code: it's nicer to get a
> local failure than have smbtorture report some high-level failure
> halfway through a 2 hour test run :(
You can run individual tests, there is no need to run the whole thing.

E.g. to run just the talloc tests:

make test TESTS=talloc

Cheers,

Jelmer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20120616/c42077d0/attachment.pgp>