Using submodules for third_party/

Stefan (metze) Metzmacher metze at samba.org
Mon Dec 8 02:01:31 MST 2014


Hi Jelmer,

> At the moment we're manually bundling a bunch of third party libraries
> in third_party/. Rather than keeping (usually partial) copy of these
> libraries in our Git repository, I would propose using git
> submodules where possible (in other words, where the upstream is using
> git). Submodules have come a long way since they were originally
> introduced in Git.
> 
> Using submodules would have the following advantages. Mainly:
> 
> * it is easy to avoid bundled third party libraries by simply not
>   running 'git submodule init'.
> 
> * easy to review updates of upstream revisions we bundle; updating a
>   submodule shows up as a one-line diff. This means we can be sure
>   we're using an unmodified upstream revision; at the moment this is
>   hard to verify. You have to manually pull down a copy to verify that
>   the changes being made to the copy of a third party library are
>   the same as in the upstream repo.
>
> Some other nice benefits:
> 
> * we're sure we always ship the pristine upstream source; what the
>   system version would provide too
> 
> * easy to update, allows killing update-external.sh
> 
> * reduces unnecessary growth of our own git repo :)
> 
> There are two minor downsides I can think of:
> 
> * after checkout, it is necessary to run 'git submodule init' to
>   do the initial checkout of submodules and then run 'git submodule
>   update' whenever there are changes to the submodules. This
>   can be avoided by setting the 'fetch.recurseSubmodules' setting in
>   Git to 'yes'.
> 
> * if the upstream repository is down for some reason, you can't check
>   out the third party library. We could work around this by hosting
>   our own clone of third party libraries on git.samba.org.
> 
>   That said, I don't think such a workaround is necessary. In the rare
>   cases that the upstream repository is down, users can always install the
>   system version of an external library (since we would only use
>   submodules for third party libraries).

I can think of the following problem: an upstream project changes the url
to it's repo. Then you checkout an older samba version after a few year
in order
to track down a customer bug. And the old samba version still references the
old upstream url...

>   This also only affects new checkouts and fetches of changes to the
>   submodules. If the submodule reference doesn't change, there is no
>   need for updates.

My typical setup is the following:

I have a bare repository on my laptop where I configured a lot of
remotes, a cronjob runs 'git remote update' every few minutes.

Then I have working repositories/checkouts, which are configured like this:

metze at SERNOX14:~/devel/samba/3.X/masterF$ cat .git/objects/info/alternates
/home/metze/devel/samba/samba-bare.git/objects
metze at SERNOX14:~/devel/samba/3.X/masterF$ ls -la .git/refs/
insgesamt 64
drwxrws--- 6 metze metze  4096 Nov 16  2013 .
drwxrws--- 9 metze metze  4096 Dez  4 10:21 ..
drwxrws--- 2 metze metze  4096 Apr 21  2009 bisect
drwxrws--- 2 metze metze  4096 Dez  4 10:21 heads
lrwxrwxrwx 1 metze metze    51 Jun  7  2011 remotes ->
/home/metze/devel/samba/samba-bare.git/refs/remotes
-rw-rw---- 1 metze metze    41 Nov 16  2013 stash
drwxrws--- 2 metze metze  4096 Apr 29  2010 stash.d
drwxrws--- 4 metze metze 40960 Jun 23 14:09 tags

If I remember correctly this wasn't supported when using git submodules,
when we discussed this topic the last time. So I nacked the proposal.

However I'm open to reevaluate, but everything needs to be available
offline after doing a 'git clone git://git.samba.org/samba.git' with
'fetch.recurseSubmodules = yes' configured in ~/.gitconfig. And it needs
to support my workflow...

I just tested this with your repository which has submodules
in the following branch:
https://git.samba.org/?p=jelmer/samba.git;a=shortlog;h=refs/heads/for-review/submodules

metze at SERNOX14:/dev/shm$ git config fetch.recurseSubmodules
yes
metze at SERNOX14:/dev/shm$ git clone git://git.samba.org/jelmer/samba.git
Klone nach 'samba'...
remote: Counting objects: 986939, done.
remote: Compressing objects: 100% (230556/230556), done.
remote: Total 986939 (delta 752585), reused 979504 (delta 745183)
Empfange Objekte: 100% (986939/986939), 228.08 MiB | 1.78 MiB/s, done.
Löse Unterschiede auf: 100% (752585/752585), done.
Prüfe Konnektivität... Fertig.
metze at SERNOX14:/dev/shm$ cd samba/
git show 5a0c331b259407896e63267e578efafee879ed4f | grep 'Subproject commit'
+Subproject commit 43c14fd73b3b94211ff8bfad8f894b48cce4e577
metze at SERNOX14:/dev/shm/samba$ git show
43c14fd73b3b94211ff8bfad8f894b48cce4e577
fatal: bad object 43c14fd73b3b94211ff8bfad8f894b48cce4e577

As long as that doesn't work, it gets a NACK from me, sorry.

metze

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20141208/ea270333/attachment.pgp>


More information about the samba-technical mailing list