Using submodules for third_party/

Simo simo at samba.org
Mon Dec 8 06:56:00 MST 2014


On Mon, 2014-12-08 at 10:01 +0100, Stefan (metze) Metzmacher wrote:
> Hi Jelmer,
> 
> > At the moment we're manually bundling a bunch of third party libraries
> > in third_party/. Rather than keeping (usually partial) copy of these
> > libraries in our Git repository, I would propose using git
> > submodules where possible (in other words, where the upstream is using
> > git). Submodules have come a long way since they were originally
> > introduced in Git.
> > 
> > Using submodules would have the following advantages. Mainly:
> > 
> > * it is easy to avoid bundled third party libraries by simply not
> >   running 'git submodule init'.
> > 
> > * easy to review updates of upstream revisions we bundle; updating a
> >   submodule shows up as a one-line diff. This means we can be sure
> >   we're using an unmodified upstream revision; at the moment this is
> >   hard to verify. You have to manually pull down a copy to verify that
> >   the changes being made to the copy of a third party library are
> >   the same as in the upstream repo.
> >
> > Some other nice benefits:
> > 
> > * we're sure we always ship the pristine upstream source; what the
> >   system version would provide too
> > 
> > * easy to update, allows killing update-external.sh
> > 
> > * reduces unnecessary growth of our own git repo :)
> > 
> > There are two minor downsides I can think of:
> > 
> > * after checkout, it is necessary to run 'git submodule init' to
> >   do the initial checkout of submodules and then run 'git submodule
> >   update' whenever there are changes to the submodules. This
> >   can be avoided by setting the 'fetch.recurseSubmodules' setting in
> >   Git to 'yes'.
> > 
> > * if the upstream repository is down for some reason, you can't check
> >   out the third party library. We could work around this by hosting
> >   our own clone of third party libraries on git.samba.org.
> > 
> >   That said, I don't think such a workaround is necessary. In the rare
> >   cases that the upstream repository is down, users can always install the
> >   system version of an external library (since we would only use
> >   submodules for third party libraries).
> 
> I can think of the following problem: an upstream project changes the url
> to it's repo. Then you checkout an older samba version after a few year
> in order
> to track down a customer bug. And the old samba version still references the
> old upstream url...
> 
> >   This also only affects new checkouts and fetches of changes to the
> >   submodules. If the submodule reference doesn't change, there is no
> >   need for updates.
> 
> My typical setup is the following:
> 
> I have a bare repository on my laptop where I configured a lot of
> remotes, a cronjob runs 'git remote update' every few minutes.
> 
> Then I have working repositories/checkouts, which are configured like this:
> 
> metze at SERNOX14:~/devel/samba/3.X/masterF$ cat .git/objects/info/alternates
> /home/metze/devel/samba/samba-bare.git/objects
> metze at SERNOX14:~/devel/samba/3.X/masterF$ ls -la .git/refs/
> insgesamt 64
> drwxrws--- 6 metze metze  4096 Nov 16  2013 .
> drwxrws--- 9 metze metze  4096 Dez  4 10:21 ..
> drwxrws--- 2 metze metze  4096 Apr 21  2009 bisect
> drwxrws--- 2 metze metze  4096 Dez  4 10:21 heads
> lrwxrwxrwx 1 metze metze    51 Jun  7  2011 remotes ->
> /home/metze/devel/samba/samba-bare.git/refs/remotes
> -rw-rw---- 1 metze metze    41 Nov 16  2013 stash
> drwxrws--- 2 metze metze  4096 Apr 29  2010 stash.d
> drwxrws--- 4 metze metze 40960 Jun 23 14:09 tags
> 
> If I remember correctly this wasn't supported when using git submodules,
> when we discussed this topic the last time. So I nacked the proposal.
> 
> However I'm open to reevaluate, but everything needs to be available
> offline after doing a 'git clone git://git.samba.org/samba.git' with
> 'fetch.recurseSubmodules = yes' configured in ~/.gitconfig. And it needs
> to support my workflow...
> 
> I just tested this with your repository which has submodules
> in the following branch:
> https://git.samba.org/?p=jelmer/samba.git;a=shortlog;h=refs/heads/for-review/submodules
> 
> metze at SERNOX14:/dev/shm$ git config fetch.recurseSubmodules
> yes
> metze at SERNOX14:/dev/shm$ git clone git://git.samba.org/jelmer/samba.git
> Klone nach 'samba'...
> remote: Counting objects: 986939, done.
> remote: Compressing objects: 100% (230556/230556), done.
> remote: Total 986939 (delta 752585), reused 979504 (delta 745183)
> Empfange Objekte: 100% (986939/986939), 228.08 MiB | 1.78 MiB/s, done.
> Löse Unterschiede auf: 100% (752585/752585), done.
> Prüfe Konnektivität... Fertig.
> metze at SERNOX14:/dev/shm$ cd samba/
> git show 5a0c331b259407896e63267e578efafee879ed4f | grep 'Subproject commit'
> +Subproject commit 43c14fd73b3b94211ff8bfad8f894b48cce4e577
> metze at SERNOX14:/dev/shm/samba$ git show
> 43c14fd73b3b94211ff8bfad8f894b48cce4e577
> fatal: bad object 43c14fd73b3b94211ff8bfad8f894b48cce4e577
> 
> As long as that doesn't work, it gets a NACK from me, sorry.

Would it be acceptable if you had a git-clone alias/script that did the
right thing ?

Simo.




More information about the samba-technical mailing list