Using submodules for third_party/
Simo
simo at samba.org
Mon Dec 8 06:56:00 MST 2014
On Mon, 2014-12-08 at 10:01 +0100, Stefan (metze) Metzmacher wrote:
> Hi Jelmer,
>
> > At the moment we're manually bundling a bunch of third party libraries
> > in third_party/. Rather than keeping (usually partial) copy of these
> > libraries in our Git repository, I would propose using git
> > submodules where possible (in other words, where the upstream is using
> > git). Submodules have come a long way since they were originally
> > introduced in Git.
> >
> > Using submodules would have the following advantages. Mainly:
> >
> > * it is easy to avoid bundled third party libraries by simply not
> > running 'git submodule init'.
> >
> > * easy to review updates of upstream revisions we bundle; updating a
> > submodule shows up as a one-line diff. This means we can be sure
> > we're using an unmodified upstream revision; at the moment this is
> > hard to verify. You have to manually pull down a copy to verify that
> > the changes being made to the copy of a third party library are
> > the same as in the upstream repo.
> >
> > Some other nice benefits:
> >
> > * we're sure we always ship the pristine upstream source; what the
> > system version would provide too
> >
> > * easy to update, allows killing update-external.sh
> >
> > * reduces unnecessary growth of our own git repo :)
> >
> > There are two minor downsides I can think of:
> >
> > * after checkout, it is necessary to run 'git submodule init' to
> > do the initial checkout of submodules and then run 'git submodule
> > update' whenever there are changes to the submodules. This
> > can be avoided by setting the 'fetch.recurseSubmodules' setting in
> > Git to 'yes'.
> >
> > * if the upstream repository is down for some reason, you can't check
> > out the third party library. We could work around this by hosting
> > our own clone of third party libraries on git.samba.org.
> >
> > That said, I don't think such a workaround is necessary. In the rare
> > cases that the upstream repository is down, users can always install the
> > system version of an external library (since we would only use
> > submodules for third party libraries).
>
> I can think of the following problem: an upstream project changes the url
> to it's repo. Then you checkout an older samba version after a few year
> in order
> to track down a customer bug. And the old samba version still references the
> old upstream url...
>
> > This also only affects new checkouts and fetches of changes to the
> > submodules. If the submodule reference doesn't change, there is no
> > need for updates.
>
> My typical setup is the following:
>
> I have a bare repository on my laptop where I configured a lot of
> remotes, a cronjob runs 'git remote update' every few minutes.
>
> Then I have working repositories/checkouts, which are configured like this:
>
> metze at SERNOX14:~/devel/samba/3.X/masterF$ cat .git/objects/info/alternates
> /home/metze/devel/samba/samba-bare.git/objects
> metze at SERNOX14:~/devel/samba/3.X/masterF$ ls -la .git/refs/
> insgesamt 64
> drwxrws--- 6 metze metze 4096 Nov 16 2013 .
> drwxrws--- 9 metze metze 4096 Dez 4 10:21 ..
> drwxrws--- 2 metze metze 4096 Apr 21 2009 bisect
> drwxrws--- 2 metze metze 4096 Dez 4 10:21 heads
> lrwxrwxrwx 1 metze metze 51 Jun 7 2011 remotes ->
> /home/metze/devel/samba/samba-bare.git/refs/remotes
> -rw-rw---- 1 metze metze 41 Nov 16 2013 stash
> drwxrws--- 2 metze metze 4096 Apr 29 2010 stash.d
> drwxrws--- 4 metze metze 40960 Jun 23 14:09 tags
>
> If I remember correctly this wasn't supported when using git submodules,
> when we discussed this topic the last time. So I nacked the proposal.
>
> However I'm open to reevaluate, but everything needs to be available
> offline after doing a 'git clone git://git.samba.org/samba.git' with
> 'fetch.recurseSubmodules = yes' configured in ~/.gitconfig. And it needs
> to support my workflow...
>
> I just tested this with your repository which has submodules
> in the following branch:
> https://git.samba.org/?p=jelmer/samba.git;a=shortlog;h=refs/heads/for-review/submodules
>
> metze at SERNOX14:/dev/shm$ git config fetch.recurseSubmodules
> yes
> metze at SERNOX14:/dev/shm$ git clone git://git.samba.org/jelmer/samba.git
> Klone nach 'samba'...
> remote: Counting objects: 986939, done.
> remote: Compressing objects: 100% (230556/230556), done.
> remote: Total 986939 (delta 752585), reused 979504 (delta 745183)
> Empfange Objekte: 100% (986939/986939), 228.08 MiB | 1.78 MiB/s, done.
> Löse Unterschiede auf: 100% (752585/752585), done.
> Prüfe Konnektivität... Fertig.
> metze at SERNOX14:/dev/shm$ cd samba/
> git show 5a0c331b259407896e63267e578efafee879ed4f | grep 'Subproject commit'
> +Subproject commit 43c14fd73b3b94211ff8bfad8f894b48cce4e577
> metze at SERNOX14:/dev/shm/samba$ git show
> 43c14fd73b3b94211ff8bfad8f894b48cce4e577
> fatal: bad object 43c14fd73b3b94211ff8bfad8f894b48cce4e577
>
> As long as that doesn't work, it gets a NACK from me, sorry.
Would it be acceptable if you had a git-clone alias/script that did the
right thing ?
Simo.
More information about the samba-technical
mailing list