[clug] database backups with git

Ian Munsie darkstarsword at gmail.com
Mon Jun 11 22:37:00 MDT 2012

> It was OK but, as others have pointed out, trimming the backups isn't
> one of git's strong points.

I was just thinking about this problem, which stems from the fact that
git will not throw away any data that it still holds a reference to.
This is what I came up with to remove all but the last $trim commits
from history to allow them to be garbage collected:

% trim=10
% cutoff=$(git show --format=%H HEAD~${trim}|head -n 1)
% git filter-branch -f --parent-filter "sed 's/-p ${cutoff}//'" $cutoff..HEAD

This creates a new history that looks like it began $trim commits ago.
At this point all the old history actually still exists in the
repository (git cannot change existing commits as that would change
their SHA1, so editing a commit always creates a new commit).
To remove the old history we have to fist remove all the references to
it before git will consider deleting it. If nothing else it will have
a backup reference created by filter-branch that we need to remove, as
well as the reference logs of HEAD and master:

% rm -fr .git/refs/original .git/logs

If you have other references to the old history, such as branches or
tags, they will also need to be removed or updated to point to the new

And finally, tell git to do a garbage collection to actually delete
the old history, telling it to prune everything (by default it will
only remove unreferenced objects that are older than 2 weeks):

% git gc --prune=now

To test if this worked I did:

% git init test
% cd test
% for x in $(seq 100); do dd if=/dev/urandom of=foo bs=1M count=1; git
add foo; git commit -m r$x; done
% du -sh .
104M    .
... above commands ...
% du -sh .


On the day *I* go to work for Microsoft, faint oinking sounds will be
heard from far overhead, the moon will not merely turn blue but
develop polkadots, and hell will freeze over so solid the brimstone
will go superconductive.
     -- Eric S. Raymond, 2005
Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html

More information about the linux mailing list