(Synchronization among clients with history)

Matt McCutchen matt at mattmccutchen.net
Tue Jan 13 01:27:20 GMT 2009


On Sat, 2009-01-10 at 15:01 -0600, Jeff Allen wrote:
>  I'm looking to build a rough implementation of a multi-client
> rdiff-backup system; in order to do this I'm using rsync before
> rdiff-backup.
> 
> (We'll say there's a server, Client A, and Client B. Files should be
> synced between A and B but the server should keep a master list of all
> differences and changes made in any file, by any client in the
> directory I'm syncing).
> Essentially, I envision that syncing client A would go something like
> this:
> 1. Rsync down from the server to Client A in order to ensure that any
> newly-created files added recently by Client B (which would have
> already been uploaded - via rdiff-backup - to the server) is added to
> the local directory on Client A.
> 2. Rdiff-backup from Client A to the server. This will not increment
> the freshly downloaded files created by client B, as the modified
> times are equal. However, it would update those newly-created/edit
> files on Client A since the last sync.
 
Do I understand correctly that you're taking advantage of the fact that
rdiff-backup leaves the latest files in an ordinary tree that you can
read via rsync, provided that you --exclude=/rdiff-backup-data ?

>  However, I will run into problems when I delete a file.
> If I delete a file off of either client, the file will be un-deleted
> when I rsync down in step one, as the file would still exist on the
> server. But if I use rsync --del, it would just delete any and all new
> files created on a client since the last sync.
> 
> The best solution I can envision is to write a shell script (or modify
> the rsync source) which would alter step 1 above to the following:
> 
> global variable lastSync; //last synchronization for this client
> function syncFile(file, modifiedDate){
>   if (modifiedDate > lastSync){
>      //this must be a new file created from another client.
>      download the file from the server
>   }
>   else{
>      //the file has been deleted on the client since the last sync,
> delete it.
>      delete the file.
>   }
> }
 
It just so happens that I had a similar need a few years ago (but
without the need to save history) and made a similar proposal as my
first rsync bug:

https://bugzilla.samba.org/show_bug.cgi?id=2094

Wayne wisely advised me to use a real two-way synchronization tool such
as unison ( http://www.cis.upenn.edu/~bcpierce/unison/ ) instead, and I
would give you the same advice.  But what makes your case more difficult
is that you don't want to write directly to the rdiff-backup dir with
unison.

If unison had an option to propagate changes in one direction and skip
any changes detected in the other direction, you could use that in step
1 and count on the next run of unison to recognize the changes made by
rdiff-backup as convergent.  Unfortunately, unison has no such option,
though you may be able to rig up a script to accomplish this in unison's
interactive mode.

Alternatively, you could introduce an intermediate directory containing
another copy of the data (which could be on either each client or the
server) and use the following procedure:

1. Rsync from rdiff-backup dir to intermediate dir.
2. Synchronize intermediate dir with client via unison.
3. Back up intermediate dir to rdiff-backup dir.

But this uses extra space.

Given your requirements for both history and synchronization, you may be
better served by using a full version-control tool in place of both
rdiff-backup and unison.  My personal favorite is git
( http://git.or.cz/ ).  The downside is that you'll have to jump through
extra hoops if you care about file attributes.  See this thread for some
ideas (written with reference to git but may apply to other tools too):

http://www.gelato.unsw.edu.au/archives/git/0612/index.html#34154

I hope one of these approaches works for you.  If not, give me some more
information and I will see if I can come up with anything else.

-- 
Matt



More information about the rsync mailing list