batch-mode fixes [was: [PATCH] fix read-batch SEGFAULT]

Alberto Accomazzi aaccomazzi at cfa.harvard.edu
Tue May 18 14:06:52 GMT 2004


Chris Shoemaker wrote:

> 	Indeed, what you describe seems to have been the design motivation.  I
> can share what my desired application is: I want to create a mirror of a
> public server onto my local machine which physically disconnected from the
> Internet, and keep it current.  So, I intend to first rsync update my own copy
> which _is_ networked while creating the batch set.  Then I can sneakernet the
> batch set to the unnetworked machine and use rsync --read-batch to update it. 
> This keeps the batch sets smallish even though the mirror is largish. 

This was something I looked into a couple of years ago.  Back then I 
even posted an email to the list 
(http://lists.samba.org/archive/rsync/2002-August/003433.html) and got 
no feedback, which led me to conclude that people were not doing any of 
this at the time.  To restate the obvious, the batch mode thing is 
really just a glorified diff/patch operation.  The problem I have with 
it is that AFAICT it's a very fragile one, since a simple change of one 
file on either sender or receiver after the batch has been created will 
invalidate the use of the batch mode.  Contrast this with diff/patch, 
which has builtin measures to account for fuzzy matches and therefore 
makes it a much more robust tool.

In the end my motivation for using the rsync-via-sneakernet approach 
disappeared when I convinced myself that the whole operation would have 
been far too unreliable, at least for our application where files are 
updated all the time and there is never really a "freeze" of a release 
against which a batch file can be created.  I won't go as far as saying 
that the feature is useless, but just caution people that they need to 
understand the assumptions that this use of rsync is based upon.  Also, 
I would suggest checking out other diff/patch tools such as rdiff-backup 
or xdelta.

> 	BTW, there is a work-around.  If you don't mind duplicating the mirror
> twice, one solution is to do a regular (no --write-batch) rsync update of one
> copy of the mirror, and then do the --write-batch during a local to local
> rsync update of another copy of the mirror.  Actually, this has some real
> advantages if your network connection is unreliable. 

This is really the only circumstance under which I would even consider 
using batch mode.  There should also be safeguards built into the batch 
mode operation to guarantee that the source files to which the batch is 
applied are in the state we expect them to be.  I wouldn't otherwise 
want rsync to touch my files.

> 	Thanks for your input.

Likewise.  Good luck...

-- Alberto


********************************************************************
Alberto Accomazzi                      aaccomazzi(at)cfa harvard edu
NASA Astrophysics Data System                        ads.harvard.edu
Harvard-Smithsonian Center for Astrophysics      www.cfa.harvard.edu
60 Garden St, MS 31, Cambridge, MA 02138, USA
********************************************************************



More information about the rsync mailing list