batch-mode fixes [was: [PATCH] fix read-batch SEGFAULT]

Tue May 18 03:10:57 GMT 2004

On Mon, May 17, 2004 at 05:18:10PM -0400, Chris Shoemaker wrote:
> The "knowledge" or "memory" of that exact state is more likely to
> reside with the receiver (who just left that state) than with the
> sender (who may never have been in that state).  Therefore it is more
> likely to be useful to the receiver than to sender.

This is only true if you imagine a receiver doing one pull and then
forwarding the update on to multiple hosts.  For instance, if you
use a pull to create the batch files and then make them available
for people to download, which would help to alleviate load from the
original server.  That said, I think most of the time a receiver is
going to be a leaf node, so the server tends to be the place where
a batch is more likely to be useful, IMO.

In thinking about batch mode, it seems like its restrictions make
it useful in only a very small set of of circumstances.  Since the
receiving systems must all have identical starting hierarchies, it
really does limit how often it can be used.

I'm wondering if batch mode should be removed from the main rsync
release and relegated to a parallel project?  It seems to me that a
better feature for the mainstream utility would be something that
optimized away some of the load on the sending system when it is
serving lots of users.  So, having the ability to cache a directory
tree's information, and the ability to cache checksums for files
would be useful (especially if the data was auto-updated as it
became stale).  That would make all transfers more optimal,
regardless of what files the receiving system started from.

Such a new feature would probably best be added to an rsync
replacement project, though.

..wayne..