batch-mode fixes [was: [PATCH] fix read-batch SEGFAULT]
Chris Shoemaker
c.shoemaker at cox.net
Tue May 18 20:30:27 GMT 2004
On Tue, May 18, 2004 at 11:11:51AM -0400, Alberto Accomazzi wrote:
>
> Wayne Davison wrote:
> >
<snip>
> >I'm wondering if batch mode should be removed from the main rsync
> >release and relegated to a parallel project? It seems to me that a
> >better feature for the mainstream utility would be something that
> >optimized away some of the load on the sending system when it is
> >serving lots of users. So, having the ability to cache a directory
> >tree's information, and the ability to cache checksums for files
> >would be useful (especially if the data was auto-updated as it
> >became stale). That would make all transfers more optimal,
> >regardless of what files the receiving system started from.
>
> Firs of all, I have a feeling that the number of people who have
> *considered* using batch mode is quite small, and those who actually
> have used in the recent past is certainly an even smaller number (I'm
> thinking zero, actually). So removing the functionality from the
/me hold waves his hand frantically.
One, here. :-)
> mainstream rsync would not be a problem, in fact I think it would be a
> good thing. It doesn't make sense to keep something in the code that is
> not used and cannot be reliably supported. Although I applaud Jos's
> efforts in providing this functionality to rsync, I was surprised to see
Jos did that? Good job!
> it included in the main distribution, especially since it underwent
> virtually no testing as far as I can tell.
>
> There's no doubt that caching the file list on the server side would
> indeed be a very useful feature for all those who use rsyncd as a
> distribution method. We all know how difficult it can be to reliably
> rsync a large directory tree because of the memory and I/O costs in
> keeping a huge filelist in memory. This may best be done by creating a
> separate helper application (say rsyncd-cache or such) that can be run
> on a regular basis to create a cached version of a directory tree
> corresponding to an rsyncd "module" on the server side. The trick in
> getting this right will be to separate out the client-supplied options
> concering file selection, checksumming, etc, so that the cache is as
> general as possible and can be used for a large set of connections so as
> to minimize the number of times that the actual filesystem is scanned.
What client options are you thinking will be tricky? Wouldn't the
helper app just cache _all_ the metadata for the module, and then rsync would
query only the subset it needed? It's not like the client can change the
checksum stride. [That would hurt.]
-chris
>
> >Such a new feature would probably best be added to an rsync
> >replacement project, though.
>
> Hmmm... "replacement"? why not make this a utility that can be run
> alongsize an rsync daemon? Or are you thinking of a design for a "new"
> rsync?
>
>
> -- Alberto
>
>
> ********************************************************************
> Alberto Accomazzi aaccomazzi(at)cfa harvard edu
> NASA Astrophysics Data System ads.harvard.edu
> Harvard-Smithsonian Center for Astrophysics www.cfa.harvard.edu
> 60 Garden St, MS 31, Cambridge, MA 02138, USA
> ********************************************************************
>
> --
> To unsubscribe or change options:
> http://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
More information about the rsync
mailing list