Improving the rsync protocol (RE: Rsync dies)

Phil Howard phil-rsync at ipal.net
Mon May 20 08:04:01 EST 2002


On Fri, May 17, 2002 at 01:42:31PM -0700, Wayne Davison wrote:

| On Fri, 17 May 2002, Allen, John L. wrote:
| > In my humble opinion, this problem with rsync growing a huge memory
| > footprint when large numbers of files are involved should be #1 on
| > the list of things to fix.
| 
| I have certainly been interested in working on this issue.  I think it
| might be time to implement a new algorithm, one that would let us
| correct a number of flaws that have shown up in the current approach.

OTOH, this mode of operation also needs to be retained.  While I certainly
would love to have an rsync that can keep millions of files in sync all at
once, I also have need for an rsync that can readily detect files being
moved around.  There are obvious difficulties in combining those needs,
so it should be a deployment issue to decide what to use.

If a subtree is moved from one place to another within the tree being
syncronized, I would like to be able to detect, on the basis of checksums,
that files have been moved.  This also would extend to hardlinks being
added.  If I move a directory of 10 large ISO images to a different name
then I would like for rsync to detect this, if it has been instructed to
collect full checksums for all files.  Then those 10 large ISO images
would just be relinked on the destination side, instead of redundantly
transferred.


| As for who spawns the receiver, it would be nice if this was done by the
| sender (so they could work alone), but an alternative would be to have
| the generator spawn the receiver and then then let the receiver hook up
| with the sender via the existing ssh connection.

What if it is not via ssh?  You're not thinking of trying to use extra ssh
channels for this, are you?  To be clean, IMHO, any solution should work
over any transparent stream transport without having to know what that
transport is once it is established.  Much of the use I make of rsync is
not over ssh.

-- 
-----------------------------------------------------------------
| Phil Howard - KA9WGN |   Dallas   | http://linuxhomepage.com/ |
| phil-nospam at ipal.net | Texas, USA | http://phil.ipal.org/     |
-----------------------------------------------------------------




More information about the rsync mailing list