Adding support for versioned files in rsync

jw schultz jw at pegasys.ws
Tue Oct 14 06:18:26 EST 2003


On Mon, Oct 13, 2003 at 03:43:35PM -0400, Jason M. Felice wrote:
> On Mon, Oct 13, 2003 at 10:44:52AM -0700, jw schultz wrote:
> > Try dirvish or one of the other backup systems already out
> > there.
> 
> dirvish looks interesting.  One of the requirements that I now realize I
> didn't write into the proposal was the ability to store only a single
> copy of duplicate files... duplicate as determined by file contents, not
> naming or inode or anything.  (Rationale: will likely need to back up
> large numbers of clients with lots of files in common.)
> 
> Dirvish does not appear to have that capability, but it *does* appear to
> solve the right problem.  Extending Dirvish to make hard links from the
> image into a directory tree based on SHA1 hash might work to implement
> this.

That would be better accomplished by adding an extended
rename detection to rsync.  I say extended because a normal
rename detection would depend on leveraging deletes.

The question that has t be addressed is whether or not the
added complexity is worthwhile for the space and bandwidth
efficiency.  I'm not yet convinced although directory
rename which is quite different from file-rename might be
worth it if we could reliably detect it.

> [snip]
> > Ever hear of vfs?  I have considered pluggability and might
> > consider it for the next version of rsync (3.x or perhaps 2.6)
> 
> Sounds interesting.  I remember the VFS from mc, and there is now some
> gnome-vfs stuff.  I'm not sure if you are referring to one of these or
> something else.  It definitely sounds like something to investigate,
> could you send a link?

I don't have any links but the vfs i'm referring to is the
adding of a layer between the physical filesystem and the
user-mode system calls.  This is often used for compression
and encryption and would be a good candidate for versioning.
One of the things this allows is different mount points
having different views on the same underlying filesystem
Historically this has also been done using NFS.

> 
> [snip]
> > Forget it.  Version awareness will just bog things down.
> > They did it on VMS and it has mainly served to be a
> > full-employment feature for admins.  If you want to use
> > version aware filesystems fine, rsync doesn't need to be
> > aware of that.  At most rsync might want to be able to
> > detect renames or, perhaps through plugin, be able to select
> > whether to see all versions with versioning disabled or just
> > latest.
> 
> I'm pondering these ideas of plugins plus the ideas Andy put in his
> post.  I might be able to make the system out of it with less
> hacking-and-slashing in code.

Because of how rsync does file i/o the plugin approach has
at least a chance but it may have some reliability and
portability issues.  A vfs approach may actually be better
suited for your purposes.

> 
> [snip] 
[snip]
> 
> > You have outlined an ambitious project.  Some of the ideas have
> > merit.  One or two would be nice to see in rsync.  The rsync
> > team are a few unpaid volunteers.  It sounds to me like you
> > propose creating a monstrosity out of rsync for the benefit
> > of one piece of vaporware.  Doing so on the backs of unpaid
> > volunteers rankles as would hijacking rsync.
> 
> I'm not trying to force patches on you guys (not that I even could).
> I basically have to solve a problem for a client and would like to do so
> in a way that I can most benefit the rsync project (or another project
> if rsync isn't the "right" project).  You seem to be unduly afraid that I
> have some sort of power here, and really all I can do is take my marbles and
> go home.  Well, perhaps I could spam you with awful patches, too ;-)

That differs somewhat from my impression of your
proposal/presentation.  My error?  It sounded like you were
aiming to create a shrink-wrap application and were
proposing to do so dependant on significant functional
changes to rsync that i recognised would alter its codebase
structuraly.

As you rightly recognise, you couldn't force patches on us.
You could fork the project but that has its own drawbacks,
political as well as practical.  I merely was trying to
point out that we have our on agenda(s) which do not align
well with some of the specific proposals and (rather
bluntly) remind you that it would be good to respect the
volunteer nature of the project.  That isn't to say that
some commercial involvement would not be accepted, only that
there are ways that would be welcomed more than others.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list