[Bug 9814] New: --cache parameter for storing recent file data

samba-bugs at samba.org samba-bugs at samba.org
Thu Apr 18 12:19:56 MDT 2013


https://bugzilla.samba.org/show_bug.cgi?id=9814

           Summary: --cache parameter for storing recent file data
           Product: rsync
           Version: 3.1.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: enhancement
          Priority: P5
         Component: core
        AssignedTo: wayned at samba.org
        ReportedBy: me at haravikk.com
         QAContact: rsync-qa at samba.org


I know rsync is generally stateless, but caching of recent data is something
that could significantly speed it up by skipping checksumming entirely.

The idea is that a file's absolute path will be checksummed (not its contents)
and then looked up in a folder structure of cached details, or maybe even a
database. A filesystem solution can optimise by using lines in a file as final
indices to the cache (so checksums for multiple files can be grouped into a
single file until it gets too large), since checksums are a fixed size, and
timestamps can be as well.

Ideally we'd get support for at least the file-system method.

Necessary options would include a threshold for discarding cache entries that
are too old by when the entry was modified and/or by how many times the entry
has been accessed. The latter would allow rsync to only recheck files on every
second pass for example.

For continuous incremental updates a cache that does a good job of balancing
speed and size should allow comparisons to be performed extremely quickly, and
could also be used to skip files entirely on the client-side if some kind cache
comparison can be performed (so that the client can quickly decide if a file
has probably already been backed up).

-- 
Configure bugmail: https://bugzilla.samba.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the QA contact for the bug.


More information about the rsync mailing list