Caching {filePath,mtime64,checksum} values to speed up execution-time

Doug Robinson doug.robinson at wandisco.com
Wed Mar 12 13:38:35 MDT 2014


Kevin:

(your reply did finally show up)

On Tue, Mar 11, 2014 at 6:20 PM, Kevin Korb <kmk at sanitarium.net> wrote:

> >OK, in that case you should try using --ignore-times instead
>- --checksum.  With --ignore-times rsync will redo the delta transfer of
>all files.  This is usually faster than --checksum and won't cause
>much additional data transfer.
>
>Unless --whole-file is in play.

If I use --ignore-times then every file will be checksummed via the normal
mechanism (per "http://rsync.samba.org/how-rsync-works.html").  The point
of my proposal was to prevent all of the checksum work (I/O,CPU) at the time
of copy by pre-computing the whole-file checksums for comparison.  Yes,
this means using the --checksum option, but then most of the work done by
the
generator and sender in computing block checksums can be skipped at the
time of sync simply by comparing the whole-file checksums and matching.

My goal is to spread out the I/O and CPU intensive portions of the sync to
"before" the actual invocation-to-sync is made.  Cache validation is as I
described - although possibly adding in 64-bit ctime to the mix would catch
even those trying to fake non-modification by reseting the 64-bit mtime (a
trick noted in https://lists.samba.org/archive/rsync/2011-August/026676.html
).

That last URL was a find by one of my co-workers.  Now I need to go track
the "db.diff" patch that Wayne notes and see if I can tweak it to do the
64-bit
stuff and so on.

Thank you.

Doug
-- 
Doug Robinson

WANdisco // *Non-Stop Data*

t. 925-396-1125
e. doug.robinson at wandisco.com

-- 


Join us in New York and San Francisco for Subversion & Git Live 2014<http://www.wandisco.com/subversion-git-live-2014>

Listed on the London Stock Exchange: WAND<http://www.bloomberg.com/quote/WAND:LN>

THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE 
PRIVILEGED.  If this message was misdirected, WANdisco, Inc. and its 
subsidiaries, ("WANdisco") does not waive any confidentiality or privilege. 
 If you are not the intended recipient, please notify us immediately and 
destroy the message without disclosing its contents to anyone.  Any 
distribution, use or copying of this e-mail or the information it contains 
by other than an intended recipient is unauthorized.  The views and 
opinions expressed in this e-mail message are the author's own and may not 
reflect the views and opinions of WANdisco, unless the author is authorized 
by WANdisco to express such views or opinions on its behalf.  All email 
sent to or from this address is subject to electronic storage and review by 
WANdisco.  Although WANdisco operates anti-virus programs, it does not 
accept responsibility for any damage whatsoever caused by viruses being 
passed.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20140312/9ce70ca7/attachment.html>


More information about the rsync mailing list