Any way to predict the amount of data to be copied when re-copying a file?
Andrew Gideon
c182driver1 at gideon.org
Sun Nov 29 09:07:39 MST 2009
I do backups using rsync, and - every so often - a file takes far longer
than it normally does. These are large data files which typically change
only a little over time.
I'm guessing that these large transfers are caused by occasional changes
that "break" (ie. yield poor performance) in the "copy only changed
pages" algorithm. But I've no proof of this. And I view unusual things
as possible warning flags of serious problems that will rise to bite me
on a Friday evening. I'd prefer to address them before they become real
problems (which makes for a far less stressful life {8^).
So I'd like to *know* that these occasional slow transfers are just
artifacts of how rsync's "copy only changed pages" algorithm works. Is
there some way to run rsync in some type of "dry run" mode but where an
actual determination of what pages should be copied is performed?
The current --dry-run doesn't go down to this level of detail as far as I
can see. It determines which files will need to be copied, but not which
pages of those files need to be copied.
So is there something that goes that next step in detail?
Note that this doesn't even have to work across a network to meet my
needs, though that would be ideal. I could always run it after the
transfer is completed (which means I'll have both copies of the file on
the same system).
Thanks...
Andrew
More information about the rsync
mailing list