What is it doing?
Linda A. Walsh
rsync at tlinx.org
Mon Jan 13 21:14:21 MST 2014
Perry Smith wrote:
> This is my first time to really use rsync. I did small tests to get the arguments like I wanted and then kicked off the big rsync about 2 and a half hours ago. So far, it has not copied over any files.
>
> -----
>
> Is it really making progress? Or will it take this long to really start copying files over each day I start it?
>
> I expect the total amount copied to be about 400G and about 4 million files
-----
This appears to be a classic case of using a hammer to drive in a screw.
Um... rsync was designed to save network bandwidth by running on the
host and
doing file-stat intensive stuff ON the local hard disk (by running on
the server and
on your client).
But your usage case does very badly because rsync needs direct access on
both end ---
THEN it optimizes the stuff transferred to minimize the amount needing
to be copied
over the network.... But you are not getting ANY benefit because it
will do all of those
stats over the network via NFS which is notoriously slow in many or most
cases (especially
with lots of stat calls).
Your copy job would already be done if you did it with 'tar' and just
copied over everything.
On the receiving end tell tar not to overwrite newer stuff. Yes it
will waste more network
bandwidth, but it would very likely, already be done.
As you have described the problem, there is no real reason to use rsync,
as it is unable to
optimize network bandwidth because all the stats are remote.
Even "cp -au src/. dst/." will likely be faster than trying to use
rsync.... talk about
tool abuse! ;-)
For rsync to do a reasonable job, you really need to tell whoever owns
that server to put
rsync ON that server so it can access the files locally, then it could
do what it does best
and build up a list of differences so it only needs to transfer the
changed stuff.
Certainly, even if you have rsync on the remote end -- for the 1st
transfer, if you need to transfer
most of the files, it would be better just to create a tar on the remote
end, compress it, and copy
that locally.
How is it that you have so much data on a server you don't have any
ability to run a
local 'job' on? It really sounds like an impediment to you getting
your work done.
More information about the rsync
mailing list