What is it doing?

Linda A. Walsh rsync at tlinx.org
Mon Jan 13 21:14:21 MST 2014


Perry Smith wrote:
> This is my first time to really use rsync.  I did small tests to get the arguments like I wanted and then kicked off the big rsync about 2 and a half hours ago.  So far, it has not copied over any files.
>
> -----
>
> Is it really making progress?  Or will it take this long to really start copying files over each day I start it?
>
> I expect the total amount copied to be about 400G and about 4 million files
-----
This appears to be a classic case of using a hammer to drive in a screw.

Um... rsync was designed to save network bandwidth by running on the 
host and
doing file-stat intensive stuff ON the local hard disk (by running on 
the server and
on your client).

But your usage case does very badly because rsync needs direct access on 
both end ---
THEN it optimizes the stuff transferred to minimize the amount needing 
to be copied
over the network....   But you are not getting ANY benefit because it 
will do all of those
stats over the network via NFS which is notoriously slow in many or most 
cases (especially
with lots of stat calls). 

Your copy job would already be done if you did it with 'tar' and just 
copied over everything.
On the receiving end tell tar not to overwrite newer stuff.   Yes it 
will waste more network
bandwidth, but it would very likely, already be done.

As you have described the problem, there is no real reason to use rsync, 
as it is unable to
optimize network bandwidth because all the stats are remote.

Even "cp -au src/. dst/." will likely be faster than trying to use 
rsync....   talk about
tool abuse!  ;-) 

For rsync to do a reasonable job, you really need to tell whoever owns 
that server to put
rsync ON that server so it can access the files locally, then it could 
do what it does best
and build up a list of differences so it only needs to transfer the 
changed stuff.

Certainly, even if you have rsync on the remote end -- for the 1st 
transfer, if you need to transfer
most of the files, it would be better just to create a tar on the remote 
end, compress it, and copy
that locally.  

How is it that you have so much data on a server you don't have any 
ability to run a
local 'job' on?   It really sounds like an impediment to you getting 
your work done.




More information about the rsync mailing list