User controlled i/o block size?
greg.freemyer at gmail.com
Tue Apr 12 18:54:36 UTC 2016
On Mon, Apr 11, 2016 at 7:05 PM, Kevin Korb <kmk at sanitarium.net> wrote:
> You didn't say if you were networking or what features of rsync you
> are using but if you aren't networking and aren't doing anything fancy
> you are probably better off with cp -au which is essentially the same
> as rsync -au except faster.
I was curious if "cp -au" was indeed as robust as rsync.
No it isn't. My test:
Create a folder with numerous files in it (a dozen in my case). Have
one of them be 9GB (or anything relatively big).
cp -au <src-folder> <dest-folder>
Look in the destination folder and when you see the 9GB file growing,
kill "cp -au". (I just did a control-C).
Restart "cp -au".
I ended up with a truncated copy of the 9GB file. (roughly a 3GB file.)
The copy I did yesterday was about 1200 files. Almost all were about
1.5GB in size, so that was a multi-hour process to make the copy.
Using rsync, I can kill the copy at any time (by desire or system
issue) and just restart it.
Using the simple "rsync -avp --progress" command I end up recopying
the file that was in progress when rsync was aborted, but 1.5GB files
only take 10 or 15 seconds to copy, so that is a minimal wasted effort
when considering a copy process that runs for hours.
fyi: In my job I work with 100GB+ read-only datasets all the time.
The tools are all designed to segment the data into 1.5 GB files.
One advantage is if a file becomes corrupt, just that segment file has
to be replaced. All the large files are validated via MD5 hash (or
SHA-256, etc). I keep a minimum of two copies of all datasets.
Yesterday I was making a third copy of several of the datasets, so I
had almost 2TB of data to copy.
More information about the rsync