how to migrate 40T data and 180M files

Kyle Lanclos lanclos at ucolick.org
Tue Aug 11 11:52:09 MDT 2009


Ming Gao wrote:
> The first question is that if there is any risk for such a big number of
> files? should I divide them into groups and rsync them in parallel or in
> serial? If yes, how many groups is better?

For that amount of data, you ought to use something simple and recursive,
like cp -rp. A tar pipe will typically break after a couple terabytes;
at least, that's what happens in my experience.

After the initial cp, follow up with an rsync. How long the rsync takes
will depend immensely on how good your NFS servers are at cacheing file
metadata.

If your testing demonstrates the time-to-rsync is not acceptable, and you
are not otherwise disk-bound, you may want to investigate breaking the task
up into multiple simultaneous rsync processes. It will be easier to manage
if you don't have to do that, though.

--Kyle


More information about the rsync mailing list