Speeding Up Rsync for Large File Sets

Tim Gustafson tjg at soe.ucsc.edu
Mon Dec 3 09:49:05 MST 2012


> Have you checked that you're not running out of memory?

I have not seen any errors or warnings to that effect.

> You probably want --delete-during instead of --delete.

Will that speed up the file comparison stage, or is that just a good idea?

> If you're checking for hard links, rsync has to track links for the entire
> session. That can bog things down for zillions of files. If you know you
> won't have links outside particular boundaries, you might benefit from
> chunking up your rsync invocations to match those boundaries.
>
> You might benefit from doing that regardless, since multiple rsyncs in
> parallel will do a better job of saturating your I/O bandwidth in this
> particular case.

We're already doing one rsync for each file system.  We have about
2,000 file systems that we're rsyncing between these servers.  There
are just two file systems in particular that are troublesome.  In
fact, the two file systems in question were formerly one file system,
and we split it in two to accomplish exactly what you describe.  Maybe
more splitting is in order, but because of the nature of the data,
we're always going to have at least one tree of folders with a zillion
files in it.

-- 

Tim Gustafson
tjg at soe.ucsc.edu
831-459-5354
Baskin Engineering, Room 313A


More information about the rsync mailing list