Speeding Up Rsync for Large File Sets
k.skarlatos at gmail.com
Mon Dec 3 13:13:42 MST 2012
On Δευτέρα, 3 Δεκέμβριος 2012 6:49:05 μμ, Tim Gustafson wrote:
Lsyncd would be good for such a case.
>> Have you checked that you're not running out of memory?
> I have not seen any errors or warnings to that effect.
>> You probably want --delete-during instead of --delete.
> Will that speed up the file comparison stage, or is that just a good idea?
>> If you're checking for hard links, rsync has to track links for the entire
>> session. That can bog things down for zillions of files. If you know you
>> won't have links outside particular boundaries, you might benefit from
>> chunking up your rsync invocations to match those boundaries.
>> You might benefit from doing that regardless, since multiple rsyncs in
>> parallel will do a better job of saturating your I/O bandwidth in this
>> particular case.
> We're already doing one rsync for each file system. We have about
> 2,000 file systems that we're rsyncing between these servers. There
> are just two file systems in particular that are troublesome. In
> fact, the two file systems in question were formerly one file system,
> and we split it in two to accomplish exactly what you describe. Maybe
> more splitting is in order, but because of the nature of the data,
> we're always going to have at least one tree of folders with a zillion
> files in it.
More information about the rsync