Speeding Up Rsync for Large File Sets

Konstantinos Skarlatos k.skarlatos at gmail.com
Mon Dec 3 13:13:42 MST 2012


On Δευτέρα, 3 Δεκέμβριος 2012 6:49:05 μμ, Tim Gustafson wrote:
Lsyncd would be good for such a case.

>> Have you checked that you're not running out of memory?
>
> I have not seen any errors or warnings to that effect.
>
>> You probably want --delete-during instead of --delete.
>
> Will that speed up the file comparison stage, or is that just a good idea?
>
>> If you're checking for hard links, rsync has to track links for the entire
>> session. That can bog things down for zillions of files. If you know you
>> won't have links outside particular boundaries, you might benefit from
>> chunking up your rsync invocations to match those boundaries.
>>
>> You might benefit from doing that regardless, since multiple rsyncs in
>> parallel will do a better job of saturating your I/O bandwidth in this
>> particular case.
>
> We're already doing one rsync for each file system.  We have about
> 2,000 file systems that we're rsyncing between these servers.  There
> are just two file systems in particular that are troublesome.  In
> fact, the two file systems in question were formerly one file system,
> and we split it in two to accomplish exactly what you describe.  Maybe
> more splitting is in order, but because of the nature of the data,
> we're always going to have at least one tree of folders with a zillion
> files in it.
>




More information about the rsync mailing list