Parallel rsync's for better Performance.
matt at mattmccutchen.net
Wed Oct 28 08:41:33 MDT 2009
On Wed, 2009-10-28 at 10:01 +0100, Matthias Schniedermeyer wrote:
> On 28.10.2009 09:05, Satish Shukla wrote:
> > We have huge data to sync usually everyday and I wish rsync could guarantee performance.
> > I thought of spliting the directories and run parallel rsyncs on them. It may cost me some network, but I can control that from the MAX_RSYNC_PROCESS variable. Can some one evaluate pros and cons of this design?. Any help is heartily appreciated.
> That only works IF:
> - You have SSDs (preferably good ones, both sides)
> - Each rsync covers a different physical HDD (both sides)
> - You have a massive Array with truck-loads of HDDs and a matching
> controller or something along that line (again both sides).
> - A combination of the above would also work
> Otherwise parallel rsyncs completly kill any performance you had because
> normal HDDs will fall into a seek-storm, when more than 1 rsync works on
Asynchronous I/O may solve that, on OSes that support it.
See also this RFE, on which I have just commented:
More information about the rsync