Parallel rsync's for better Performance.

Matthias Schniedermeyer ms at citd.de
Wed Oct 28 10:24:32 MDT 2009


On 28.10.2009 10:35, Matt McCutchen wrote:
> On Wed, 2009-10-28 at 10:01 +0100, Matthias Schniedermeyer wrote:
> > On 28.10.2009 09:05, Satish Shukla wrote:
> > > We have huge data to sync usually everyday and I wish rsync could guarantee performance.
> > > 
> > > I thought of spliting the directories and run parallel rsyncs on them. It may cost me some network, but I can control that from the MAX_RSYNC_PROCESS variable. Can some one evaluate pros and cons of this design?. Any help is heartily appreciated. 
> > 
> > That only works IF:
> > - You have SSDs (preferably good ones, both sides)
> > - Each rsync covers a different physical HDD (both sides)
> > - You have a massive Array with truck-loads of HDDs and a matching
> >   controller or something along that line (again both sides).
> > - A combination of the above would also work
> > 
> > Otherwise parallel rsyncs completly kill any performance you had because 
> > normal HDDs will fall into a seek-storm, when more than 1 rsync works on 
> > them.
> 
> Asynchronous I/O may solve that, on OSes that support it.

No. That's a fundamental problem with ANY rotating media device.

I don't say say that you can't build something for the people that have 
that kind of hardware, or that are constrainted by high bandwidth & 
latency network connections (You don't need it for low bandwidth and/or 
low latency). But it would be utterly useless for the other 95-99% of 
rsync users.






Bis denn

-- 
Real Programmers consider "what you see is what you get" to be just as 
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated, 
cryptic, powerful, unforgiving, dangerous.



More information about the rsync mailing list