Parallel rsync's for better Performance.
Matthias Schniedermeyer
ms at citd.de
Wed Oct 28 10:24:32 MDT 2009
On 28.10.2009 10:35, Matt McCutchen wrote:
> On Wed, 2009-10-28 at 10:01 +0100, Matthias Schniedermeyer wrote:
> > On 28.10.2009 09:05, Satish Shukla wrote:
> > > We have huge data to sync usually everyday and I wish rsync could guarantee performance.
> > >
> > > I thought of spliting the directories and run parallel rsyncs on them. It may cost me some network, but I can control that from the MAX_RSYNC_PROCESS variable. Can some one evaluate pros and cons of this design?. Any help is heartily appreciated.
> >
> > That only works IF:
> > - You have SSDs (preferably good ones, both sides)
> > - Each rsync covers a different physical HDD (both sides)
> > - You have a massive Array with truck-loads of HDDs and a matching
> > controller or something along that line (again both sides).
> > - A combination of the above would also work
> >
> > Otherwise parallel rsyncs completly kill any performance you had because
> > normal HDDs will fall into a seek-storm, when more than 1 rsync works on
> > them.
>
> Asynchronous I/O may solve that, on OSes that support it.
No. That's a fundamental problem with ANY rotating media device.
I don't say say that you can't build something for the people that have
that kind of hardware, or that are constrainted by high bandwidth &
latency network connections (You don't need it for low bandwidth and/or
low latency). But it would be utterly useless for the other 95-99% of
rsync users.
Bis denn
--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.
More information about the rsync
mailing list