Parallel rsync's for better Performance.
matt at mattmccutchen.net
Wed Oct 28 17:00:53 MDT 2009
On Wed, 2009-10-28 at 23:46 +0100, Matthias Schniedermeyer wrote:
> On 28.10.2009 18:27, Matt McCutchen wrote:
> > On Wed, 2009-10-28 at 17:24 +0100, Matthias Schniedermeyer wrote:
> > > On 28.10.2009 10:35, Matt McCutchen wrote:
> > > > On Wed, 2009-10-28 at 10:01 +0100, Matthias Schniedermeyer wrote:
> > > > > Otherwise parallel rsyncs completly kill any performance you had because
> > > > > normal HDDs will fall into a seek-storm, when more than 1 rsync works on
> > > > > them.
> > > >
> > > > Asynchronous I/O may solve that, on OSes that support it.
> > >
> > > No. That's a fundamental problem with ANY rotating media device.
> > "Solve" may be an overstatement, but asynchronous I/O would at least
> > help significantly because one process could issue many I/O requests to
> > the same area of disk at once, and the disk scheduler could fulfill all
> > of those requests before seeking elsewhere. Without asynchronous I/O,
> > after the scheduler fulfills one request, it is left to either seek or
> > wait for the process to issue another request.
> And "same disc region" is kind of a problem. In most modern filesystems
> inodes can be pretty random so you can't for sure sort the files by
> inode, or something like that.
I wasn't implying any effort on the process's part to choose files in
the same disk region. Your statement that running parallel rsyncs
creates a worse seek storm than one rsync alone is based on the
assumption that each rsync tends to process files that are together on
disk, and I was simply referring to that assumption.
More information about the rsync