rsync backup performance question

jw schultz jw at pegasys.ws
Mon Jun 23 00:05:10 EST 2003


On Sun, Jun 22, 2003 at 04:20:34PM +0200, Ron Arts wrote:
> jw schultz wrote:
[snip] 
> Would it be feasible to have a separate process pre-creating
> blocksums during the day in separate files (ending in ",rsync")?
> Or, for example, while writing the changed file, the receiver
> would precompute and save the blocksums, for using it on
> the next run? This would save at least half my I/O.

No.  Not with the current codebase.

> 
> >
> >>>The easiest way to manage the scheduling is to have the
> >>>server pull.  If that isn't possible then you will need to
> >>>use an rsync wrapper that keeps the simultaneous runs within
> >>>limits or put a good deal of smarts into the clients.
> >>>
> >>
> >>Yeah, pulling is out of the question, because the server can't
> >>activate the ISDN link. The clients' rsync start time will need
> >>to be hashed across the night.
> >
> >
> >I'd favour a wrapper over depending on hashing the start
> >times.  An alternate approach might be to have the clients
> >open the connection with port forwarding, write a queue file
> >and wait for a completion indicator before closing the
> >connection.  The server could then pull using on the queue
> >files to identify waiting clients.  While a bit more
> >complicated it avoids the temporal gaps caused by the
> >fallback-sleep-retry of the wrappers.
> >
> 
> What do you mean by a wrapper? something that connects,
> check if the server has some resources, and try again later?
> Does it already exist?

Something that would accept the connection, test to see if
it ok and if not either loop until it is ok or return an
error.  The client side would either have to accept the long
delay or retry if the not-ok error were detected.

> This might incur ISDN call-setup costs that might be
> unacceptable. Same thing with keep-line-open-until-server-pulls.
> But on the other hand, this will maximize server performance.

You would still use a start time hash to manage the number
of clients waiting to run.  This would just serve to
maximize server performance by ensuring you don't have an
overload.

> On the other hand, I will probably need to spread the load
> across multiple servers anyway, so maybe something like the
> linux virtual server project would come in handy.... have
> to look into that too.

Rsync doesn't perform well on non-local filesystems.


-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list