rsync backup performance question

jw schultz jw at pegasys.ws
Sun Jun 22 20:31:35 EST 2003


On Sun, Jun 22, 2003 at 11:42:46AM +0200, Ron Arts wrote:
> Dear all,
> 
> I am implementing a backup system, where thousands of postgreSQL
> databases (max 1 Gb in size) on as much clients need to be backed
> up nightly across ISDN lines.
> 
> Because of the limited bandwidth, rsync is the prime candidate of
> course.

Only if you are updating an existing file on the backup
server with sufficient commonality from one version to the
next.  pg_dump --format=t would is good.  Avoid the built-in
compression in pg_dump as it defeats rsync.  gzip with the
rsyncable patch and bzip2 are OK if you must compress.

The other issue is individual file size.  Rsync versions
prior to what is in CVS start having some performance issues
with files larger than the 200-500MB range.  

> Potential problems I see are server load (I/O and CPU), and filesystem 
> limits.

Most of the load is on the sender.  Over ISDN even with
rsync compressing the datastream no one update should be CPU
or I/O issue.  The issue is scheduling so you don't have too
many running simultaneously.

The easiest way to manage the scheduling is to have the
server pull.  If that isn't possible then you will need to
use an rsync wrapper that keeps the simultaneous runs within
limits or put a good deal of smarts into the clients.

> Does anyone have experience with such setups?

Unlikely on that scale over that sort of link.

I'd suggest experimenting with -v and the --stats options turned on.

-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list