performance problems while building the filelist...

jw schultz jw at pegasys.ws
Fri Aug 16 02:12:00 EST 2002


On Fri, Aug 16, 2002 at 12:08:39PM +0200, Scheufen Stephan wrote:
> Hello Rsync PRO´s,
> 
> i´m satisfied with the rsync features...never seen a better replication tool....!! ;-)
> 
> But i have some problems...:-(
> OK, here me installation:
> - one Compac Proliant ML370 with cached SCSI HDDs and 1,2GHz and 1gb RAM...
> on this machine we have the Rsync deamon runnig to export the data we want to replicate
> - now we have 68 other NAS machines in our branch offices and this NAS machines are connecting to the ProLiant server to replicate the date it holds
> - approx. amount of files on the Proliant server = 26000 Files (~6gb)

First lets make sure we have an accurate picture:
	You have a central master server running the rsync
	daemon.  68 servers in branch offices connect to the
	master server as rsync clients.  The fileset is only
	26.000 files filling about 6GB.

> Here´s the problem:
> - the Proliant server HDDs are overloaded when more than 3 NAS machines creating the filelists before they start to replicate.
> - if all 68 machines are connecting it takes 4 hours (!) before all machines got their filelists and start to replicate :-(
> 
> possible solutions (except upgrading the cache memory of the HDD controller)???
> has somebody a idea?
> can we precompile the filelist and hold it somewhere?
> what else can we do....?

Upgrading the HDD cache won't buy much if anything.

Several ideas in random order:

1.  schedule your client connections so that they don't collide
with each other.  With 68 clients this could be a bit
difficult as you probably have noticed.

2.  Create a wrapper on your rsync jobs that queries the
master server to check the load before running.  This could
be done very simply with small tool on the server (perhaps a
cgi script) that tracks the number of concurrent jobs and
refuses.  This is a little complicated and prone to error
but in some cases might be the ticket.

3.  set "max connections" in rsyncd.conf and wrap the client
job in a script that tests for connection refusal and does a
fallback+retry.

4.  break the requests up. 
This is a standard answer for filescan slowness but 26000
isn't very much so this is probably not your problem.

5.  Increase the size of your inode cache.  You don't say
what OS you are running on.  It might have a fixed size inode
cache.  The tree scan could be thrashing it.

6.  don't use the -c|--checksum option.  The "refuse
options" parameter in rsyncd.conf may be your friend.
You didn't show your command line so it is possible you made
the mistake of using checksumming every file which is seldom
needed and really thrashes the caches.

7.  increase the number of levels in the replication
hierarchy. Don't of have 68 machines sync with one.

	a.  add a couple of other servers alongside your
	master to spread the load so that only a fraction
	have to sync directly from the master.

	b.  use a two tier system where most of your clients
	sync to other clients.  This might use regional
	offices as middlemen.  The practicality of this is
	very dependent on network topology and the
	reliability of the second tier systems.


Some of these ideas are inappropriate for all but a few
situations.  Your solution may involve more than one.

> 
> i have no idea anymore but:
> i would like to thank you already for helpfull hints and tips!!
> 
> greetings from germany
> Stephan
> 



-- 
________________________________________________________________
	J.W. Schultz            Pegasystems Technologies
	email address:		jw at pegasys.ws

		Remember Cernan and Schmitt



More information about the rsync mailing list