rsync bottleneck

tim.conway at philips.com tim.conway at philips.com
Wed May 29 09:57:02 EST 2002


Dib:  sorry about the long delay in answering.  No time for list for a 
week.  I saw one response, that didn't address your question.
While I can't tell you how your system will act with your load, I can give 
an analogy.
The generation of the file list is the big, uncontrollable load.  As it 
happens local, it's not affected by bwlimit.
Your situation, being backups FROM multiple machines to a single machine, 
mitigates things a bit.  It will be like running "find backupdirectory 
-type f -exec sum {} \;" for each rsync job, where backupdirectory= the 
destination for that jobs backup.  Since they don't all backup over each 
other, your jobs are partitioned by machine, which mitigates from my case, 
where I have a single master which needs to be duplicated in 29 different 
replicas, necessitating 29 discrete "find directory -type f -exec sum {} 
\;" processes.  Now, rsync won't really be that cpu-intensive, as it 
checksums files only in case of a time/size mismatch, or if given the "-c" 
option, so the actual load is somewhere between "find directory -ls" and 
the sum one.  On a run where no existing file is changed (touched, 
modified), it will be purely like a find -ls.  Either one is terribly disk 
i/o intensive.
If the backup machine will be performing ANY other function besides these 
backups when they're running, find a way to run rsync at a low priority. 
If you have so many that they all fail, find a way to stagger the 
invocations, such that some limited number of rsyncs are in the file 
generation stage at any single time.  You can probably support maybe 4 
simultaneous invocations per cpu.  After the list is generated, it becomes 
more of a network-limited process, which you can further throttle, using 
bwlimit.

Tim Conway
tim.conway at philips.com
303.682.4917
Philips Semiconductor - Longmont TC
1880 Industrial Circle, Suite D
Longmont, CO 80501
Available via SameTime Connect within Philips, n9hmg on AIM
perl -e 'print pack(nnnnnnnnnnnn, 
19061,29556,8289,28271,29800,25970,8304,25970,27680,26721,25451,25970), 
".\n" '
"There are some who call me.... Tim?"




"diburim" <diburim at hotmail.com>
Sent by: rsync-admin at lists.samba.org
05/23/2002 12:38 AM

 
        To:     <rsync at lists.samba.org>
        cc:     (bcc: Tim Conway/LMT/SC/PHILIPS)
        Subject:        rsync bottleneck
        Classification: 



Hello,

I'm planning to use rsync for backup a lot of nodes to one rsync server 
over
wan.
I'm aware of the fact that on big directory tree rsync will consume a lot 
of
memory
and some time even hang.
Do you have any estimation on how many rsync client can work with one 
server
at the same time ?
I can control each client bandwidth. What will be the bottleneck on the
server resource ?
Is there any role of thumb.
My server running Linux 1Gh Pentium cpu and 512 Mb memory.
Please share your experience.

Thanks
Dib Urim

-- 
To unsubscribe or change options: 
http://lists.samba.org/mailman/listinfo/rsync
Before posting, read: http://www.tuxedo.org/~esr/faqs/smart-questions.html







More information about the rsync mailing list