performance problem of using parallel rsync to stage data from 1 source to multiple destination

xuehai zhang hai at cs.uchicago.edu
Wed Aug 31 21:42:40 GMT 2005


Hi all,
I am new to rsync and I apologize in advance if my question is shallow.
I write a simple script to use rsync to transfer a big file (~600MB)from a single source to variable 
number of destinations in a parallel way. When I transfer the file to 4 destination machines, I get 
X overall transfer time. Then I transfer the same file to 8 destination machines and I get Y overall 
transfer time. However, Y is smaller than 2 * X from my experimental results. Why the time of 
transferring the file to 2N nodes is shorter than twice of the time of transferring the same file to 
N nodes? Does it make sense to you? What could be the reason if it makes sense in some way?
Thank you so much for your help in advance!
Xuehai

P.S. the script to do the parallel rsync

#!/bin/sh

LIST="ccn2"

if [ "$#" -gt "0" ] ; then
   if  [ "$1" -eq "2" ] ; then
      LIST="ccn2"
   fi
   if  [ "$1" -eq "4" ] ; then
      LIST="ccn2 ccn3 ccn4"
   fi
   if  [ "$1" -eq "6" ] ; then
      LIST="ccn2 ccn3 ccn4 ccn7 ccn6"
   fi
   if  [ "$1" -eq "8" ] ; then
      LIST="ccn2 ccn3 ccn4 ccn5 ccn6 ccn7 ccn8"
   fi
fi

echo "nodes: $LIST"

date
for dest in $LIST
do
   time rsync -az /tmp/disk.img $dest:/tmp/&
done
wait
date


More information about the rsync mailing list