Question about rsync and BIG mirror

Tony at ServaCorp.com Tony at ServaCorp.com
Fri Mar 3 15:32:35 GMT 2006


Flames invited if I'm wrong on any of this, but:

Some (long overdue) backups indicate that network speed
should be much more important than cpu speed.
Your results will depend heavily on your exact mix
and I cannot think of any reasonable way to quantify it.
That said, this may help give you a clue.

This is about two and a half hours to transfer two months of changes
to 15GB over as 20k bit/second connection.
This is aprox 2.38% of the files, 1.19% of the volume changed.
Most if not all the changed files are actually new files.

Be aware that big files that are logically almost the same
but are physically a bit different
can be rather time consuming to transfer
--- watch timeouts    older versions of rsync
	sometimes needed very large timeout values.

You probably want the exact same version of rsync on both sides
although I've had almost no problems using whatever version
happened to be available at the time.
Seems like only problems I've seen were slightly spooky error messages.

If the computers are fast and the network is slow,
probably compression is a good idea


Files which have the same size and last modified times are assumed to be
identical and are not further checked. (There are switches to check anyway)

What will matter is how much changed and where.
You are probably ahead to set things up so you can cope sensibly
when a lot of stuff gets rearranged. It happens.

>From a rather overdue (Dec 30) backup of almost 15GB drawings
(secondary, off-off-site)  (T1 to backwater cable modem)
Receiver (Server) looks 99+ % idle (with top taking most of the cpu)
Sender (Client) looks 94 % idle (top taking about 4%) 700MHz AMD Duron
58183 files

time rsync -avPz --bwlimit=20 --timeout=750 \
	  --password-file=/etc/rsync.secrets/xxxxxx \
        /home/xxxxxxxxxxx/*  rsync-sjs-dwg at ssssssss::xxxxxxxxx/

      146587 100%   44.89kB/s    0:00:03 (xfer#1385, to-check=643/58183)

sent 178,475,844 bytes  received 198,042 bytes  20,348.94 bytes/sec
total size is 15,034,880,070  speedup is 84.15

real    146m43.557s
user    2m4.680s
sys     0m16.950s


      146587 100%   44.89kB/s    0:00:03 (xfer#1385, to-check=643/58183)

sent 178,475,844 bytes  received 198,042 bytes  20,348.94 bytes/sec
total size is 15,034,880,070  speedup is 84.15

real    146m43.557s
user    2m4.680s
sys     0m16.950s


time rsync -av rsync-xxxxxxx at ssssssss24.241.188.11::rsync-xxxxxxx/
/home/rsync-xxxxxxxx/
very stale  -- Sep 8 2005	about 15% idle 300MHz P2

sent 315061 bytes  received 1,239,438,988 bytes  5,571,928.31 bytes/sec
total size is 15,186,475,211  speedup is 12.25

real    3m42.096s
user    0m54.560s
sys     1m44.730s



> -----Original Message-----
> From: rsync-bounces+tony=servacorp.com at lists.samba.org
> [mailto:rsync-bounces+tony=servacorp.com at lists.samba.org]On Behalf Of
> johan.boye at latecoere.fr
> Sent: Friday, March 03, 2006 1:03 AM
> To: rsync at lists.samba.org
> Subject: Question about rsync and BIG mirror
>
>
> // I wonder if this message has been posted, so I sent it again //
>
> Hello,
>
>   I'm quite a n00b on rsync stuff but I went to the website, read
> FAQ/how-to, Google and more, I setup my own rsync server and clients:
> everything works fine :-D
>
>   I'm preparing a plan for a production mode in my company: we need to
> mirror around 100GB of data trough a special VPN internet line 2MB
> symmetric.
>   The first time, the data will be transferred by a media such as a HD.
> Next, each night, we will try to update clients from the master server.
> It should be around 500MB to 3GB, no so much in comparison of the
> original size of data.
>   I discovered "rsync" use a lot of CPU and RAM to run "checksums" on
> file that have to be synchronised. I need an opinion about my situation:
>
>
>   So: each night, from 0:00am to maximum 7:00am, the server will have to
> check the 100Go of files and see what files have been modified, then,
> upload them to the clients. Each file is around 4MB to 40MB in average.
>
> I would like to know your opinion about this situation:
>  - Should I setup a strong dual CPU computer dedicated to calculate this
> whole stuff?
>  - What about the memory I should install?
>  - Is there any bandwidth used during the checksums computation? Mine is
> quite limited.
>  - I know the client computer will have to check files too; Disk I/O
> will be the most used. I think this computer will have NFS mount from a
> "datacenter" computer with a GB LAN card, I wonder it will be enough...
>
>   I'm quite scared of the amount of data to check before synchronise
> clients, and how long it will take. To finish shortly, what do YOU
> think? Any advices?
>
>
> Thanks,
>
> Johan
> --
> To unsubscribe or change options:
> https://lists.samba.org/mailman/listinfo/rsync
> Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html



More information about the rsync mailing list