Question about rsync and BIG mirror

Shachar Shemesh rsync at shemesh.biz
Fri Mar 3 10:00:16 GMT 2006


johan.boye at latecoere.fr wrote:

>Hello,
>
>  So: each night, from 0:00am to maximum 7:00am, the server will have to
>check the 100Go of files and see what files have been modified, then,
>upload them to the clients. Each file is around 4MB to 40MB in average. 
>  
>
Are the clients what you call the "mirror"? Are there several of them?

>I would like to know your opinion about this situation:  
> - Should I setup a strong dual CPU computer dedicated to calculate this
>whole stuff? 
>  
>
That depends.

> - What about the memory I should install? 
> - Is there any bandwidth used during the checksums computation? Mine is
>quite limited.
>  
>
Is that "2 mega BYTE per second" or "2 mega BIT per second"?

> - I know the client computer will have to check files too; Disk I/O
>will be the most used. I think this computer will have NFS mount from a
>"datacenter" computer with a GB LAN card, I wonder it will be enough...
>  
>
Scanning 100GB of data in 7 hours doesn't require that much a disk
bandwidth.

>  I'm quite scared of the amount of data to check before synchronise
>clients, and how long it will take. To finish shortly, what do YOU
>think? Any advices?
>  
>
Here are a few performance characteristics of rsync I think you should
be aware of:
- By default, rsync only checks files that are different between
receiver and sender in timestamp or size. If most files in your archive
did not change at all, you can discard them altogether from your
bandwidth calculations.
- The receiver only does a linear scan of the file, followed by
generating a second file (which MAY require random access of the first
file, if blocks in the file changed order). It's CPU performance
requirements are negligible. This is bad for the case where you have one
mirror source sending out info to many mirrors, as all the CPU load
falls on the single server.
- If your bandwidth is 2 mega BIT per second, you are a bit marginal as
far as transferring 5GB of data in 7 hours. This has nothing to do with
rsync, though. A simple calculation can show you the same result.
Getting full bandwidth for the entire 7 hours will allow you to transfer
6 GB of data.

>Thanks,
>
>Johan
>  
>


More information about the rsync mailing list