Large file - match process taking days

Shachar Shemesh shachar at shemesh.biz
Wed Jul 30 13:33:04 GMT 2008


Rob Bosch wrote:
> I've been trying to figure out why some large files are taking a long time
> to rsync (80GB file).  With this file, the match process is taking days.
> I've added logging to verbose level 4.  The output from match.c is at the
> point where it is writing out the "potential match at" message.  In a 9 hour
> period the match verbiage has changed from:
>
>   
Can you tell where the bottleneck is? Is it on the sender's CPU? The 
receiver's? The network? Local IO on either sides?
> I believe this means that 4.8GB of the file has been processed in this 9
> hour period?  Blocksize is currently manually set at 1149728, 4 times the
> default value. 
Rsync does have some CPU inefficient behavior for especially large 
files. However, it should not happen at the block size you are using 
(assuming the files are fairly identical). Try increasing it a little 
further, to 1638400 (80% utilization on the hash table), and see if 
things are any better.

Are the files fairly identical?

Shachar


More information about the rsync mailing list