Large file - match process taking days
shachar at shemesh.biz
Wed Jul 30 13:33:04 GMT 2008
Rob Bosch wrote:
> I've been trying to figure out why some large files are taking a long time
> to rsync (80GB file). With this file, the match process is taking days.
> I've added logging to verbose level 4. The output from match.c is at the
> point where it is writing out the "potential match at" message. In a 9 hour
> period the match verbiage has changed from:
Can you tell where the bottleneck is? Is it on the sender's CPU? The
receiver's? The network? Local IO on either sides?
> I believe this means that 4.8GB of the file has been processed in this 9
> hour period? Blocksize is currently manually set at 1149728, 4 times the
> default value.
Rsync does have some CPU inefficient behavior for especially large
files. However, it should not happen at the block size you are using
(assuming the files are fairly identical). Try increasing it a little
further, to 1638400 (80% utilization on the hash table), and see if
things are any better.
Are the files fairly identical?
More information about the rsync