Rsyncing really large files

Shachar Shemesh rsync at shemesh.biz
Sat Mar 5 12:10:11 GMT 2005


Lars Karlslund wrote:

>>And I'm suggesting making it static, by adjusting the hash table's size 
>>according to the number of blocks. Just do 
>>"hashtablesize=(numblocks/8+1)*10;", and you should be set.
>>    
>>
>
> Or maybe it should really be dynamic.

I'm talking about the hash table load. I.e. - the ratio between the 
number of buckets the table has, and the number of blocks that go in it. 
This is almost unrelated to your problem.

> I adjusted the block-size as an experiment, as I read somewhere about 
> the default blocksize of 700 bytes. Now I'm told the blocksize is 
> calculated automatically. Which is it?

According to my extremely non-official reading of the source code - 
dynamic. Do try to lose the parameter and see how things are doing. 
Also, try setting it really high, say, 50MB, and tell us how things go 
then. This is just so we find out where the bottleneck is in your case.

> But hey, I can run all the tests you want. Just tell me what to do.

See previous paragraph. Comparative numbers of block sizes of:
always transfer.
Block sizes of 64k (as you have been doing so far)
Default block sizes (about 700K, according to my calculations).
50MB block sizes.

Also, knowing the CPU and network load of each solution would be very 
beneficial.

> Okay, okay, my mistake. Should I just remove the parameter altogether?

Would probably be better, yes.

>>Keyin, I'm trying to make rsync better. Lars' problem is an opportunity 
>>to find a potential bottleneck. Trying to solve his use of possibly 
>>    
>>
> Well, its probably a non-standard situation for rsync anyway.

The fact that your setup is highly likely non-optimal does not mean that 
rsync cannot be made even better.

>>non-optimal values won't help rsync, though it will help him. Let's keep 
>>    
>>
>
> Well, me either, as the rsync job processes both this gigantic file 
> and other smaller ones.

If you don't specify block sizes, this should not be a problem.

> Whoa, it that the subject? I thought the subject was solving my 
> problem <big smile>

Not for four or five messages, no :-)

          Shachar

-- 
Shachar Shemesh
Lingnu Open Source Consulting ltd.
Have you backed up today's work? http://www.lingnu.com/backup.html



More information about the rsync mailing list