why variable last_i is needed in match.c rsync source ?

Martin Pool mbp at samba.org
Mon Mar 25 13:32:50 EST 2002


On 23 Mar 2002, Kim Jongtae <javu at enpia.net> wrote:
>  Hi all
> 
>   I see the rsync source and rsync makes hashing table and search  hashing
> table tag_table to find the index of array struct sum_buf , which is a
> element of struct  sum_struct.
> 
> According to the source code, variable last_i is used to encourage
> adjacent matches allowing the RLL coding of the output to work more
> efficiently.

I think this code handles the case where there are two identical
blocks in the basis file.  When sending the delta, we could therefore
use the index of either block and we would get the same result.

When rsync transmits block numbers in the delta, it actually transmits
deltas between block indexes.  So in the common situation where a
series of consecutive blocks in the basis are reproduced in the new
file, they will be transmitted as 1, 1, 1, 1, 1...  This compresses
well when we run it through gzip later.

I think the last_i test just tries to prefer the block following the
last block matched hwen there is more than one to choose from.  (But I
haven't looked very closely so I might be wrong.)

-- 
Martin 




More information about the rsync mailing list