Rsyncing really large files

Wayne Davison wayned at samba.org
Mon Feb 28 18:55:46 GMT 2005


On Mon, Feb 28, 2005 at 08:33:52PM +0200, Shachar Shemesh wrote:
> If so, we can probably make it much much (much much much) more
> efficient by using a hash table instead. 

That's what "tag_table" is -- it's an array of 65536 pointers into
the checksum data sorted by weak checksum.  The code then does a
linear search for a matching strong checksum through the matching
weak checksums.  There should not be that many blocks with an
identical weak checksum unless the block size is too small (which
would cause there to be way too many blocks for all the buckets).

..wayne..


More information about the rsync mailing list