[RSYNC] Re: Compressed backup
Ph. Marek
marek at bmlv.gv.at
Thu Jun 13 03:38:02 EST 2002
I've got a suggestion regarding the mail Kevin wrote:
Instead of comparing the least m bits of n bytes I'd suggest using a
algorithm as described in the Paper
http://webglimpse.org/publications.html
"Siff -- Finding Similar Files in a Large File System"
ftp://ftp.cs.arizona.edu/reports/1993/TR93-33.ps
(Search for "Manber finding" in google for more references)
This calculates a rolling checksum for n bytes (n can be chosen).
In this paper a action is taken if m bits of this checksum are a defined
value, similar to
if (!(checksum & 0xff)) { ... }
for 8 bits = 0 which gives a 1/256 chance.
I wrote the program described there (siff) and it works very well.
Regards,
Phil
More information about the rsync
mailing list