[RSYNC] Re: Compressed backup

Ph. Marek marek at bmlv.gv.at
Thu Jun 13 03:38:02 EST 2002

I've got a suggestion regarding the mail Kevin wrote:

Instead of comparing the least m bits of n bytes I'd suggest using a
algorithm as described in the Paper
	"Siff -- Finding Similar Files in a Large File System"


(Search for "Manber finding" in google for more references)

This calculates a rolling checksum for n bytes (n can be chosen).
In this paper a action is taken if m bits of this checksum are a defined
value, similar to 
	if (!(checksum & 0xff)) { ... }
for 8 bits = 0 which gives a 1/256 chance.

I wrote the program described there (siff) and it works very well.



More information about the rsync mailing list