Weak CheckSum Question
s.buys at icon.co.za
Fri Mar 15 08:15:22 EST 2002
I am writing a xdelta-like application as a personal experiment and am busy
implementing the rsync protocol, so far so good. I am using C++ templates and
creating the algorithms so that operate on any stream, array, etc. through
All seems well except that I am getting a lot of false hits with the weak
checksum. When generating checksums of blocksize 1024 on the RedHat 7.1 ISO I
generate about 760 000 checksums which go into a hash_multimap.
When running the rolling checksums on the RedHat 7.2 ISO (against the
checksums in the hash_map) I am getting almost 95% false hits. (ie the
weak-checksums match, but when comparing the offsets the comparison fails).
Is this expected? Is there a stronger rolling checksum I could implement? It
takes about 80seconds to generate all the checksums on a 650MB file, thus I
could certainly use a stronger algorithm which cause less false hits.
Any ideas would be greatly appreciated.
PS please reply directly to me as I am not on the mailing list.
More information about the rsync