Chance of equal checksum and changing blocks

Matt McCutchen matt at mattmccutchen.net
Thu Jan 22 19:13:17 GMT 2009


On Thu, 2009-01-22 at 10:43 +0100, David de Lama wrote:
> - First, am I right that the chance of getting the same 32-bit rolling
> checksum is 1/2^16 and to get the same 128-bit MD5 Hash is 1/2^127?

You might know something I don't, but I would expect the collision
probability to be 1/2^32 for 32 bits and 1/2^128 for 128 bits.  These
values are for two given blocks; to find the probability of at least one
collision in a collection of blocks (e.g., a file), you would have to
account for all pairs.  The values further assume that the checksums of
all the blocks under consideration are independent and uniformly random.
Of course, one can craft an input file that causes a collision.

> - Finally I want two know if it is possible to change an amount of
> blocks manually? 
>   e.g. I made a 100 MB file with "dd if=/dev/zero of=/home/test.xyz
> bs=1M count=100" and know I want to change, lets say, 10 blocks of
> this file. Is it possible?

I guess you're performing some kind of test of the delta-transfer
algorithm?

The block size for each file defaults to approximately the square root
of its size.  You can find out the exact block size rsync is using for a
file by passing -vvv (--debug=deltasum2 in rsync 3.1.*) and looking for
"blength=" in the output, or you can specify a block size to use for all
files with the --block-size option.  Then, just overwrite the desired
areas of the file.

>   The blocksize above (bs=1M) has nothing to do with the blocksize
> rsync uses, right?!

Correct, although setting the dd block size equal to the rsync block
size and using the "seek" option does give you a convenient way to
overwrite individual rsync blocks.

-- 
Matt



More information about the rsync mailing list