rsync hashing / collision handling?

Andrew Pennebaker andrew.pennebaker at
Wed Oct 3 07:04:08 MDT 2012

I find your responses quite interesting!

> I'm wondering why rsync chooses not to work like version control
> > software,
> Because it's designed to copy large amounts of data.

Ah, rsync is designed to copy large file systems, not just individual
source code projects.

> I bet a lot more accurate than that.  After a certain point
> you're a lot more likely to have random bit flips due to,
> say, cosmic rays, than to hash collisions.  But you'll
> have to compute the number and show it to me if you want
> to prove me wrong.

md4coll <> is a security tool
that produces collisions for a given block of data.

I assume that by now rsync uses a more modern hash algorithm such as MD5 (
collisions <>) or SHA-1

> >
> > No file system hooks would be necessary, as demonstrated by version
> > control
> > software, which does not continuously monitor the file system for
> > changes
> > but merely confirms discrete file changes during version control
> > commits.
> So use version control software.  Why not?
> But VCS does not scale to the filesystem/entire system level.

Hmm, you might be right. I wonder if diff-based sync algorithms could
reasonably work for large file systems.

> Everything has it's niche.   Trying to make one-size-fit-all
> usually results in an unwieldy mess.   Rsync is already
> way, way too option-heavy, IMO.

Too true!

I don't know the particulars of the rsync options, but I agree
wholeheartedly in the Unix principle: do one thing and do it well.


Andrew Pennebaker
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <>

More information about the rsync mailing list