rsync hashing / collision handling?

Andrew Pennebaker andrew.pennebaker at gmail.com
Wed Oct 3 07:04:08 MDT 2012


I find your responses quite interesting!

> I'm wondering why rsync chooses not to work like version control
> > software,
>
> Because it's designed to copy large amounts of data.
>

Ah, rsync is designed to copy large file systems, not just individual
source code projects.


> I bet a lot more accurate than that.  After a certain point
> you're a lot more likely to have random bit flips due to,
> say, cosmic rays, than to hash collisions.  But you'll
> have to compute the number and show it to me if you want
> to prove me wrong.
>

md4coll <http://www.freshports.org/security/md4coll/> is a security tool
that produces collisions for a given block of data.

I assume that by now rsync uses a more modern hash algorithm such as MD5 (
collisions <http://www.win.tue.nl/hashclash/>) or SHA-1
(collisions<http://www.schneier.com/blog/archives/2005/02/cryptanalysis_o.html>
).


>
> >
> > No file system hooks would be necessary, as demonstrated by version
> > control
> > software, which does not continuously monitor the file system for
> > changes
> > but merely confirms discrete file changes during version control
> > commits.
>
> So use version control software.  Why not?
> But VCS does not scale to the filesystem/entire system level.
>

Hmm, you might be right. I wonder if diff-based sync algorithms could
reasonably work for large file systems.


>
> Everything has it's niche.   Trying to make one-size-fit-all
> usually results in an unwieldy mess.   Rsync is already
> way, way too option-heavy, IMO.
>

Too true!

I don't know the particulars of the rsync options, but I agree
wholeheartedly in the Unix principle: do one thing and do it well.

-- 
Cheers,

Andrew Pennebaker
www.yellosoft.us
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/rsync/attachments/20121003/7645c297/attachment.html>


More information about the rsync mailing list