rsync hashing / collision handling?
andrew.pennebaker at gmail.com
Wed Oct 3 07:04:08 MDT 2012
I find your responses quite interesting!
> I'm wondering why rsync chooses not to work like version control
> > software,
> Because it's designed to copy large amounts of data.
Ah, rsync is designed to copy large file systems, not just individual
source code projects.
> I bet a lot more accurate than that. After a certain point
> you're a lot more likely to have random bit flips due to,
> say, cosmic rays, than to hash collisions. But you'll
> have to compute the number and show it to me if you want
> to prove me wrong.
md4coll <http://www.freshports.org/security/md4coll/> is a security tool
that produces collisions for a given block of data.
I assume that by now rsync uses a more modern hash algorithm such as MD5 (
collisions <http://www.win.tue.nl/hashclash/>) or SHA-1
> > No file system hooks would be necessary, as demonstrated by version
> > control
> > software, which does not continuously monitor the file system for
> > changes
> > but merely confirms discrete file changes during version control
> > commits.
> So use version control software. Why not?
> But VCS does not scale to the filesystem/entire system level.
Hmm, you might be right. I wonder if diff-based sync algorithms could
reasonably work for large file systems.
> Everything has it's niche. Trying to make one-size-fit-all
> usually results in an unwieldy mess. Rsync is already
> way, way too option-heavy, IMO.
I don't know the particulars of the rsync options, but I agree
wholeheartedly in the Unix principle: do one thing and do it well.
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the rsync