recent discussion regarding 'checksums'

grarpamp grarpamp at gmail.com
Thu Sep 30 15:51:29 MDT 2010


> Also as I think Wayne pointed out, using "-c" isn't very common in the first place.

>  transfer. The delay due to I/O wait is going to be orders of magnitude higher
>  than any reasonable hash computation.

In a parallel life, I also mirrored some websites generated from database
backends. For some unknown reason, they often changes embedded
fixed length strings in the html, but the last-mod date didn't change.
I never caught this till using -c for grins. Then I ran it against the entire
filesystem and turned up some more interesting diffs. It was just a
thousand or so small files so there wasn't much I/O time involved in -c.

I've also compared find -ls's for grins [reported most recently, albeit
indirectly as:  Subject:  Abysmal sparse file performance!]

So the various use cases do exist out there. I've solved media file
integrity with ZFS sha256 checksums on top of crypto block devices.
And due to having some CPU to spare, still make and compare strong
checksum indexes to catch new things as mentioned two paragraphs up.
It's only marginally faster to do it separately than with rsync -c overhead.
And the index is a bonus used for fastfind, etc.

Anyways :)


More information about the rsync mailing list