Purpose of --checksum-seed ?

Thu Aug 11 04:46:45 MDT 2011

Hi Wayne,

On 11/08/2011 02:58, Wayne Davison wrote:
> On Wed, Aug 10, 2011 at 3:19 PM, Andrew Gideon <c182driver1 at gideon.org>wrote:
> 
>> So...what is the point of fixing the seed?
>>
> 
> I'ts not really intended for normal use.  It can help with some types of
> debugging and other fringe uses.
> 
> Is there a "best solution" for caching checksums?
> 
> 
> For use with user-changeable files (as apposed to things like mirror files
> that don't change in unpredictable ways), my favorite patch is db.diff -- it
> allows you to cache checksums in a SQLite or MySQL db.  If you copy files
> to/from temporarily-mounted filesystems, you will need to update the "disk"
> table to match the mounted devices each time that changes or just leave out
> the temp-mounted disks (which will just not cache checksums associated with
> a missing disk device number).  If your mounts don't vary, a single-time db
> init via the perl script is all you need (perldb --db=foo.cfg --init
> --mounts).

Thanks for the explaination. I was playing around with that patch
recently but didn't get it to work.
What exactly is the point of --mounts? There is no etc/mtab on FreeBSD,
and from looking at the code I couldn't figure out what that is supposed
to do...

> I also recommend using the mysql db support unless you're just trying things
> out, in which case the SQLite support can be somewhat useful.  The SQLite DB
> likes to do a lot of exclusive locking, and so it is pretty slow when
> writing lots of checksums.  It can also not play nice with a local transfer
> where you want both the sender and receiver to be querying and updating the
> DB at the same time.

Using a separate SQLite db for each sides should be fine?

> Note also that each side of the transfer needs its own --db option, which is
> why the patch expects the --remote-option (-M) patch to be applied first.
>  e.g. rsync -aiv --db=/etc/db.cfg -M--db=/etc/db.cfg src/ dest/
> 
> I'm unclear how caching checksums doesn't cause exactly that problem (since
>> a cached checksum means that the file isn't checked for changes each time
>> rsync runs).

What my use-case was is to cache on the receiving side because I can
guarantee there are no modifications to the files.

Also, I might have misunderstood why it requires the -c option. If file
times/sizes/etc mismatch then there's no point in computing checksum? Or
is that the very same checksum that is used to figure out which parts of
the file need to be transfered? (in which case it needs to be recomputed
too to figure out what has changed locally.)

Johannes