question about 2.6.3pre2's --link-by-hash behaviour

Paul Slootman paul at debian.org
Thu Sep 23 14:27:01 GMT 2004


On Wed 22 Sep 2004, Wayne Davison wrote:

> On Wed, Sep 22, 2004 at 04:54:32AM -0400, Erik Jan Tromp wrote:
> > Are there plans to make --link-by-hash pay attention to file externals?
> 
> The issue has come up before:
> 
> http://lists.samba.org/archive/rsync/2004-February/008630.html
> 
> I don't know of any plans for changing the --link-by-hash patch, but if
> someone is interested in working on it, I can integrate changes into it.
>
> One simple change would be to prime the computed hash using some
> external data (e.g. uid, gid, and/or mode), but it sounds like at least
> the original creator would like this to be optional.

I might be interested...
Your suggested approach was the first thing I thought of too :-)

I tried to copy over a 128GB mirror of debian mirror (more than a year's
worth of daily snapshots, where identical files are hardlinked across
the daily snapshots. Unfortunately, even with 2.6.3pre1 using the -H
option meant that the 4GB address space available to rsync was
exhausted :-(  I'm now thinking of using this --link-by-hash option to
do it day by day.

I tried using --link-dest=yesterdays-dir on an incremental basis (i.e.
doing one day at a time), however it seems that didn't quite work out as
the 128GB became 188GB, almost making the migration of the archive
useless (the larger new archive is now also almost full).

When it's done, I'll report my findings :-)

BTW, I found that when trying to use --checksum on an rsyncd that has
refuse options = checksum, I get "This server does not support
--checksum (-c)". This is a bit misleading, I first thought that it
indeed didn't support it, while it's simply refusing it. Perhaps a
better message would be: "This server refuses the --checksum (-c)
option".


Paul Slootman


More information about the rsync mailing list