New feature: detect and avoid transfering renamed files

Mon Sep 22 02:36:49 GMT 2008

On Tue, 9 Sep 2008 07:49:06 -0700, Wayne Davison wrote:
> Sorry for the slow reply -- I marked your message for more in-depth
> study, and failed to get back to it until now.

That's OK, I've done worse :-(

> drawbacks:
> 
>  - It creates a single (potentially really big) directory of files on
>    the receiver for the byinode/* files.
[others deleted]

Indeed, it more or less assumes you have a filesystem which handles this
well. Your other observations are also quite correct.

> I had been thinking of extending the db patch to add the ability to
> track files by checksum in a database.  This would allow a run that used
> the DB to be an efficient checksum run (reading the checksums from the
> DB, not slowly generating them) and look up matching checksums in the DB

That is a good idea because the database can be used for all sorts of
other purposes too. Here are the drawbacks I see:

  - I think you will have trouble catching files that have both moved
    and changed since the last rsync (typical example: /var/log/syslog
    had data appended and then was rotated to /var/log/syslog.0 and
    then rsync runs to do an incremental backup).

    You can solve this by storing dev&inum in the database. I'm not
    familiar with the db patch, perhaps it already does this?

  - It might be inconvenient to have a database on both sides of the
    transfer. For example, when I backup many devices to a backup
    server, it's cleaner to keep state only on the backup server.
    Some small devices (mobile phones?) might not even tolerate the
    creation of "extra" files (sqlite database) in their filesystem.

As I indicated, the first is solvable, and I think I can live with the
second one.

-Phil