Unexpected behavior with --hard-links and --ignore-existing

Benjamin Pflugmann benjamin-rsync at pflugmann.de
Tue Dec 27 05:25:37 MST 2011


Hi,

this is a re-send, because I apparently needed to subscribe to the
list first. The confirmation mail said: "If you are joining the list
with a held message, no NOT resend the message without first canceling
the held message!" I am sorry, but I am not sure if my previous mail
counts as "held" message and if so, what I need to do in order to
cancel it (aside from that "no NOT" above is a typo, isn't it?).

Back to my original request:

I hope I am right here with my concern, if not kindly direct me to the
right place. Thank you.

I searched via google and the bug tracker but didn't find anyone with
a similar problem[1]. Summary: When (repeatedly) running rsync with
--hard-links and --ignore-existing, new hard links are copied instead
of linked.

Long story: I try to distribute a heavily hard-linked source directory
to several machines. Due to the kind of files and services I usually
only want to distribute new files, and prevent to modify or delete
existing ones. Therefore I use --ignore-existing, which does the job
just fine for normal files. To safe space and time, I want to add
--hard-links, which also works as expected on its own. But combined,
it seems that the existing files are not considered as candidates for
linking.

Reproduction recipe:

----------------------------------------------------------------------
$ rsync --version
rsync  version 3.0.8  protocol version 30
Copyright (C) 1996-2011 by Andrew Tridgell, Wayne Davison, and others.
Web site: http://rsync.samba.org/
Capabilities:
    64-bit files, 64-bit inums, 64-bit timestamps, 64-bit long ints,
    socketpairs, hardlinks, symlinks, IPv6, batchfiles, inplace,
    append, ACLs, xattrs, iconv, symtimes

rsync comes with ABSOLUTELY NO WARRANTY.  This is free software, and you
are welcome to redistribute it under certain conditions.  See the GNU
General Public Licence for details.

# setting up an example source/target
$ mkdir source
$ echo "foo" > source/one
$ rsync -av --hard-links --ignore-existing source/ target
sending incremental file list
created directory target
./
one

sent 109 bytes  received 34 bytes  286.00 bytes/sec
total size is 4  speedup is 0.03

# now create a new hard link
$ ln source/one source/two
$ rsync -av --hard-links --ignore-existing source/ target
sending incremental file list
./
two

sent 121 bytes  received 34 bytes  310.00 bytes/sec
total size is 8  speedup is 0.05
# It got copied instead of linked

# Now, if we skip --ignore-existing, a hard-link is created, as I
# would have expected for the former command already.
$ rsync -av --hard-links source/ target
sending incremental file list
two => one

sent 78 bytes  received 19 bytes  194.00 bytes/sec
total size is 8  speedup is 0.08
----------------------------------------------------------------------

Well, is this intended behaviour? The man page says:
--ignore-existing
    This tells rsync to skip updating files that already exist on the
    destination (this does not ignore existing directories, or nothing
    would get done).  See also --existing.

    This option is a transfer rule, not an exclude, so it doesn’t
    affect the data that goes into the file-lists, and thus it doesn’t
    affect deletions.  It just limits the files that the receiver
    requests to be transferred.

    This option can be useful for those doing backups using the
    --link-dest option when they need to continue a backup run that
    got interrupted.  Since a --link-dest run is copied into a new
    directory hierarchy (when it is used properly), using --ignore
    existing will ensure that the already-handled files don’t get
    tweaked (which avoids a change in permissions on the hard-linked
    files).  This does mean that this option is only looking at the
    existing files in the destination hierarchy itself.

To me, this doesn't suggest that the new file cannot be hard-linked to
an existing file, but I must admit I do not understand all
implications of transfer rules and excludes.

Else, if this is working as intended, any suggestion how to have the
advantage of --hard-links (size and speed wise), while not modifying
existing files on the target? For now I do the rsync without
--ignore-existing after I checked that rsync with --dry-run --existing
--delete would do nothing. But that solution has it's own problems.

Thank you in advance,

      Benjamin Pflugmann.


[1] During my search I noticed some broken links, e.g. on
    ftp://ftp.samba.org/pub/unpacked/rsyncweb/bugzilla.html
    "NEWS file from the git repository" points to
        ftp://ftp.samba.org/ftp/unpacked/rsync/NEWS
    "patches dir" to
        ftp://ftp.samba.org/ftp/rsync/dev/patches/
    which both give me a "550 Failed to change directory" error when I
    click the links in my browser.

    Same on ftp://ftp.samba.org/pub/unpacked/rsyncweb/issues.html for
    "TODO file" pointing to
        ftp://ftp.samba.org/ftp/rsync/TODO


More information about the rsync mailing list