breakage? when using --ignore-times with --link-dest

Ray and Sandie Clark rclark03 at rochester.rr.com
Thu Aug 30 19:01:13 GMT 2007


Overview
--------
I am trying to use --ignore-times with --link-dest and find that all
files are duplicated inappropriately (IMHO).  I think it is because
--link-dest creates a hard link, which results in the link count, and so
Change Time, changing.  This requires rsync to create a new inode and
duplicate the data just to preserve the ctime.  If --ignore-times is not
specified, only mtime and the file size is checked so rsync doesn't
notice that other inode information changed (ctime and links).

1) I am hoping that someone can confirm my analysis of the situation

2) I am proposing that perhaps this behavior is not appropriate and
should be changed.

I would be interested in comments and suggestions of all kinds!

Thank you.

Detail
------
I am using rsync with the --link-dest option to perform backups to a
local USB disk.  It works great without the --ignore-times option, but
now I want to add that option to make it more bulletproof.  My rsync
command is below (sometimes run with, sometimes without --ignore-times).

I got unexpected behavior, and am trying to figure it out.

WITHOUT --ignore-times, files which have not changed, are not
duplicated.  Instead a hard link was put into place, as should be with
the --link-dest option.

WITH --ignore-times, *every file* is duplicated in the target directory,
even if it has not changed.  I have verified that the file data has not
changed by doing an MD5SUM of each.

Another reason to duplicate the file would be to duplicate the metadata
in the inode.  I assume that this is what is causing the copy.

The output of "stat" for the source file and the --link-dest file is
included below.  Other than device and inode number (Which
necessarily are different and I ASSUME are not checked), the only
changes are the link count, Access Time and Inode Change time.

With --ignore-times, does rsync decide to duplicate the file to maintain
either Access time, Change time, or link count, if nothing else changes,
including all other inode information?  This appears to be
true from what I can see.  Can someone confirm that?

If so, --ignore-times defeats the purpose of --link-dest, since the
inode will necessarily be changed to create the hard link, and creating
the hard-link changes ctime.

I would argue that the fact that Change Time is updated when a hard link
is made confounds FILE metadata with File SYSTEM metadata (It might be
better to have two Change Times, one for FILE metadata and one for FILE
SYSTEM metadata, but we are not going to change that for sure!)

For rsync purposes the link count should be ignored when deciding if a
file changed or not.  A hard link will get handled in due course if the
--links option is selected.  Instead rsync should base its decision that
a new file must be created on pure FILE metadata (other stuff in the
inode such as permissions, owner, group, etc). CTime by itself is not a
useful indicator of whether FILE metadata has changed, dictating
creation of a new file.

Comments, rebuttals, etc.?

Thank you.

--Ray


     rsync \
         --delete \
         --devices \
         --bwlimit "${BWLIMIT}" \
         --group \
         --hard-links \
         --ignore-times \
         --links \
         --numeric-ids \
         --owner \
         --perms \
         --recursive \
         --sparse \
         --specials \
         --stats \
         --times \
         --verbose \
         --verbose \
         --verbose \
         --verbose \
         "--link-dest=${workingDir}/${previousRSyncPath}/${userName}" \
         "${snapMountName}/" \
         "${snapRSyncHome}/${userName}"


Source to be rsynced:
  File: `newUserExample.tgz'
  Size: 14631           Blocks: 32         IO Block: 4096   regular file
Device: fd01h/64769d    Inode: 11          Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 2010/sysadmin)   Gid: ( 2010/sysadmin)
Access: 2007-07-22 00:56:24.000000000 -0400
Modify: 2007-04-04 17:26:02.000000000 -0400
Change: 2007-04-10 19:25:08.000000000 -0400


File at --link-dest:
  File: `newUserExample.tgz'
  Size: 14631           Blocks: 32         IO Block: 4096   regular file
Device: 805h/2053d      Inode: 10797118    Links: 3
Access: (0664/-rw-rw-r--)  Uid: ( 2010/sysadmin)   Gid: ( 2010/sysadmin)
Access: 2007-08-28 20:52:11.000000000 -0400
Modify: 2007-04-04 17:26:02.000000000 -0400
Change: 2007-08-28 21:10:22.000000000 -0400





More information about the rsync mailing list