[Bug 13735] New: Synchronize files when the sending side has newer change times while modification times and sizes are identical on both sides

samba-bugs at samba.org samba-bugs at samba.org
Wed Jan 2 17:28:00 UTC 2019


https://bugzilla.samba.org/show_bug.cgi?id=13735

            Bug ID: 13735
           Summary: Synchronize files when the sending side has newer
                    change times while modification times and sizes are
                    identical on both sides
           Product: rsync
           Version: 3.1.3
          Hardware: All
                OS: All
            Status: NEW
          Severity: normal
          Priority: P5
         Component: core
          Assignee: wayned at samba.org
          Reporter: sbehuret at gmail.com
        QA Contact: rsync-qa at samba.org

Files that have identical sizes and change times on the sending and receiving
sides, but different contents, will not be synchronized by default (e.g. rsync
-a --delete source:/path/ dest:/path/). Synchronizing these files requires the
use of --checksum or --ignore-times options, which are both sub-optimal in most
cases (see caveats below). I would like to propose new options to efficiently
synchronize these files.

To make this report as clear as possible, I kindly remind that modification
times (mtime) can be manually set by users (e.g. with touch) and rsync will
preserve those during synchronization (with -a, which includes -t). By
contrast, change times (ctime) are automatically updated by the OS when files
are changed. rsync relies on modification times to decide whether it should
skip or transfer files.

There are some use cases where files are modified while preserving their
original sizes and mtimes. This can happen when a fixed-size file is updated
with new content in a build chain while forcibly preserving a specific mtime.
To the best of my knowledge, rsync does not have any option to transfer these
files in an efficient manner.

I illustrate the issue below:

# Create two different files with same size
echo 'new content' > srcfile
echo 'old content' > destfile

# Set identical mtime for both files: Update srcfile's mtime to match
destfile's mtime
touch -mr destfile srcfile

# At this point, srcfile and destfile have:
# - identical size
# - identical mtime
# - different content
# - srcfile's ctime is newer than destfile's ctime

# rsync experiments (dry runs)
rsync -avn srcfile destfile # will not synchronize
rsync -avn --checksum srcfile destfile # will synchronize
rsync -avn --ignore-times srcfile destfile # will synchronize

The desired behavior is to synchronize srcfile to destfile, because srcfile is
different and has a newer ctime.

Caveats: --checksum option is I/O intensive and will check all files
         --ignore-times option will force a (re)synchronization of all files

To solve this issue, we could check change times. Considering that rsync will
not be able to control change times on the receiving side, we must be careful.
If we suppose that rsync used ctimes (and not mtimes) to compare these files,
it would first find that srcfile is newer, and then on subsequent rsync passes,
it would find that destfile is newer. Therefore we can't transfer files if
[srcfile's ctime != destfile's ctime]. However we can use a combination of
mtime and ctime to solve the above ambiguity.

Considering that both files have identical size and mtime, I propose the
following new flags:
--ignore-times-if-newer
  "don't skip files that match size and (m)time if source has newer ctime"
  This would force a transfer for files that are newer on the sending side,
regardless of their sizes and mtimes.
--checksum-if-newer
  "skip based on checksum if source has newer ctime"
  Likewise, this would force a transfer for files that are newer on the sending
side and different in content.

Similarly, --ignore-times-if-older and --checksum-if-older may be desirable
when we trust older files on the sending side more than newer files on the
receiving side.

New rsync experiments:

# Try to synchronize srcfile to destfile
rsync -avn --ignore-times-if-newer srcfile destfile # will synchronize
rsync -avn --ignore-times-if-older srcfile destfile # will not synchronize

# Swap source and destination: Try to synchronize destfile to srcfile
rsync -avn --ignore-times-if-newer destfile srcfile # will not synchronize
rsync -avn --ignore-times-if-older destfile srcfile # will synchronize

# Real examples (wet runs) to test multiple rsync passes
# Synchronizing srcfile to destfile
rsync -av --ignore-times-if-newer srcfile destfile # 1st pass: will synchronize
rsync -av --ignore-times-if-newer srcfile destfile # 2nd pass: will skip
rsync -av --ignore-times-if-newer srcfile destfile # nth pass: will always
skip: srcfile's ctime is no longer newer, rely only on size and mtime that are
identical
rsync -av --ignore-times-if-older srcfile destfile # 1st pass: will synchronize
rsync -av --ignore-times-if-older srcfile destfile # 2nd pass: will synchronize
rsync -av --ignore-times-if-older srcfile destfile # nth pass: will always
synchronize: srcfile's ctime will always be older at this point

# Final cleanup
rm srcfile destfile

And similarly for --checksum-if-newer/--checksum-if-older.

These additional options would efficiently synchronize newer
(--ignore-times-if-newer/--checksum-if-newer) or older
(--ignore-times-if-older/--checksum-if-older) files that do not differ in size
and mtime.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.



More information about the rsync mailing list