[Bug 13735] Synchronize files when the sending side has newer change times while modification times and sizes are identical on both sides

samba-bugs at samba.org samba-bugs at samba.org
Tue Jan 22 10:57:39 UTC 2019


--- Comment #3 from Sébastien Béhuret <sbehuret at gmail.com> ---
Thank you for suggesting the patches repo. An improved checksum/maybe-checksum
algorithm would be great but there appears to be a lot of work to achieve this.
Checksums are very handy for special cases (e.g. to detect and fix data
corruption) but are still relatively slow and prone to collisions or require
specific patches as you suggested. We ideally want the possibility to enforce
the synchronization of files that are more recent on the sending side when
mtime and size are identical on both sides. This would improve the reliability
of system backup software that are based on rsync, and could be implemented as
a new option to alter the behavior of the quick-check algorithm.

Overall, rsync lacks a solid way to detect and transfer back-dated files. I
feel like the importance of dealing with back-dated files is underestimated:

In a file system, file back-dating may occur during software updates without
malicious intent and users being aware of it. An example of file back-dating is
found in Firefox package in Debian-based distributions. Some JS files in
/usr/share/firefox/browser/defaults/preferences/ directory are always dated
2010-01-01 00:00:00. When changes in these files are small (e.g. a version
string, a fixed-size series of characters such as a timestamp, hash or key),
the files end up with the same size and mtime and the changes won’t be detected
by rsync quick-check algorithm. Backup software relying on rsync for
incremental updates will eventually get wrong unless they use the --checksum
option, but this is sub-optimal (and sometimes buggy) and most backup systems
don’t even allow the user to add this option.

Quick fix suggestion:

This may be a bit of an oversimplification, but assuming that the current rsync
quick-check algorithm looks like this:

synchronize(source, dest) IF [ mtime(source) != mtime(dest) AND size(source) !=
size(dest) ]

Then a new option (e.g. --use-ctime or --ignore-times-if-newer) could alter it
in the following way:

synchronize(source, dest) IF [[ ctime(source) > ctime(dest) ] OR [
mtime(source) != mtime(dest) AND size(source) != size(dest) ]]

(Notice the use of ‘greater than’ rather than ‘not equal’ to compare ctimes.)

This would do the trick and ensure that files that were back-dated are properly
detected and synchronized during incremental updates. I think that such an
option is a must-have for reliable backup software, and could even be enabled
by default since atime updates do not alter ctime.

You are receiving this mail because:
You are the QA Contact for the bug.

More information about the rsync mailing list