duplicated file removal: call for comment

Craig Barratt craig at atheros.com
Wed Feb 12 18:17:36 EST 2003


> This problem may be discussed now, because in versions before
> rsync-2.5.6, the algorithm for removing the so called "duplicated files"
> was broken.
> That's why we expect nobody used it anyway in earlier versions - but who
> knows..

I agree it should be the last argument that wins, but as Wayne points
out your code and 2.5.6 have unpredictable behavior since qsort() could
return identical names in any order.

Another concern I have about this fix in 2.5.6 is that there is risk
the change is not backward compatible with earlier protocol versions.
The file list is sent (unsorted and uncleaned) from the sender to the
receiver, and each side then sorts and cleans the list.  Since the
duplicate removal changed in 2.5.6, but the protocol number didn't
change, it is possible that with duplicates the file lists are no
longer identical.  Specifically, with three or more duplicates, 2.5.5
and earlier will remove the even ones, while 2.5.6 correctly removes
all but the first.  Remember that the files are referred to as an
integer index into the sorted file list, and the receiver skips
NULL (duplicate) files.

I suspect (but haven't checked) that if a 2.5.5 receiver is talking to
a 2.5.6 sender then 2.5.5 will send the index for the 3rd file, which
will be null_file on 2.5.6.

Craig


More information about the rsync mailing list