[Bug 14109] New: Support Custom Fuzzy Basis Selection Algorithm
samba-bugs at samba.org
samba-bugs at samba.org
Sun Sep 1 22:55:39 UTC 2019
Bug ID: 14109
Summary: Support Custom Fuzzy Basis Selection Algorithm
Assignee: wayne at opencoder.net
Reporter: lonniebiz at yahoo.com
QA Contact: rsync-qa at samba.org
The --fuzzy argument does an incredible job at syncing large files when it
chooses the correct fuzzy basis.
However, the default "fuzzy-basis-destination-file-selection algorithm" is not
correct for every situation, so I propose the ability to pass an argument to
the fuzzy parameter that specifies which
"fuzzy-basis-destination-file-selection algorithm" to use.
I've posted a question detailing my needs here:
In short, some of the files in my source-folder are 200GB in size. When rsync
chooses the correct existing-destination-file for its "fuzzy basis", my
synchronization (of these files) seems magical in term of the data that gets
transferred over the wire.
However, when it chooses the wrong existing-destination-file as the source
file's fuzzy basis, the data transfer can take days.
Look at the filenames in both my source-folder an destination-folder (below):
# Source Folder's new files (from today's on-site backup):
# Destination-Folder's old files (from yesterday's off-site backup):
In my case, the fuzzy-basis-selection-algorithm needs to select the existing
1) Has the same file extension as the source file
2) Begins with the most consecutively identical characters as the source file
The default algorithm does not meet these requirements.
Therefore, I propose the ability to pass an argument that allows the user to
specify non-default fuzzy basis selection algorithms.
There should probably be a few common, baked-in ones (as time goes on) that you
can choose from by name and it would be even more flexible if rsync also
permitted the user the ability pass a file into the command that specifies a
custom "fuzzy-basis-destination-file-selection algorithm".
Naturally, if these features are granted, the documentation would also need to
be update to give guidance on specifying these things.
If these things are already implemented, and I have somehow overlooked them,
would you kindly post an answer to my question here?:
You are receiving this mail because:
You are the QA Contact for the bug.
More information about the rsync