combining --preallocate and --fuzzy

Matt McCutchen matt at mattmccutchen.net
Fri Apr 4 00:55:24 GMT 2008


On Thu, 2008-04-03 at 15:49 -0400, John Taylor wrote:
> What I would like to accomplish is the merging of the .preallocate patch
> and the .fuzzy option.  When these 2 switches are used together (at least
> on Windows platforms), I want rsync to determine which fuzzy file to use
> (from the fuzzy_distance() function), but then go ahead and preallocate
> the new file, not with posix_fallocate(), but with the contents of the
> file matched from fuzzy_distance().  This would keep the destination
> file from being severely fragmented.
> 
> What I see happening is rsync doing a local file copy first, but not
> block by block.  When you copy a file on Windows, with the copy command,
> the entire file (size) is preallocated so that no fragmentation occurs.
> After this step, the copy command  performs the actual transfer of data
> from the source file to the destination file.  You could think of this
> almost as a local, non-fragmenting file copy operation, and then the
> rsync algorithm is used to update the new, destination file.

Do I understand correctly: you want rsync to copy the fuzzy basis file
to the new destination file (either by reading and writing or with some
special Windows system call) and then proceed with the file transfer?  I
don't see how the extra step would reduce fragmentation compared to
rsync's current technique.  Rsync currently preallocates the new
destination file to the length of the source file using posix_fallocate,
which (IIRC) maps to Windows's SetEndOfFile, and then fills in the data.
This is pretty much the same as what the Windows copy command does, and
Rob Bosch has found that it mostly eliminates fragmentation.

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : http://lists.samba.org/archive/rsync/attachments/20080403/de3398ed/attachment.bin


More information about the rsync mailing list