File fragmentation

Matt McCutchen hashproduct+rsync at gmail.com
Fri Dec 1 03:55:24 GMT 2006


On 9/5/06, Matt McCutchen <hashproduct+rsync at gmail.com> wrote:
> There are, however, two things about the preallocate patch that I mean
> to revisit:
> (1) Rsync should preallocate locally copied files (e.g., due to
> --copy-dest) as well as transferred ones.
> (2) posix_fallocate actually extends the file with logical zeros.  If
> for some reason the file ends up being shorter than rsync expected,
> rsync needs to truncate the file to the new size.  I know this
> situation could arise if a file shrinks while it is being locally
> copied; I don't know whether it can arise if a source file shrinks
> while the sender is transferring it.

I have improved the preallocation patch to handle #1 and #2.  The
improved version is attached.  Please test and comment!  Wayne, please
consider adding the patch to patches/ of the rsync source code.

On 8/26/06, Wayne Davison <wayned at samba.org> wrote:
> I'm wondering if this will slow down rsync.  Since the function changes
> the size of the file before rsync writes out the data, does this result
> in extra disk I/O, especially for large files?  We'd probably need to
> test both Windows systems and Linux/unix systems separately and possibly
> conditionalize the code (it it does not slow things down somewhere) or
> make it a configure (command-line?) option (if someone wants to pay the
> price for reduced fragmentation).

I did a simple test with a 100MB file on Linux, and preallocation
indeed seemed to slow rsync down.  I looked at the strace, and
posix_fallocate appears to be implemented by writing a zero byte into
each needed disk block using pwrite, forcing it to be allocated.
Yuck!  I guess it works though.

On 9/13/06, Rob Bosch <robbosch at msn.com> wrote:
> Wayne…my vote is for a command-line option.

In my improved patch, preallocation is controlled by the command-line
option --preallocate.  If rsync finds ftruncate and posix_fallocate at
configure time, it supports this option when it is receiver.  Either
way, it supports passing the option on to the other side when it is
sender.

An option gives the user the most flexibility.  It might be more
convenient to the user if the receiving rsync preallocated by default
on systems where preallocation is an improvement, but I'm not sure how
to test that at configure time!

Matt
-------------- next part --------------
A non-text attachment was scrubbed...
Name: preallocate.diff
Type: text/x-patch
Size: 7763 bytes
Desc: not available
Url : http://lists.samba.org/archive/rsync/attachments/20061130/2daf2ef1/preallocate.bin


More information about the rsync mailing list