Current status of --inplace?

Matt McCutchen matt at mattmccutchen.net
Mon Nov 10 20:32:07 GMT 2008


On Mon, 2008-11-10 at 09:57 -0600, Steve Bergman wrote:
> In the 3.0.4 version of the man pagem dated June 29, 2008, it still
> states:
> 
> """
> (5) the efficiency of rsync’s delta-transfer algorithm may be reduced if
> some data in the destination file is overwritten before it can be copied
> to a position later in the file
> """
> 
> Also, I know I have read somewhere in the past that the limitation stems
> from the fact that "rsync does not yet sort the blocks to be updated". I
> presume that the current manual is accurate, but would be interested in
> confirmation.

Yes, this is still the case.

> Also, how how significant is this effect?  I have some
> possible uses for --inplace which could benefit from not having to copy
> all that data locally every time, but which also require good network
> efficiency.

It depends on how the source file is changing.  An insertion in the
middle of the file moves the data after the insertion to a later offset,
so rsync will retransmit all of that data because it is overwritten on
the destination before it can be copied to the later offset.  This is
easily demonstrated:

$ cp PATH/TO/eclipse-SDK-3.4-linux-gtk.tar.gz dest
$ cat <(head -c 100000000 dest) <(echo "INSERTED DATA") \
	<(tail -c +100000001 dest) >src
$ cat <(echo hi) dest >src
$ rsync --only-write-batch=batch --no-whole-file --inplace --stats src dest

Number of files: 1
Number of files transferred: 1
Total file size: 158375423 bytes
Total transferred file size: 158375423 bytes
Literal data: 58382959 bytes  # everything after insertion retransmitted
Matched data: 99992464 bytes
File list size: 18
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 28
Total bytes received: 88133

sent 28 bytes  received 88133 bytes  16029.27 bytes/sec
total size is 158375423  speedup is 1796.43 (BATCH ONLY)

Compare to:

$ rsync --only-write-batch=batch --no-whole-file --stats src dest

Number of files: 1
Number of files transferred: 1
Total file size: 158375423 bytes
Total transferred file size: 158375423 bytes
Literal data: 12598 bytes  # only the affected block retransmitted
Matched data: 158362825 bytes
File list size: 18
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 28
Total bytes received: 88133

sent 28 bytes  received 88133 bytes  35264.40 bytes/sec
total size is 158375423  speedup is 1796.43 (BATCH ONLY)

On the other hand, in-place changes (common for databases) do not move
any data and thus do not incur any efficiency loss with --inplace:

# The "echo" replaces bytes 100000001 through 100000013
# (counting from 1, like tail does).
$ cat <(head -c 100000000 dest) <(echo "CHANGED DATA") \
	<(tail -c +100000014 dest) >src
$ rsync --only-write-batch=batch --no-whole-file --inplace --stats src dest

Number of files: 1
Number of files transferred: 1
Total file size: 158375409 bytes
Total transferred file size: 158375409 bytes
Literal data: 12584 bytes  # only the affected block retransmitted
Matched data: 158362825 bytes
File list size: 18
File list generation time: 0.001 seconds
File list transfer time: 0.000 seconds
Total bytes sent: 28
Total bytes received: 88133

sent 28 bytes  received 88133 bytes  35264.40 bytes/sec
total size is 158375409  speedup is 1796.43 (BATCH ONLY)

Matt



More information about the rsync mailing list