lee.j.i.winter at gmail.com
Wed Sep 16 08:13:40 MDT 2009
On Wed, Sep 16, 2009 at 1:44 AM, Matt McCutchen <matt at mattmccutchen.net> wrote:
> Both of your proposals have been discussed before (see below),
> neither has been taken very far because they would both involve large
> changes to rsync.
The delta computations are addressed below. Externalizing the file
list should have quite minimal impact on the existing code. Now I say
that without having planned the revisions to the existing code base.
For me such a planning effort would be equivalent to doing the
modifications, so if that is what I should be doing (essentially
forking the code) then further discussing is not useful.
However, it appears to me that the issue can be decomposed into simple
changes that would not disturb the existing design, so discussion of
the feasibility is probably worthwhile.
To externalize the file list we do need a substantial change to the
_behavior_ of the program, but I think those changes can be driven by
two fairly local changes to the code. The harder of the two changes
is using a pregenerated file list. The option that provides the
pregenerated file list must replace the reading of the local file
system with the reading of the file containing the pregenerated file
list. The only complication I expect is that the reading of the local
file system might be intertwined with the processing of it.
One way to deal with this is to make the change at a very low level.
Essentially we would capture the file system scanning transactions as
they take place and preserve their logical sequence. The capture
would have to include the request parameters and the returned results.
Then "replaying" the transactions can be done by injecting the
results at the same level where they were captured. This would have a
substantial impact on that section of the code, but it would be
completely transparent to the rest of the code (including intrusive
stuff like progress indicators).
This isn't hard. Nor is it complex. And it is easy to test because
you can run both in parallel actions and compare the results to
validate the fidelity of the capture/playback.
A less dramatic approach would be to implement the change at a point
between the phases of the application. That phase change point would
be a natural point of "cleavage" where the complexity of the change
would be trivial. It appears to me that such a point must exist
because both ends are required to sort their file lists prior to
sending them. So changing the existing behavior involves only
conditionalizing the file system processing that happens prior to the
sorting and then (opposite) conditionally substituting the
pregenerated and presorted file list in place of the sorting.
Adding a conditional write plus an exit after the existing sorting
would be a simple way to implement the pre-generation of the file
list. But if not it can be skipped and the pregeneration performed by
a separate program (which would probably not be part of the rsync
> On Wed, 2009-09-16 at 01:16 -0400, Lee Winter wrote:
>> A simplistic approach would be to add a file-list option able to
>> create-and-save a new file list or load-and-use an existing file list.
>> This would permit a newly updated mirror to perform the file list
>> generation once per update rather than once per client.
> A first step in this direction is the maintained patch "db.diff", which
> caches checksums of entire source files but does not (yet) cache the
> file list itself.
OK, so how hard would it be to add the file list? Implicitly it is
available (and used), but just never saved, right?
>> But in an severely asymmetrical environment where the goal is minimal
>> load on the sender, it would be better to swap the roles by using the
>> sender's newer version as the "basis file" and let each receiver chew
>> through its comparisons rather than having the sender perform the
>> comparisons for all of the receivers.
> This is a good idea that has been discussed before and is implemented in
> another tool called "zsync". Wayne noted some of its drawbacks here:
> Perhaps you could respond to his concerns.
Given the fact that the intended usage involves dense files for which
the delta processing is not expected to be beneficial, this part of
the discussion is probably moot.
First, I was not proposing an unconditional change to the direction of
delta handling. I was actually proposing a more ambitious change
which would make the directionality the subject of an user option.
That seriously increases the scope of the change, but it also gains
you the benefit of the feature (flexible direction), so it might be
justified on that basis.
Second, I looked at the comments, and while technically valid I don't
think they are reasonable.
------ quote start
(1) This increases the amount of data flowing in the transfer
direction, removing the largest part of the data currently flowing in
the opposite direction. This is a bad thing if the network is the
bottleneck for the transfer. The current algorithm helps to balance
out the network use by sending the checksum data in the opposite
direction of the transfer.
----- quote stop
The whole point of the rsync delta algorithm is that the checksums are
small compared to the full data set. So the bandwidth impact should
be negligible except in the extreme cases where a few bytes of a large
file are changed. That situation is both rare and predictable. And
in that situation the delta processing is less effective than a simple
diff (based on a sender-cached image of the modified file) because the
delta processing uses essentially arbitrary block boundaries. If a
block boundary search was added, which I have also seen discussed,
then delta processing would be competitive with diff in even the
extreme cases. But those searches are quite expensive in terms of
processing load, so they too would have to be a user option.
----- quote start
(2) This increases the disk I/O on the sending side because it would
need to read the file twice (assuming that cached checksums aren't
available). Since rsync is running in a pipelined mode, the sender
will be iterating over several future files by the time it gets back a
request for data chucks from the receiving side. Hopefully the data
will still be in the disk cache, but if it is not, the transfer would
bog down to a significant degree.
----- quote stop
Again we have a non-quantitative statement. The absolute worst case
effect is to double the disk IO. I have a hard time imagining an
rsync system that is within a factor of two of being disk-bound. The
best I can come up with is a hardware chimera of an ST-506 drive
trying to fill an OC-192 pipe. It would have to be a chimera because
I've never seen a mother board that could accept both kinds of bus
adapters. There could be a software bottleneck such as anything
running micros~1 window~1. But in neither of those cases does anyone
actually care about the overall performance of the system. If anyone
cared about performance then the system wouldn't be built like that.
(C.f., analogy with the anthropic principle)
----- quote start
(3) This increases the time during which a change on the sending side
results in the sending of a bogus file to the receiver. The reason is
that the checksums are sent some amount of elapsed time prior to when
the requests for file hunks begin arriving. If changes happen during
this period, the file will not be constructed correctly. The current
algorithm has a similar problem, but is sensitive to changes occurring
on the receiving side. For many applications, the receiving side only
changes when the files are updated by rsync (e.g. backups), so the
current algorithm has the advantage that the checksum-aging window is
on the side that is less likely to be changing.
Still no numbers. What is the percentage increase in the window of
vulnerability? I expect it to be small because the pipelining of
multiple overlapping updates already increases the vulnerability
window by a large factor.
----- quote start
(4) Either the number of active temporary files increases, or the
memory needed for remembering the matching blocks increases. Either
way, it looks like the disk I/O on the receiving side also increases.
This occurs because the receiver first gets the checksum data a good
time before the sender data starts arriving for the file. One
algorithm to handle this is to start constructing a sparse temporary
file as the basis data is read, keeping the temp file open until the
server data arrives to finish the file. The bad thing about this
algorithm is that it makes it hard to compute the full-file checksum
without re-reading all the matched data from the temp file (it also
increases the number of active temporary files, which increases the
demands on free disk space). The alternative is to not do any temp-
file building until the sender data starts arriving, at which point
we'd need to re-read the matched data from the basis file -- this
would require caching (in memory?) all the matching block info we
computed for all the files that are awaiting sender-side data.
One solution to some of the above is to remove the pipelining from the
transfer so that rsync is operating on one file at a time, adding
delays to the transfer as each file is requested in sequence.
The main benefit of a reverse-algorithm is that the sender does not
need as much CPU, especially if the checksum data is cached. It seems
to me that the extra disk I/O needed on both sides might outweigh this
advantage, especially for large files that cannot remain in the disk-
cache long enough for the necessary re-reading.
----- quote stop
This is not a reasonable point. Exactly the same processing is going
on, just in different (better) places. But the above comments count
only the increases and none of the decreases, which is quite unfair.
>> But, while I did see some traffic about the
>> conflict between delta processing and compressed files, I never
>> located a resolution or summary of the issue. Does the current
>> version of rsync actually provide much benefit for compressed files?
> Not for typical compression formats, where a local change to the input
> changes the output from that point on. If compression would preserve
> locality of changes, then delta transfer would work, but locality is
> somewhat at odds with effective compression. There used to be a patch
> "gzip-rsyncable.diff" that modified gzip for better locality, but I
> don't know if it's maintained any more.
This renders most of the above discussion on reverse delta handling
moot. With a large collection of dense files there is no point to
making any change. Better to just drop the delta-handling and keep
the (impressive) collection of file system heuristics that have grown
up around the compression algorithm.
> The other approach is to delta-transfer the uncompressed content:
A non-started I think.
We should ignore the reverse delta issue. We should not spend much
time on the generation of the file list. Instead we should
concentrate on the issue of optionally inserting an existing file list
instead of generating a new one. So we aren't talking about a change
to rsync, but a change to db.diff. If that is hard, will someone
please tell me what is hard about it?
More information about the rsync