Proposed tweaks

Tue Sep 15 23:16:54 MDT 2009

On Tue, Sep 15, 2009 at 9:48 PM, Matt McCutchen <matt at mattmccutchen.net> wrote:
> On Mon, 2009-09-14 at 21:13 -0400, Lee Winter wrote:
>> The purpose of this note is to inquire about your collective interest
>> in optimizing rsync for certain uses, particularly atomic,
>> unidirectional transfers with few or single writer and many, often
>> very many, readers.
>>
>> After studying the requirements and the documentation on samba.org I
>> have reached the tentative conclusion that serious improvements can be
>> obtained with a few small tweaks and one medium-sized internal change.
>
> Would you care to elaborate to attract more comments?

OK, taking that as an affirmation that there exists at least a minimal
level of interest, there seem to be three possible topics: goals,
theory, and implementation.  Since the goals and associated rationales
are the only things worth discussing a priori, I'll start with them
and then present some theory to demonstrate the feasibility of the
objectives.

The use case that needs some optimization is that of online
repositories -- mirrors.  In contrast to other kinds of usage such as
file synchronization, replication, backup, etc., mirrors present a
quite different set of needs.  The most important distinction is that
mirrors, especially hierarchical collections, have high reader/writer
ratios.  They also have a very disciplined update process and it is
quite beneficial if that update process is as close to atomic as
possible.

The issue with the current implementation of rsync is that it imposes
a heavy load on the source mirror (sender in rsync terminology).  The
load is composed of two components, one being IO necessary to scan the
file system and the other being the computational cost of the delta
calculations.

The burden of those loads tends to reduce the maximum fan out of the
mirror hierarchy.  It also tends to reduce the ceiling on the update
frequency.  It would be useful to have a high update frequency.  But
at present the IO overhead of each update is a constant, so it sets a
hard limit on the time it takes to perform an update.  That minimal
time both widens the non-atomic gap during the update and reduces the
number of clients that can be serviced by each source.

I propose to eliminate both forms of load.

The end-point goal is to have source mirrors that are completely
passive, which means they simply act as specialized (rsync-specific)
file servers, but have no burdensome role in the rsync protocol.

The theory behind this proposal has two major components.  One is
related to the phases of the rsync application and the other is
related to the rsync protocol.  Changing the phases of the application
should not be hard.  The only requirement is to expose the file list.
A simplistic approach would be to add a file-list option able to
create-and-save a new file list or load-and-use an existing file list.
 This would permit a newly updated mirror to perform the file list
generation once per update rather than once per client.

The second component of this proposal is to optimize the roles of the
sender and receiver within the rsync protocol.  The proposed change
should be transparent and easy to add while maintaining backward
compatibility.

At present the computation of the delta information has specific,
non-symmetrical roles assigned to the sender and receiver.  The key
distinction is that one end has possession of the "basis file" and
needs only to compute the block checksums and send them over the wire.
 The other end must compute the rolling block checksums and compare
them to the hash table of the block checksums of the basis file.  The
latter processing represents a substantial computational burden.

In the existing implementation the "basis file" role is assigned to
the receiver.  That actually makes sense if the overall intention is
to fully optimize the use of the connection because the sender needs
the deltas' descriptors before it can transmit them to the receiver
along with the deltas' content.  So in a symmetrical environment it
makes sense to generate the deltas on the sender because that
contributes to very serious benefit that rsync offers which is
pipelining the pipe.  (sorry).

But in an severely asymmetrical environment where the goal is minimal
load on the sender, it would be better to swap the roles by using the
sender's newer version as the "basis file" and let each receiver chew
through its comparisons rather than having the sender perform the
comparisons for all of the receivers.

Of course the fact that the sender's file are static and only need to
be block checksummed once per update is an additional optimization
that comes almost for free (it can be done with a utility based on the
rsync lib if it is not worth adding a second option to the rsync app
to compute and save the block checksums.

A word about priorities is probably called for. If only one component
of this proposal can be accepted then clearly the externalization of
the file list is the most important issue.  the reasons for that are
as follows:

1.  CPU performance is increasing faster than disk performance, so
eliminating the IO burden is the bigger win.

2.  Repositories tend to have files that are already fairly dense.  So
they probably don't benefit all that much from the delta handling.  So
if the "basis file" can't be swapped to the sender then the
computational load can still be eliminated by using --whole-file mode
despite the small loss in transport efficiency.  I admit that I have
not tested this premise.  But, while I did see some traffic about the
conflict between delta processing and compressed files, I never
located a resolution or summary of the issue.  Does the current
version of rsync actually provide much benefit for compressed files?
(My 80G mirrors consistently report "speed up is 1.00" with an
occasional 1.01).

3.  Most mirrors offer methods for slicing and dicing the repository
along several different dimesions, so having external access to the
file list will make that aspect of mirroring much simpler.

My conclusion is that having external access to the file list yeilds a
vast improvement even without the proposed change in the basis file.

This describes the WHY and the WHAT, but not the HOW.  If the WHY is
good enough and the WHAT sounds reasonable it would be worth
discussing the HOW and then I think the WHO.

Naturally if this explanation is not sufficient I wll be happy to
/g/a/r/b/l/e/ explain it further.

Lee Winter
NP Engineering
Nashua, New Hampshire