superlifter design notes and a new proposal
jw at pegasys.ws
Mon Aug 5 20:35:02 EST 2002
On Tue, Aug 06, 2002 at 01:09:46PM +1000, Martin Pool wrote:
> On 5 Aug 2002, Paul Haas <paulh at hamjudo.com> wrote:
> > > Fast networks are a significant use case that ought to be considered.
> > > On GigE, having low processing overhead and being able to use things
> > > like sendfile() to transmit data are going to be enormously more
> > > important than the length of the header.
I concur. GigE is a while away for most transfers (except
the TB+) but i think we can say that the vast majority of
rsyncs probably fall into either the 100Mb switched or the
384Kb-1.4Mb internet links.
> > Which is another situation where compression code in rsync may not be a
> > win.
> > Is there a big real world CPU penalty for letting ssh do the
> > compression?
> I doubt it.
There are three gains to be had from doing the compression
in rsync. Allow compression when connecting isn't over ssh.
Use less CPU intensive compression (lzo) for mid-speed
networks. Reduce the amount of data being copied. Sendfile
is well and good but as far as i know ssh isn't going to
take advantage of it. Except for the non-ssh conection i
don't think they are worth the added complexity.
Certainly putting the compression in is the last thing that
should be done. Let's leave it to the network layers for
now and get the rest working. Then try doing internal
compression with modified (get|send)_msg and see how much it
buys us over ssh.
> > That would mean uncompressed bits go over the local socket, but the
> > network links are compressed.
> These two extremes are what lead me to think that micro-optimizing the
> headers is not a good idea.
> - for fast networks, being quick to handle (i.e. aligned, fixed-size)
> is more important than size
> - for ssh, you have to compress anyhow, so bumming bytes out is not
J.W. Schultz Pegasystems Technologies
email address: jw at pegasys.ws
Remember Cernan and Schmitt
More information about the rsync