request: add TCP buffer options to rsync CLI?

Mon Nov 7 13:40:43 GMT 2005

Jason,

Summary guess-on-my-part:
   "Maybe" (w.r.t. will_larger_buffers_help?).  Depending on which 
satellite-height you're
   using, and what your current default buffers are, and which-operating-system,
   1-4Mbps is on the edge where more-than-default buffers are useful. 
More detail below.
   (But it's equally likely, perhaps more likely, that you're getting 
just enough loss,
   on such a long path, that TCP is frequently getting it's rate 
reduced; and on long
   paths it can take a fairly long time to climb back up to "full rate").
   More detail below...

More detail:
   Jason- after looking at the following, it you'd like me to try to help
   further, but conclude the discussion will bore rsync-list-members,
   feel free to reply unicast.  If you think it'll be "interesting" to 
list members,
   OK...

Stuff that will help you/(us) figure out what the core issue is:

   1. One key item is to figure out the bandwidth*delay product
     of the path.  The whole thing about adjusting TCP buffers is that
     the Sender-transmit-buf, and the Receiver-receive-buf, need to
     be large enough to "fill the pipe".  I.e., be as large as the
     bandwidth*delay product  (mbits/sec * RTTseconds * (1byte/8bits)) 
--> xx bytes.
     Depending on implementation, sometimes the buffers need to be 
2-3x this value.

   2. What's the ping-time to the other end?  Knowing the round-trip-time will
     help figure out if it's a Low- , Medium-, or High(geosync)- earth-orbit
     satellite (perhaps you already know...).  LEO's don't usually have
     a huge in-air delay , so it's less likely that underbuffering is 
the culprit.
     But you mentioned inter-continental, so the delay will likely be 
large, anyway.
     Anyway, it'll help estimate bandwidth*delay.
     Example: a geosync might have 500msec RTT (1/2 second).
     At 4mbits/sec, that's 0.5MBytes/sec * 0.5sec --> 250KBytes.
      Most systems don't have a 250KByte default, so a means of requesting
      buffers would be needed. (64kB is a pretty common default, as is 
32k, some 128k;
      the OS X (10.4.x) from which I'm writing uses 63KB.)

   3. What's the operating system on both ends? (Linux is simple to tweak;
     some others I'm less familiar with...)

   4. Do you know how the current TCP buffers are set up in the OS?
     (some OS's allow min/default/max setting for each direction, 
others don't....).
     Aim any browser at: http://miranda.ctd.anl.gov:7123/
      and that tool will tell you, among other things, what your system
      is using for TCP buffers. You have to hit "statistics" after the test is
      run to find that detail.  The tool sends bytes towards you for 10sec, and
      pulls bytes from you for 10sec, then makes inferences based on
      date available from "web100" (see below) at the server.

   5. If you're using Linux, there's a  nice set of tools called "web100"
      (web100.org) that, via a kernel patch, make most (100+)of the internal TCP
      variables accessible, and can be very helpful in trying to diagnose these
      issues.  Example: let's you plot loss-over-time, to see if 
that's suppressing TCP.

   6. As a test, you can increase the buffers that every TCP session gets.
      Not appropriate for long-term use on a general-purpose box, but would
      allow testing the buffer hypothesis without rsync patching.

   7. If it turns out to be loss-related, rather than buffer-related, 
you could explore
    using alternate TCP stacks.  This is probably most appropriate if you're the
    sole-user of the link.  Many of the stacks try to figure out if they're in a
    "high-performance" environment, and behave more aggressively, while
     still trying to be "fair" to regular-TCP.  Some do a better job 
of fairness than
     others, thus the caveat about "sole-user" above.  Some of the stacks also
     attempt to address a "classic" TCP problem, specifically, whether 
the loss is
     due to "congestion", where backing-off-the-rate is appropriate, or just due
     to "non-congestive loss" (e.g. packet corruption, likely in a 
satellite environment),
     in which case "backing off" isn't necessarily the right thing.
     Example stack alternatives include:  HS-TCP, FAST, H-TCP.

   8. One other technique used on some satellite links is "TCP proxy".
     It's a bit controversial, but essentially "terminates" the TCP connection
    at each ground-station, so the sender sees a smaller RTT.
     (Smaller RTT means TCP can increase it's rate faster, and
      recover from loss/congestion_signals more quickly).
    Then the proxy is free to do "fancier" things across the satellite
     link, like extra Forward-Error-Correction (FEC), or whatever.
     (Even without tcp-proxy, satellite links often employ FEC to
      try to mask link-errors.)  If you'd like more references on this
      topic, or on high-performance tuning generally, let me know.

   9. re: "defaults" - it's a little tricky.  The best scenario is if 
you're running
     a current Linux kernel.  Then you can set the "default" to 
something like 65kB,
    and the max to 8-10MBytes.  And the system will auto-tune to find the
    largest-useful buffers for both sender and receiver.  For any other
    systems (that don't autotune), processes that don't specifically request a
     buffer (like using the options in the rsync patch) will be stuck 
with whatever
    is "default" for your OS.  If it's too small (most are, for many 
combinations of
     link-speed and RTT), they will never be able to "fill the pipe". 
If you set
     the default too big, you waste real memory, and might starve resources on
     the machine (if too many processes gobble up too much real memory).
     That's why the auto-tuning thing is so nice.
     The key on such systems is to estimate the bandwidth*delay product
     of a "typical" application (skeptics would ask: "is there such a 
thing?"), and
     make sure you allow for that much in the default.  Or, as rsync 
might evolve,
     allow a generous system-max, and let the user ask for 
more-than-default when
     they know it's useful.

As I mentioned, feel free to let me know if I can help further
    (but consider taking it off-list).

Larry
--

At 9:37 AM +1300 11/7/05, Jason Haar wrote:
>Can someone tell me if such options would help the likes of 
>ourselves with more "old fashion" linkspeeds?
>
>We have 1-4Mbps VPN-over-Internet links (between continents - i.e. 
>high latency), and routinely find rsync only capable of 300Kbps - 
>and rsync is the best performer we can find. If we "parallelize" 
>several rsync jobs (i.e. start a bunch at the same time), we can 
>certainly chew up to 80% of the max bandwidth - so the raw 
>throughput potential is there.
>
>Would such options help single rsync jobs? Actually, are there good 
>default options in general that we could use that might help in our 
>high speed, high latency environment?
>
>Thanks!
>
>--
>Cheers
>
>Jason Haar
>Information Security Manager, Trimble Navigation Ltd.
>Phone: +64 3 9635 377 Fax: +64 3 9635 417
>PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
>
>--
>To unsubscribe or change options: 
>https://lists.samba.org/mailman/listinfo/rsync
>Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html