request: add TCP buffer options to rsync CLI?
Lawrence D. Dunn
ldunn at cisco.com
Mon Nov 7 13:40:43 GMT 2005
Jason,
Summary guess-on-my-part:
"Maybe" (w.r.t. will_larger_buffers_help?). Depending on which
satellite-height you're
using, and what your current default buffers are, and which-operating-system,
1-4Mbps is on the edge where more-than-default buffers are useful.
More detail below.
(But it's equally likely, perhaps more likely, that you're getting
just enough loss,
on such a long path, that TCP is frequently getting it's rate
reduced; and on long
paths it can take a fairly long time to climb back up to "full rate").
More detail below...
More detail:
Jason- after looking at the following, it you'd like me to try to help
further, but conclude the discussion will bore rsync-list-members,
feel free to reply unicast. If you think it'll be "interesting" to
list members,
OK...
Stuff that will help you/(us) figure out what the core issue is:
1. One key item is to figure out the bandwidth*delay product
of the path. The whole thing about adjusting TCP buffers is that
the Sender-transmit-buf, and the Receiver-receive-buf, need to
be large enough to "fill the pipe". I.e., be as large as the
bandwidth*delay product (mbits/sec * RTTseconds * (1byte/8bits))
--> xx bytes.
Depending on implementation, sometimes the buffers need to be
2-3x this value.
2. What's the ping-time to the other end? Knowing the round-trip-time will
help figure out if it's a Low- , Medium-, or High(geosync)- earth-orbit
satellite (perhaps you already know...). LEO's don't usually have
a huge in-air delay , so it's less likely that underbuffering is
the culprit.
But you mentioned inter-continental, so the delay will likely be
large, anyway.
Anyway, it'll help estimate bandwidth*delay.
Example: a geosync might have 500msec RTT (1/2 second).
At 4mbits/sec, that's 0.5MBytes/sec * 0.5sec --> 250KBytes.
Most systems don't have a 250KByte default, so a means of requesting
buffers would be needed. (64kB is a pretty common default, as is
32k, some 128k;
the OS X (10.4.x) from which I'm writing uses 63KB.)
3. What's the operating system on both ends? (Linux is simple to tweak;
some others I'm less familiar with...)
4. Do you know how the current TCP buffers are set up in the OS?
(some OS's allow min/default/max setting for each direction,
others don't....).
Aim any browser at: http://miranda.ctd.anl.gov:7123/
and that tool will tell you, among other things, what your system
is using for TCP buffers. You have to hit "statistics" after the test is
run to find that detail. The tool sends bytes towards you for 10sec, and
pulls bytes from you for 10sec, then makes inferences based on
date available from "web100" (see below) at the server.
5. If you're using Linux, there's a nice set of tools called "web100"
(web100.org) that, via a kernel patch, make most (100+)of the internal TCP
variables accessible, and can be very helpful in trying to diagnose these
issues. Example: let's you plot loss-over-time, to see if
that's suppressing TCP.
6. As a test, you can increase the buffers that every TCP session gets.
Not appropriate for long-term use on a general-purpose box, but would
allow testing the buffer hypothesis without rsync patching.
7. If it turns out to be loss-related, rather than buffer-related,
you could explore
using alternate TCP stacks. This is probably most appropriate if you're the
sole-user of the link. Many of the stacks try to figure out if they're in a
"high-performance" environment, and behave more aggressively, while
still trying to be "fair" to regular-TCP. Some do a better job
of fairness than
others, thus the caveat about "sole-user" above. Some of the stacks also
attempt to address a "classic" TCP problem, specifically, whether
the loss is
due to "congestion", where backing-off-the-rate is appropriate, or just due
to "non-congestive loss" (e.g. packet corruption, likely in a
satellite environment),
in which case "backing off" isn't necessarily the right thing.
Example stack alternatives include: HS-TCP, FAST, H-TCP.
8. One other technique used on some satellite links is "TCP proxy".
It's a bit controversial, but essentially "terminates" the TCP connection
at each ground-station, so the sender sees a smaller RTT.
(Smaller RTT means TCP can increase it's rate faster, and
recover from loss/congestion_signals more quickly).
Then the proxy is free to do "fancier" things across the satellite
link, like extra Forward-Error-Correction (FEC), or whatever.
(Even without tcp-proxy, satellite links often employ FEC to
try to mask link-errors.) If you'd like more references on this
topic, or on high-performance tuning generally, let me know.
9. re: "defaults" - it's a little tricky. The best scenario is if
you're running
a current Linux kernel. Then you can set the "default" to
something like 65kB,
and the max to 8-10MBytes. And the system will auto-tune to find the
largest-useful buffers for both sender and receiver. For any other
systems (that don't autotune), processes that don't specifically request a
buffer (like using the options in the rsync patch) will be stuck
with whatever
is "default" for your OS. If it's too small (most are, for many
combinations of
link-speed and RTT), they will never be able to "fill the pipe".
If you set
the default too big, you waste real memory, and might starve resources on
the machine (if too many processes gobble up too much real memory).
That's why the auto-tuning thing is so nice.
The key on such systems is to estimate the bandwidth*delay product
of a "typical" application (skeptics would ask: "is there such a
thing?"), and
make sure you allow for that much in the default. Or, as rsync
might evolve,
allow a generous system-max, and let the user ask for
more-than-default when
they know it's useful.
As I mentioned, feel free to let me know if I can help further
(but consider taking it off-list).
Larry
--
At 9:37 AM +1300 11/7/05, Jason Haar wrote:
>Can someone tell me if such options would help the likes of
>ourselves with more "old fashion" linkspeeds?
>
>We have 1-4Mbps VPN-over-Internet links (between continents - i.e.
>high latency), and routinely find rsync only capable of 300Kbps -
>and rsync is the best performer we can find. If we "parallelize"
>several rsync jobs (i.e. start a bunch at the same time), we can
>certainly chew up to 80% of the max bandwidth - so the raw
>throughput potential is there.
>
>Would such options help single rsync jobs? Actually, are there good
>default options in general that we could use that might help in our
>high speed, high latency environment?
>
>Thanks!
>
>--
>Cheers
>
>Jason Haar
>Information Security Manager, Trimble Navigation Ltd.
>Phone: +64 3 9635 377 Fax: +64 3 9635 417
>PGP Fingerprint: 7A2E 0407 C9A6 CAF6 2B9F 8422 C063 5EBB FE1D 66D1
>
>--
>To unsubscribe or change options:
>https://lists.samba.org/mailman/listinfo/rsync
>Before posting, read: http://www.catb.org/~esr/faqs/smart-questions.html
More information about the rsync
mailing list