The SND/RCV LO/HI WAT options
David Collier-Brown
davecb at canada.sun.com
Thu Jun 10 13:47:00 GMT 1999
Majid Tajamolian wrote:
> > > I'm working on SAMBA performance enhancement. Can anyone guide me for a
> > > reference about the 4 WATer mark socket options and their effects on
> > > TCP/IP performance?
> > > Have anyone any experiments about using them in the SAMBA code?
I asked:
> > A colleague has been experimenting with them in a non-samba
> > context, and we've been looking at their behavior quantiatively:
> > Alas, I'm not doing the experiments myself...
> >
> > What would you like to know?
> >
> It seems that anything about them can be useful (i.e. their meaning, how
> they change the TCP behavior, what is relation between them and the other
> configurations such as SNDBUF, RCVBUF, , ...).
> If you don't know about them, can you help me to find some information in
> the internet?
The best reference is Stevens' TCP/IP Illustrated,
and a colleague and I have been playing with the
high- and low-water marks in a group of Solaris
servers and clients on 100baseT ethernet.
Let's look at in /net3 terms (BSD & Linux):
On the send side, an application writes a big chunk
of data, say 10 KB. The so_send processing hands
this off in 2048-byte mbufs to the lower level of
the protocol, until there is enough data sent but
not acknowledged to exceed the high-water mark, which
is at 8k. At this point the sending stops until
some acks come in. The sending process is suspended.
The process is not awakened until the send-but-unacked
data falls below the low-water mark, 2 KB. At this point
it send the next two KB and is done for the moment.
In a large-data scenario (which is what I have at work),
you want the high-water mark to be set high enough that
1) the expected write size is below the
high-water mark (minimum), and
2) the mark is high enough that enough data will
be transferred and ack's on back-to-back writes
that there will be at least a write's worth of
space left below the high-water mark (optimum)
Lets assume Samba is being used to access big database relations,
and max xmit is at it's default setting of 65,535.
Logically, you'd
1) want the high-water-mark set above 64 KB
That's easy: the /net3 maximum is 26,2144. (The Solaris
maximum is different, and affects the TCP scaling options)
2) want there to be at least 64KB free during back-to-back
transfers.
This is harder: you'll have to measure to see how fast samba
can issue transfers back-to-back (by adding some timers), then
compute how much data your ethernet transfers in the same period
and then set the high-water mark to suit. Richards talks about
the bandwidth-delay product: this is another issue affected by it.
If you don't want to go the analytic route, you can just do
experiments and plot high-water mark -vs- throughput.
A test with no load on the ethernet will give you a curve that
**underestimates** the buffering needed. Do your benchmarking in
a production environment, in a busy period!
If you can't I'd use a packet sniffer (snoop, in my case) to see
what your "normal" load is, then run a bunch of ftp jobs on another
pair of machines to simulate it on your test net.
<simulated guru hat on>
I'd expect the curve to look like this
_____
___/ ---__
_/
/
/
|
|
+------+-------+----
0 default lots
In other words, the performance when it's small would be
badly throttled. When it's "enough" performance will
jump up quickly, rise slowly to a peak and then drop off
when big buffers starve other parts of the system
for memory.
<guru hat off>
The gentle curve at the top of the graph is caused by
probabilistic effects: every once in a while, there's a burst
of traffic, Samba doesn't get enough transferred in time and
the the process hits the high-water mark and is suspended until
the data drains. Add a bit more buffer and the probability
is reduced.
In my opinion, the important thing to know is where the performance
jumps upward, measured on 10 mbit/sec ethernet for low, "average"
and near-saturation loads. THAT curve is interesting to someone
doing performance tuning, as it will tell them
what's the minimum they must have
what they need in the worst case, and
the shape of the curve between those points.
A concave curve means you can be low without much risk: a convex
one means you'd better get as close to the maximum as you can.
One known good point is the default: it's known to be sane for
10 mbit/sec ethernets and ftp sending 24-odd-KB worth of buffers.
For 100, we don't need quite as much as the net acks data faster.
For T1 and below, I have to ask a non-simulated guru!
--dave
--
David Collier-Brown, | Always do right. This will gratify some people
185 Ellerslie Ave., | and astonish the rest. -- Mark Twain
Willowdale, Ontario | http://java.science.yorku.ca/~davecb
Work: (905) 477-0437 Home: (416) 223-8968 Email: davecb at canada.sun.com
More information about the samba-technical
mailing list