How to debug 100x variance in writing the same file?
gerth-samba at gpo.stanford.edu
Sun Jan 23 04:36:43 GMT 2005
Richard Sharpe wrote:
> On Fri, 21 Jan 2005, John Gerth wrote:
>>>OK, to be more specific, if a packet is dropped towards the end of a
>>>read response (or a write request) where there are no subsequent segments
>>>to trigger normal TCP behavior (fast-retransmit) then we will see big
>>>reductions in throughput as Chris mentions.
>> Is the packet timeout you mention configurable for SMB or is
>> it TCP specific for the OS?
> This is a function of the TCP layer. It computes a retransmit timeout
> (RTO), which is typically a function of the smoothed round trip time that
> the TCP layer is seeing for the connection.
> When a segment gets dropped by the network, if there is more data coming,
> then the subsequent segments will cause the receiver to issue acks
> pointing to the dropped segment (there is an RFC on this, and Richard
> Stevens' book has a good description). Once the sender sees a couple of
> acks for a segment it has already sent it is supposed to resend that
> segment on the assumption that the receiver hasn't got it because it got
> dropped. However, at the end of a higher layer request or response, there
> are no more segments, and we have to wait for RTO to go off.
It makes sense that it's all handled by the TCP layer. I guess I was
confused by the earlier post which said that TCP apps like scp and ftp
(I had used scp to exonerate the network layer) would recover faster
from drops than Samba would. That implied that not all TCP apps are
equal and so I thought maybe it was at least partly configurable.
>> I'm asking because the delay I observe on the client's Task Manager network
>> activity graph is on the order of *tens of seconds* of no activity.
> Then there is something really wrong. RTO is often set to something like
> four times the SRTT value, and I have not normally seen anything more than
> a few milliseconds.
> A trace would be useful, as we said before, so we can see where the delay
> is and what is actually causing it. It would be useful to mentin again
> what OSes are involved.
I will work to get some traces after the Siggraph deadline passes this week.
I do think it may be an OS issue since it's Apple OS-X 10.3 (Samba 3.0.5).
The clients are W2K+(SP4+patches) and WinXP+(SP2+patches). I don't think there
are hardware/wiring faults as no client has a problem with our W2k3 server.
Ultimately I can also test a Linux/Samba server, but we picked the Apple some
time ago because it gave us off-the-shelf auth for Unix/Linux/Windows.
Now there's the XAD option, too.
Thanks for the tutorial,
More information about the samba-technical