[cifs-protocol] [Pfif] [REG: 110120160951867] Requesting clarification of CIFS client timeout behavior

Fri Dec 3 14:12:28 MST 2010

On Fri, Dec 3, 2010 at 2:21 PM, Volker Lendecke
<Volker.Lendecke at sernet.de>wrote:

> On Fri, Dec 03, 2010 at 01:50:11PM -0500, Jeff Layton wrote:
> > > Probably needs two tests.  One to see what happens if the (single)
> > > connection is lost, and another to see what happens if a single
> operation
> > > takes a very, very long time to complete (as you describe).
> > >
> >
> > I did an experiment with this on win2k8. I first doctored an smbd to
> > discard write requests. When I try to copy a file to this host (via
> > copy.exe), the server usually waits a little while (the time seems to
> > vary between 30-60s or so), sends a single echo request and then
> > reconnects the socket if it still doesn't get a write reply in about
> > 30s. copy.exe then says "The specified network name is no longer
> > available." Heh.
> >
> > That said, the behavior seems to be really inconsistent. In at least
> > one case, no echo was sent and the socket was shut down <30s after the
> > write request was sent.
> >
> > The timeout before sending an echo also seems to vary quite a bit. My
> > suspicion is that that indicates that the client has the echo ping on a
> > separate timer, and just selectively sends it whenever the timer pops
> > based on certain criteria.
>
> Probably all this timeout stuff varies too much with
> different application behaviours. I have the same discussion
> right now with the opposite direction: How can a server
> reliably tell that a client died hard? The question here is:
> When can we reliably throw away share mode entries? A
> colleague just measured a W2k8 timeout of 5 minutes in this
> case, but is this dependable? I suspect we have to develop
> our own policies for this.
>

A loosely related question is whether POSIX forbids
EIO or EHOSTDOWN on some syscalls.  If such were
specified in the standard, at least for those syscalls posix
clients can never time out (or must timeout and either
cancel/resubmit and/or reconnect transparently)
Currently write beyond end of file (and operations on
offline files) are the only known special cases where timeout would
be inappropriate, but we may find other syscalls where it
would be inappropriate for a client to return to the user.

For Windows (Windows behavior may be slightly different
than POSIX but still important for implementers to understand)
it would be helpful to know which operations
are allowed to return errors to the user (if the host
hangs or goes down) and which must retry forever.

-- 
Thanks,

Steve
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.samba.org/pipermail/cifs-protocol/attachments/20101203/e9a2074b/attachment.html>