<br><br><div class="gmail_quote">On Fri, Dec 3, 2010 at 2:21 PM, Volker Lendecke <span dir="ltr">&lt;<a href="mailto:Volker.Lendecke@sernet.de">Volker.Lendecke@sernet.de</a>&gt;</span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">

<div class="im">On Fri, Dec 03, 2010 at 01:50:11PM -0500, Jeff Layton wrote:<br>

&gt; &gt; Probably needs two tests.  One to see what happens if the (single)<br>

&gt; &gt; connection is lost, and another to see what happens if a single operation<br>

&gt; &gt; takes a very, very long time to complete (as you describe).<br>

&gt; &gt;<br>

&gt;<br>

&gt; I did an experiment with this on win2k8. I first doctored an smbd to<br>

&gt; discard write requests. When I try to copy a file to this host (via<br>

&gt; copy.exe), the server usually waits a little while (the time seems to<br>

&gt; vary between 30-60s or so), sends a single echo request and then<br>

&gt; reconnects the socket if it still doesn&#39;t get a write reply in about<br>

&gt; 30s. copy.exe then says &quot;The specified network name is no longer<br>

&gt; available.&quot; Heh.<br>

&gt;<br>

&gt; That said, the behavior seems to be really inconsistent. In at least<br>

&gt; one case, no echo was sent and the socket was shut down &lt;30s after the<br>

&gt; write request was sent.<br>

&gt;<br>

&gt; The timeout before sending an echo also seems to vary quite a bit. My<br>

&gt; suspicion is that that indicates that the client has the echo ping on a<br>

&gt; separate timer, and just selectively sends it whenever the timer pops<br>

&gt; based on certain criteria.<br>

<br>

</div>Probably all this timeout stuff varies too much with<br>

different application behaviours. I have the same discussion<br>

right now with the opposite direction: How can a server<br>

reliably tell that a client died hard? The question here is:<br>

When can we reliably throw away share mode entries? A<br>

colleague just measured a W2k8 timeout of 5 minutes in this<br>

case, but is this dependable? I suspect we have to develop<br>

our own policies for this.<font color="#888888"></font><br></blockquote><div><br>A loosely related question is whether POSIX forbids<br>EIO or EHOSTDOWN on some syscalls.  If such were<br>specified in the standard, at least for those syscalls posix <br>

clients can never time out (or must timeout and either <br>cancel/resubmit and/or reconnect transparently)<br>Currently write beyond end of file (and operations on<br>offline files) are the only known special cases where timeout would<br>

be inappropriate, but we may find other syscalls where it<br>would be inappropriate for a client to return to the user.<br><br>For Windows (Windows behavior may be slightly different<br>than POSIX but still important for implementers to understand)<br>

it would be helpful to know which operations<br>are allowed to return errors to the user (if the host<br>hangs or goes down) and which must retry forever.<br></div></div><br><br>-- <br>Thanks,<br><br>Steve<br>