[PATCH] CIFS: Decrease reconnection delay when switching nics

Tom Talpey ttalpey at microsoft.com
Wed Feb 27 18:26:26 MST 2013

> -----Original Message-----
> From: linux-cifs-owner at vger.kernel.org [mailto:linux-cifs-
> owner at vger.kernel.org] On Behalf Of Dave Chiluk
> Sent: Wednesday, February 27, 2013 5:44 PM
> To: Steve French
> Cc: Jeff Layton; Stefan (metze) Metzmacher; Dave Chiluk; Steve French;
> linux-cifs at vger.kernel.org; samba-technical at lists.samba.org; linux-
> kernel at vger.kernel.org
> Subject: Re: [PATCH] CIFS: Decrease reconnection delay when switching nics
> On 02/27/2013 04:40 PM, Steve French wrote:
> > On Wed, Feb 27, 2013 at 4:24 PM, Dave Chiluk
> <dave.chiluk at canonical.com> wrote:
> >> On 02/27/2013 10:34 AM, Jeff Layton wrote:
> >>> On Wed, 27 Feb 2013 12:06:14 +0100
> >>> "Stefan (metze) Metzmacher" <metze at samba.org> wrote:
> >>>
> >>>> Hi Dave,
> >>>>
> >>>>> When messages are currently in queue awaiting a response, decrease
> >>>>> amount of time before attempting cifs_reconnect to SMB_MAX_RTT
> =
> >>>>> 10 seconds. The current wait time before attempting to reconnect
> >>>>> is currently 2*SMB_ECHO_INTERVAL(120
> >>>>> seconds) since the last response was recieved.  This does not take
> >>>>> into account the fact that messages waiting for a response should
> >>>>> be serviced within a reasonable round trip time.
> >>>>
> >>>> Wouldn't that mean that the client will disconnect a good
> >>>> connection, if the server doesn't response within 10 seconds?
> >>>> Reads and Writes can take longer than 10 seconds...
> >>>>
> >>>
> >>> Where does this magic value of 10s come from? Note that a slow
> >>> server can take *minutes* to respond to writes that are long past the
> EOF.
> >> It comes from the desire to decrease the reconnection delay to
> >> something better than a random number between 60 and 120 seconds.  I
> >> am not committed to this number, and it is open for discussion.
> >> Additionally if you look closely at the logic it's not 10 seconds per
> >> request, but actually when requests have been in flight for more than
> >> 10 seconds make sure we've heard from the server in the last 10 seconds.
> >>
> >> Can you explain more fully your use case of writes that are long past
> >> the EOF?  Perhaps with a test-case or script that I can test?  As far
> >> as I know writes long past EOF will just result in a sparse file, and
> >> return in a reasonable round trip time *(that's at least what I'm
> >> seeing with my testing).  dd if=/dev/zero of=/mnt/cifs/a bs=1M
> >> count=100 seek=100000, starts receiving responses from the server in
> >> about .05 seconds with subsequent responses following at roughly
> >> .002-.01 second intervals.  This is well within my 10 second value.
> >
> > Note that not all Linux file systems support sparse files and
> > certainly there are cifs servers running on operating systems other
> > than Linux which have popular file systems which don't support sparse
> > files (e.g. FAT32 but there are many others) - in any case, writes
> > after end of file can take a LONG time if sparse files are not
> > supported and I don't know a good way for the client to know that
> > attribute of the server file system ahead of time (although we could
> > attempt to set the sparse flag, servers can and do lie)
> >
> It doesn't matter how long it takes for the entire operation to complete, just
> so long as the server acks something in less than 10 seconds.  Now the
> question becomes, is there an OS out there that doesn't ack the request or
> doesn't ack the progress regularly.

SMB/CIFS servers will signal the operation "going async" by returning a
STATUS_PENDING response if the operation is not prompt, but this only
happens once. The client is still expected to run a timer, and recover from
possibly lost responses and/or unresponsive servers. Windows clients
extend their timeout when this occurs, typically quadrupling it.

Some clients will issue ECHO requests to probe the server in this
case, but it is neither a protocol requirement nor does it truly address
the issue of tracking each pending operation. Windows SMB2 clients
do not do this.

More information about the samba-technical mailing list