[linux-cifs-client] Re: [PATCH 6/6] cifs: don't retry on blocking sends

Mon Dec 1 16:40:26 GMT 2008

On Mon, 1 Dec 2008 09:19:29 -0600
"Shirish Pargaonkar" <shirishpargaonkar at gmail.com> wrote:

> On Mon, Dec 1, 2008 at 6:35 AM, Jeff Layton <jlayton at redhat.com> wrote:
> > On Sun, 30 Nov 2008 20:31:54 -0600
> > "Steve French" <smfrench at gmail.com> wrote:
> >
> >> The large write case is unique for a few reasons, (the write past end
> >> of file is treated differently for the timeout on waiting for
> >> response, but is not the issue here on tcp send), but primarily
> >> because a shorter timeout will increase the possibility of data
> >> corruption, and the write SMB request is so much larger than the
> >> typical case.
> >>
> >> Waiting too long (20 seconds or more) on a firewall or stuck tcp
> >> session could be very annoying for the other 90%+ of the (non-SMB
> >> write) requests though and would make the cifs mount appear
> >> unresponsive.
> >>
> >
> > One of the things I'm working toward here is some consistency in how
> > "hard" and "soft" mount options are implemented. While I haven't tested
> > them extensively, I'm pretty convinced from looking at the code that
> > neither really works as advertised. Thinking this through...
> >
> > We should only block if we're unable to buffer the data for the send.
> > Either:
> >
> > a) the system is low on memory and can't allocate enough to copy
> >
> > b) the socket buffers are full, and can't be grown any larger
> >
> > I think "b" is the main worry. That should mainly happen when we've
> > buffered up a bunch of sends and the receive window on the other end is
> > closed.
> >
> > We could do different timeouts for "large" and "small" calls, but what
> > will that mean in practice? In many (most?) cases though we're not
> > going to return an error to the user with a timeout, we're going to
> > return -EAGAIN to the caller of SendReceive or SendReceive2 and then
> > that caller is going to retry. A smaller timeout doesn't seem like
> > it's going to make a difference in most cases...
> >
> > If we do decide that different timeouts here give us something, I still
> > think it would be better to just set the socket timeout properly before
> > doing the call. We can call kernel_setsockopt or just set sk_sndtimeo
> > directly (they're equivalent AFAICT). The perf impact there will be low.
> 
> So why was kernel_setsockopt function created?  If setting a socket
> option directly is equivalent to using kernel_setsockopt to set the same
> optoin, why use kernel_setsockopt at all?
> Does kernel_setsockopt catch any errors/checks_to_be_assigned_values
> additionally that would be useful/advantageous compared to setting an
> option directly?
> 

I assume it's simply to provide an interface consistent with userspace
setsockopt for kernel callers.

Some SO_* options do a bit more than just setting a variable. In the
case of snd/rcvtimeo though, it essentially just sets the field in the
socket struct (after doing some bounds checking, etc). Given that we'd
probably set these values to something sane and well-defined we'd
probably be safe to bypass the extra checks and just set it. But, it's
probably more future-proof to use kernel_setsockopt() (in case later
changes make it do something more there).

IMO, it doesn't look terribly inefficient to just call kernel_setsockopt
here. The call chain would just be:

kernel_setsockopt
   sock_setsockopt
      sock_set_timeout

...and we only need to do that if it actually needs to be changed.

That seems more efficient than having to call kernel_sendmsg in a loop
and makes the cifs code a lot simpler.

That said, I'm not convinced that it's a good idea to have different
timeouts for different calls. A consistent timeout that works for all
calls seems like the most reasonable thing to do. In most situations,
we're going to call back down into smb_send* anyway. The processes will
be hung regardless.

A reasonably long timeout will be easier to document and explain to
users too. We have too many special cases in CIFS as it is. Simplifying
this code would make all of our lives easier. If there is a need for
it later, we can always add the complexity back in...

Just my US $.02...

-- 
Jeff Layton <jlayton at redhat.com>