[PATCH 1/2] cifs: change wait_for_free_request to take number of requests to wait for

Wed Mar 6 02:15:18 UTC 2019

On Wed, Feb 27, 2019 at 09:12 PM, ronnie sahlberg
<ronniesahlberg at gmail.com> wrote:
>
> On Thu, Feb 21, 2019 at 11:39 AM Pavel Shilovsky <piastryyy at gmail.com> wrote:
> >
...
> >
> > Waiting for more than one credit without a timeout is a source of
> > possibility to be stuck. We already have 'timeout' argument which
> > contains some part of flags passed to transport layer from PDU. We
> > might consider renaming 'optype' argument into flags and use 'timeout'
> > with its true meaning - timeout of the waiting for credits.
> >
> > Once implemented, we can define some default period of time which we
> > can *afford* to wait for credits and then fall back to sequential code
> > if we timed out.
> >
> > I also suggest to add another function wait_for_compound_credits and
> > make wait_for_free_request calls it - doesn't require changes to all
> > the places where the latter is being used.
>
> Thanks, I am reworking it and will resend shortly.
> Passing flags down into these functions instead of timeout/optype
> makes sense and I will do that.
>
> I am not sure we need to add a timeout to represent a deadline after
> we return -EAGAIN or similar.
> But for sure we need to prevent waiting atomically for >1 credits to
> block forever.

(adding Jeremy to comment and cc'ing samba-technical)

Thanks for taking a look at this.

If we don't set a timeout on waiting for several credits at once, we
may end up waiting forever. It is only a matter of what is a
possibility to be stuck. We can try to guess if the server is going to
grant us more or not in future by analyzing the number of requests in
flight or other parameters but we can't say for sure because the spec
doesn't say anything about minimal number of credits. This means that
under certain conditions the server may end up giving us a very few
credits that won't be enough to do compounding.

Those conditions might be different but from the top of the head I can
give an example of a highly loaded file server with thousands of
active connections that experiences CPU and/or memory pressure and
decides to limit the number of credits per connection. That's why I do
think that the ability to set a timeout on waiting is essential. We
can start with something between 100ms and 1s (and even making it
auto-tunable - let's say a couple round trips to the server) and then
decide what is the best (or even make it configurable).

> This could happen in two scenarios, the first would be if the number
> of free credits are persistently pegged at <=1 credits
> due to persistent and massive number of threads causing single credits
> SMB2 I/O.
> I.e. we are constantly starved for multi-credit requests.
>
> The second would be if we get stuck at <=1 available credits even when
> the session is idle.
> This would be a bug and the best is probably to let the reconnect
> eventually happen and thus force the credits to be in sync again
> between server and client.
> So I don't think we need t do anything explicit for that case here.
>
> The first case is a genuine case, but I think we can handle that in a
> different way instead of switching back to a "grab one credit at a
> time" like we have now.
> In that scenario we would have a different problem that is  there
> could be so many concurrent calls for compounded operations requiring
> >=3 credits each
> that we eventually consume MAX_CREDITS number of credits across these
> threads. Each thread having already allocated one of more credits less
> that it needs for the full compound
> and the the threads are now all deadlocked each waiting for one or
> more credits that will never become available.
> Making the allocation of credits for a compound allocate all of them
> atomically solves this type of deadlock, but would be prone to the
> credits starvation issue.

The approach of grabbing one credit in a time was a temporary work
around that didn't require a lot of code changes and was easier to be
pushed to stable kernels. There are no doubts that it left a
possibility to be stuck and we need a proper solution to handle
compound requests.

> The way I propose to solve the credits starvation problem for
> compounds is something like this (I haven't coded it yet so I dont
> know exactly what it will look like)
> * After the session has been fully established, we should be able to
> assume we WILL have a whole bunch of credits available.
> Modulo a hypothetical situation where the sever will just never grant
> us more than 1 or 2 credits in total. I don't think this is a
> reasonable situation we need to worry too much about and maybe the
> right thing to do here is just to block all compound
> operations until we get a session reconnect where hopefully the server
> will be more reasonable.
> * We can define a new MAX_CREDITS_FOR_COMPOUND which would be the
> largest number of credits that we will ever create a single compound
> for.
> I think that would be 4 at this stage.

Note, that we reserve 1 credit for echo request and 1 credit for
oplock/lease break ack, so those operations won't be stuck waiting for
credits. If the maximum number of requests in the compounded chain is
4, then 5 credits granted by the server will make our client useless
for certain operations.

> * In the loop in wait_for_free_credits(), If this is a request for a
> single credit AND iff it is a normal operation, i.e. &CIFS_OP_MASK ==
> 0
> then we will NOT grant the single credit but continue looping.
>
> I.e. basically reserving the last MAX_CREDITS_FOR_COMPOUND credits
> before we completely run out to only be available for compounds.
> That way we do guarantee that compound requests will be making
> progress and will not be completely starved by non-compound requests..
>

Trying to reserve MAX_CREDITS_FOR_COMPOUNDING increases latency of
single-credit requests. In some workloads compounding won't be even
used (let's say it is pure IO workload), so instead of using all the
bandwidth available to the client (and the credit bandwidth might be
pretty narrow for highly loaded servers) we will end up using only a
part of it.

But it is not only about latency of a single-credit request, it is
also about latency of a compounded request itself. Instead of waiting
for all the credits for a long period of time we can start making
progress by doing sequential operations. If the round trip time is
small, the penalty of indefinite waiting instead of going sequential
will be rather substantial. That's why waiting only for some
reasonable period of time seems like a good option to try: if we have
already waited for some reasonable time, let's stop waiting, return
special error code and let the upper layer process the request
sequentially.

So, I think that compounding is a great optimization and it is
important to try using it as much as possible across the client code.
In the same time, the SMB3 client should be capable to handle any
workload or scenario that doesn't contradict the spec.

Thoughts?

--
Best regards,
Pavel Shilovsky