[PATCH] Version 2: Patchset for bug #10344 - SessionLogoff on a signed connection with an outstanding notify request crashes smbd.

Tue Mar 11 14:59:08 MDT 2014

I'm replying to this out of order, as it (hopefully)
will make the organization of my thoughts a little
clearer :-).

I've attached an updated jra1.diff which fixes the
complaints you have, other than the ones where we
have a disagreement on API construction in the
server. But more on that at the end (and a proposed
solution :-).

Sorry - this is a really long reply. But it covers
a lot of ground (and I think I've found a bug in
metze.diff also - more on that below).

Metze if you want to jump to the bug in the logic
in your patch just search for :

******METZE***BUG***ALERT***********

below :-).

On Mon, Mar 10, 2014 at 02:41:48PM +0100, Stefan (metze) Metzmacher wrote:

> In order to make some progress on I hope we can agree on attachments
> tmp1.diff and tmp2.diff. And continue to discuss if we take metze.diff
> or an updated version of jra.diff.

I certainly agree on tmp1.diff and tmp2.diff, with the minor
change of the commit message should change from :

This is based on a patch from Jeremy Allison <jra at samba.org>.
Signed-off-by: Stefan Metzmacher <metze at samba.org>

to:

Signed-off-by: Stefan Metzmacher <metze at samba.org>
Signed-off-by: Jeremy Allison <jra at samba.org>

This is asychronous distributed pair-programming if
ever there was such :-).

> Why does it require 'struct smbd_smb2_request *smb2req_to_awaken'
> instead of just TALLOC_CTX *mem_ctx, we may want to use this
> for the terminate_connection code, which doesn't have a
> struct smbd_smb2_request.

Fixed. Good catch ! It is only being used
as a talloc context and so should be labelled
as such.

>> static int smbd_smb2_cleanup_wait_list_entry(struct smb2_request_wait_list *pwe)
>> {
>>        if (pwe->prev) {
>>                /*
>>                 * This is only the case when the SMB2 request to
>>                 * be awakened has finished and is being freed before
>>                 * the SMB2 request that should awaken us. This should
>>                 * be a very rare occurrence.
>>                 */
>>                struct smb2_request_wait_list *wl_head;
>>                DLIST_HEAD(pwe, wl_head);
>>                DLIST_REMOVE(wl_head, pwe);
>
> This doesn't cleanup smb2req_to_wait_for->wait_list !
>
> wl_head will have the same value as smb2req_to_wait_for->wait_list,
> but you have no chance to get the address of
> smb2req_to_wait_for->wait_list.
>
> So struct smb2_request_wait_list needs a pointer to smb2req_to_wait_for.

Fixed. *Great catch* :-). This is why I love
your code reviews. :-).

> Now a few more comments about jra.diff:
>
> 1) the session/tcon is not marked as DELETED before
> wait_for_outstanding_requests()
>  which means following request would be able to use the session/tcon
>  and the number of pending requests may not decrease.

Fixed. Although as I mentioned to you privately
I don't consider that as big a problem as you do.
However, the suggestion to just set the status
in the session/connection before setting up the
waits is certainly a good one !

> 2) find_outstanding_session_request() and find_outstanding_tcon_request()
>  both just skip requests with compound_related = true, which means
>  we don't wait for them to be completed and still have the original
> problem.

Actually the code in the tear-down I believe now
makes them safe from the original problem (in that
it's the initial 'hack' you mentioned which I originally
used but didn't like much :-).

But waiting on even compound_related = true requests
without cancelling them is a good idea, and the revised
jra.diff now does this.

> 3) you're not allowed to do anything after tevent_req_nterror()
>   other then 'return;' or 'return tevent_req_post(req, ev);'
>   DEBUG() or tevent_req_done(req); is wrong...

Fixed - thanks !

> 4) For think the jump between helper functions like
> wait_for_outstanding_requests,
>  find_outstanding_tcon_request, try_to_tear_down_tcon is a bit confusing.
>
>   I think some
> find_outstanding_session_request/find_outstanding_tcon_request
>   would be better inlined.

Bah. Matter of taste IMHO. I prefer them as separate functions
as they are re-used. I hate inlining code that is used twice.
Can generate cut-and-paste errors later.

>  It's not the typical tevent_req style where all step functions have:
>  a) the same prefix smbd_smb2_logoff_ or smbd_smb2_tdis_
>  b) they only return void and the _send() function uses
>       if (!tevent_req_is_in_progress(req)) {
>           return tevent_req_post(req, ev);
>       }
>     to check the result.
>  c) they only take (struct tevent_req *req) or (struct tevent_req *subreq)
>     as arguments.
>  d) reading from the function from 'struct smbd_smb2_logout_state {' until
>     the end of smbd_smb2_logoff_recv() should make it almost as easy as to
>     read a sync function. I hope you see what I mean when looking at metze.diff.

This has merit. That style does look cleaner, but in
this case it comes with a cost (which I'll explain
below).

> 5) I'd skip the remove_outstanding_session_refs() patch as it will do
>   nothing in try_to_tear_down_session(), as all requests are finished.
>  Just leave smbd_smb2_session_setup_state_destructor() alone for now...

As above. Adding remove_outstanding_session_refs() made
leaving the compound_related requests alone safe. The
new patch doesn't suffer from that problem, but I left
the code alone as I don't think it hurts either.

>> The rest of the code is built on top of this,
>> and is split into a series of micro commits
>> to make it easier to review.
> For me they make it harder to review the real change,
> because I don't see why a function is added and why
> it's written in a specific way.

Arggggh. I can't win. I specifically spent a day splitting
everything up into micro-commits to try and make your and Volkers'
lives easier. No good deed goes unpunished :-(. Next
time I'm just going to ship you what I got and damn
your commit style :-).

> But not with a full make test :-)
> It fails this tests:
>
> [3/23 in 12s, 1 errors] samba3.smb2.lock(s3dc)

That's a low blow :-). That test depends on
Windows server specifics on returns from
terminating active connections. I'd argue
returning cancelled makes just as much sense.

However - Fixed :-). There's no point complaining,
we can't regress on tests :-).

OK, now for the big finish !

> I have problems with this API, as it introduces a new style of doing async
> programming, which is very confusing.

ARE YOU KIDDING ME !!!!!

Look at the API used in metze.diff. Notably this
utter horror...

+       for (preq = smb2req->sconn->smb2.requests; preq != NULL; preq = preq->next) {
		.... do stuff....
+               /*
+                * Now wait until the request is finished.
+                *
+                * We don't set a callback, as we just want to block the
+                * wait queue and the talloc_free() of the request will
+                * remove the item from the wait queue.
+                */
+               subreq = tevent_queue_wait_send(preq, ev, state->wait_queue);
+               if (tevent_req_nomem(subreq, req)) {
+                       return tevent_req_post(req, ev);
+               }
+        /*
+        * Now we add our own waiter to the end of the queue,
+        * this way we get notified when all pending requests are finished
+        * and send to the socket.
+        */
+       subreq = tevent_queue_wait_send(state, ev, state->wait_queue);
+       if (tevent_req_nomem(subreq, req)) {
+               return tevent_req_post(req, ev);
+       }
+       tevent_req_set_callback(subreq, smbd_smb2_tdis_wait_done, req);

How do these subreq = tevent_queue_wait_send() terminate ?

Oh that's right, they magically disappear when preq
is freed. Thats *really* obvious (note the sarcasm
here :-). And why does smbd_smb2_tdis_wait_done()
get called when that last of them disappears ? After
all, state isn't being freed here ? Oh yes, you added
it as a callback. Hmm, wonder why you didn't have
to do that for the subreq's hung off preq ? The
comment says "this way we get notified when all pending requests are finished
and send to the socket." Why is that ? How
does that happen ? Nothing added here that
seems to achieve this. More destructor magic
in another time and place...

It uses the _send()/_recv() pattern true,
but that is all it has going for it. I understand
it by going line-by-line very carefully through
lib/tevent/tevent_queue.c but MY GOD that
code is OPAQUE. Imagine trying to explain the
control flow inside that to someone new to
Samba (and maybe to async programming).

I don't think you could. They'd just have
to take it on trust that this code does
what the comments say it should.

I ended up having to take copious notes
whilst reviewing lib/tevent/tevent_queue.c
to ensure I understood exactly what it is
doing in terms of control flow.

(For the record, every call to
tevent_queue_wait_send() sets up
an immediate function tevent_queue_immediate_trigger()
which causes tevent_queue_wait_trigger() to be
called once the the ulogoff_send function
returns. tevent_queue_wait_trigger() calls
tevent_req_done() on the created subreq,
which - as there is no callback added
does nothing. Everything then waits on
the queue until the destructors for the
'blocking' subreq's on the queue are
called. As each one terminates then
the entry is removed from the queue
and the trigger function on the next
entry is called - tevent_queue_wait_trigger(),
which calls tevent_req_done(). The
last subreq actually has a callback,
smbd_smb2_tdis_wait_done() which is
then called as the last entry on
the queue).

******METZE***BUG***ALERT***********

OK. So after writing all that down I
think there's a logic bug here.

tevent_queue_wait_send() sets up an
implicit ordering - you expect each
of the queued SMB2 requests to finish
in order so that when the penultimate
SMB2 request frees itself then it triggers
smbd_smb2_tdis_wait_done() from the
next wait queue trigger.

However THERE IS NO GUARENTEED ORDERING
in queued SMB2 requests terminating,
especially in ones that don't cancel
cleanly and we're just waiting for
completion (see the code inside
cancel_smb2_aio() for details - those
requests can finish in any order).

You might have 99 async SMB2 write
requests queued up, and this code
will trigger smbd_smb2_tdis_wait_done()
if SMB2 write request 99 finishes
*before* SMB2 write requests 1-98 finishes.

Using a queue here implies an ordering
in the SMB2 requests finishing that simply
doesn't exist. Most of the time you'll
get away with it, but 0.0001% on a
heavily loaded system it'll crash :-).

As I said before, as you need to wait
for all requests to finish the only
safe way to do this is to wait for
the first one, and when it finishes
re-scan and wait on the next 'first'
one. Only when there are no more
entries pending then it's safe
to do the callback.

******METZE***BUG***ALERT***********

Note it's only used in one other place:

source4/librpc/rpc/dcerpc.c:            p->transport.read_subreq = tevent_queue_wait_send(..

but my fear is that this will get cut-and-pasted
wrongly into any place we need to do (if you'll
pardon the Windows-ism) WaitForMultipleObjects()
where we really only need in this case a
WaitForSingleObject().

Compare that with:

+       smb2_wait_req = find_outstanding_session_request(smb2req);
+       if (smb2_wait_req != NULL) {
		... do stuff...
+
+               status = smbd_smb2_wait_for_request_to_complete(
+                               smb2req,        /* Talloc context. */
+                               smb2_wait_req,  /* Request we're waiting on. */
+                               state->ev,      /* Event context. */
+                               smbd_smb2_logoff_do, /* callback. */
+                               req);           /* Private pointer. */
+               if (!NT_STATUS_IS_OK(status)) {
+                       DEBUG(0, ("logoff wait failed: %s\n"
+                               nt_errstr(status)));
+                       return status;
+               }
+               /* We're waiting on another SMB2 request. */
+               return NT_STATUS_RETRY;

Pretty obvious. /* callback */ gets called when smb2_wait_req
finishes. Not much magic there.

I gotta admit. I *hate* the tevent_queue_wait_send() API.
Much too complex for what it's doing here IMHO (see
the bug alert above :-).

CONCLUSION (if anyone actually gets down as far as this :-).
------------------------------------------------------------

I agree on tmp1.diff and tmp2.diff, with the 'Signed-off-by'
change mentioned above.

I hate it, but the tevent_queue_wait_send() already exists
and is in use inside our code, and even though I think my
code is easier to understand people can disagree, so in
the interests of not adding extra API's I'm willing to accept
metze's fix in place of mine, so long as the logic bug
gets fixed. I might have a crack at this myself :-).

I also think (at least for the smb2_sesssetup.c
and smb2_tcon.c changes that we should change the
'Signed-off-by' lines to include both Metze and
myself, as this has been a herculean effort from
both of us.

OK, I'm done.... :-).

Jeremy.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: jra2.diff
Type: text/x-diff
Size: 26453 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20140311/2bb996da/attachment.diff>