pidl:Samba4/Client: don't mix rpc and application NTSTATUS errors anymore for the new bindings

Mon Apr 5 16:09:26 MDT 2010

On Tue, 2010-03-30 at 13:12 +0200, Stefan (metze) Metzmacher wrote:
> Hi Andrew,
> 
> >> - Log -----------------------------------------------------------------
> >> commit e230c8dd2441181963c6df678be06cdaaf6cb822
> >> Author: Stefan Metzmacher <metze at samba.org>
> >> Date:   Sat Mar 27 12:48:21 2010 +0100
> >>
> >>     pidl:Samba4/Client: don't mix rpc and application NTSTATUS errors anymore for the new bindings
> >>     
> >>     The new dcerpc_binding_handle based client stubs don't mix error codes anymore,
> >>     while the old dcerpc_pipe based ones still do to keep OpenChange happy for now.
> >>     
> >>     metze
> > 
> > Metze,
> > 
> > This seems a step back into the past. 
> > 
> > When Samba4 started, this was how we programmed DCERPC.  It was a
> > disaster - the extra check was forgotten so often that we moved to the
> > current scheme, where application and NDR level errors were combined, so
> > that both were kept checked.
> >
> > How are we going to avoid this happening again, now that we have
> > returned to that API?
> 
> We already have explicit checks for most of the rpc calls (on all which
> don't return NTSTATUS).
> 
> They calls returning WERROR are a lot more than the NTSTATUS ones.
> 
> $ git grep WERROR librpc/idl/ | wc -l
> 593
> $ git grep NTSTATUS librpc/idl/ | wc -l
> 213
> 
> $ git grep WERROR librpc/idl/| grep -v todo | wc -l
> 429
> $ git grep NTSTATUS librpc/idl/| grep -v todo | wc -l
> 180

Sure, but I was under the impression that most of the the implemented
ones - and the ones we use most often - are still NTSTATUS.

> I think the bigger problem is inconsistency. Developers see NTSTATUS
> based example and forget the extra check for WERROR based calls.
> We need a strict rule to always check both errors.

And the best way to get a strict rule is to have that rule enforced by
the autogenerated code.  Leaving it up to the caller isn't 'strict',
it's 'lax'. 

> Maybe having some helper macros for torture test, will also reduce
> the risk of forgetting it.

I worry that it will instead just ensure other code forgets it.  That is
*exactly* what happened the first time we did this. 

> For the current sync s3 stubs, we have much more error mapping logic,
> which also leads to pidl warnings/errors, because we try to map any
> result to NTSTATUS. The async s3 stubs don't have that silly mapping
> but require the caller to pass in a variable to get the application
> level result, which makes it also less likely that someone forget
> about checking the result. (The new dcerpc_foo() binding, without the
> '_r' at the end will look similar to the current (async) s3 stubs).
> 
> After trying to fix reconnect bugs in the s3 winbindd I think mixing the
> error codes only leads to problems.

I don't mind allowing them to be separate on demand, but given that in
the remaining cases, we just mash them back into one status variable, I
think it's just asking for trouble to leave it to the caller. 

> I'd really like to avoid putting any logic into the pidl generated
> stubs. They should be just simple type/function specific wrappers
> arround a generic.
> 
> NTSTATUS dcerpc_binding_handle_call(struct dcerpc_binding_handle *b,
> 				    TALLOC *mem_ctx,
> 				    uint32_t opnum,
> 				    struct ndr_interface_table *table,
> 				    void *r);
> 
> function. (The MS-RPCE documentation seems to indicate that is would be
> better to return WERROR instead of NTSTATUS, but *that* would be
> something that we need to carefully analyze and discuss on the list).
> 
> It would be nice if we could let the stubs return the same return
> as defined in the idl (as on Windows), but we don't have exceptions
> in C where we could deliver the separate rpc error. (I think we should
> think about triggering exception in the python bindings on rpc errors).
> 
> > It seems very rare that we actually wish to do anything *except* combine
> > the error codes.  Why not have an API that does that by default, and
> > allow the separate error codes to be inspected in the rare cases where
> > that is desired?
> 
> The rare conditions mostly happen in torture test, but this doesn't look
> to complicated to me:
> 
> +       torture_assert_ntstatus_ok(tctx, dcerpc_samr_Close_r(b, tctx, &r),
> +               "Close failed");
> +       torture_assert_ntstatus_ok(tctx, r.out.result, "Close failed");
> 
> But I planed to introduce dcerpc specific torture macros, to make it even
> more easy and less error prone.

Thanks.

> Most real callers like winbindd really need to handle both errors
> differently and we should avoid to make it much more complex
> for them to decide how to handle errors. I think there're much more
> things in the s3 winbindd which we need to improve once we can use
> sane stubs.

I don't mind it being optionally available.  I just want callers that
will just going to combine the two errors to have that done by the PIDL
generated function, so that unless asked for, it just 'does the right
thing'.  

> > I really appreciate your work here, but perhaps you could put big
> > changes like this past the list for comment?
> 
> My impression was, this was a more or less trivial change,
> (as I tested with a modified pidl, which generated an additional
> argument to the stubs and had patches to let it still compile,
> which made sure that I catch all places where we relied on the error
> mixing).
> 
> But I'll try to push more patches via the list in future.

Thanks

> > (I know it's on your wiki
> > page at http://wiki.samba.org/index.php/DCERPC but I just didn't realise
> > this was what you meant!)
> 
> One of the next steps will be to get rid of p->last_fault_code,
> which is ugly for the binding handle abstraction and also wrong
> for async requests (which we fully support now).
> 
> At first I'll return corresponding NT_STATUS_RPC_* value for the
> fault codes. (As said above WERR_RPC_* would maybe be a better choice,
> but we this is something that we need to discuss).

So, can't we just return these in the combined 'status' variable, and
therefore get the fault info still?

I guess my point is that, given the size of the Samba codebase, the only
'strict rules' we successfully apply are those that are enforced by the
codebase.  

So the natural, 'just seems to work' application of the API should do
the simple and right thing, while a developer that knows he needs to
check for faults and intends to *do something* with that extra
information (rather than just error return anyway) can signal that
intention, and make a different call.  

As I said, we once had a 'strict rule' that both error codes must be
checked, and it wasn't followed.   That is what lead to the current
'error mapping' situation. 

Andrew Bartlett

-- 
Andrew Bartlett                                http://samba.org/~abartlet/
Authentication Developer, Samba Team           http://samba.org
Samba Developer, Cisco Inc.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 190 bytes
Desc: This is a digitally signed message part
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20100406/ddb765e2/attachment.pgp>