[Samba] Winbind timeouts/hangs(?)

Rowland penny rpenny at samba.org
Thu Aug 22 13:31:17 UTC 2019


On 22/08/2019 13:41, Heiko Wundram via samba wrote:
> Hello group,
>
> I'm hitting a problem where group resolution through winbind seems to 
> time out sometimes, which in turn causes invalid group information to 
> propagate to Cronjob users or similar. The profiling information that 
> winbind spits out is the following:
>
> [2019/08/22 14:34:14.809540,  0] 
> ../source3/winbindd/winbindd.c:1058(winbind_client_processed)
>   winbind_client_processed: request took 60.000364 seconds
>   [struct process_request_state] ../source3/winbindd/winbindd.c:683 
> [2019/08/22 14:33:14.809169] ../source3/winbindd/winbindd.c:853 
> [2019/08/22 14:34:14.809533] [60.000364] -> TEVENT_REQ_DONE (2 0))
>    [struct winbindd_getgroups_state] 
> ../source3/winbindd/winbindd_getgroups.c:53 [2019/08/22 
> 14:33:14.809175] ../source3/winbindd/winbindd_getgroups.c:107 
> [2019/08/22 14:34:14.809485] [60.000310] -> TEVENT_REQ_USER_ERROR (3 
> 10483072397370982515))
>     [struct wb_lookupname_state] 
> ../source3/winbindd/wb_lookupname.c:47 [2019/08/22 14:33:14.809184] 
> ../source3/winbindd/wb_lookupname.c:95 [2019/08/22 14:34:14.809483] 
> [60.000299] -> TEVENT_REQ_USER_ERROR (3 10483072397370982515))
>      [struct dcerpc_wbint_LookupName_state] 
> default/librpc/gen_ndr/ndr_winbind_c.c:781 [2019/08/22 
> 14:33:14.809188] default/librpc/gen_ndr/ndr_winbind_c.c:847 
> [2019/08/22 14:34:14.809481] [60.000293] -> TEVENT_REQ_DONE (2 0))
>       [struct dcerpc_wbint_LookupName_r_state] 
> default/librpc/gen_ndr/ndr_winbind_c.c:693 [2019/08/22 
> 14:33:14.809189] default/librpc/gen_ndr/ndr_winbind_c.c:727 
> [2019/08/22 14:34:14.809479] [60.000290] -> TEVENT_REQ_DONE (2 0))
>        [struct dcerpc_binding_handle_call_state] 
> ../librpc/rpc/binding_handle.c:371 [2019/08/22 14:33:14.809190] 
> ../librpc/rpc/binding_handle.c:520 [2019/08/22 14:34:14.809477] 
> [60.000287] -> TEVENT_REQ_DONE (2 0))
>         [struct dcerpc_binding_handle_raw_call_state] 
> ../librpc/rpc/binding_handle.c:149 [2019/08/22 14:33:14.809197] 
> ../librpc/rpc/binding_handle.c:203 [2019/08/22 14:34:14.809471] 
> [60.000274] -> TEVENT_REQ_DONE (2 0))
>          [struct wbint_bh_raw_call_state] 
> ../source3/winbindd/winbindd_dual_ndr.c:89 [2019/08/22 
> 14:33:14.809199] ../source3/winbindd/winbindd_dual_ndr.c:204 
> [2019/08/22 14:34:14.809469] [60.000270] -> TEVENT_REQ_DONE (2 0))
>           [struct wb_domain_request_state] 
> ../source3/winbindd/winbindd_dual.c:473 [2019/08/22 14:33:14.809202] 
> ../source3/winbindd/winbindd_dual.c:708 [2019/08/22 14:34:14.809460] 
> [60.000258] -> TEVENT_REQ_DONE (2 0))
>            [struct wb_child_request_state] 
> ../source3/winbindd/winbindd_dual.c:198 [2019/08/22 14:33:44.809349] 
> ../source3/winbindd/winbindd_dual.c:273 [2019/08/22 14:34:14.809457] 
> [30.000108] -> TEVENT_REQ_DONE (2 0))
>             [struct tevent_queue_wait_state] ../tevent_queue.c:336 
> [2019/08/22 14:33:44.809350] ../tevent_queue.c:355 [2019/08/22 
> 14:33:44.809351] [0.000001] -> TEVENT_REQ_DONE (2 0))
>             [struct wb_simple_trans_state] 
> ../nsswitch/wb_reqtrans.c:375 [2019/08/22 14:33:44.809352] 
> ../nsswitch/wb_reqtrans.c:432 [2019/08/22 14:34:14.809450] [30.000098] 
> -> TEVENT_REQ_DONE (2 0))
>              [struct req_write_state] ../nsswitch/wb_reqtrans.c:158 
> [2019/08/22 14:33:44.809353] ../nsswitch/wb_reqtrans.c:194 [2019/08/22 
> 14:33:44.809377] [0.000024] -> TEVENT_REQ_DONE (2 0))
>               [struct writev_state] ../lib/async_req/async_sock.c:263 
> [2019/08/22 14:33:44.809353] ../lib/async_req/async_sock.c:412 
> [2019/08/22 14:33:44.809376] [0.000023] -> TEVENT_REQ_DONE (2 0))
>              [struct resp_read_state] ../nsswitch/wb_reqtrans.c:222 
> [2019/08/22 14:33:44.809378] ../nsswitch/wb_reqtrans.c:275 [2019/08/22 
> 14:34:14.809449] [30.000071] -> TEVENT_REQ_DONE (2 0))
>               [struct read_packet_state] 
> ../lib/async_req/async_sock.c:458 [2019/08/22 14:33:44.809379] 
> ../lib/async_req/async_sock.c:546 [2019/08/22 14:34:14.809446] 
> [30.000067] -> TEVENT_REQ_DONE (2 0))
>    [struct resp_write_state] ../nsswitch/wb_reqtrans.c:307 [2019/08/22 
> 14:34:14.809495] ../nsswitch/wb_reqtrans.c:344 [2019/08/22 
> 14:34:14.809531] [0.000036] -> TEVENT_REQ_DONE (2 0))
>     [struct writev_state] ../lib/async_req/async_sock.c:263 
> [2019/08/22 14:34:14.809496] ../lib/async_req/async_sock.c:412 
> [2019/08/22 14:34:14.809528] [0.000032] -> TEVENT_REQ_DONE (2 0))
>
> (so, basically, most of the time is spent in wb_domain_request_state, 
> with half of that in resp_read_state) . The system in question is 
> doing absolutely nothing, no network traffic, and the CPU is idling 
> (i.e., winbind seems to be sleeping), when this is logged, and there's 
> also no definitive moment where an answer from one of the DCs arrives 
> which causes completion. I've tried to follow the control path in the 
> corresponding sources noted in the profiling, but couldn't find a 
> specific point that would cause hangs, that's why I thought asking 
> here whether anyone has seen this behaviour.
>
> The Samba package that's being used is the Debian 10 prepackaged Samba 
> 4.9.5, and this is on a simple domain-joined client which has correct 
> Kerberos and PAM configuration to mirror the AD accounts to the 
> operating system. The system has nested groups on, but the complexity 
> of those is minor, and all in all, there are (only) about 100 groups 
> and 250 accounts in the complete domain. The system in question was 
> updated from Debian 9 (using the prepackaged Samba/winbind sources of 
> stretch), and the configuration hasn't changed. Debian 9 packaged 
> Samba did not show this behaviour.
>
> Thanks for any insights on what I might do to debug this further!
>
Sorry, but you cannot rely on the output of 'id' to identify your users 
group memberships. Having said that, I wouldn't have thought it would 
take 60 secs to do nothing ;-)

Please post your smb.conf.

Rowland




More information about the samba mailing list