winbindd callbacks not triggered

Isaac Boukris iboukris at gmail.com
Mon Sep 28 08:51:59 UTC 2020


Hi,

I'm looking into a customer issue which i couldn't reproduce, they are
running v4.10.4 (i didn't find any recent related fixes), and report
that when winbindd runs for a while it becomes slow to respond and
commands like 'id root' may take more than a minute.

The first thing that caught my eyes is this one time error I see in
log.winbindd, of which looks quite strange:

[2020/09/02 07:04:35.886276,  0, pid=42540, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/wb_lookupsids.c:663(wb_lookupsids_recv)
  res_names->count = 2, expected 3
[2020/09/02 07:04:35.886322,  5, pid=42540, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/wb_sids2xids.c:426(wb_sids2xids_recv)
  wb_sids_to_xids failed: NT_STATUS_INTERNAL_ERROR

But the real issue is with requests that seem to get lost and for
which the callbacks do not seem to be triggered, for example:

[2020/09/01 20:12:01.603270,  6, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:920(new_connection)
  accepted socket 32
[2020/09/01 20:12:01.603329, 10, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:768(process_request_send)
  process_request_send: process_request: request fn INTERFACE_VERSION
[2020/09/01 20:12:01.603346,  3, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd_misc.c:432(winbindd_interface_version)
  winbindd_interface_version: [nss_winbind (19519)]: request interface
version (version = 31)
[2020/09/01 20:12:01.603376, 10, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:854(process_request_written)
  process_request_written: [nss_winbind(19519):unknown request]:
delivered response to client
[2020/09/01 20:12:01.603441, 10, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:744(process_request_send)
  process_request_send: process_request: Handling async request
nss_winbind(19519):GETGROUPS
[2020/09/01 20:12:01.603460,  3, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd_getgroups.c:66(winbindd_getgroups_send)
  winbindd_getgroups_send: [nss_winbind (19519)] getgroups root
[2020/09/01 20:12:01.603723, 10, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/wb_sids2xids.c:114(wb_sids2xids_send)
  SID 0: S-1-22-1-0
[2020/09/01 20:12:01.603767, 10, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd_util.c:1462(find_lookup_domain_from_sid)
  find_lookup_domain_from_sid: SID [S-1-22-1-0]

And that's it, after that winbindd goes back to the main loop and
handles another request, notice the pid (some times after several
seconds like here, which shows that winbind is rather idle, sometimes
immediately):

[2020/09/01 20:12:13.111935,  6, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:920(new_connection)
  accepted socket 29

Reviewing the code flow from wb_sids2xids_send() it is not clear why
it doesn't get it from cache, but mostly I could not make sense of
these logs (assuming level 10), that is I expect to see more printouts
before winbind goes back to the event loop...

Then after a minute we get this (lp_winbind_request_timeout):

[2020/09/01 20:13:02.700696,  5, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:1212(remove_timed_out_clients)
  Client request timed out, shutting down sock 32, pid 19519

Or sometimes the client gives up first, and we see some like this:

[2020/09/01 20:13:05.144441,  6, pid=15783, effective(0, 0), real(0,
0), class=winbind]
../../source3/winbindd/winbindd.c:1019(winbind_client_activity)
  winbind_client_activity[19583:GETGROUPS]: client has closed
connection - removing client

Thoughts?

Thanks!



More information about the samba-technical mailing list