Winbind can't kill connections to dead PDC?

Esh, Andrew AEsh at tricord.com
Thu Sep 27 14:58:03 GMT 2001


I just reported a bug in the Samba bug tracking system:
	
http://bugs.samba.org/cgi-bin/samba-bugs/incoming?id=21790;user=guest#themes
g

Now I'm not so sure I characterized the bug right, so I thought I should
discuss it with you folks. Here's a log snippet I referred to:

[2001/09/27 15:08:31, 1] nsswitch/winbindd_idmap.c:get_rid_from_id(263)
  unknown domain DOMAIN for rid 1829
[2001/09/27 15:08:31, 1]
nsswitch/winbindd_user.c:winbindd_getpwnam_from_uid(211)
  Could not convert uid 1053 to rid
[2001/09/27 15:08:31, 1] nsswitch/winbindd_util.c:get_domain_info(470)
  Getting domain info for domain DOMAIN
[2001/09/27 15:08:31, 1] nsswitch/winbindd_util.c:lookup_domain_sid(422)
  looking up sid for domain DOMAIN
[2001/09/27 15:08:31, 2] libsmb/namequery.c:name_query(407)
  Got a positive name query response from 10.10.9.52 ( 10.10.9.52 10.10.9.52
)
[2001/09/27 15:08:51, 0] rpc_client/cli_pipe.c:rpc_api_pipe(358)
  cli_pipe: return critical error. Error was code 0
[2001/09/27 15:08:51, 0] nsswitch/winbindd_util.c:get_domain_info(475)
  could not find sid for domain DOMAIN
[2001/09/27 15:08:51, 0]
nsswitch/winbindd_util.c:winbindd_kill_connections(244)
  killing connections to domain DOMAIN with controller GC52
[2001/09/27 15:09:11, 0] rpc_client/cli_pipe.c:rpc_api_pipe(358)
  cli_pipe: return critical error. Error was code 0
[2001/09/27 15:09:11, 1] nsswitch/winbindd_idmap.c:get_rid_from_id(263)
  unknown domain DOMAIN for rid 1125
[2001/09/27 15:09:11, 1]
nsswitch/winbindd_user.c:winbindd_getpwnam_from_uid(211)
  Could not convert uid 1027 to rid
[2001/09/27 15:09:11, 1] nsswitch/winbindd_util.c:get_domain_info(470)
  Getting domain info for domain DOMAIN
[2001/09/27 15:09:11, 1] nsswitch/winbindd_util.c:lookup_domain_sid(422)
  looking up sid for domain DOMAIN
[2001/09/27 15:09:11, 2] libsmb/namequery.c:name_query(407)
  Got a positive name query response from 10.10.9.52 ( 10.10.9.52 10.10.9.52
)
[2001/09/27 15:09:31, 0] rpc_client/cli_pipe.c:rpc_api_pipe(358)
  cli_pipe: return critical error. Error was code 0

The problem appeared to be that winbindd_kill_connections calls rpc_api_pipe
to close the PDC connection. That call fails, which causes winbind to decide
to close all its connections, and there appeared to be a loop. This is not
the case.

What is really happening is there is a long list of users being queried, so
the network has to fail on EACH USER. There is no loop, just a long list, so
there probably isn't a bug in winbind as I thought.

It would be nicer, though, if winbind would take the rpc_api_pipe failure as
an indication that the PDC connection is already down, and kill off the
local end of it. Winbind keeps reaching the point of trying to look up a
RID, and it tries to use the bad connection again. Winbind should close it
and reopen a fresh one. Here's why: The PDC had recovered and was up and
running while winbind was still stuck in this loop on my machine. It was
trying to resolve RIDs with a bad connection. Once winbind had been
restarted, it worked just fine. The PDC was there, it's just that winbind
couldn't recover the old connection to it, and didn't bother to try to open
a new one.

If you guys come up with a fix, I will soon have the tools in place to test
it. I'd try to fix it, but as you can see, I am not very familiar with the
code.

---
Andrew C. Esh                mail:Andrew.Esh at tricord.com
Tricord Systems, Inc.
2905 Northwest Blvd., Suite 20        763-557-9005 (main)
Plymouth, MN 55441-2644 USA      763-551-6418 (direct)
http://www.tricord.com - Tricord Home Page

-------------- next part --------------
HTML attachment scrubbed and removed


More information about the samba-technical mailing list