winbindd stuck at getaddrinfo.

Richard Sharpe realrichardsharpe at gmail.com
Wed Oct 29 12:14:58 MDT 2014


On Wed, Oct 29, 2014 at 12:20 AM, Hemanth Thummala
<hemanth.thummala at gmail.com> wrote:
> Hi All,
>
> We are using samba 3.6.12+ stack on FreeBSD 8.0. Recently we are hitting a
> winbindd hung issue at few of our customers. Each time we could see that
> winbindd getting hung at gettaddrinfo.

Curiously, I am seeing what might be a similar issue at one site in
Italy. In this case it manifests as a long pause during
authentication, which looks like a winbindd timeout in
wb_is_trusted_domain, but because none of the functions in
lib/winbind_util.c has any DEBUG statements in them I can only guess
that this is the case. The site has a very large domain (it is part of
a Class A network that is not 10.0.0.0.)

This is with 3.6.9 (CentOS 6.x).

I have provided them with modified RPM to give me extra info at log
level 10 and should know more later today.

I will look at the master code in this area and might submit a patch
that provides extra info at log level 10 because that would sure
improve our ability to debug things quickly.


> Here are couple of instances.
>
> Thread 1 (Thread 8030021c0 (LWP 101442)):
> #0  0x00000008027079dc in kevent () from /lib/libc.so.7
> #1  0x00000008026d9d81 in ?? () from /lib/libc.so.7
> #2  0x00000008026da6e4 in __res_nsend () from /lib/libc.so.7
> #3  0x00000008026e91ae in ?? () from /lib/libc.so.7
> #4  0x00000008026e94bf in ?? () from /lib/libc.so.7
> #5  0x00000008026ea2ba in ?? () from /lib/libc.so.7
> #6  0x00000008026fb7e3 in nsdispatch () from /lib/libc.so.7
> #7  0x00000008026eb907 in getaddrinfo () from /lib/libc.so.7
> #8  0x0000000801505d82 in krb5_krbhst_get_addrinfo () from
> /usr/lib/libkrb5.so.10
> #9  0x00000008015051c0 in krb5_sendto () from /usr/lib/libkrb5.so.10
> #10 0x000000080150549f in krb5_sendto_context () from /usr/lib/libkrb5.so.10
> #11 0x00000008014ee82c in krb5_get_init_creds () from /usr/lib/libkrb5.so.10
> #12 0x00000008014ef34a in krb5_get_init_creds_password () from
> /usr/lib/libkrb5.so.10
> #13 0x000000000082c193 in kerberos_kinit_password_ext
> (principal=0x803030b80 "HOSTNAME$@CORP.DOMAIN.COM", password=0x80300a4d0
> "4hOZxLoyOqynNr", time_offset=0, expire_time=0x0, renew
> _till_time=0x0, cache_name=0x80300cdc0 "MEMORY:cliconnect",
> request_pac=false, add_netbios_addr=false, renewable_time=0, ntstatus=0x0)
> at libads/kerberos.c:232
> #14 0x000000000082c426 in kerberos_kinit_password (principal=0x1c <Address
> 0x1c out of bounds>, password=0x7fffffff9ba0 "\035", time_offset=1,
> cache_name=0x7fffffff9bd0 "\n") at libads/kerb
> eros.c:657
> #15 0x000000000058e47f in cli_session_setup_spnego (cli=0x803049f50,
> user=0x803030b80 "HOSTNAME$@CORP.DOMAIN.COM", pass=0x80300a4d0
> "4hOZxLoyOqynNr", user_domain=0x80300d0f0 "CORP",
> dest_realm=0x80305f300 "corp.domain.com") at libsmb/cliconnect.c:1861
> #16 0x000000000049f404 in cm_prepare_connection (retry=<optimized out>,
> cli=<optimized out>, controller=<optimized out>, sockfd=<optimized out>,
> domain=<optimized out>) at winbindd/winbindd
> _cm.c:893
> #17 cm_open_connection (domain=0x80305f200, new_conn=0x80305f720) at
> winbindd/winbindd_cm.c:1606
> #18 0x000000000049f89d in init_dc_connection_network (domain=0x80305f200)
> at winbindd/winbindd_cm.c:1788
> #19 0x000000000049f8ee in init_dc_connection (domain=0x1c) at
> winbindd/winbindd_cm.c:1808
> #20 0x000000000049f911 in init_dc_connection_rpc (domain=0x1c) at
> winbindd/winbindd_cm.c:1815
> #21 0x000000000049f97d in cm_connect_netlogon (domain=0x1c,
> cli=0x7fffffff9ba0) at winbindd/winbindd_cm.c:2623
> #22 0x00000000004983da in winbind_samlogon_retry_loop (domain=0x80305f200,
> mem_ctx=0x8030095b0, logon_parameters=2080, server=0x80305f590 "
> DC01.corp.domain.com", username=0x7fffff
> ffe074 "1420djc", domainname=0x7fffffffe174 "CORP",
> workstation=0x7fffffffe47c "MH2017", chal=0x7fffffffe068
> "\375\211X\310{\303\021\300 \b", lm_response=..., nt_response=...,
> info3=0x7ffff
> fffd108) at winbindd/winbindd_pam.c:1178
> #23 0x0000000000499626 in winbindd_dual_pam_auth_crap (domain=0x80305f200,
> state=0x7fffffffe800) at winbindd/winbindd_pam.c:1875
> #24 0x00000000004aeeed in child_process_request (state=<optimized out>,
> child=<optimized out>) at winbindd/winbindd_dual.c:495
> #25 fork_domain_child (child=<optimized out>) at
> winbindd/winbindd_dual.c:1609
> #26 wb_child_request_trigger (req=<optimized out>, private_data=<optimized
> out>) at winbindd/winbindd_dual.c:200
> #27 0x0000000000569db0 in tevent_common_loop_immediate (ev=0x80301e110) at
> ../lib/tevent/tevent_immediate.c:139
> #28 0x0000000000568075 in run_events_poll (ev=0x80301e110, pollrtn=0,
> pfds=0x0, num_pfds=0) at lib/events.c:197
> #29 0x0000000000568799 in s3_event_loop_once (ev=0x80301e110,
> location=<optimized out>) at lib/events.c:331
> #30 0x0000000000568bb1 in _tevent_loop_once (ev=0x80301e110,
> location=0x8b45fa "winbindd/winbindd.c:1491") at ../lib/tevent/tevent.c:494
> #31 0x0000000000489a22 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at winbindd/winbindd.c:1491
>
> Above stack belongs to parent winbindd process. And below one is child
> winbindd process.
>
> Thread 1 (Thread 8030021c0 (LWP 101793)):
> #0  0x00000008027079dc in kevent () from /lib/libc.so.7
> #1  0x00000008026d9d81 in ?? () from /lib/libc.so.7
> #2  0x00000008026da6e4 in __res_nsend () from /lib/libc.so.7
> #3  0x00000008026e91ae in ?? () from /lib/libc.so.7
> #4  0x00000008026e94bf in ?? () from /lib/libc.so.7
> #5  0x00000008026ea2ba in ?? () from /lib/libc.so.7
> #6  0x00000008026fb7e3 in nsdispatch () from /lib/libc.so.7
> #7  0x00000008026eb907 in getaddrinfo () from /lib/libc.so.7
> #8  0x0000000801505d82 in krb5_krbhst_get_addrinfo () from
> /usr/lib/libkrb5.so.10
> #9  0x00000008015051c0 in krb5_sendto () from /usr/lib/libkrb5.so.10
> #10 0x000000080150549f in krb5_sendto_context () from /usr/lib/libkrb5.so.10
> #11 0x00000008014ee82c in krb5_get_init_creds () from /usr/lib/libkrb5.so.10
> #12 0x00000008014ef34a in krb5_get_init_creds_password () from
> /usr/lib/libkrb5.so.10
> #13 0x000000000082c413 in kerberos_kinit_password_ext
> (principal=0x80302fb00 "HOSTNAME$@HOME.DOMAIN.COM", password=0x80300a370
> "QD1eo3h0HEf-cA", time_offset=0, expire_time=0x0, renew_till_time=0x0,
> cache_name=0x80300c940 "MEMORY:cliconnect", request_pac=false,
> add_netbios_addr=false, renewable_time=0, ntstatus=0x0) at
> libads/kerberos.c:232
> #14 0x000000000082c6a6 in kerberos_kinit_password (principal=0x1c <Address
> 0x1c out of bounds>, password=0x7fffffff9cd0 "\036", time_offset=1,
> cache_name=0x7fffffff9d00 "\005") at libads/kerberos.c:657
> #15 0x000000000058e6ef in cli_session_setup_spnego (cli=0x803048150,
> user=0x80302fb00 "HOSTNAME$@HOME.DOMAIN.COM", pass=0x80300a370
> "QD1eo3h0HEf-cA", user_domain=0x80301c0a4 "OHM", dest_realm=0x803051300 "
> home.domain.com") at libsmb/cliconnect.c:1861
> #16 0x000000000049f554 in cm_prepare_connection (retry=<optimized out>,
> cli=<optimized out>, controller=<optimized out>, sockfd=<optimized out>,
> domain=<optimized out>) at winbindd/winbindd_cm.c:893
> #17 cm_open_connection (domain=0x803051200, new_conn=0x803051720) at
> winbindd/winbindd_cm.c:1606
> #18 0x000000000049f9ed in init_dc_connection_network (domain=0x803051200)
> at winbindd/winbindd_cm.c:1788
> #19 0x000000000053c179 in messaging_dispatch_rec (msg_ctx=0x1c,
> rec=0x803087290) at lib/messages.c:376
> #20 0x000000000053dfa1 in message_dispatch (msg_ctx=0x80302f0d0) at
> lib/messages_local.c:516
> #21 messaging_tdb_signal_handler (ev_ctx=<optimized out>, se=<optimized
> out>, signum=<optimized out>, count=<optimized out>, _info=<optimized out>,
> private_data=<optimized out>) at lib/messages_local.c:77
> #22 0x000000000056a744 in tevent_common_check_signal (ev=0x80301e110) at
> ../lib/tevent/tevent_signal.c:395
> #23 0x00000000005682d9 in run_events_poll (ev=0x1c, pollrtn=-1,
> pfds=0x80301be20, num_pfds=2) at lib/events.c:192
> #24 0x00000000004aed51 in fork_domain_child (child=<optimized out>) at
> winbindd/winbindd_dual.c:1568
> #25 wb_child_request_trigger (req=<optimized out>, private_data=<optimized
> out>) at winbindd/winbindd_dual.c:200
> #26 0x000000000056a030 in tevent_common_loop_immediate (ev=0x80301e110) at
> ../lib/tevent/tevent_immediate.c:139
> #27 0x00000000005682f5 in run_events_poll (ev=0x80301e110, pollrtn=0,
> pfds=0x0, num_pfds=0) at lib/events.c:197
> #28 0x0000000000568a19 in s3_event_loop_once (ev=0x80301e110,
> location=<optimized out>) at lib/events.c:331
> #29 0x0000000000568e31 in _tevent_loop_once (ev=0x80301e110,
> location=0x8b48da "winbindd/winbindd.c:1491") at ../lib/tevent/tevent.c:494
> #30 0x0000000000489b02 in main (argc=<optimized out>, argv=<optimized out>,
> envp=<optimized out>) at winbindd/winbindd.c:1491
>
>
> When the issue occurs, winbindd stops any further authentications to new
> connections.
>
> I could internally reproduce the issue, after setting a firewall rule to
> drop UDP DNS responses off reaching samba server. When I restored the
> traffic, winbindd comes back to normal. Where as in our customer case, we
> had to manually restart winbindd to restore the service.
>
> Would like to know if anyone else has faced the same issue in past. We are
> using heimdal kerberos version and getaddrinfo() is actually from libc.
>
> Thanks,
> Hemanth.



-- 
Regards,
Richard Sharpe
(何以解憂?唯有杜康。--曹操)


More information about the samba-technical mailing list