[Samba] Samba 4 replication causes bind dns to freeze

Nikos Mitas nkmitas at gmail.com
Sat Mar 1 04:28:01 MST 2014


Hello again,

I still have the problem with Bind DNS becoming unresponsive when trying to
add a third samba 4 DC to existing environment.
but when i stop the processess on third DC everything is back to normal.
after 3 or 4 days testing with various options and settings, it seems that
bind 9 becomes unresponsive when samba_dlz is starting a transaction:

/////////  named.log /////////
28-Feb-2014 17:59:04.238 database: info: samba_dlz: starting transaction on
zone example.com
28-Feb-2014 17:59:04.240 update-security: error: client 10.61.10.112#53754:
update 'example.com/IN' denied
28-Feb-2014 17:59:04.240 database: info: samba_dlz: cancelling transaction
on zone example.com
28-Feb-2014 17:59:04.787 database: info: samba_dlz: starting transaction on
zone example.com
28-Feb-2014 17:59:04.791 database: info: samba_dlz: allowing update of
signer=pc29878\$\@example.com name=PC29878.example.com tcpaddr= type=A
key=1272-ms-7.1-1b82f.11df676e-9fc8-11e3-5194-000ffe8d65ed/160/0
28-Feb-2014 17:59:04.793 database: info: samba_dlz: allowing update of
signer=pc29878\$\@example.com name=PC29878.example.com tcpaddr= type=A
key=1272-ms-7.1-1b82f.11df676e-9fc8-11e3-5194-000ffe8d65ed/160/0
28-Feb-2014 17:59:04.793 update: info: client 10.61.10.112#50022/key
pc29878\$\@example.com: updating zone 'example.com/NONE': deleting rrset at
'PC29878.example.com' A
28-Feb-2014 17:59:04.795 database: info: samba_dlz: subtracted rdataset
PC29878.example.com 'PC29878.example.com. 1200 IN A 10.61.10.112'
28-Feb-2014 17:59:04.796 update: info: client 10.61.10.112#50022/key
pc29878\$\@example.com: updating zone 'example.com/NONE': adding an RR at '
PC29878.example.com' A
28-Feb-2014 17:59:04.798 database: info: samba_dlz: added rdataset
PC29878.example.com 'PC29878.example.com. 1200 IN A 10.61.10.112'
28-Feb-2014 17:59:04.808 database: info: samba_dlz: committed transaction
on zone example.com
28-Feb-2014 17:59:05.135 database: info: samba_dlz: starting transaction on
zone 10.61.10.in-addr.arpa
28-Feb-2014 17:59:05.137 update-security: error: client 10.61.10.112#60452:
update '10.61.10.in-addr.arpa/IN' denied
28-Feb-2014 17:59:05.137 database: info: samba_dlz: cancelling transaction
on zone 10.61.10.in-addr.arpa
28-Feb-2014 17:59:05.154 database: info: samba_dlz: starting transaction on
zone 10.61.10.in-addr.arpa
28-Feb-2014 17:59:05.156 database: info: samba_dlz: allowing update of
signer=pc29878\$\@example.com name=112.10.61.10.in-addr.arpa tcpaddr=
type=PTR key=1272-ms-7.1-1b82f.11df676e-9fc8-11e3-5194-000ffe8d65ed/160/0
28-Feb-2014 17:59:05.158 database: info: samba_dlz: allowing update of
signer=pc29878\$\@example.com name=112.10.61.10.in-addr.arpa tcpaddr=
type=PTR key=1272-ms-7.1-1b82f.11df676e-9fc8-11e3-5194-000ffe8d65ed/160/0
28-Feb-2014 17:59:05.158 update: info: client 10.61.10.112#57482/key
pc29878\$\@example.com: updating zone '10.61.10.in-addr.arpa/NONE':
deleting rrset at '112.10.61.10.in-addr.arpa' PTR
28-Feb-2014 17:59:05.160 database: info: samba_dlz: subtracted rdataset
112.10.61.10.in-addr.arpa '112.10.61.10.in-addr.arpa. 1200 IN PTR
PC29878.example.com.'
28-Feb-2014 17:59:05.161 update: info: client 10.61.10.112#57482/key
pc29878\$\@example.com: updating zone '10.61.10.in-addr.arpa/NONE': adding
an RR at '112.10.61.10.in-addr.arpa' PTR
28-Feb-2014 17:59:05.163 database: info: samba_dlz: added rdataset
112.10.61.10.in-addr.arpa '112.10.61.10.in-addr.arpa. 1200 IN PTR
PC29878.example.com.'
28-Feb-2014 17:59:05.172 database: info: samba_dlz: committed transaction
on zone 10.61.10.in-addr.arpa
28-Feb-2014 17:59:07.564 resolver: debug 1: createfetch: a26.ms.akamai.net A
28-Feb-2014 17:59:07.564 resolver: debug 1: createfetch: . NS
28-Feb-2014 17:59:07.568 database: debug 1: decrement_reference: delete
from rbt: 0x7f1a493da178 fbcdn-profile-a.ak.fbcdn.akamaihd.net.akadns.net
28-Feb-2014 17:59:08.595 resolver: debug 1: createfetch:
statsfe2.update.microsoft.com A
28-Feb-2014 17:59:08.596 resolver: debug 1: createfetch: . NS
28-Feb-2014 17:59:08.706 database: debug 1: decrement_reference: delete
from rbt: 0x7f1a4416abb0 a1408.dspw43.akamai.net
28-Feb-2014 17:59:08.707 resolver: debug 1: createfetch:
statsfe2.update.microsoft.com.akadns.net A
28-Feb-2014 17:59:17.376 resolver: debug 1: createfetch:
audownload.windowsupdate.nsatc.net A
28-Feb-2014 17:59:17.376 resolver: debug 1: createfetch: . NS
28-Feb-2014 17:59:17.381 database: debug 1: decrement_reference: delete
from rbt: 0x7f1a3841d028 a1406.dspw42.akamai.net
28-Feb-2014 17:59:17.385 resolver: debug 1: createfetch: a695.d.akamai.net A
28-Feb-2014 17:59:17.390 database: debug 1: decrement_reference: delete
from rbt: 0x7f1a493ce118 a1007.dspw43.akamai.net
28-Feb-2014 17:59:17.932 database: info: samba_dlz: starting transaction on
zone example.com


*at this point, nslookup returns timeout, rndc options are not working, all
i can do is kill named process.*

also pmap for named process reports that the memory usage is:* total
 1102400K*

*and this is the output of pstack command for named:*

Thread 11 (Thread 0x7f2cb7294700 (LWP 31214)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 10 (Thread 0x7f2cb6893700 (LWP 31215)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 9 (Thread 0x7f2cb5e92700 (LWP 31216)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 8 (Thread 0x7f2cb5491700 (LWP 31217)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 7 (Thread 0x7f2cb4a90700 (LWP 31218)):
#0  0x00000039bb80e938 in fcntl () from /lib64/libpthread.so.0
#1  0x00007f2ca6976347 in fcntl_lock () from
/usr/local/samba/lib/private/libtdb.so.1
#2  0x00007f2ca697644f in tdb_brlock () from
/usr/local/samba/lib/private/libtdb.so.1
#3  0x00007f2ca6976919 in tdb_nest_lock () from
/usr/local/samba/lib/private/libtdb.so.1
#4  0x00007f2ca6976f2c in tdb_transaction_lock () from
/usr/local/samba/lib/private/libtdb.so.1
#5  0x00007f2ca697c3be in _tdb_transaction_start () from
/usr/local/samba/lib/private/libtdb.so.1
#6  0x00007f2ca697c6d3 in tdb_transaction_start () from
/usr/local/samba/lib/private/libtdb.so.1
#7  0x00007f2ca0656321 in partition_metadata_start_trans () from
/usr/local/samba/lib/ldb/partition.so
#8  0x00007f2ca0651f1f in partition_start_trans () from
/usr/local/samba/lib/ldb/partition.so
#9  0x00007f2cabba76e1 in ldb_next_start_trans () from
/usr/local/samba/lib/private/libldb.so.1
#10 0x00007f2ca187f925 in linked_attributes_start_transaction () from
/usr/local/samba/lib/ldb/linked_attributes.so
#11 0x00007f2cabba76e1 in ldb_next_start_trans () from
/usr/local/samba/lib/private/libldb.so.1
#12 0x00007f2c9fe300c3 in replmd_start_transaction () from
/usr/local/samba/lib/ldb/repl_meta_data.so
#13 0x00007f2cabba76e1 in ldb_next_start_trans () from
/usr/local/samba/lib/private/libldb.so.1
#14 0x00007f2ca36ddb12 in descriptor_start_transaction () from
/usr/local/samba/lib/ldb/descriptor.so
#15 0x00007f2cabba76e1 in ldb_next_start_trans () from
/usr/local/samba/lib/private/libldb.so.1
#16 0x00007f2c9e7e270f in schema_load_start_transaction () from
/usr/local/samba/lib/ldb/schema_load.so
#17 0x00007f2cabbc4eeb in ldb_transaction_start () from
/usr/local/samba/lib/private/libldb.so.1
#18 0x00007f2cb067fea9 in dlz_newversion () from
/usr/local/samba/lib/bind9/dlz_bind9_9.so
#19 0x0000000000470d6d in dlopen_dlz_newversion ()
#20 0x00000039bc90b57c in newversion () from /usr/lib64/libdns.so.100
#21 0x000000000045948f in update_action ()
#22 0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#23 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#24 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 6 (Thread 0x7f2cb408f700 (LWP 31219)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 5 (Thread 0x7f2cb368e700 (LWP 31220)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 4 (Thread 0x7f2cb2c8d700 (LWP 31221)):
#0  0x00000039bb80e264 in __lll_lock_wait () from /lib64/libpthread.so.0
#1  0x00000039bb809508 in _L_lock_854 () from /lib64/libpthread.so.0
#2  0x00000039bb8093d7 in pthread_mutex_lock () from /lib64/libpthread.so.0
#3  0x0000000000471289 in dlopen_dlz_findzonedb ()
#4  0x00000039bc90af64 in dns_sdlzfindzone () from /usr/lib64/libdns.so.100
#5  0x00000039bc850a94 in dns_dlzfindzone () from /usr/lib64/libdns.so.100
#6  0x0000000000433e33 in query_find ()
#7  0x000000000043c25e in ns_query_start ()
#8  0x000000000041c34c in client_request ()
#9  0x00000039bc432240 in run () from /usr/lib64/libisc.so.95
#10 0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#11 0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 3 (Thread 0x7f2cb228c700 (LWP 31222)):
#0  0x00000039bb80b98e in pthread_cond_timedwait@@GLIBC_2.3.2 () from
/lib64/libpthread.so.0
#1  0x00000039bc44993e in isc_condition_waituntil () from
/usr/lib64/libisc.so.95
#2  0x00000039bc4363bf in run () from /usr/lib64/libisc.so.95
#3  0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#4  0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 2 (Thread 0x7f2cb188b700 (LWP 31223)):
#0  0x00000039bb4e9163 in epoll_wait () from /lib64/libc.so.6
#1  0x00000039bc44757c in watcher () from /usr/lib64/libisc.so.95
#2  0x00000039bb8079d1 in start_thread () from /lib64/libpthread.so.0
#3  0x00000039bb4e8b6d in clone () from /lib64/libc.so.6
Thread 1 (Thread 0x7f2cb74e47c0 (LWP 31213)):
#0  0x00000039bb432cd4 in sigsuspend () from /lib64/libc.so.6
#1  0x00000039bc437904 in isc__app_ctxrun () from /usr/lib64/libisc.so.95
#2  0x000000000042748b in main ()

any help will be very welcome

Nikos Mitas



2014-02-25 11:36 GMT+02:00 Nikos Mitas <nkmitas at gmail.com>:

> Hi Daniel,
>
> thanks for the suggestion,
> I will check this right now, but both DNS servers are working fine until I
> startup the third DC and replication starts.
>
> two more things to check today is
> a) dns_update samba process. but I dont know if I can safely disable it
> (temporarily)
> b) disable dns forwarders inside named.conf
>
> Nikos Mitas
>
>
> 2014-02-25 8:39 GMT+02:00 Daniel Müller <mueller at tropenklinik.de>:
>
> I think some service takes the same port as bind!?
>> In my case ,centos 6.4, it was portreserve. I switches it off and all
>> started to work.
>>
>>
>>
>>
>> EDV Daniel Müller
>>
>> Leitung EDV
>> Tropenklinik Paul-Lechler-Krankenhaus
>> Paul-Lechler-Str. 24
>> 72076 Tübingen
>> Tel.: 07071/206-463, Fax: 07071/206-499
>> eMail: mueller at tropenklinik.de
>> Internet: www.tropenklinik.de
>> "Der Mensch ist die Medizin des Menschen"
>>
>>
>>
>>
>> -----Ursprüngliche Nachricht-----
>> Von: samba-bounces at lists.samba.org [mailto:samba-bounces at lists.samba.org]
>> Im
>> Auftrag von Nikos Mitas
>> Gesendet: Montag, 24. Februar 2014 23:10
>> An: Marc Muehlfeld
>> Cc: Samba
>> Betreff: Re: [Samba] Samba 4 replication causes bind dns to freeze
>>
>> Missed the last questions......
>>
>> -Which DNS server have you configured as primary in your old and new DCs
>> /etc/resolv.conf?
>>
>> on domain1 the ip of domain1
>> on domain2 the ip of domain2
>> on new dc, as it does not have dns, the ip of both old dc
>>
>> -Can you query the DNS on both hosts from each other?
>>
>> yes, i have followed this wiki page,
>>
>> https://wiki.samba.org/index.php/Samba4/HOWTO/Join_a_domain_as_a_DC
>> host resolution, guid names resolution etc, is working
>>
>> -Any firewall stuff prevent from accessing port 53?
>>
>> No firewall, no SELinux.
>>
>> -Is the DLZ module for 9.9 enabled in /usr/local/samba/private/named.conf?
>> Yes, i have comment for 9.8 and enabled 9.9
>>
>> -What Samba version are you running and is it self compiled or from where
>> you got it?
>>
>> self compiled, samba 4.1.0
>>
>> The old dc pair is working since october without problems.
>>
>> Thanks
>>
>> Nikos
>> On Feb 24, 2014 11:46 PM, "Marc Muehlfeld" <samba at marc-muehlfeld.de>
>> wrote:
>>
>> > Hello Nikos,
>> >
>> > Am 24.02.2014 21:49, schrieb Nikos Mitas:
>> >
>> >> Joined a new samba 4 dc to an existing pair of samba 4 domain
>> >> controllers, but i have a problem with replication.
>> >>
>> >
>> > Is the replication working before it hangs ('samba tool drs showrepl')?
>> >
>> >
>> >
>> >
>> >  5-10 minutes after starting samba services on the new samba 4 server,
>> >> both   dns servers on the old Domain controllers freeze.  Nothing
>> works.
>> >>
>> >> all i can do is to kill all the services (ntp,named,samba) and start
>> >> over again.
>> >>
>> >> i get this message  on new dc in this file: /usr/local/samba/var/log.
>> >> samba:
>> >> ....
>> >> dreplsrv_notify: Failed to send DsReplicaSync to
>> >> 4d2038d4-3b1c-41a8-9865-142f7e9cadba._msdcs.example.com for
>> >> DC=example,DC=com - NT_STATUS_IO_TIMEOUT : WERR_SEM_TIMEOUT .....
>> >> environment:
>> >> Redhat 6.5
>> >> Bind with dlz v9.9.5
>> >>
>> >
>> > Which DNS server have you configured as primary in your old and new
>> > DCs /etc/resolv.conf?
>> >
>> > Can you query the DNS on both hosts from each other?
>> >
>> > Any firewall stuff prevent from accessing port 53?
>> >
>> > Is the DLZ module for 9.9 enabled in
>> /usr/local/samba/private/named.conf?
>> >
>> > What Samba version are you running and is it self compiled or from
>> > where you got it?
>> >
>> >
>> > Regards,
>> > Marc
>> >
>> >
>> --
>> To unsubscribe from this list go to the following URL and read the
>> instructions:  https://lists.samba.org/mailman/options/samba
>>
>>
>


More information about the samba mailing list