[Samba] Non Functioning Internal DNS - Samba4

Donaldson Jeff Jeff.Donaldson at ncs.k12.de.us
Fri Oct 31 07:55:31 MDT 2014


Recently one of my Samba4 (4.2.0 Ver) Domain Controllers started acting up. Authentication against it would time out and fail, but until recently the internal DNS was still working. Now the internal DNS fails. If I use nslookup and set the server to it, then look up any hostname I get "connection timed out; no servers could be reached". This DC is my primary and has all FSMO roles. I need to get this working again in order to seize those roles on one of my other DCs. During troubleshooting here are some of the things I found.

If I nslookup the IP address of my primary DC on one of my other servers, I get two records

25.2.xxx.xx.in-addr.arpa          name = hostname.

25.2.xxx.xx.in-addr.arpa          name = FQDN.

I only get the FQDN when I lookup my other DCs. When I found this, I tried to use Samba-Tool to delete the hostname. record, but I get message that the record doesn't exist. If I then run samba-tool dns serverinfo hostname, I get the following error...

ERROR(runtime): uncaught exception - (-1073741643, 'NT_STATUS_IO_TIMEOUT')  File "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/__init__.py", line 175, in _run

return self.run(*args, **kwargs)

File "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/dns.py", line 703, in run

dns_conn = dns_connect(server, self.lp, self.creds)

File "/usr/local/samba/lib/python2.7/site-packages/samba/netcmd/dns.py", line 37, in dns_connect

dns_conn = dnsserver.dnsserver(binding_str, lp, creds)

I then tried checking the sam.ldb to see how the record is entered using ldbedit --url=sam.ldb. When I look at it's record, there are at least 10 additional servicePrincipalName lines that are pointing to an old orphaned DC that I had to manually remove using ADSI and AD Sites and Services several months back. They are somehow attached to the Primary DC record in sam.ldb now. Could this be causing the DNS failure? If so, what if I were to take each DC down (over a weekend of course) then manually edit the record in sam.ldb on each DC making  sure that only the one being edited was up at a time, then once all of the changes are complete bring each one back online. The database record would be the same on all DCs and therefore replication wouldn't cause any further damage.

Oddly enough, despite all of this I can still connect to this DC via DNS Manager. Its really slow, but I can see all of the records and even attempted to delete the PTR record for the odd hostname. I got similar error that the record does not exist. I can only assume that there is a timeout querying DNS via nslookup that DNS manager doesn't hit.

Is there anything else I may be missing in troubleshooting this problem? If needed I can provide info from resolv.conf and hosts. Any help is appreciated.



Jeff Donaldson
Technology Director
Newark Charter School
jeff.donaldson at ncs.k12.de.us
(302) 369-2001 ext: 425

More information about the samba mailing list