[Samba] Domain member cannot authenticate when first domain controller is down

Dale samba at txschroeder.family
Fri Mar 5 04:28:30 UTC 2021



On 3/4/21 1:46 PM, Rowland penny via samba wrote:
> On 04/03/2021 17:39, Dale via samba wrote:
>>
>> I'm very open to suggestions.
>>
>
> OK, I tested this on my small domain, from an rpi running 4.13.4. I 
> did not change anything except for resolv.conf, which I changed to this:
>
> # wait 2 seconds : default 5 seconds
> options timeout:2
> # make 1 attempt before trying next nameserver : default 2
> options attempts:1
> # round robin nameservers
> #options rotate
> search samdom.example.com
> nameserver 192.168.0.8
> nameserver 192.168.0.6
>
> I commented 'rotate' because it round robins nameservers, something I 
> didn't want to happen.
>
> Also 192.168.0.8 is dc01.samdom.example.com and 192.168.0.6 is 
> dc4.samdom.example.com
>
> Ran this command on the rpi:
>
> time host -v -t SRV _ldap._tcp.samdom.example.com.
>
> And got this output:
>
> Trying "_ldap._tcp.samdom.example.com"
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53889
> ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 2
>
> ;; QUESTION SECTION:
> ;_ldap._tcp.samdom.example.com.    IN    SRV
>
> ;; ANSWER SECTION:
> _ldap._tcp.samdom.example.com. 900 IN    SRV    0 100 389 
> dc4.samdom.example.com.
> _ldap._tcp.samdom.example.com. 900 IN    SRV    0 100 389 
> dc01.samdom.example.com.
>
> ;; AUTHORITY SECTION:
> samdom.example.com.    900    IN    NS    dc4.samdom.example.com.
> samdom.example.com.    900    IN    NS    dc01.samdom.example.com.
>
> ;; ADDITIONAL SECTION:
> dc4.samdom.example.com.    900    IN    A    192.168.0.6
> dc01.samdom.example.com. 900    IN    A    192.168.0.8
>
> Received 192 bytes from 192.168.0.8#53 in 78 ms
>
> real    0m0.153s
> user    0m0.038s
> sys        0m0.038s
>
> So far, so good.
>
> I then turned off bind9 on dc01 and ran the command again, this time 
> the output was:
>
> Trying "_ldap._tcp.samdom.example.com"
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63152
> ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
>
> ;; QUESTION SECTION:
> ;_ldap._tcp.samdom.example.com.    IN    SRV
>
> ;; ANSWER SECTION:
> _ldap._tcp.samdom.example.com. 900 IN    SRV    0 100 389 
> dc4.samdom.example.com.
> _ldap._tcp.samdom.example.com. 900 IN    SRV    0 100 389 
> dc01.samdom.example.com.
>
> Received 132 bytes from 192.168.0.6#53 in 6 ms
>
> real    0m1.074s
> user    0m0.031s
> sys      0m0.041s
>
> As you can see, this time dc4 replied and fairly quickly.
>
> I think you may have missing or incorrect records for DC2, I will try 
> and come up with something to check your records.
>
> Rowland

Running the same commands that you did, I have good news and what I 
think might be bad news.

Good - Using the resolv.conf options values that you have (no rotate), I 
was able to log into other member servers fairly quickly.  A "getent 
user" took a little longer, but was acceptable.
Bad - Running the "time host..." command that you used returns only 2 
sections, QUESTION and ANSWER.  There is no AUTHORITY or ADDITIONAL 
section.  I don't know how essential that is.

_*Client resolv.conf
*_The client is LMDE4 and Samba is 4.13.4 from Louis' repo.
[I get consistent values from resolvconf by editing 
/etc/resolvconf/resolv.conf.d/base to get the values shown below in 
/etc/resolv.conf.]
**_**_

# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
#     DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 192.168.0.7
nameserver 192.168.0.8
search workgroup.realm.tld
options timeout:2
options attempts:1

_*Both DC's on the network*_
Trying "_ldap._tcp.workgroup.realm.tld"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48104
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_ldap._tcp.workgroup.realm.tld.	IN SRV

;; ANSWER SECTION:
_ldap._tcp.workgroup.realm.tld.	900 IN SRV 0 100 389 dc1.workgroup.realm.tld.
_ldap._tcp.workgroup.realm.tld.	900 IN SRV 0 100 389 dc2.workgroup.realm.tld.

Received 158 bytes from*192.168.0.7*#53 in 6 ms

real	0m0.025s
user	0m0.010s
sys	0m0.010s

*_Ethernet cable unplugged from DC1_*
Trying "_ldap._tcp.workgroup.realm.tld"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10495
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0

;; QUESTION SECTION:
;_ldap._tcp.workgroup.realm.tld.	IN SRV

;; ANSWER SECTION:
_ldap._tcp.workgroup.realm.tld.	900 IN SRV 0 100 389 dc1.workgroup.realm.tld.
_ldap._tcp.workgroup.realm.tld.	900 IN SRV 0 100 389 dc2.workgroup.realm.tld.

Received 158 bytes from*192.168.0.8*#53 in 8 ms

real	0m1.032s
user	0m0.020s
sys	0m0.005s

So, failover appears to be acceptably working now, but I can't explain 
the lack of two sections in the first "time host..." command results.

Dale




More information about the samba mailing list