[Samba] Domain member cannot authenticate when first domain controller is down
Dale
samba at txschroeder.family
Fri Mar 5 04:28:30 UTC 2021
On 3/4/21 1:46 PM, Rowland penny via samba wrote:
> On 04/03/2021 17:39, Dale via samba wrote:
>>
>> I'm very open to suggestions.
>>
>
> OK, I tested this on my small domain, from an rpi running 4.13.4. I
> did not change anything except for resolv.conf, which I changed to this:
>
> # wait 2 seconds : default 5 seconds
> options timeout:2
> # make 1 attempt before trying next nameserver : default 2
> options attempts:1
> # round robin nameservers
> #options rotate
> search samdom.example.com
> nameserver 192.168.0.8
> nameserver 192.168.0.6
>
> I commented 'rotate' because it round robins nameservers, something I
> didn't want to happen.
>
> Also 192.168.0.8 is dc01.samdom.example.com and 192.168.0.6 is
> dc4.samdom.example.com
>
> Ran this command on the rpi:
>
> time host -v -t SRV _ldap._tcp.samdom.example.com.
>
> And got this output:
>
> Trying "_ldap._tcp.samdom.example.com"
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53889
> ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 2, ADDITIONAL: 2
>
> ;; QUESTION SECTION:
> ;_ldap._tcp.samdom.example.com. IN SRV
>
> ;; ANSWER SECTION:
> _ldap._tcp.samdom.example.com. 900 IN SRV 0 100 389
> dc4.samdom.example.com.
> _ldap._tcp.samdom.example.com. 900 IN SRV 0 100 389
> dc01.samdom.example.com.
>
> ;; AUTHORITY SECTION:
> samdom.example.com. 900 IN NS dc4.samdom.example.com.
> samdom.example.com. 900 IN NS dc01.samdom.example.com.
>
> ;; ADDITIONAL SECTION:
> dc4.samdom.example.com. 900 IN A 192.168.0.6
> dc01.samdom.example.com. 900 IN A 192.168.0.8
>
> Received 192 bytes from 192.168.0.8#53 in 78 ms
>
> real 0m0.153s
> user 0m0.038s
> sys 0m0.038s
>
> So far, so good.
>
> I then turned off bind9 on dc01 and ran the command again, this time
> the output was:
>
> Trying "_ldap._tcp.samdom.example.com"
> ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63152
> ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
>
> ;; QUESTION SECTION:
> ;_ldap._tcp.samdom.example.com. IN SRV
>
> ;; ANSWER SECTION:
> _ldap._tcp.samdom.example.com. 900 IN SRV 0 100 389
> dc4.samdom.example.com.
> _ldap._tcp.samdom.example.com. 900 IN SRV 0 100 389
> dc01.samdom.example.com.
>
> Received 132 bytes from 192.168.0.6#53 in 6 ms
>
> real 0m1.074s
> user 0m0.031s
> sys 0m0.041s
>
> As you can see, this time dc4 replied and fairly quickly.
>
> I think you may have missing or incorrect records for DC2, I will try
> and come up with something to check your records.
>
> Rowland
Running the same commands that you did, I have good news and what I
think might be bad news.
Good - Using the resolv.conf options values that you have (no rotate), I
was able to log into other member servers fairly quickly. A "getent
user" took a little longer, but was acceptable.
Bad - Running the "time host..." command that you used returns only 2
sections, QUESTION and ANSWER. There is no AUTHORITY or ADDITIONAL
section. I don't know how essential that is.
_*Client resolv.conf
*_The client is LMDE4 and Samba is 4.13.4 from Louis' repo.
[I get consistent values from resolvconf by editing
/etc/resolvconf/resolv.conf.d/base to get the values shown below in
/etc/resolv.conf.]
**_**_
# Dynamic resolv.conf(5) file for glibc resolver(3) generated by resolvconf(8)
# DO NOT EDIT THIS FILE BY HAND -- YOUR CHANGES WILL BE OVERWRITTEN
nameserver 192.168.0.7
nameserver 192.168.0.8
search workgroup.realm.tld
options timeout:2
options attempts:1
_*Both DC's on the network*_
Trying "_ldap._tcp.workgroup.realm.tld"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 48104
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;_ldap._tcp.workgroup.realm.tld. IN SRV
;; ANSWER SECTION:
_ldap._tcp.workgroup.realm.tld. 900 IN SRV 0 100 389 dc1.workgroup.realm.tld.
_ldap._tcp.workgroup.realm.tld. 900 IN SRV 0 100 389 dc2.workgroup.realm.tld.
Received 158 bytes from*192.168.0.7*#53 in 6 ms
real 0m0.025s
user 0m0.010s
sys 0m0.010s
*_Ethernet cable unplugged from DC1_*
Trying "_ldap._tcp.workgroup.realm.tld"
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 10495
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 0
;; QUESTION SECTION:
;_ldap._tcp.workgroup.realm.tld. IN SRV
;; ANSWER SECTION:
_ldap._tcp.workgroup.realm.tld. 900 IN SRV 0 100 389 dc1.workgroup.realm.tld.
_ldap._tcp.workgroup.realm.tld. 900 IN SRV 0 100 389 dc2.workgroup.realm.tld.
Received 158 bytes from*192.168.0.8*#53 in 8 ms
real 0m1.032s
user 0m0.020s
sys 0m0.005s
So, failover appears to be acceptably working now, but I can't explain
the lack of two sections in the first "time host..." command results.
Dale
More information about the samba
mailing list