[Samba] BIND9.8 DLZ performance issue
Arthur Ramsey
arthur_ramsey at mediture.com
Fri Oct 7 16:12:08 UTC 2016
I'm hoping the issue is just load balancing, but I'm not sure. I can't
see to get the traffic balanced across two DCs.
I ran this script on all Linux nodes to balance the traffic.
#!/usr/bin/perl
use strict;
use warnings;
my $primary_name_server;
my $random = int(rand(10));
open(my $resolv_conf_fh, '< /etc/resolv.conf') or die("Unable to open /etc/resolv.conf for reading: $!");
while(<$resolv_conf_fh>) {
chomp;
if ($_ =~ /nameserver (.*)/) {
$primary_name_server = $1;
last;
}
}
close($resolv_conf_fh);
if (! defined($primary_name_server) || $primary_name_server eq '192.168.168.64' || $primary_name_server eq '192.168.168.65') {
open(my $resolv_conf_fh, '> /etc/resolv.conf') or die("Unable to open /etc/resolv.conf for writing: $!");
print $resolv_conf_fh "search mediture.dom\n";
print $resolv_conf_fh "options rotate timeout:1\n";
if ($random >= 4) {
print $resolv_conf_fh "nameserver 192.168.168.64\n";
print $resolv_conf_fh "nameserver 192.168.168.65\n";
} else {
print $resolv_conf_fh "nameserver 192.168.168.65\n";
print $resolv_conf_fh "nameserver 192.168.168.64\n";
}
close($resolv_conf_fh);
if (-f '/usr/bin/wbinfo') {
open(my $krb5_conf_fh, '> /etc/krb5.conf') or die("Unable to open /etc/krb5.conf for writing: $!");
print $krb5_conf_fh q([logging]
default = FILE:/var/log/krb5libs.log
kdc = FILE:/var/log/krb5kdc.log
admin_server = FILE:/var/log/kadmind.log
default_realm = MEDITURE.DOM
[libdefaults]
default_realm = MEDITURE.DOM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 24h
renew_lifetime = 7d
forwardable = true
default_keytab_name = FILE:/etc/krb5.keytab
[realms]
MEDITURE.DOM = {);
if ($random >= 4) {
print $krb5_conf_fh " kdc = dc01.mediture.dom\n";
print $krb5_conf_fh " kdc = dc03.mediture.dom\n";
print $krb5_conf_fh " kdc = dc02.mediture.dom\n";
print $krb5_conf_fh " kdc = dc04.mediture.dom\n";
} else {
print $krb5_conf_fh " kdc = dc03.mediture.dom\n";
print $krb5_conf_fh " kdc = dc01.mediture.dom\n";
print $krb5_conf_fh " kdc = dc04.mediture.dom\n";
print $krb5_conf_fh " kdc = dc02.mediture.dom\n";
}
print $krb5_conf_fh q( default_realm = MEDITURE.DOM
}
[domain_realm]
mediture.dom = MEDITURE.DOM
.mediture.dom = MEDITURE.DOM);
close($krb5_conf_fh);
open(my $smb_conf_fh, '> /etc/samba/smb.conf') or die("Unable to open /etc/samba/smb.conf for writing: $!");
print $smb_conf_fh q([global]
#--authconfig--start-line--
workgroup = MEDITURE
password server = );
if ($random >= 4) {
print $smb_conf_fh 'dc01.mediture.dom ';
print $smb_conf_fh 'dc03.mediture.dom ';
print $smb_conf_fh 'dc02.mediture.dom ';
print $smb_conf_fh 'dc04.mediture.dom';
} else {
print $smb_conf_fh 'dc03.mediture.dom ';
print $smb_conf_fh 'dc01.mediture.dom ';
print $smb_conf_fh 'dc04.mediture.dom ';
print $smb_conf_fh 'dc02.mediture.dom';
}
print $smb_conf_fh q(
realm = MEDITURE.DOM
security = ads
template homedir = /home/%U
template shell = /bin/bash
winbind use default domain = true
#--authconfig--end-line--
server string = Samba Server Version %v
# logs split per machine
log file = /var/log/samba/log.%m
# max 50KB per log file, then rotate
max log size = 50
passdb backend = tdbsam
winbind refresh tickets = yes
winbind offline logon = yes
winbind use default domain = yes
winbind nss info = rfc2307
winbind enum users = yes
winbind enum groups = yes
winbind nested groups = yes
kerberos method = secrets and keytab
idmap config *: backend = tdb
idmap config *: range = 90000001-100000000
idmap config MEDITURE: backend = ad
idmap config MEDITURE: range = 10000-49999
idmap config MEDITURE: schema mode = rfc2307);
close($smb_conf_fh);
close($resolv_conf_fh);
}
}
I also have AD sites setup and have manually configured SRV records to
perform load balancing.
$ dig +short srv _ldap._tcp.vsc._sites.dc._msdcs.mediture.dom
0 50 389 dc02.mediture.dom.
0 25 389 dc04.mediture.dom.
0 100 389 dc01.mediture.dom.
0 100 389 dc03.mediture.dom.
$ dig +short srv _ldap._tcp.aws._sites.dc._msdcs.mediture.dom
0 25 389 dc02.mediture.dom.
0 100 389 dc04.mediture.dom.
0 50 389 dc01.mediture.dom.
0 50 389 dc03.mediture.dom.
$ dig +short srv _ldap._tcp.epo._sites.dc._msdcs.mediture.dom
0 25 389 dc04.mediture.dom.
0 100 389 DC02.mediture.dom.
0 50 389 dc01.mediture.dom.
0 50 389 dc03.mediture.dom.
$ dig +short srv _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.mediture.dom
0 100 389 dc01.mediture.dom.
0 100 389 dc03.mediture.dom.
$ dig +short srv _ldap._tcp.vsc._sites.mediture.dom
0 100 389 dc01.mediture.dom.
0 100 389 dc03.mediture.dom.
0 50 389 dc02.mediture.dom.
0 25 389 dc04.mediture.dom.
$ dig +short srv _ldap._tcp.aws._sites.mediture.dom
0 100 389 dc04.mediture.dom.
0 50 389 dc01.mediture.dom.
0 50 389 dc03.mediture.dom.
0 25 3268 dc02.mediture.dom.
$ dig +short srv _ldap._tcp.epo._sites.mediture.dom
0 25 389 dc04.mediture.dom.
0 100 389 dc02.mediture.dom.
0 50 389 dc01.mediture.dom.
0 50 389 dc03.mediture.dom.
$ dig +short srv _ldap._tcp.Default-First-Site-Name._sites.mediture.dom
0 100 389 dc04.mediture.dom.
0 100 389 dc01.mediture.dom.
0 100 389 dc02.mediture.dom.
0 100 389 dc03.mediture.dom.
I'm not seeing balanced traffic though.
[root at dc01 ~]# netstat -an | grep 445 | grep -c ESTABLISHED
164
[root at dc03 ~]# netstat -an | grep 445 | grep -c ESTABLISHED
10
[root at dc01 ~]# netstat -an | grep 88 | grep -c ESTABLISHED
20
[root at dc03 ~]# netstat -an | grep 88 | grep -c ESTABLISHED
2
[root at dc01 ~]# netstat -an | grep 389 | grep -c ESTABLISHED
175
[root at dc03 ~]# netstat -an | grep 389 | grep -c ESTABLISHED
23
[root at dc01 ~]# netstat -an | grep 636 | grep -c ESTABLISHED
3
[root at dc03 ~]# netstat -an | grep 636 | grep -c ESTABLISHED
7
[root at dc01 ~]# netstat -an | grep 53 | grep -c ESTABLISHED
42
[root at dc03 ~]# netstat -an | grep 53 | grep -c ESTABLISHED
6
I only have a handful of Windows instances joined to the domain at that
site, VSC, but over 100 Linux nodes.
Thanks,
Arthur
On 09/29/2016 10:16 AM, Arthur Ramsey wrote:
> Hello,
>
> I'm running Samba 4.5.0 and bind-9.8.2-0.47.rc1.el6_8.1. One DC of
> four, the PDC, is magnitudes slower running
> /usr/local/samba/sbin/samba_dnsupdate --verbose --all-names. When
> that is running on that DC it seems to block any queries. The load
> average is usually under 0.5. The DC was unsafely halted, which could
> have corrupted something. I ran a dbcheck with samba-tool and it came
> back clean other than the expected cleanup after upgrading to 4.5.0.
> Is there any caches or similar that I could try clearing for BIND?
> Usually at least once a day the memory increases from the typical ~1
> GB of usage to everything the box has, 8 GB physical and 10 GB swap,
> requiring a forceful restart, so there appears to be a memory leak as
> well. When memory usage is high, it is from smbd process, which I
> wouldn't think would have a correlation to BIND. Rather than a memory
> leak, the blocking seen with DNS queries is also blocking smb clients
> resulting in a pile of connections and high memory usage? The load
> under this condition is very high, but that is due to high IO and CPU
> usage from swapping. I had similar behavior with 4.4.5, but it was
> fine for the first couple of weeks after upgrade.
>
> Thanks,
> Arthur
This e-mail and any attachments may contain CONFIDENTIAL information, including PROTECTED HEALTH INFORMATION. If you are not the intended recipient, any use or disclosure of this information is STRICTLY PROHIBITED; you are requested to delete this e-mail and any attachments, notify the sender immediately, and notify the Mediture Privacy Officer at privacyofficer at mediture.com.
More information about the samba
mailing list