[Samba] BIND9.8 DLZ performance issue
Arthur Ramsey
arthur_ramsey at mediture.com
Thu Oct 13 15:11:25 UTC 2016
I got core dumps when the issue was happening. Here are the backtraces:
http://pastebin.com/N0e2fsSQ.
Seems to be TDB contention?
Thanks,
Arthur
On 10/7/2016 11:12 AM, Arthur Ramsey wrote:
>
> I'm hoping the issue is just load balancing, but I'm not sure. I can't
> see to get the traffic balanced across two DCs.
>
> I ran this script on all Linux nodes to balance the traffic.
>
> #!/usr/bin/perl
> use strict;
> use warnings;
>
> my $primary_name_server;
> my $random = int(rand(10));
>
> open(my $resolv_conf_fh, '< /etc/resolv.conf') or die("Unable to open /etc/resolv.conf for reading: $!");
> while(<$resolv_conf_fh>) {
> chomp;
> if ($_ =~ /nameserver (.*)/) {
> $primary_name_server = $1;
> last;
> }
> }
> close($resolv_conf_fh);
>
> if (! defined($primary_name_server) || $primary_name_server eq '192.168.168.64' || $primary_name_server eq '192.168.168.65') {
> open(my $resolv_conf_fh, '> /etc/resolv.conf') or die("Unable to open /etc/resolv.conf for writing: $!");
> print $resolv_conf_fh "search mediture.dom\n";
> print $resolv_conf_fh "options rotate timeout:1\n";
> if ($random >= 4) {
> print $resolv_conf_fh "nameserver 192.168.168.64\n";
> print $resolv_conf_fh "nameserver 192.168.168.65\n";
> } else {
> print $resolv_conf_fh "nameserver 192.168.168.65\n";
> print $resolv_conf_fh "nameserver 192.168.168.64\n";
> }
> close($resolv_conf_fh);
>
> if (-f '/usr/bin/wbinfo') {
> open(my $krb5_conf_fh, '> /etc/krb5.conf') or die("Unable to open /etc/krb5.conf for writing: $!");
> print $krb5_conf_fh q([logging]
> default =FILE:/var/log/krb5libs.log
> kdc =FILE:/var/log/krb5kdc.log
> admin_server =FILE:/var/log/kadmind.log
> default_realm = MEDITURE.DOM
>
> [libdefaults]
> default_realm = MEDITURE.DOM
> dns_lookup_realm = false
> dns_lookup_kdc = false
> ticket_lifetime = 24h
> renew_lifetime = 7d
> forwardable = true
> default_keytab_name =FILE:/etc/krb5.keytab
>
> [realms]
> MEDITURE.DOM = {);
> if ($random >= 4) {
> print $krb5_conf_fh " kdc = dc01.mediture.dom\n";
> print $krb5_conf_fh " kdc = dc03.mediture.dom\n";
> print $krb5_conf_fh " kdc = dc02.mediture.dom\n";
> print $krb5_conf_fh " kdc = dc04.mediture.dom\n";
> } else {
> print $krb5_conf_fh " kdc = dc03.mediture.dom\n";
> print $krb5_conf_fh " kdc = dc01.mediture.dom\n";
> print $krb5_conf_fh " kdc = dc04.mediture.dom\n";
> print $krb5_conf_fh " kdc = dc02.mediture.dom\n";
> }
> print $krb5_conf_fh q( default_realm = MEDITURE.DOM
> }
>
> [domain_realm]
> mediture.dom = MEDITURE.DOM
> .mediture.dom = MEDITURE.DOM);
> close($krb5_conf_fh);
>
> open(my $smb_conf_fh, '> /etc/samba/smb.conf') or die("Unable to open /etc/samba/smb.conf for writing: $!");
> print $smb_conf_fh q([global]
> #--authconfig--start-line--
> workgroup = MEDITURE
> password server = );
> if ($random >= 4) {
> print $smb_conf_fh 'dc01.mediture.dom ';
> print $smb_conf_fh 'dc03.mediture.dom ';
> print $smb_conf_fh 'dc02.mediture.dom ';
> print $smb_conf_fh 'dc04.mediture.dom';
> } else {
> print $smb_conf_fh 'dc03.mediture.dom ';
> print $smb_conf_fh 'dc01.mediture.dom ';
> print $smb_conf_fh 'dc04.mediture.dom ';
> print $smb_conf_fh 'dc02.mediture.dom';
> }
> print $smb_conf_fh q(
> realm = MEDITURE.DOM
> security = ads
>
> template homedir = /home/%U
> template shell = /bin/bash
>
> winbind use default domain = true
>
> #--authconfig--end-line--
> server string = Samba Server Version %v
>
> # logs split per machine
> log file = /var/log/samba/log.%m
> # max 50KB per log file, then rotate
> max log size = 50
>
> passdb backend = tdbsam
>
> winbind refresh tickets = yes
> winbind offline logon = yes
> winbind use default domain = yes
> winbind nss info = rfc2307
> winbind enum users = yes
> winbind enum groups = yes
> winbind nested groups = yes
>
> kerberos method = secrets and keytab
>
> idmap config *: backend = tdb
> idmap config *: range = 90000001-100000000
>
> idmap config MEDITURE: backend = ad
> idmap config MEDITURE: range = 10000-49999
> idmap config MEDITURE: schema mode = rfc2307);
> close($smb_conf_fh);
> close($resolv_conf_fh);
> }
> }
> I also have AD sites setup and have manually configured SRV records to
> perform load balancing.
> $ dig +short srv _ldap._tcp.vsc._sites.dc._msdcs.mediture.dom
> 0 50 389 dc02.mediture.dom.
> 0 25 389 dc04.mediture.dom.
> 0 100 389 dc01.mediture.dom.
> 0 100 389 dc03.mediture.dom.
>
> $ dig +short srv _ldap._tcp.aws._sites.dc._msdcs.mediture.dom
> 0 25 389 dc02.mediture.dom.
> 0 100 389 dc04.mediture.dom.
> 0 50 389 dc01.mediture.dom.
> 0 50 389 dc03.mediture.dom.
>
> $ dig +short srv _ldap._tcp.epo._sites.dc._msdcs.mediture.dom
> 0 25 389 dc04.mediture.dom.
> 0 100 389 DC02.mediture.dom.
> 0 50 389 dc01.mediture.dom.
> 0 50 389 dc03.mediture.dom.
>
> $ dig +short srv _ldap._tcp.Default-First-Site-Name._sites.dc._msdcs.mediture.dom
> 0 100 389 dc01.mediture.dom.
> 0 100 389 dc03.mediture.dom.
>
> $ dig +short srv _ldap._tcp.vsc._sites.mediture.dom
> 0 100 389 dc01.mediture.dom.
> 0 100 389 dc03.mediture.dom.
> 0 50 389 dc02.mediture.dom.
> 0 25 389 dc04.mediture.dom.
>
> $ dig +short srv _ldap._tcp.aws._sites.mediture.dom
> 0 100 389 dc04.mediture.dom.
> 0 50 389 dc01.mediture.dom.
> 0 50 389 dc03.mediture.dom.
> 0 25 3268 dc02.mediture.dom.
>
> $ dig +short srv _ldap._tcp.epo._sites.mediture.dom
> 0 25 389 dc04.mediture.dom.
> 0 100 389 dc02.mediture.dom.
> 0 50 389 dc01.mediture.dom.
> 0 50 389 dc03.mediture.dom.
>
> $ dig +short srv _ldap._tcp.Default-First-Site-Name._sites.mediture.dom
> 0 100 389 dc04.mediture.dom.
> 0 100 389 dc01.mediture.dom.
> 0 100 389 dc02.mediture.dom.
> 0 100 389 dc03.mediture.dom.
> I'm not seeing balanced traffic though.
> [root at dc01 ~]# netstat -an | grep 445 | grep -c ESTABLISHED
> 164
> [root at dc03 ~]# netstat -an | grep 445 | grep -c ESTABLISHED
> 10
>
> [root at dc01 ~]# netstat -an | grep 88 | grep -c ESTABLISHED
> 20
> [root at dc03 ~]# netstat -an | grep 88 | grep -c ESTABLISHED
> 2
>
> [root at dc01 ~]# netstat -an | grep 389 | grep -c ESTABLISHED
> 175
> [root at dc03 ~]# netstat -an | grep 389 | grep -c ESTABLISHED
> 23
>
> [root at dc01 ~]# netstat -an | grep 636 | grep -c ESTABLISHED
> 3
> [root at dc03 ~]# netstat -an | grep 636 | grep -c ESTABLISHED
> 7
>
> [root at dc01 ~]# netstat -an | grep 53 | grep -c ESTABLISHED
> 42
> [root at dc03 ~]# netstat -an | grep 53 | grep -c ESTABLISHED
> 6
> I only have a handful of Windows instances joined to the domain at
> that site, VSC, but over 100 Linux nodes.
>
> Thanks,
> Arthur
>
> On 09/29/2016 10:16 AM, Arthur Ramsey wrote:
>> Hello,
>>
>> I'm running Samba 4.5.0 and bind-9.8.2-0.47.rc1.el6_8.1. One DC of
>> four, the PDC, is magnitudes slower running
>> /usr/local/samba/sbin/samba_dnsupdate --verbose --all-names. When
>> that is running on that DC it seems to block any queries. The load
>> average is usually under 0.5. The DC was unsafely halted, which
>> could have corrupted something. I ran a dbcheck with samba-tool and
>> it came back clean other than the expected cleanup after upgrading to
>> 4.5.0. Is there any caches or similar that I could try clearing for
>> BIND? Usually at least once a day the memory increases from the
>> typical ~1 GB of usage to everything the box has, 8 GB physical and
>> 10 GB swap, requiring a forceful restart, so there appears to be a
>> memory leak as well. When memory usage is high, it is from smbd
>> process, which I wouldn't think would have a correlation to BIND.
>> Rather than a memory leak, the blocking seen with DNS queries is also
>> blocking smb clients resulting in a pile of connections and high
>> memory usage? The load under this condition is very high, but that is
>> due to high IO and CPU usage from swapping. I had similar behavior
>> with 4.4.5, but it was fine for the first couple of weeks after upgrade.
>>
>> Thanks,
>> Arthur
This e-mail and any attachments may contain CONFIDENTIAL information, including PROTECTED HEALTH INFORMATION. If you are not the intended recipient, any use or disclosure of this information is STRICTLY PROHIBITED; you are requested to delete this e-mail and any attachments, notify the sender immediately, and notify the Mediture Privacy Officer at privacyofficer at mediture.com.
More information about the samba
mailing list