[Samba] Winbind is "sticky" on one DC

Jonathan Gazeley Jonathan.Gazeley at bristol.ac.uk
Fri Oct 3 03:07:36 MDT 2014

On 02/10/14 22:08, steve wrote:
> On 02/10/14 20:54, Jonathan Gazeley wrote:
>> On 02/10/14 16:42, Allen Chen wrote:
>>> On 10/1/2014 10:05 AM, Jonathan Gazeley wrote:
>>>> On 01/10/14 11:56, Jonathan Gazeley wrote:
>>>>> Hi chaps,
>>>>> I've been using Winbind for several years to authenticate 802.1x
>>>>> wireless users against Active Directory via FreeRADIUS. The solution
>>>>> we've been using until now has been adequate but I've noticed some
>>>>> problematic behaviour. We're running all stock packages from CentOS
>>>>> 6 repos. Current version of winbind is 3.6.9. Unfortunately the
>>>>> Windows DCs are managed by a different team and we don't have access
>>>>> to their settings or logs.
>>>>> We locate domain controllers using a DNS round-robin on
>>>>> ads.bris.ac.uk which returns about 10 DCs. I've noticed that quite
>>>>> often, our three RADIUS servers all latch onto the same DC and cause
>>>>> loading problems.
>>>>> In my smb.conf I've set "password server" to the DNS name of
>>>>> individual DCs but this parameter seems to be ignored. Even after
>>>>> restarting winbind or rebooting, the system always goes back to the
>>>>> same DC.
>>>>> I've also tried explicitly setting the names of individual DCs in
>>>>> krb5.conf and this does not help the situation.
>>>>> Can someone with winbind experience please explain what is going on,
>>>>> and how I can force my RADIUS servers to latch onto specific DCs for
>>>>> their authentications, so I can ensure that they don't all pile onto
>>>>> the same DC and overload it.
>>>>> Thanks,
>>>>> Jonathan
>>>> Bit of information from further testing - I was able to make winbind
>>>> stop using the first DC by temporarily adding an iptables rule that
>>>> dropped all outbound traffic to the first DC. Then, when restarting
>>>> winbind, it picked a different DC. Surely there's a better way than
>>>> this?
>>>> Thanks,
>>>> Jonathan
>>> HI Jonathan,
>>> What is the DNS setting on your Radius server?
>>> I guess it points to your company's DNS server, then forward to your 
>>> DCs?
>>> Allen
>> Yes, exactly. The Radius server uses the main DNS server to look up the
>> fully qualified domain name of the DCs. The name of the
>> ads.bristol.ac.uk returns round-robin records for all the 10 DCs, but I
>> have also set password server to be the DNS name of one individual DC.
> What are the DCs called? Guessing, e.g. dc1.ads.bristol.ac.uk, 
> dc2.ads.bristol.ac.uk... You need to get to the AD side for the round 
> robin to work. Difficult without smb.conf and a bit more info...
> Cheers,

OK, here goes.

ads.bris.ac.uk returns all the IP addresses of the DCs:

[jg4461 at radius03 ~]$ nslookup ads.bris.ac.uk

Non-authoritative answer:
Name:    ads.bris.ac.uk
Name:    ads.bris.ac.uk
Name:    ads.bris.ac.uk
Name:    ads.bris.ac.uk
Name:    ads.bris.ac.uk
Name:    ads.bris.ac.uk
Name:    ads.bris.ac.uk

The DCs themselves are all in different subdomains of bris.ac.uk, not 
ads.bris.ac.uk. e.g. one is called uob-dc10.bris.ac.uk

My smb.conf is quite short (no file shares, mostly just AD 
configuration). Here are the relevant params:

         workgroup = UOB
         server string = radius03.nomadic-core.bris.ac.uk
         netbios name = rn-de89a707
         security = ads
         realm = ads.bris.ac.uk
         password server = cse-lox.cse.bris.ac.uk
         winbind use default domain = no
         winbind nested groups = Yes
         winbind enum users = No
         winbind enum groups = No
         socket options = TCP_NODELAY SO_RCVBUF=8192 SO_SNDBUF=8192
         dns proxy = yes
         name resolve order = host lmhost
         winbind max domain connections = 5
         winbind max clients = 300

After restarting winbind, it continues to use uob-dc10.bris.ac.uk (as 
returned from the dns lookup of ads.bris.ac.uk) despite the fact that 
password server is telling it to use a different DC.

I've used tcpdump to watch what DNS lookups the system is doing when 
winbind is restarted, and apparently it is not performing a DNS lookup 
at all. It only starts to use cse-lox in preference to uob-dc10 when I 
set an outbound firewall rule on the Radius server to reject all 
outbound traffic to uob-dc10.

I hope this gives a bit more info about our setup.


More information about the samba mailing list