[Samba] Slow, Incorrect Group Resolution through Winbind

Rich Otero rotero at editshare.com
Wed Sep 13 14:48:18 UTC 2017


Hello. I am observing some strange behavior on a Linux system that has
joined a Windows Active Directory domain using the Samba suite. Our servers
are based on Ubuntu v12.04 but have kernel v3.12.17 and Samba v4.3.6.

The problem that I'm trying to understand is that group name resolution
through Winbind occasionally fails. Here's an example where one group name
could not be resolved. This causes "groups" to hang, presumably because it
is waiting for Winbind to provide the name and Winbind is waiting for the
domain controller:

editshare at es-exp1:~$ time groups dwill627
dwill627 : domain users _adsso_editors editors exp1-promos groups: cannot
find name for group ID 16777230
16777230 KUTZTOWN\computeradministrativeaccessclassrooms allstudents
KUTZTOWN\oitfs_software_r
KUTZTOWN\computeradministrativeaccessconferencerooms
KUTZTOWN\mediasiteviewonly pcns kup-passpol-stu-temp editshareusers
BUILTIN\users

real    1m21.472s
user    0m0.064s
sys     0m0.000s

However, the user dwill627 is apparently not a member of the group with ID
16777230:

editshare at es-exp1:~$ getent group 16777230
KUTZTOWN\computeradministrativeaccesslabs:x:16777230:KUTZTOWN\techcreel,KUTZTOWN\techstamm,KUTZTOWN\techeben,KUTZTOWN\techjulian,KUTZTOWN\chemnmr,KUTZTOWN\librarypatron,KUTZTOWN\olympiad,KUTZTOWN\labprint

I don't understand why there is this discrepancy.

Here's the global configuration as reported by "testparm:"

[global]
        workgroup = STUDENTS
        realm = STUDENTS.KUTZTOWN.EDU
        server string = es-exp1
        security = ADS
        password server = kustudc01.students.kutztown.edu,
kustudc02.students.kutztown.edu
        smb passwd file = /var/cache/samba/smbpasswd
        passdb backend = smbpasswd
        restrict anonymous = 2
        log file = /var/log/samba/log.%I
        server max protocol = SMB2_22
        max protocol = SMB2_22
        protocol = SMB2_22
        max xmit = 65535
        unix extensions = No
        max open files = 32768
        socket options = TCP_NODELAY SO_RCVBUF=65536 SO_SNDBUF=1048576
        load printers = No
        printcap name = /dev/null
        machine password timeout = 0
        os level = 33
        dns proxy = No
        wins support = Yes
        ldap debug level = 1
        ldap debug threshold = 5
        idmap uid = 16777216-33554431
        idmap gid = 16777216-33554431
        template homedir = /home/%U
        template shell = /sbin/nologin
        winbind use default domain = Yes
        winbind expand groups = 1
        idmap config * : range = 16777216-33554431
        idmap config * : backend = tdb
        aio read size = 1
        aio write size = 1
        use sendfile = Yes
        include = /etc/samba/smb.0.0.0.0.conf
        wide links = Yes

I know that we are using some deprecated options, but this configuration
typically works well for us. From that whole config, these are the few
options that I have added in the course of troubleshooting this system
(some of which are unrelated to my current question):

ldap debug level = 1
ldap debug threshold = 5
log level = winbind:5
password server = kustudc01.students.kutztown.edu
kustudc02.students.kutztown.edu
winbind request timeout = 10

Besides the logging options, allow me to explain the other two: I set
"password server" to restrict Winbind from contacting DCs that it can't
actually reach. For reasons that I do not completely understand, our
customer has setup DNS such that it provides SRV records that point to
hosts that we are prevented from accessing by a firewall. The two DCs that
I've listed for "password server" are the ones that are accessible to our
server (on the same side of the firewall). I set "winbind request timeout"
to attempt to deal with the unusually long time to resolve group IDs to
group name. My thought is that if we can't resolve a GID because a DC is
taking too long to reply, a short timeout should either cause Winbind to
try another DC or give up altogether. I've lightly tested this change and
it seems to help.

One theory that I have is that Winbind is still trying to contact one of
the inaccessible DCs to do group ID resolution. (I understand that the GID
comes from the idmap mechanism, not the DC, but I imagine that there still
must be some initial interaction with the DC. Is that accurate?) Does
"password server" affect group ID resolution? Or is it only used for user
authentication as the manual suggests? If it has nothing to do with group
ID resolution, is there a corresponding option for Winbind that would have
this effect? (I couldn't find one.)

But even if I could explain that part of it (GID resolution taking a very
long time), the other behavior is also quite confusing: Considering the
example with the dwill627 account that I showed, why is "groups dwill627"
attempting to resolve GID 16777230 if "getent group 16777230" indicates
that dwill627 isn't a member? Is there a problem in the idmap?

Regards,
Rich Otero
Technical Support and Professional Services
EditShare
rotero at editshare.com
617-782-0479


More information about the samba mailing list