[Samba] winbind causing huge timeouts/delays since 4.8

Alexander Spannagel aspannagel at gmx.de
Fri Feb 22 12:59:15 UTC 2019


Hello!

I want to share some findings with the community about hugh 
timeouts/delays since upgraded to samba 4.8 end of last year and a patch 
fixing this in our setup. It would be great if someone from samba dev 
team could take a look and if acceptable apply the patch to the common 
code base. It may also affect current stable and release candidates.
The patch expects the patch from BUG 13503 "getpwnam resolves local 
system accounts to AD" being already applied.

Within the company i'm working for, we see frequently system 
hangs/slowness for a couple of seconds on servers using winbind 
passwd/group resolution via nsswitch.conf since we updated our OS from 
CentOS7.5 to CentOS7.6 which includes a samba update from 4.7 to 4.8.

We could track it down to winbind and when it is asked for an unknown 
local user account. This means that the users account in question is not 
in local passwd and doesn't contain any domain like SOMEDOMAIN\account 
or account at SOMEDOMAIN. The expected behavior is an immediately return 
with an error like "no such user" or "unknown user", but instead a call 
like "id unknown" takes 60+ seconds. Increasing "winbind max domain 
connections" could reduce this to 10+ seconds and setting "winbind use 
default domain" to yes could get it back to the expected immediately 
response. A protocol about different setups can be found at the bottom.

As none of the config changes make sense as a requirement to me and 
setting "winbind use default domain" to yes isn't usable on some of our 
servers, i digged deeper using wbinfo to talk to the winbind more 
directly and so avoid other services affecting testing.
The finding was pretty clear:
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|grep "winbind use 
default domain" ; time wbinfo -i unknown
         winbind use default domain = No
failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE
Could not get info for user unknown

real    1m2.522s
user    0m0.005s
sys     0m0.009s
[root at centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart winbind
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|grep "winbind use 
default domain" ; time wbinfo -i unknown
         winbind use default domain = Yes
failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
Could not get info for user unknown

real    0m0.015s
user    0m0.005s
sys     0m0.005s

Doing some code research i could track it down to a logical change and 
the return value of the function parse_domain_user from within 
source3/winbindd/winbindd_util.c.
Calling the function with this conditions:
- none domain (e.g. empty)
- user without a domain part (e.g. not DOM\user or user at DOM)
- "winbind use default domain" set to No/false (which is the default) 
causes different return values:
- up to version 4.7: false
- since version 4.8: \0 - e.g. empty string

Applying the attached patch that re-introduce the return value of false 
instead of '\0' fixed the described issues and we now could revert back 
to former config without changing "winbind use default domain" and/or 
"winbind max domain connections" from their default values using our 
patched version of samba.

Hopefully this helps others and i would appreciate if it gets into 
common code base of samba, so it could get into usual update channels of 
the distributions out there. For CentOS i already reported a bug (15795) 
for further processing.

Best regards

Alex

#######

Here is a protocol of a trip through the different config settings on 
one of our servers, which is reproducible on the other servers using 
winbind and samba-4.8:

[root at centos7dev64 ~]# rpm -q samba-4*
samba-4.8.3-4.el7.x86_64
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server 
role|winbind use default domain|max domain connections)"  ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
         winbind max domain connections = 1
         winbind use default domain = No
id: unknown: no such user

real    1m8.630s
user    0m0.000s
sys     0m0.009s
[root at centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart 
winbind ; sss_cache -E
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server 
role|winbind use default domain|max domain connections)"  ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
         winbind max domain connections = 10
         winbind use default domain = No

id: unknown: no such user

real    0m10.914s
user    0m0.000s
sys     0m0.005s
[root at centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart 
winbind ; sss_cache -E
[root at ecentos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server 
role|winbind use default domain|max domain connections)"  ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
         winbind max domain connections = 10
         winbind use default domain = Yes
id: unknown: no such user

real    0m0.020s
user    0m0.002s
sys     0m0.003s
-------------- next part --------------
A non-text attachment was scrubbed...
Name: samba-4.8.9-fix_winbind_empty_domain.patch
Type: text/x-patch
Size: 459 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba/attachments/20190222/3eb9e8e7/samba-4.8.9-fix_winbind_empty_domain.bin>


More information about the samba mailing list