[Samba] winbind causing huge timeouts/delays since 4.8
Alexander Spannagel
aspannagel at gmx.de
Fri Feb 22 12:59:15 UTC 2019
Hello!
I want to share some findings with the community about hugh
timeouts/delays since upgraded to samba 4.8 end of last year and a patch
fixing this in our setup. It would be great if someone from samba dev
team could take a look and if acceptable apply the patch to the common
code base. It may also affect current stable and release candidates.
The patch expects the patch from BUG 13503 "getpwnam resolves local
system accounts to AD" being already applied.
Within the company i'm working for, we see frequently system
hangs/slowness for a couple of seconds on servers using winbind
passwd/group resolution via nsswitch.conf since we updated our OS from
CentOS7.5 to CentOS7.6 which includes a samba update from 4.7 to 4.8.
We could track it down to winbind and when it is asked for an unknown
local user account. This means that the users account in question is not
in local passwd and doesn't contain any domain like SOMEDOMAIN\account
or account at SOMEDOMAIN. The expected behavior is an immediately return
with an error like "no such user" or "unknown user", but instead a call
like "id unknown" takes 60+ seconds. Increasing "winbind max domain
connections" could reduce this to 10+ seconds and setting "winbind use
default domain" to yes could get it back to the expected immediately
response. A protocol about different setups can be found at the bottom.
As none of the config changes make sense as a requirement to me and
setting "winbind use default domain" to yes isn't usable on some of our
servers, i digged deeper using wbinfo to talk to the winbind more
directly and so avoid other services affecting testing.
The finding was pretty clear:
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|grep "winbind use
default domain" ; time wbinfo -i unknown
winbind use default domain = No
failed to call wbcGetpwnam: WBC_ERR_WINBIND_NOT_AVAILABLE
Could not get info for user unknown
real 1m2.522s
user 0m0.005s
sys 0m0.009s
[root at centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart winbind
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|grep "winbind use
default domain" ; time wbinfo -i unknown
winbind use default domain = Yes
failed to call wbcGetpwnam: WBC_ERR_DOMAIN_NOT_FOUND
Could not get info for user unknown
real 0m0.015s
user 0m0.005s
sys 0m0.005s
Doing some code research i could track it down to a logical change and
the return value of the function parse_domain_user from within
source3/winbindd/winbindd_util.c.
Calling the function with this conditions:
- none domain (e.g. empty)
- user without a domain part (e.g. not DOM\user or user at DOM)
- "winbind use default domain" set to No/false (which is the default)
causes different return values:
- up to version 4.7: false
- since version 4.8: \0 - e.g. empty string
Applying the attached patch that re-introduce the return value of false
instead of '\0' fixed the described issues and we now could revert back
to former config without changing "winbind use default domain" and/or
"winbind max domain connections" from their default values using our
patched version of samba.
Hopefully this helps others and i would appreciate if it gets into
common code base of samba, so it could get into usual update channels of
the distributions out there. For CentOS i already reported a bug (15795)
for further processing.
Best regards
Alex
#######
Here is a protocol of a trip through the different config settings on
one of our servers, which is reproducible on the other servers using
winbind and samba-4.8:
[root at centos7dev64 ~]# rpm -q samba-4*
samba-4.8.3-4.el7.x86_64
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server
role|winbind use default domain|max domain connections)" ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
winbind max domain connections = 1
winbind use default domain = No
id: unknown: no such user
real 1m8.630s
user 0m0.000s
sys 0m0.009s
[root at centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart
winbind ; sss_cache -E
[root at centos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server
role|winbind use default domain|max domain connections)" ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
winbind max domain connections = 10
winbind use default domain = No
id: unknown: no such user
real 0m10.914s
user 0m0.000s
sys 0m0.005s
[root at centos7dev64 ~]# vi /etc/samba/smb.conf ; systemctl restart
winbind ; sss_cache -E
[root at ecentos7dev64 ~]# testparm -v 2>&1 < /dev/null|egrep "(Server
role|winbind use default domain|max domain connections)" ; time id unknown
Server role: ROLE_DOMAIN_MEMBER
winbind max domain connections = 10
winbind use default domain = Yes
id: unknown: no such user
real 0m0.020s
user 0m0.002s
sys 0m0.003s
-------------- next part --------------
A non-text attachment was scrubbed...
Name: samba-4.8.9-fix_winbind_empty_domain.patch
Type: text/x-patch
Size: 459 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba/attachments/20190222/3eb9e8e7/samba-4.8.9-fix_winbind_empty_domain.bin>
More information about the samba
mailing list