winbind cache issue for NDR entries
shilpa.krishnareddy at gmail.com
Fri Jan 29 14:09:40 UTC 2021
We had a customer report that the users were not able to login for about
30minutes and the problem cleared itself in almost about 30minutes. They
are using Samba as a member server in a domain which has 2 way trust with
another domain (say ABC.COM). Upon investigation, we found that there was a
problem with trusted domain DCs for a very short duration as per the event
log on the DC of the primary domain. This problem seems to have been
cleared away after a short duration. Around the same time, a user belonging
to a trusted domain mapped Samba share and encountered a problem. At this
time, looks like NDR cache entry for trusted domain group "Domain Users"
was added in winbindd_cache.tdb to indicate that there was a lookup problem
and the status NT_STATUS_TRUSTED_DOMAIN_FAILURE was stored as part of this
entry. Once the issue with trusted domain DC was cleared and the domain was
back online, when users tried to login, PAM_AUTH was successful for the
users but getpwnam failed while looking up SID for "Domain Users". This
failure was returned from the entry in the winbindd_cache.tdb as
wcache_fetch_ndr() succeeded for this entry. Due to this, users belonging
to the trusted domain were not able to login. Once the cache was expired,
getpwnam succeeded for trusted domain users and the shares could be mapped.
In order to resolve this issue, should we not refresh the sequence number
when the domain goes online? Btw, we are using "winbind cache time = 1800".
More information about the samba-technical