winbind cache issue for NDR entries
asn at samba.org
Sat Jan 30 19:48:53 UTC 2021
On Friday, 29 January 2021 22:05:11 CET Jeremy Allison via samba-technical
> On Fri, Jan 29, 2021 at 07:39:40PM +0530, Shilpa K via samba-technical
> >We had a customer report that the users were not able to login for about
> >30minutes and the problem cleared itself in almost about 30minutes. They
> >are using Samba as a member server in a domain which has 2 way trust with
> >another domain (say ABC.COM). Upon investigation, we found that there was a
> >problem with trusted domain DCs for a very short duration as per the event
> >log on the DC of the primary domain. This problem seems to have been
> >cleared away after a short duration. Around the same time, a user belonging
> >to a trusted domain mapped Samba share and encountered a problem. At this
> >time, looks like NDR cache entry for trusted domain group "Domain Users"
> >was added in winbindd_cache.tdb to indicate that there was a lookup problem
> >and the status NT_STATUS_TRUSTED_DOMAIN_FAILURE was stored as part of this
> >entry. Once the issue with trusted domain DC was cleared and the domain was
> >back online, when users tried to login, PAM_AUTH was successful for the
> >users but getpwnam failed while looking up SID for "Domain Users". This
> >failure was returned from the entry in the winbindd_cache.tdb as
> >wcache_fetch_ndr() succeeded for this entry. Due to this, users belonging
> >to the trusted domain were not able to login. Once the cache was expired,
> >getpwnam succeeded for trusted domain users and the shares could be mapped.
> >In order to resolve this issue, should we not refresh the sequence number
> >when the domain goes online? Btw, we are using "winbind cache time = 1800".
> Yep, looks like we should add a call to force a refresh of the
> sequence number in the cache here:
> 539 domain->online = True;
> Add a force_refresh_domain_sequence_number(domain) call above.
> Here is a (raw, untested) patch that implements this.
> Any chance you can test this for me ?
I wonder if this is the dc-connect issue with trusted domains.
A fix for this we are currently using is:
This is just a hack as the right fix would be to completely get rid of the dc-
connect child. However the winbind parent needs the dc-connect just to refresh
the secquence number.
Isaac started to investigate this further and just had a draft for this which
was never finished. We really need to fix this correctly.
More information about the samba-technical