[Samba] Winbind and caching - idmap, DC

Alexey A Nikitin nikitin at amazon.com
Fri Oct 18 20:13:25 UTC 2019

On Friday, 18 October 2019 12:29:45 PDT Rowland penny via samba wrote:
> On 18/10/2019 19:45, Alexey A Nikitin wrote:
> > On Friday, 18 October 2019 10:52:40 PDT Rowland penny via samba wrote:
> >> On 18/10/2019 18:26, Alexey A Nikitin via samba wrote:
> >>> Hi everyone,
> >>>
> >>> I have few questions about Winbind on AD DS domain member I'm having difficulty finding answers to in the docs on my own:
> >>> * does Winbind remember the last DC it was connected to on shutdown, will it attempt to connect to the same DC on restart or will it go through DC location process again?
> >> I don't think it does cache the last DC and as it might not always use
> >> the same DC in the same session, it doesn't really matter
> > Under some corner cases which we're hitting (concerning replication delay, I believe) it does.
> >
> >>> * If yes, will that information be wiped out when one runs 'net cache flush'?
> >>> * If yes, is 'net cache flush' necessary when changing idmap configuration? It seems even after winbind restart querying user info still returns old UID, before the idmap config change :-/
> >> What do you mean 'changing idmap configuration' ?
> >>
> >> Why are doing this and more importantly, how are you doing this ?
> >>
> > Meaning I'd gladly use something like autorid, but apparently it doesn't allocate the range for a given domain until someone from that domain actually authenticates on the machine. As I need to make certain configuration for the user before they log in for the first time, this doesn't work for me. So I use rid backend, but my understanding is that backend needs ranges configured for specific domains, and I don't know ahead of time what domain user belongs to, I only have user's SID. So the current approach is I use default idmap backend (tdb) after initial domain join to query user info and get their domain, then I stop winbind, change backends to autorid for * and rid for user's domain, something like
> >
> >          idmap config * : backend = autorid
> >          idmap config * : range = 100010000-2100010000
> >          idmap config * : rangesize = 100000000
> >          idmap config <DOMAIN> : backend = rid
> >          idmap config <DOMAIN> : range = 10000 - 100010000
> >
> > With this approach it appears I have to flush Winbind cache before I can query user info again and get the UID based on the new idmap configuration instead of the old default tdb idmap. Once I get the new UID I can make the necessary configurations and finish the script.
> >
> >>> * If yes, can the cache be wiped out selectively, only the idmap cache without the last DC cache (assuming the answer to first question is yes)?
> >>> * If no, can 'net cache flush' be done while Winbind is running, will it achieve the desired effect with regards to SID-UID id mapping change without losing connection to a particular DC?
> >> It shouldn't matter which DC you connect to, for a given smb.conf, you
> >> should always get the same UID for a given user.
> >>
> > Sure, but I have to change the smb.conf as an ugly workaround for a limitation outside of my control (I have only user's SID, I don't have their domain nor do I even know how many domains are there in the forest), so UID is expected to change. DC change, on the other hand, is undesired, because this config change and Winbind restart is scripted and happens within seconds after initial domain join, and it is my understanding that if change notification is disabled within AD DS then the new machine account simply doesn't get enough time to replicate to the rest of the DCs in the domain, which is why I'm seeing something like this in the logs after 'net cache flush' and Winbind restart:
> >
> > [<TIMESTAMP>,  1] ../source3/winbindd/winbindd_cm.c:1300(cm_prepare_connection)
> >    Failed to prepare SMB connection to <DC>: NT_STATUS_LOGON_FAILURE
> > [<TIMESTAMP>,  1] ../source3/winbindd/winbindd_cm.c:1160(cm_prepare_connection)
> >    authenticated session setup to <DC> using <DOMAIN>\<HOSTNAME>$ failed with NT_STATUS_LOGON_FAILURE
> >
> > This issue is intermittent, but seems to be more likely the more DCs and sites there are in the domain. Likely some AD DS misconfiguration is involved too, as I see sometimes Winbind connecting to a DC in a wrong site, but that is also outside of my control. There is also a timeout value outside of my control that limits how long I can wait and retry for.
> >
> > One more thing to note is that Windows machines have zero issue in this situation. One reason for that is we don't have to fiddle with idmap at all, and we can just use SID directly to set up the necessary configurations for the user, but the other is that AFAIK Windows NetLogon service (at least according to the MS docs) actually caches the last DC it was using, and on start tries to re-use it, going into DC location only if either it cannot connect to the cached DC or that DC on connection reports that the client site has changed and it should locate a different, closer DC.
> >
> > Thanks!
> If you are using 'sites', then you should also be using subnets and your 
> clients should only use DCs in their site, but you also mention 'forest' 
> which is a collection of domains, not sites.
> Just what configuration do you have to do ?

Any and all. 'I' am not using either sites or forest, the domain setup is completely out of my hands. I have zero control over the domain(s) setup, and almost zero knowledge of the configuration I have to deal with, all I have is the creds for the service account used to join the machine to the domain, the name of the domain (and OU in that domain) I'm supposed to join the machine to, and the SID of the user I need to pre-populate the profile and make certain other configurations for, that's all. Everything else I have to discover as I go.

>  From my understanding 'autorid' could have been written for you.

And I do make use of it, just was missing the part where I could pre-allocate the mapping range for a given domain SID without having someone from that domain authenticate on the machine, so was using rid backend for the specific given domain, the name of which I have to discover, hence the hack. I'll try pre-allocating the range for the domain of interest using its SID and see if that solves the problem in a way that I don't have to do this ugly hack with idmap re-configuration after the join anymore.

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: This is a digitally signed message part.
URL: <http://lists.samba.org/pipermail/samba/attachments/20191018/f980bfc8/signature.sig>

More information about the samba mailing list