A future module like idmap_hash or sssd? (was: Re: [PATCH] Check if the idmap_hash range is big enough)

Sun Feb 26 08:59:35 UTC 2017

On Tue, 2017-02-21 at 10:16 +0100, Michael Adam wrote:
> 
> Let me try to explain what the module does. This is not yet a
> polished text, but may serve as a basis:
> 
> =====================================================================
> ====
> The idmap_hash module calculates a Unix ID for a given SID as
> follows:
> 
> - Write the SID as DOMAINSID-RID.
> - The module calculates a 12-bit hash value of the DOMAINSID,
>   i.e. a value hash(DOMAINSID) between 0 and 4095.
> - The unix-ID for SID is then calculated as
> 
>     unix-id(SID) = hash(DOMAINSID) * 0x080000 + (RID % 0x080000)
> 
>   (Note 0x080000 == 524288 and 4095 == 0x0FFF.)
> 
> 
> Hence:
> 
> - Each domain has its predefined fixed range of
> 
>     hash(DOMAINSID)*0x080000 -- (hash(domainsid)*0x080000 + 524287)
> 
> - The overall required range to be able to map all SIDs is
> 
>     0 -- 4096 * 524288 - 1 = 2147483647
> 
> This leads to a few issues:
> 
> - Any range smaller than 0 - 2147483647 will filter some SIDs.
> - Since we can not start the range at 0, some SIDs can *never*
>   be mapped.
> - Some domain SIDs will be mapped to the same range.
> - RIDs will wrap around, i.e. DOMSID-RID and
>   DOMSID-(RID+524288) will be mapped to the same ID.
> 
> Hence the recommendation is:
> 
>    DO NOT USE THIS MODULE!

As I try to work to make the Samba AD DC more useful to posix-centric
organisations, I would like to include (and perhaps eventually default
to) a hash-based system for id allocation in the AD DC, so as to gain
consistent but distributed uid/gid allocation.  

Could a module that runs on the DC (and so write to the directory)
address some of these concerns?

I'm thinking we could use the trustPosixOffset to store the hash
offset.  While there is a non-zero chance that two disconnected forests
may gain a inter-forest trust and so the same SID hash, we could at
least detect that.

We could also detect via rid allocation pools if we are using more than
one 'slot' in our own domain (if not for other domains).

Finally, I would like to write the uidNumber or gidNumber generated by
this hash into the directory at user creation time, knowing that RID
allocation is unique and avoiding re-creating another distributed
number allocation scheme. 

I'm told that sssd has a scheme like idmap_hash that is less offensive,
is that the case, or is it just that it is outside Samba so we don't
hear about the problems?

I realise that all hashes are lossy, and there is always a non-zero
chance that a domain will collide when a trust is established, but do
you think there is a way we could make the default behaviour sensible
and configurable (if not perfect)?

Currently folks copy around an idmap.ldb file or have randomised UID
values on each DC and member server by default.  Do you think we could
we do something with automatic defaults for the vast majority of
installs with well less than 500,000 RIDs?

Thanks,

Andrew Bartlett

-- 
Andrew Bartlett                       http://samba.org/~abartlet/
Authentication Developer, Samba Team  http://samba.org
Samba Developer, Catalyst IT          http://catalyst.net.nz/services/samba