A future module like idmap_hash or sssd? (was: Re: [PATCH] Check if the idmap_hash range is big enough)

Scott Lovenberg scott.lovenberg at gmail.com
Wed Mar 1 01:59:24 UTC 2017


> On Sun, Feb 26, 2017 at 2:59 AM, Andrew Bartlett <abartlet at samba.org> wrote:
>
> On Tue, 2017-02-21 at 10:16 +0100, Michael Adam wrote:
> >
> > Let me try to explain what the module does. This is not yet a
> > polished text, but may serve as a basis:
> >
> > =====================================================================
> > ====
> > The idmap_hash module calculates a Unix ID for a given SID as
> > follows:
> >
> > - Write the SID as DOMAINSID-RID.
> > - The module calculates a 12-bit hash value of the DOMAINSID,
> >   i.e. a value hash(DOMAINSID) between 0 and 4095.
> > - The unix-ID for SID is then calculated as
> >
> >     unix-id(SID) = hash(DOMAINSID) * 0x080000 + (RID % 0x080000)
> >
> >   (Note 0x080000 == 524288 and 4095 == 0x0FFF.)
> >
> >
> > Hence:
> >
> > - Each domain has its predefined fixed range of
> >
> >     hash(DOMAINSID)*0x080000 -- (hash(domainsid)*0x080000 + 524287)
> >
> > - The overall required range to be able to map all SIDs is
> >
> >     0 -- 4096 * 524288 - 1 = 2147483647
> >
> > This leads to a few issues:
> >
> > - Any range smaller than 0 - 2147483647 will filter some SIDs.
> > - Since we can not start the range at 0, some SIDs can *never*
> >   be mapped.
> > - Some domain SIDs will be mapped to the same range.
> > - RIDs will wrap around, i.e. DOMSID-RID and
> >   DOMSID-(RID+524288) will be mapped to the same ID.
> >
> > Hence the recommendation is:
> >
> >    DO NOT USE THIS MODULE!
>
> As I try to work to make the Samba AD DC more useful to posix-centric
> organisations, I would like to include (and perhaps eventually default
> to) a hash-based system for id allocation in the AD DC, so as to gain
> consistent but distributed uid/gid allocation.
[...]
>
> I'm told that sssd has a scheme like idmap_hash that is less offensive,
> is that the case, or is it just that it is outside Samba so we don't
> hear about the problems?
>
> I realise that all hashes are lossy, and there is always a non-zero
> chance that a domain will collide when a trust is established, but do
> you think there is a way we could make the default behaviour sensible
> and configurable (if not perfect)?
[...]
> Currently folks copy around an idmap.ldb file or have randomised UID
> values on each DC and member server by default.  Do you think we could
> we do something with automatic defaults for the vast majority of
> installs with well less than 500,000 RIDs?

I had a couple of thoughts, or more accurately - questions for which
I'm not nearly informed enough to even gauge my own lack of knowledge
on the topic. :)

First, that number, 500K RIDs, is if the hashing function has zero
collisions, even when being seeded from existing randomly, possibly
even in conjunction with linearly, allocated values, right? I'm
terrible at math, but that address space might get ugly way before
saturating the half million RID mark unless I'm mistaken (collisions
~50% of the time after 250K and quickly degrading from that point, I
would guess). That's probably not an important point, but I thought it
might be worth mentioning for consideration when taking into account
how many RIDs are usable in the given address space.

Second, this sounds like the exact definition of a distributed hash
table, as I'm sure you're painfully aware, or a reasonable case for
the use of RAFT***.  I understand that additional libraries (at the
least) and plumbing would be required, but I'm guessing that's the
case for any non-trivial solution to this problem space.  OTOH, a RAFT
solution already has most of the plumbing in place since there is
already NT4/"FSMO" election code and clusters are defined explicitly
or could be found via winbind/remote announce/whatever else, etc.
Does anyone have thoughts or insight on how well this problem space
maps to the ideal  (existing or theoretical) implementations of DHT
and/or RAFT solutions?

*** For those unfamiliar with RAFT and curious: https://raft.github.io
-- 
Peace and Blessings,
-Scott.



More information about the samba-technical mailing list