Proposal/Idea: Remove support for using rfc2307 attributes for s4 id-mapping?

Mon Oct 15 18:13:23 MDT 2012

On Mon, 2012-10-15 at 23:39 +0200, Michael Adam wrote:
> Simo,
> 
> On 2012-10-15 at 11:46 -0400, simo wrote:
> > On Mon, 2012-10-15 at 16:51 +0200, Michael Adam wrote:
> > > Hi Simo,
> > > 
> > > On 2012-10-15 at 10:25 -0400, simo wrote:
> > > > On Mon, 2012-10-15 at 15:17 +0200, Michael Adam wrote:
> > > > > 
> > > > > Am I missing something important here?
> > > > 
> > > > Sorry Michael, I think this would be a very bad mistake.
> > > > 
> > > > I actually would think we should use *only* rfc2307 attributes, as those
> > > > are the authoritative ones when an admin wants to use them.
> > > 
> > > I was thinking the same initially.
> > > A long and fruitful discussion with Matthieu and Metze
> > > convinced me of the opposite..
> > > 
> > > > What are the exact difficulties here ?
> > > 
> > > - We want the users in our AD to be able to acces files in the
> > >   shares, e.g. the sysvol share, right?
> > 
> > ack
> > 
> > > - Hence they need unix IDs (for their token SIDs).
> > >   These should be added automatically when users are created
> > >   or when users are connecting to the file server on the DC.
> > 
> > ack
> > 
> > > - While this would not be problematic in a single-DC setup,
> > >   we have no reliable and no official way of working with the
> > >   unix-ID-pool in a multi-DC setup. (There is no concept of
> > >   unix-ID master (fsmo) as there is for rids.)
> > > 
> > >   I know that multi-DC is not the focus for 4.0, but I am convinced
> > >   this will bite us quite badly later..
> > 
> > why is this a problem ?
> > if you use algorithmic mapping all you need to do is rid -> uid
> > and why wouldn't we do algorithmic mapping ? It's the thing that makes
> > the msiot sense.
> 
> That does not work, at least it can not always work:
> 
> - iirc, initially support for rfc2307 was introduced as part of
>   the s3->s4 upgrade migrating samba-ldap posix attributes to the
>   ad rfc2307 attributes. ==> no algorithmic mapping
>   (hence that would only work for new provisions)
> 
> - I think nothing prevents the admin of creating a user object
>   including the rfc attribute before samba's mechanism can kick in.

Yes admins are all powerful they can create quite some trouble, they can
also load arbitrary mapping in any idmap backend that do not make sense.

> > > - Windows does not automatically create the posix-IDs
> > >   when creating users/groups. If we would, how would you
> > >   ensure consistency when in a multi-DC setup admins are
> > >   working on multiple DCs simultaneously with the user manager?
> > 
> > When a samba DC gets the replication of a new user it adds the
> > attributes if they are missing. It would be a quite simple plugin, and
> > if you use algorithmic mapping it doesn't matter which samba server does
> > it first or if multiple ones do that, they all add the same data hence
> > conflict resolution should have no issues when they try to replicate
> > back.
> > 
> > > - Thinking this through further: In a heterogenious environment,
> > >   the admin working on a windows-DC (by chance) would create
> > >   users that would stay locked out of the samba file server, or
> > >   would get fallback unix-IDs from idmap.ldb if that was still
> > >   enabled.
> > 
> > No, see above.
> 
> This is simply not how the SFU AD extension was designed by Microsoft:
> What if an admin creates a user when working on a windows DC
> and manually assigns a posix user id to that object before it
> is replicated.
> ==> oops

What is the oops ? Admins need to know how the system works, if they act
against the system too bad for them.
Note that we can still have an idmap.tdb caching backend to resolve non
algorithmic mappings if needed, what I am saying is that shouldn't hold
us back.

> Or else, the object gets replicated out to a samba DC, which adds
> an algorithmic ID, but the admin modifies the object, adding a
> unix ID before the change is replicated back from the samba DC.
> ==> oops

If the admins changes the ID it will get replicated around and conflict
resolution will take place. Admins need to know that samba will
automatically assign IDs, if they want to change assignments after the
fact it is up to them.
We allow the same thing in IPA and document the caveats, all works just
fine.

> So it _might_ be possible to get it working correctly with a
> domain consisting only of samba DCs. But it will fail miserably
> with a heterogenious domain.

Always fail ? Come one stop with the hyperbole. It will fail only for
admins that want to break it. Admins that reads a minimum of docs/ask on
list will know how the system is supposed to work.

> In short you are trying to mix a database (the AD SFU posix
> attributes) with an algorithmic mapping. That is doomed.

I am just saying that should be the default assignment. I do not see
what's doomed here. Admins can change assignments, they will need to
understand the consequences (which will not be very dramatic unless you
change long help IDs already used in ACLs on files or if they are clumsy
and assign duplicate IDs.

> > > - Addressing one frequent request:
> > >   There is no good reason I know for requiring a user/group to
> > >   have the same unix-ID on all DCs for a given domain.
> > 
> > Of course there is, you need sysvol and all shares to be consistent
> > across the domain for file replication,
> 
> File replication should ideally be done with windows mechanisms (DFSR).

Right now we have no replication, and it is done via rsync, but in any
case, consistent UIDs across the domain is just a requirement. You can't
be serious in wanting to allow different machines in the same domain to
have different UIDs for the same users depending on which DC you are
connected to, that's just masochism.

> > for rsyncing,
> 
> If you need to use rsync between two samba hosts e.g., you
> should *not* use --numeric-ids. Going via the name will work
> even if the IDs are not in sync.
> 
> > for exporting stuff via NFS if it is needed.
> 
> I'd say omit serving NFS from a S4 AD DC by all means!
> What is more, I'd suggest to not use the DC for
> extensive file serving (SMB) if possible.
> Rather stick to sysvol and netlogon and add member
> file servers...

And yet people will do it, and will do backups and all other things.
Come on be serious, we can have the same UIDs acros the domain and it
makes no sense to not have them consistent.

> After all, a bunch of DCs is not a parallel NFS cluster.
> 
> This is a very frequent misconception that it helps
> having IDs on multiple samba servers in sync (unless
> of course it is a cluster that appears as a single
> server to the client). We don't need it. I might only
> give a deceivingly safe feeling. :-)

No you really need to have the same IDs across the domain, move a disk
from one machine to another and you know you do. (and again NFS and any
other network filesystem including cifs.ko with unix extension calls for
it, don't kid yourself).
You can somehow cope with different IDs, but it is a lot less painful to
just keep them synchronized.

> > >   All communication between the DCs is done via SIDs not unix-IDs.
> > >   With the local idmap.ldbs, this just works (tm). No need to
> > >   publish that extra complexity (id mapping) to the outside,
> > >   creating an additional DB that needs to be kept in sync.
> > 
> > As long as the mapping is algorithmic it is a non-issue.
> 
> As pointed out above, algorithmic is not (always) possible.

And it is not a strict requirement as long as the admin does not
voluntarily create duplicates. Algorithmic assignment make slife easy,
but admins can assign what they want ultimately, because in the
directory you have all the mapping you need (user sid - uidNumber, group
sid - gid number).

> > > > Andrew pointed out some issues with IDAMP_BOTH as the SDC, but I think
> > > > we can find a method to handle idmap_both, without too much pain.
> > > 
> > > No, that was not the point I was making.
> > > You are right, these problems will get solved.
> > > 
> > > Have I been able to make my thoughts a little clearer?
> > > Am I making sense?
> > 
> > Sorry I do not see any of the issues you raised relevant unless we want
> > to keep the madness of dynamic assignment of mappings.
> > 
> > That was needed on member servers because we could get whatever random
> > SID, but on a DC we have just one domain SID and we can control ranges
> > and have a very simple algorithmic mapping.
> 
> We could think about an algorithmic mapping for the s4-dc's id
> mapping (for its own domain), but *not* for storing it in the RFC attributes.

Can you explain why not ? You haven't yet told me why you wouldn't use
rfc2307.
It is a very simple mapping database sid/xidNumber in objects that
define the ID type, doesn't get simpler than that, and the DC can do the
assignment on it's own using a simple algorithmic assignment, what do
you want more?.

> > We have the chance of simplifying and make this stuff rock solid at
> > least for the DC case.
> 
> Yes, this is precisely my intention!

Apparently we have very different views, and unfortunately you haven't
conveyed yet what is difficult in using rfc2307, so far I have read only
handwaving. Can you please summarize in a few points the actual
difficulties ?

> > We have also wanted forever to have a 'standardish' way to publish unix
> > IDs in the classic samba DC case, we never did the unix info pipe but
> > always wanted to.
> > Now we have rfc2307 attributes that give that us for free and you want
> > to throw it away ? No way.
> 
> No, I want to keep it. As a service for external users (member
> server), but not using it for the DC.

I totally don't get it, if it is good for member servers why should it
be different for DCs ??? It matters exclusively in the 'file serving'
part of the DC which uses the same file server as member servers... you
totally lost me here.

Simo.

-- 
Simo Sorce
Samba Team GPL Compliance Officer <simo at samba.org>
Principal Software Engineer at Red Hat, Inc. <simo at redhat.com>