Idmap changes in 3.6

Fri Feb 4 07:22:19 MST 2011

List,

This email contains a proposal for changes to the idmap
configuration for the samba 3.6 release.
It is a long email since I felt I had to explain a lot
about id mapping as it was and is now in order to justify
the proposal.

I would be happy if someone really read the mail and
gave feed-back. The important bits start at
"Configuration inconsistencies ..." and especially
at "Proposal".  ;-)

Here we go:

Some weeks ago, I suggested on the mailing lists a small
change of the configuration to complete my previous rewrite of
the idmap system. This proposed change had the idea of making
the idmap configuration more systematic and more coherent,
namely to deprecate the "idmap uid" and "idmap gid"
parameters and introduce a single "idmap range" parameter,
making the default idmap config more similar to the "idmap
config <DOMAIN> : range = ..." configuration of individual
domains. Under the hood, it is assumed that the uid and
gid range be the same anyways, and the intersection is
taken if this is not the case.

Meanwhile I had a couple of discussions and further
experiences with idmap configuration in several situations,
and I would like to take a step back and propose a much more
radical change to the id mapping configuration in samba.

In order to explain why I am suggesting this, I need to
briefly review the past changes in the id mapping
subsystem and their consequences. (I may recall things
wrongly and misundestood other thing. I might oversimplify
some things. Please correct me if I am wrong.)

Original ID Mapping
===================

Originally, there was just on default id mapping configuraiton
configured with "idmap backend" and "idmap uid/gid".
Some backends like "rid" allowed for a multi domain
configuration in the backend parameter. These backends
like rid also automatically created mappings in the tdb
for domains not covered in the config line. I.e. the
tdb idmap backend was always a layer below the configured
idmap backend and potentially used as a fallback.

First Rewrite (Simo, 3.0.25)
============================

In 3.0.25 (late 2006/early 2007), Simo rewrote id mapping
to allow for multiple domains to be explicitly configured
with different backends and ranges. He deprecated the
"idmap backend" and related parameters and introduced
the "idmap config <DOMAIN> : backend/range/..." style
configuration. While this configuration scheme is such
is nice and pretty intuitive, the change had (imho) a
couple of problems:
1)
There was only one "unix id allocator", not one
in each idmap domain (that needed one). It was hence
separated from the id mapping code and exposed in the
configuration via parameters "idmap alloc backend",
"idmap alloc config : range", "idmap alloc config : ..."
etc. This was pretty confusing for a start. The fact
that there was only one allocator also had a couple of
limitations, namely that the allocator was effectively
related to the "default" id mapping configuration/domain,
i.e. the one config that should catch all idmaps for
domains not otherwise explicitly configured.
2)
The concept of the default idmap config was rather
awkward in the new idmap scheme, namely you had to
come up with a fake domain name that should not be
a real domain name and mark the idmap config as default
like this "idmap config FAKEDOMAIN : default = yes".

All this made the configuration rather complicated
and unintuitive in some ways. Another decision made here
was to not map sids from all domains by default but only
those from domains listed in the parameter "idmap domains".

Second Rewrite (3.3.0, Volker)
==============================

In Samba 3.3 Volker slightly rewrote the id mapping
again, after having despaired of the configuration
complexities. The goal was to make the configuration
more simple again. The basic changes were these:
1) "idmap domains" was removed, all domains were
   mapped again by default.
2) The "idmap backend/uid/gid" were undeprecated and
   now used for the default idmap configuration,
   hence "idmap config <domain> : default" was removed
3) "idmap alloc config : range" was removed, the idmap
   default range (idmap uid/gid) was always used,
   and it had been the default of the alloc range before.

The practical consequence of this (which had implicitly
already been the case before) was that there could only
ever be one "allocating idmap config" in a setup, and
it had to be the default config. I.e. the basic idea
was this: use tdb or ldap (or tdb2) as "idmap backend"
and define a large enough range, and configure any
explicit domains with "read only" backends such as
rid, ad, and others.

The basic configuration has been simplified in this
rewrite. The complexity of the alloc config was reduced
but it was still there. One major goal was to reestablish
the old "idmap backend = ..." style configuration as
the default configuration and integrate it with the
extended per-domain configuration.

Third rewrite (started 2009, Michael)
=====================================

One major point I criticised in the design of the first
and also the second rewite was that the alloc system
was separated from the id mapping and exposed to the
outside at all. It could be configured in smb.conf
and it was unclear to people if you had to configure it
and which combinations were possible and which were not.

At the code- and api-level, the separation of the alloc
system from the idmap system had the consequence that
the operations for creating ids and storing and deleting
id mappings were exposed at a very high level (winbindd API).

My feeling was that this allocation should be transparent
and and completely hidden beneath the mapping code.
I.e. the id mapping operations sids_to_unixids should
just create ids and mappings under the hood. Instead
winbindd operated like this: try to read mapping from db,
if it does not exist, allocate a new unix id and then
store the mapping.

I gave a talk about that and a proposed (and started)
rewrite at sambaXP 2009.
(http://samba.org/~obnox/presentations/sambaXP-2009/)
The goal was to reduce the idmap api to the
sids_to_unixids and unixids_to_sids operations and
hide the allocation in the backends that needed it
and to remove the low level operations from the high level
(winbindd api).

I did not complete the rewrite then since I hit the
complication that the idmap unixid allocator was not
only used in the id mapping code but also used in other
places as samba's general unixid pool: group mapping
and ldapsam:editposix relies on it. Due to lack of time
the rewrite not pursued.

Third Rewrite (for 3.6, Michael)
================================

In 2010, I picked up the rewrite and completed it:
It was needed to make the creations of id mappings atomic
in order to reduce the number of (tdb) transactions needed
for each mapping (to one), since in the clustered setup
with ctdb, such transactions are especially expensive.

The current state of the rewrite is this:

* Unixid allocation has been banned from the surface
  (from configuration as well as as API).
  - I.e. all the "idmap alloc" related parameters have vanished
  - The set_uid/gid_mapping/hwm methods have vanished
    from winbindd and from the idmap api.
  A lot of code has been unified and cleaned.

* The allocate_unixid method now calls into the
  default idmap config. The allocate_id method
  has been kept at the surface for this reason
  (still used in group mapping and ldapsam:editposix)

* The sids_to_unixids calls are now atomic
  (per backend).

* In principle it is now possible to
  use more than one allocating idmap domain.
  But the tdb and tdb2 backends have not been
  extended in their database format. Only the
  ldap backend is ready to do so database-wise,
  although this is not activated yet (for the
  sake of consistency with the tdb backends).
  But it can easily be done.

All this has been done and has simplified the
configuration more (basically by removing the alloc
parameters). It has removed some flexibility of the
previous versions (the flexibility to use one
backend as a allocator and another to store the
mappings - feature that is of very qustionable
practical use). It has increased the flexibility
in another direction: It has made it possible
in principle to configure multiple "allocating"
idmap domains (as said above: this only needs to
be activated).

Configuration inconsistencies to be resolved
============================================

There are still inconsistencies with the configuration
that could be addressed and removed in changes
that I am about to propose.

The first thing to observe is that
the default config uses "idmap uid" and "idmap gid"
in contrast to "idmap config  <domain> : range".

Under the hood, the uid and gid ranges are
intersected anyways to build a common idmap range.

So my first suggestion on the mailing lists
some weeks ago was to introduce an "idmap range"
parameter instead and deprecate the uid and gid
range parameters. There have not been objections
to this.

But I have observed later on that the problem is
more severe than just this little superficial
inconsistency.

For instance for the ldap backend code, there is
a horrible logic mixture between the parametric
options and fallbacks to various standard ldap
options, especially the selection of ldap secrets.
Fortunately, this will never really have bitten
anyone up to now, since starting with 3.3, the ldap
backend was only useful in the default config anyways.

The most severe problem I am seeing here is that
the backwards compatibility to the original idmap
configuration is an incomplete and in a sublte way
deceiving one:

For the tdb backend all is fine. But for instance
the rid backend can not really be useful as the
default backend any more, since there is no fallback
to storing tdb mappings in the newer code.
Hence all the older configs of the form
"idmap backend = rid:<DOMAIN>=xxxx-yyy..." do not work
as expected any more. There even have been bug reports
about this (e.g.  https://bugzilla.samba.org/show_bug.cgi?id=7788).
There are more examples. I have spent quite some time
in the past to fix idmap configurations that were
based on misconceptions.

This failure to be really backwards compatible and all
those misunderstandings have lead to the idea of making
a more radical change to the idmap configuration in 3.6.
This idea came up lately when I was discussing yet another
idmap problem with Björn.

Proposal
========

The idea is this:

We should drop all the old configuration options,
and have new options for the the configuration of the
default idmap domain that integrate nicely and systematically
with the per-domain idmap config (idmap config <domain> : abc).

This way users have to be awar that something has changed
and that they should not feel themselves in the deceiving
safety of backwards compatibility.

It is also a chance to unify and simplify the code for
parsing the options.

One idea is to use the domain name '*' to configure
the default idmap config, i.e.

"idmap config * : backend = tdb" (the default)
"idmap config * : range = 10000-20000"

and so on.

This is appealing to me for the following reasons:

1. It is completely consistent with the remaining
   configuration.

2. The patch (which I will present soon) will consist
   mainly of removals.

3. Internally "*" is used as a name for the default
   idmap config anyways.

4. The backends have a dedicated place to look for their
   options also for the default config. Especially for the
   ldap backend this will clarify a lot.

Opinions?

In respect of people waiting (including me) to release
Samba 3.6, I would like to apologize for the long time it
hast taken me to write this email. I also apologize for
the length of this mail, but I have really thought a lot
about the id mapping in the last year and I felt I had
to share some of these thoughts with you as a reasoning
for the proposed changes.
(The length of the mail is also a reason for the delay
 in writing it... :-)

Cheers - Michael

Andrew Bartlett wrote:
> On Thu, 2011-01-27 at 01:09 +0100, Michael Adam wrote:
> 
> > I would like to suggest a radical but very consistent subsequent
> > change to idmapping that will get rid of the old and often
> > confusing configuration that keeps questions popping up on the
> > mailing lists and in bugzilla because it creates an illusion of
> > a backwards compatibility that is in many cases deceiving.
> > 
> > It is to late for me right now to write down the details.
> > I will send a follow-up mail soon.
> 
> Michael, 
> 
> As we seem to be in a mood for discussing parameter changes, I wondered
> if you had found time to detail your plans are for idmap?
> 
> Andrew Bartlett
> 
> -- 
> Andrew Bartlett                                http://samba.org/~abartlet/
> Authentication Developer, Samba Team           http://samba.org
> Samba Developer, Cisco Inc.
> 

-- 

i.A. Michael Adam

-- 
Michael Adam <ma at sernet.de>
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.SerNet.DE, mailto: Info @ SerNet.DE
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 206 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20110204/76cbf13f/attachment.pgp>