samba_kcc overview and current state information

Dave Craft wimberosa at
Wed Jan 11 14:55:35 MST 2012

I'm preparing to do an overview of the samba_kcc code with some team mates
and to that end I've produced an outline of samba KCC information and where we
stand.   The following note is part of that overview and is posted
here mostly for

Warning...very long note applicable only to people really wanting to
know where we
are with the samba kcc python implementation and what the code looks like:

The Samba KCC implementations currently consist of the older C code
present in

 directory and the newer Python implementation in

The older implementation is still in use and primarily produces a set of N to N
NTDSConnections for every DC (even across sites).    Thus every DC must check
with every other DC to retrieve updates to database records.   This
type of connection
production is the maximum set of connections that could ever be needed and thus
guarantees that replica updates are always seen.

The newer implementation attempts to follow the MS-Tech ref fairly
closely and thus
implements two separate topology algorithms.   The first (intra-site) mimics the
MS-Tech ref for intra-site very closely.   As such intra-site produces
for a subset of DCs that are determined should be in-bound
(fromServer) to the local
DC.   This means that the local DC need only consult a subset of all
DCs in the site
to get proper replica information.   There are some cardinal rules in
this algorithm that
are outlined in the tech-ref but the most obvious are things like:

    in-bound full NC replicas should be replicated via connections to
other DCs that also
    have full replicas

    a read/write replica should only be replicated thru a connection
to another DC with
    a read/write replica (ie. a replica to a read write DC should not
pass thru a RODC)


Note that the older implementation violates some of these rules.

You can find an overview of these types of rules at this URL:
or search for "Knowledge Consistency Checker Overview"

Under the MS Tech-Ref you will find an outline of the steps of the KCC and note
that there are about 7 major steps which include both intra-site and inter-site
replication algorithms.    The major steps to the algorithm are
outlined in subsections
below the overview URL: Refresh kCCFailedLinks and kCCFailedConnections Intrasite Connection Creation Intersite Connection Creation Removing Unnecessary Connections Connection Translation Remove Unneeded kCCFailedLinks and kCCFailedConnections Tuples Updating the RODC NTFRS Connections

The newer samba KCC implementation mimics the above flow with a few absent
function implementations stubbed with (XXX - implementation needed).
The stubbed
functions are,, and    These functions are
currently no-ops.

The newer samba KCC implementation also implements the second major part of the
KCC topology, which is "inter-site".    The algorithm we implement
however does not
adhere as closely to the MS-Tech ref as the "intra-site" algorithm
however.    This is
mainly an artifact of time and current do-ability.  The implemented
algorithm however
does implement quite a lot of it already (with more to come) and
covers the major
pieces such that inter-site replication mostly gets the correct
answer.    Notably the
flow is almost exactly the same as the published spec and the
computation of ISTG,
bridgeheads, and the color of graph vertexes is present.    We however
do not produce
a minimum cost spanning tree.   Instead the newer implementation
currently computes
bridgeheads for every site and makes NTDS Connections between the local site
and ALL other sites.   That means that we are producing a super-set of
instead of a set of NTDSConnections in a multi-site minimum cost spanning tree.

Now I stated "mostly correct answer" in the following paragraph
because there is an
important missing function that must be filled in next and that is the
(within the daemon) of kccFailedConnections and kccFailedLinks.
These are actually
runtime tuples that the samba kcc should consume in order to select
fromServers and
bridgeheads.    Currently we assume that all DCs are alive and
available to be targets
of the NTDSConnections.    This is not correct because we while we are
producing a
superset of NTDSConnections to other sites we are however possibly
selecting dead
bridgeheads in those sites.   That means that we may not be receiving
replica information
from other sites that have bridgheads (which we've selected) which are down.

Note that I have not investigated much in regards to the production
(in the daemon) and
consumption of these failure tuples by the samba_kcc.   I believe they
are retrievable
by some defined RPC but not totally sure.   As an aside it would
probably be useful to
be able to give these tuples to the samba_kcc python script via some
input file as well
as via RPC.   I say this because currently the samba_kcc can run
standalone against
an abbreviated database (see --exportldif / --importldif) retrieved
from some (possibly)
third party set of DCs (e.g. a customer environment) and there may not
be a DC to
communicate with via RPC when running this way.    Hence in addition
to --importldif
we may need an --importtuple option (or whatever we want to name it)
to also simulate
the failed links/connections information that are used in bridgehead selection.

The newer samba_kcc implementation is invoked from the dsdb/kcc/kcc_periodic.c
code if the (kccsrv:samba_kcc=true) variable is set in smb.conf.
Usually for test
purposes you just run samba_kcc by hand or from test scripts as no
running daemon
is currently necessary.

The older samba implementation will eventually need to be pruned when we switch
over to samba_kcc.   Part of the code in dsdb/kcc is a file kcc_topology.c which
is a currently inoperable version of inter-site.   It is there for
example only but I don't
think anyone needs to examine it because the MS-Tech ref has very good
information.   Note that while the MS-Tech inter-site algorithm is
vastly more complicated
than the intra-site implementation, it is still vastly easier to
understand because the
control flow is outlined in clear pseudo code that translates to
python pretty easily.

On the flip site the MS-Tech intra-site information is written in more
of a paragraph
format and its easy to get tripped up when trying to transcode to any
language.   Here's
an example from the tech ref:

        If s and the local DC's nTDSDSA object are in the same
        site, cn!transportType has no value, or the RDN of
        cn!transportType is CN=IP:"

Could be a slightly clearer to me if it had an "or" between each condition as
the above statement should be interpreted as:

         IF (same-site) OR (no-value) OR (type-ip)

That's probably obvious to others but the point I'm making is that you can get
tripped up easily and you must interpret what the logical implementation is
as opposed to turning your brain off.

In some cases I've pointed out MS-Tech errors in the samba python code
with comments like:

        #  <description of what is wrong>

and there are a couple of instances where this occurs.

Now .... on to the code overview....

The samba_kcc file implements one major class called (KCC).   The
global main function
is at the bottom of this file and instantiates the KCC class and calls either
      export_ldif(),  import_ldif(), or run()

The names here should be mostly self evident but run() is the main
topology computation
entry point and run() calls each of the main steps outlined above
(e.g. intrasite(),
intersite(), translate_ntdsconn(), etc).

Note that it is very important that if a new LDAP object or attribute
is needed by the
topology algorithm (e.g. if you add an attribute to a samdb.query())
then you absolutely
must add the attribute to the appropriate search in export_ldif().
If you don't do that
then a LDIF that you extract from a third party database won't have
the attribute that
the newly changed algorithm relies upon.   So the rule is....

          (new samdb.query attributes == change export_ldif())

The file implements other classes that are invoked and utilized
by the KCC class.   Running pydoc on these files will get you lots of
but here's some overview of the major classes in this file:

class Site - found in the configuration partition of the sambd.   The class
contains its dn as well as interSiteTopologyGenerator, interSiteTopologyFailover
attributes, and a table of all DirectoryServiceAgents (DCs) under the site.

class DirectoryServiceAgent - a DC within a Site. The class can be queried to
determine if the DC is a RODC or ISTG and contains a table of NTDSConnections
under the DC as well as current and needed NC Replica tables.    With the
list of NC replicas loaded you can query whether an NC "is present" or
"should be present" on the DC.   Note that both "is present" and
"should be present"
are defined very specifically in the MS-Tech reference.

class NamingContext - a class that contains a small amount of code and
that is inherited by class NCReplia and class Partition.   The class
can enumerate what type (domain, default, application, configuration, schema) it
is.  NCs found in the rootDSE are pretty easy to determine what type
they are whereas
types like (application NCs) and (non default domain NCs) are a bit harder.

class Partition - a class that inherits from NamingContext.  The list
of partitions is
loaded from the configuration partition of the samdb.  The class
Partition has two lists
enumerating DNs for DCs that have read only and read write replicas of the NC.

class NCReplica - a class that inherits from NamingContext.   These
are the objects
that are per DC and enumerate if a particular NC is present as a
replica on the DC.
Lists of instances of this class can thus be found in each
DirectoryServiceAgent class.

class GraphNode - used only in the intra-site topology algorithm.
This class enumerates
a list of edges pointed (or should be pointing) at a particular DC.
It manages the algorithm
regarding what are the maximum / minimum number of edges required in
the intra-site
topology graph.   This eventually correlates to the number of
NTDSConnections that
are created for a particular DC.

class RepsFrom - this is a special class that manages the
repsFromToBlob in a manner
such that you don't have to understand the version.   It also handles
that python reference
problem that we have with NDR code.   Basically it keeps a python
reference for all
assignments to the repsFromToBlob NDR element.   If it didn't do that
then a very subtle
bug shows up.   This is enumerated in the (WARNING) comment at the top
of the class.

class SiteLink - enumerate the replication schedule, cost, etc across
site replication and
contains a list of sites that it is applicable to.   You can take the
dn of two sites and give
them to SiteLink.is_sitelind(dn1, dn2) and it will tell you if this is
the proper site link to
use in topology computation.

class Transport - the inter-site transport object class.  Contains the
as well as the transportAddressAttribute.

class NTDSConnection - one of the primary classes of the KCC.
Instances of this class
may be in various states in regard to being committed to the database.
  For these types
of objects that may be changed in the database, you will usually see
attributes (to_be_deleted,
to_be_added, and to_be_modified).   If the attribute has not yet been
deleted, modified or
added to the database persistently then one of these flags will be
set.   The commit
functions for a class that can be modified is usually commit_<name>() such as
commit_connections() which will make the appropriate modification in
the samdb and
cut off these flags.   Usually the KCC flow will run a major portion
of the algorithm
(e.g. such as intrasite()) and will then call commit_connections() on
modifications made
to the "in-memory" NTDSConnection class state.    Note also that you
can run samba_kcc
in --readonly mode which causes only the "in-memory" state to be
modified but the database
is not persistently updated.


Regards, Dave Craft
Cut the headlights and put it in neutral.

More information about the samba-technical mailing list