[PATCHv3] Faster handling of linked attributes

Douglas Bagnall douglas.bagnall at catalyst.net.nz
Thu Feb 2 21:58:16 UTC 2017

This time I think I have it right.

As a side-effect of getting it right, we have also accidentally made it
faster, with a join of an (admittedly artificial) 5000 user DC now
taking one fifth of the time, compared to origin/master.

My original description of the work (quoted below) is still accurate.


On 12/01/17 15:44, Douglas Bagnall wrote:
> In AD, linked attributes are complementary pairs of attributes that
> describe a relationship between two objects. One of these attributes
> is called the "forward link"; adding this implies and automatically
> creates the corresponding "back link".
> The canonical example is group membership: when you add a user to a
> group, you add a "member" attribute to the group that points to the
> user, which magically creates a "memberOf" attribute on the user
> pointing to the group. Groups can have many members.
> If we didn't manage linked attributes in a vaguely sensible way, we
> would end up having to traverse all the existing members of a group
> for simple operations like adding or deleting users. For domain
> replication we would be performing this traversal for each user being
> replicated, meaning a group would have a potential (and often real)
> cost proportional to the square of the number of members.
> This hypothetical situation describes what we currently do. To
> compound our misfortune, we have to explode the extended DN of every
> linked object in order to find a GUID to compare against, which turns
> out to be far more expensive than the comparing the GUIDs or shifting
> the values about.
> The solution in these patches is to store the links in a sorted order
> in the database. When a new value is added we bisect the array to find
> the link or insertion point. This means we only have to explode the
> DNs of bisection pivots.
> The existing code already uses a binary search, but it can only do so
> after the expensive work of exploding, parsing, and sorting all of the
> DNs. The binsearch macro it uses is only gives an absence/presence
> answer, which means we end up searching multiple times to find the
> exact status of each member. Using a binary search that returns a
> pointer to the object found allows us to simplify the code in many
> places (by some measures of simplify).
> How do we change the database storage order? For existing databases,
> we don't. Only new databases (including e.g. those used in autobuild
> runs) maintain the links in sorted order. When we're setting up the
> indexes on a new database, we add a record called
> "@SAMBA_FEATURES_SUPPORTED" which lists features the database might be
> compatible or incompatible with. If we find the value "sortedLinks" in
> the compatible features attribute, we tell the replmd module to assume
> links are sorted.
> It should not be terribly tricky to make a dbcheck rule or something
> that upgraded existing databases, but we don't provide that here.
> Without the "sortedLinks" flag the new code will parse and sort the
> links as before, and will store them in sorted order. It seems to be
> slightly faster than before. With the flag the speed-up depends on how
> many links the object has. The ad dc performance tests
> (source4/dsdb/tests/python/ad_dc_performance.py) get up to perhaps a
> thousand links and see a 60% reduction in clock time for several
> tests. Overall for these tests the percentage of CPU time spent in the
> replmd_modify_la_* functions drops from 29.4% to 4.2%, with most of
> the remaining time spent on back link processing. The bit these
> patches deal with really is several orders of magnitude faster on
> large databases.
> For what its worth, we are now finding that nearly half the CPU time
> is spent in memcpy inside tdb. But even with TDB_NO_FSYNC, the elapsed
> time for these tests is divorcing itself from the CPU time.
> please review carefully!
> Douglas

-------------- next part --------------
A non-text attachment was scrubbed...
Name: replmd-sorted-links.patch
Type: text/x-diff
Size: 127860 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20170203/88e4c12a/replmd-sorted-links-0001.diff>

More information about the samba-technical mailing list