[PATCH] LMDB full patch set
Stefan Metzmacher
metze at samba.org
Fri May 11 18:07:27 UTC 2018
Hi Andrew,
>>>> From reading the patches it seems that it will no longer possible
>>>> to build ldb or an AD DC without lmdb, e.g. 32bit support will be gone.
>>>> I don't think that's acceptable right now, maybe in a few years.
>>>
>>> Yes, you are correct we will require lmdb for the AD DC and ldb
>>> standlone builds.
>>>
>>> lmdb is shipped on 32 bit ubuntu. I've set up a test environment and
>>> will confirm the LDB tests pass on it. Howard Chu indicates earlier in
>>> the thread that it is fully functional, but limited in DB size to 2GB.
>>
>> Are we going to require lmdb at runtime or do you want to require it
>> at compile time?
>
> Just compile time for now, it is new and and want to test it.
>
>> Even for the standalone ldb build we have --disable-python.
>> I think a standalone ldb should be able to build without lmdb
>> it the user really wants this.
>
> The only reason --disable-python needs to remain is for the fileserver,
> which never uses python anyway. I support keeping that mode because it
> will make it easier to move to python3 (ironically) because we can be a
> little more aggressive on dependencies for the AD DC than for pure
> fileservers, which still need to work on AIX et al.
>
> For your example, it is actually what I want to avoid!
>
> In that case if we had a system ldb without python, we then go and
> build pyldb from the (samba) build tree, but call #include
> "ldb_private.h"! While it is the classic way to introduce new things,
> it creates combinations activated either accidentally or 'because I
> want to', it turns out to be both untested and actively harmful.
>
> For this particular one I'll be submitting a patch today to make that
> building a python-enabled Samba against a not-python ldb will just
> fail. (This mistake actually goes back to why python was itself the new
> untrustworthy and experimental thing).
I'm happy to require a system pyldb if we use a system libldb
as AD DC. But the difference is that pyldb is strictly required
at runtime for an AD DC, while the lmdb support is just optional.
>> For the AD DC I'm happy to require it (similar to the jansson library)if
>> selftest is enabled, but I really don't see why we need to require
>> it at build time (at least an explicit --without-ldb-lmdb should be
>> possible for the AD DC and standalone builds).
>
> The problem with this approach is that we create more untested
> combinations. Also, as I mentioned in my previous mail, we have less
> freedom compared with jansson because Samba, built for selftest, can be
> built against a system ldb.
Doesn't the ldb:// prefix already work fine without mdb support?
The default provision will just work fine without ldb support in
the system library.
> Now, like for python, we would have two different possible system ldb
> versions, one acceptable and one not acceptable.
>
>> Can you summaries how this is supposed to work at runtime?
>> (I didn't have time to play with this yet)
>> Will lmdb only used for the partition files or are all files,
>> e.g. secrets.ldb and wins.ldb.
>
> Just the sam.ldb backend for now. The choice is via yet another option
> at provision time.
>
>> In case lmdb is not activated by default (at runtime)
>> and you prepare a tmp patch to do that and check what autobuild says
>> about that?
>
> Likely nothing, as I've not forced an environment to use it yet. You
> ask below for a patch to enable it, and I'll do that today, so Samba's
> use of lmdb is covered in autobuild. (Gary's development testing was
> done with such a temp patch, it made everything use lmdb by default,
> but that isn't a step we want to take yet).
Here's my branch:
https://git.samba.org/?p=metze/samba/wip.git;a=shortlog;h=refs/heads/master4-lmdb-full
I just changed get_default_backend_store() to "mdb" and just got this:
PROVISIONING AD DC (NTVFS)...
WARNING: The "lsa over netlogon" option is deprecated
WARNING: The "server schannel" option is deprecated
WARNING: The "lsa over netlogon" option is deprecated
WARNING: The "server schannel" option is deprecated
Unable to determine the DomainSID, can not enforce uniqueness constraint
on local domainSIDs
Failed to connect to
'mdb:///data/tmp/samba-master4-franky/ad_dc_ntvfs/private/sam.ldb.d/DC=SAMBA,DC=EXAMPLE,DC=COM.ldb'
with backend 'mdb': Could not create MDB environment
/data/tmp/samba-master4-franky/ad_dc_ntvfs/private/sam.ldb.d/DC=SAMBA,DC=EXAMPLE,DC=COM.ldb:
Accessing a corrupted shared library
ERROR(ldb): uncaught exception - Could not create MDB environment
/data/tmp/samba-master4-franky/ad_dc_ntvfs/private/sam.ldb.d/DC=SAMBA,DC=EXAMPLE,DC=COM.ldb:
Accessing a corrupted shared library
>> In the configure check we also check for a specific function of lmdb
>> and only enable HAVE_LMDB on success (together with an unused
>> conf.env.ENABLE_MDB_BACKEND), but if it's not found we don't complain...
>
> I'll ensure it is handled. I expected such a discussion and so we
> hedged our bets a little.
It's used in one place, but should be replaced by
if bld.CONFIG_SET('HAVE_LMDB')
>>> The DB size issue is also the first issue I've found empirically, as we
>>> got this while trying to set the default DB size:
>>>
>>> [1295/4337] Compiling lib/ldb/ldb_mdb/ldb_mdb.c
>>> ../lib/ldb/ldb_mdb/ldb_mdb.c: In function lmdb_open_env:
>>> ../lib/ldb/ldb_mdb/ldb_mdb.c:726:2: error: large integer implicitly
>>> truncated to unsigned type [-Werror=overflow]
>>> ret = mdb_env_set_mapsize(*env, 16LL * GIGABYTE);
>>>
>>> As the operational side is both optional and not a major departure from
>>> the 4GB limit on TDB, I think this is a reasonable restriction. Those
>>> running a RaspberryPi DC won't be running hundreds of users in any
>>> case.
>>>
>>> To address the broader point:
>>>
>>> My strong preference is that the AD DC be able to rely on this feature
>>> being available. I don't mind if the fileserver-with-bundled-libraries
>>> has it optional, but for the AD DC I'm actively trying to reduce the
>>> possible combinations so we can keep everything tested.
>>
>> For autobuild I'm fine with the requirement, but we need to test the
>> code path without lmdb anyway and keep it working. That's why I
>> think we should have a --without-ldb-lmdb option.
>
> Yes, but that should be by having some selftest environments running
> with lmdb and some without. We don't need the complexity of builds
> with and without lmdb to cover that. Those extra configure options
> have a real cost! See the work I've had to do to ensure the pointless
> combination of --without-ldap remains supported.
> For each combination we add, someone uses it and then wonders why it
> doens't work and complains at best or suffers harm at worst. An AD DC
> is no small thing, we can reasonably choose to depend on widely
> available libraries.
We already have a number of different autobuild targets changing
one of them to --without-ldb-lmdb should be a huge problem.
>>> (Once this lands, I'll change some of our selftest environments to use lmdb).
>>
>> I want to see the patch that enables it first.
>
> OK.
>
>>> That is why 'samba' needs to require it for the AD DC. For standlone
>>> ldb, that reason I've not made it optional is that otherwise we need to
>>> declare to samba that the built-standalone 'system' ldb doesn't have
>>> lmdb, then have a reliable method to detect that. This seemed like all
>>> to much complexity for a feature we will likely soon strictly rely on.
>>
>> Why do we have to rely on this strictly very soon, that is every tricky
>> new code without any real world testing.
>
> It may or may not be tricky new code, but does that justify adding more
> tricky code to ensure we can build (but not test) with and without the
> tricky code?
What's so tricky about a few lines of simple python in the wscript
file, everything else needs to work anyway for the without-ad-dc case
anyway.
Maybe the situation would be different if we wouldn't require ldb
when using --without-ad-dc, but I'm not completely sure.
>>> (We have been actively removing the ability to run 'make test' without
>>> optional packages after we found parts of our testsuite silently
>>> skipped. I'll propose the same for the ldb package once this is
>>> agreed.)
>>
>> Yes, for autobuild that's fine, but there's a real world outside of
>> autobuild.
>
> There is. I actually think that those things 'required for --enable-
> selftest' should be just required, as untested code is broken code, and
> clearly if it is a combination that selftest requires then it isn't
> tested outside selftest.
>
> I realise there are limits, and many rightly feel it is still better to
> build untested than not build at all on some platforms, but that is the
> background motivation here.
>
> Remember also that we have since found there are no non-samba users of
> lmdb beyond sssd and zombie openchange sites. So we should not invest
> greatly in the fiction that ldb is installed for other reasons (unlike
> TDB/talloc/tevent which has been more broadly successful).
>
> Likewise with ldb is only built standalone for Samba/sssd use in
> distributions.
>
>>> Portability to i386 I can sort out, but can you confirm you are
>>> otherwise happy with the branch?
>>>
>>> https://gitlab.com/catalyst-samba/samba/commits/metze-master4-lmdb-full
>>
>> Not is this state sorry and I'd like to play with lmdb activated
>> and have a look at the files myself...
>>
>> Thanks for your patience...
>
> Can you please give me a timeframe here?
I tried today, but it failed as shown above (I'm testing on ubuntu
trusty amd64)
> You and I have made small and important improvements, but the patches I
> re-pushed yesterday are still essentially unchanged from the patches I
> sent you a month ago[1], and those that remain outside master are in
> turn essentially the patches that were posted two months ago.
>
> Could any remaining issues be dealt with once it lands in master?
>
> Could we delay the tagging of an ldb 1.4.0 release until closer to the
> 4.9.0 RC and get some feedback and improve your comfort with the code
> and configure/build behaviour in the meantime?
Why not have it optional for 1.4.0 and allow people to build master with
a released system ldb for package testing.
> I'm more than happy to keep working with you on this once it is in
> master, but I would like to know we are past the indefinite review
> stage[2].
If it would be optional for sometime and it would actually work
it would be much easier to agree to push it to master.
I just tried two private autobuilds one with the
get_default_backend_store() change and one without.
But while writing this the first already failed
the ldb standalone tests with similar failures
as above:
Could not run the test - check test fixtures
[ ERROR ] test_transaction_commit_across_fork
Failed to connect to 'mdb://apitest.ldb' with backend 'mdb': Could not
create MDB environment apitest.ldb: Accessing a corrupted shared library
Could not run the test - check test fixtures
[ ERROR ] test_lock_read_across_fork
Failed to connect to 'mdb://apitest.ldb' with backend 'mdb': Could not
create MDB environment apitest.ldb: Accessing a corrupted shared library
metze
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20180511/0c30c334/signature.sig>
More information about the samba-technical
mailing list