[PATCH] LMDB full patch set

Mon Apr 16 15:11:54 UTC 2018

Am 16.04.2018 um 16:43 schrieb Howard Chu via samba-technical:
> Simo wrote:
>> On Mon, 2018-04-16 at 14:36 +0200, Volker Lendecke via samba-technical
>> wrote:
>>> On Fri, Apr 13, 2018 at 07:00:35AM +1200, Andrew Bartlett via
>>> samba-technical wrote:
>>>> It seems so, and there isn't much we can do.  If you think about a
>>>> fork() based memory model then cleaning up the internal state will only
>>>> dirty memory anyway.  They don't provide a 'clean up after fork()'
>>>> function, but do strictly require that you not touch an environment
>>>> after a fork().
>>>
>>> Asked Howard directly about fork handling. His reply contains:
>>>
>>>> After discussing this on -devel IRC, I think we could add an
>>>> mdb_env_postfork() call that the child can use to make its copy of
>>>> the env valid. It would have the same restrictions as
>>>> mdb_env_set_mapsize() - there must not be any active txns in the
>>>> parent process at the time of fork - they'd just be memleaks in the
>>>> child. The child must call this immediately, and cannot call any
>>>> other LMDB APIs until this is done. But assuming those conditions
>>>> are met, we can continue to use the existing file descriptors and
>>>> mmaps without needing to tear down and reopen.
>>>
>>> Would that help us?
>>
>> It would require to gate forks on the fact no transactions are open.
>> Can we guarantee no transactions in parents or block forks?
>>
>> Note: If we need to fork a utility within the transaction we do not
>> need to gate it as long as we have call semantics that enforce an exec
>> or a panic I guess.
> 
> Yes, this is missing a little prior context, in my earlier reply to Volker:
> 
>>> One question that came up: What to do
>>> with an open lmdb after fork? Samba does fork a lot, and with tdb we
>>> just reopen the fd and restart everything from scratch. What's the
>>> advice for lmdb?
>>
>> Does the parent process continue to use the DB after the fork?
>> If the child is just going to exec, there's not much needed. You will
>> leak a descriptor unless you close mdb_env_get_fd() before exec'ing.

Why isn't *CLOEXEC used for env->me_fd and env->me_mfd ?
For me_fd we could set it ourself using mdb_env_get_fd(), but
me_mfd is completely hidden.

>> If both processes are using the DB, you will need to env_close before
>> forking and reopen in both processes. Otherwise you'll mess up the
>> fcntl lock on the lockfile.

With tdb we just need to close and reopen in the child, why is that a
problem with fcntl locking?

>> If you don't close in the parent, and only open in the child, things
>> will work, but the child will leak the entire copy of the parent's
>> environment.

Without a pending transaction I'd guess mdb_env_close() would be all we
need in the child. But what would be the problems with that?

> I wrote the above before we had the discussion about adding a
> _postfork() API. If you don't care about leaking the memory from any
> active transactions, you could ignore that detail as well.
> 
> There's also the case where you fork and then the parent exits, but our
> proposed _postfork() would handle that as well.

What would the _postfork() do in detail compaired to mdb_env_close()?

metze

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: OpenPGP digital signature
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20180416/715d48b4/signature.sig>