[PATCH] LMDB full patch set

Howard Chu hyc at highlandsun.com
Mon Apr 16 14:43:10 UTC 2018


Simo wrote:
> On Mon, 2018-04-16 at 14:36 +0200, Volker Lendecke via samba-technical
> wrote:
>> On Fri, Apr 13, 2018 at 07:00:35AM +1200, Andrew Bartlett via samba-technical wrote:
>>> It seems so, and there isn't much we can do.  If you think about a
>>> fork() based memory model then cleaning up the internal state will only
>>> dirty memory anyway.  They don't provide a 'clean up after fork()'
>>> function, but do strictly require that you not touch an environment
>>> after a fork().
>>
>> Asked Howard directly about fork handling. His reply contains:
>>
>>> After discussing this on -devel IRC, I think we could add an
>>> mdb_env_postfork() call that the child can use to make its copy of
>>> the env valid. It would have the same restrictions as
>>> mdb_env_set_mapsize() - there must not be any active txns in the
>>> parent process at the time of fork - they'd just be memleaks in the
>>> child. The child must call this immediately, and cannot call any
>>> other LMDB APIs until this is done. But assuming those conditions
>>> are met, we can continue to use the existing file descriptors and
>>> mmaps without needing to tear down and reopen.
>>
>> Would that help us?
> 
> It would require to gate forks on the fact no transactions are open.
> Can we guarantee no transactions in parents or block forks?
> 
> Note: If we need to fork a utility within the transaction we do not
> need to gate it as long as we have call semantics that enforce an exec
> or a panic I guess.

Yes, this is missing a little prior context, in my earlier reply to Volker:

>> One question that came up: What to do
>> with an open lmdb after fork? Samba does fork a lot, and with tdb we
>> just reopen the fd and restart everything from scratch. What's the
>> advice for lmdb?
> 
> Does the parent process continue to use the DB after the fork?
> If the child is just going to exec, there's not much needed. You will leak a descriptor unless you close mdb_env_get_fd() before exec'ing.
> 
> If both processes are using the DB, you will need to env_close before forking and reopen in both processes. Otherwise you'll mess up the fcntl lock on the lockfile.
> 
> If you don't close in the parent, and only open in the child, things will work, but the child will leak the entire copy of the parent's environment.

I wrote the above before we had the discussion about adding a _postfork() API. 
If you don't care about leaking the memory from any active transactions, you 
could ignore that detail as well.

There's also the case where you fork and then the parent exits, but our 
proposed _postfork() would handle that as well.

-- 
   -- Howard Chu
   CTO, Symas Corp.           http://www.symas.com
   Director, Highland Sun     http://highlandsun.com/hyc/
   Chief Architect, OpenLDAP  http://www.openldap.org/project/



More information about the samba-technical mailing list