Understanding TDB behavior

Fri Jul 17 18:02:39 UTC 2015

On Fri, Jul 17, 2015 at 11:11 PM, Volker Lendecke <Volker.Lendecke at sernet.de
> wrote:

> On Thu, Jul 16, 2015 at 05:39:17PM +1000, Amitay Isaacs wrote:
> > Hi,
> >
> > Recently I noticed an unexpected (at least to me) behavior in TDB.
> >
> > One process has created a database without mutex support and is still
> > running.
> > Second process tries to open the same database with mutex support.
> >
> > My expectation was that the tdb_open() in the second process should fail.
> > But, the second process succeeds and converts the database to use
> mutexes.
> >
> > Is this expected?
> >
> > To check this exhaustively, I wrote a new tdb test (attached) that tries
> > various create and open flags combinations.  And as a result here are the
> > findings:
> >
> > 1. Either TDB_ALLOW_NESTING or TDB_DISALLOW_NESTING is always set for
> > create and open.
> >
> > 2. Create and open with TDB_MUTEX_LOCKING | TDB_CLEAR_IF_FIRST always
> adds
> > TDB_INCOMPATIBLE_HASH.
>
> Yep, expected. We always want INCOMPATIBLE_HASH except for
> old compatibility uses. And we know users with MUTEX_LOCKING
> are new users. So set INCOMPATIBLE_HASH.
>
> > 3. Database created with TDB_MUTEX_LOCKING can only be opened without
> > TDB_MUTEX_LOCKING provided TDB_NOLOCK is specified, otherwise fails with
> > EINVAL.
>
> Yep.
>
> > 4. Open with TDB_MUTEX_LOCKING | TDB_CLEAR_IF_FIRST always converts the
> > existing database to use mutexes and incompatible hash.
>
> Ok, I think I see a possible misunderstanding.
>
> TDB_CLEAR_IF_FIRST to function correctly depends on every
> potential opener to use this flag. Only a CLEAR_IF_FIRST
> opener indicates its existence via an fcntl RDLCK like this:
>
> fcntl(3, F_SETLK, {type=F_WRLCK, whence=SEEK_SET, start=4, len=1})
>
> So if process A opens a tdb without CLEAR_IF_FIRST like for
> example in line
>
> { TDB_INCOMPATIBLE_HASH, TDB_MUTEX_LOCKING | TDB_CLEAR_IF_FIRST, 0 }
>
> only the second opener will even attempt to claim its
> FIRSTness. It will succeed, because the first opener did not
> mark his claim. And CLEAR_IF_FIRST does the CLEAR part,
> because the second opener thinks it is first.
>
> During the initialization we do the mutex thing, "killing"
> access by the first opener.
>
> The same happens without mutexes. If the pure
> INCOMPATIBLE_HASH opener had put data into the tdb, that
> would be gone after the second CLEAR_IF_FIRST opener came
> in.
>
> If you test like this:
>
> { TDB_CLEAR_IF_FIRST, TDB_MUTEX_LOCKING | TDB_CLEAR_IF_FIRST, 0 }
>
> the resulting tdb does not have mutexes.
>
> Does this clear things, or did I still get you wrong?
>

Thanks for the explanation.  I assumed that first opener is always the
"first" irrespective of the CLEAR_IF_FIRST flag.  That was the confusion.

>
> If so, can you trim your test case to just the single case
> that we don't have a common understanding about?
>
>
The test case is based on CTDB's persistent and volatile databases.
Persistent databases are obviously not created nor opened with
TDB_CLEAR_IF_FIRST flag.  However, if you try to open the database with
TDB_CLEAR_IF_FIRST | TDB_MUTEX_LOCKING flags, then the database is going to
get wiped clean without any protection.  Not only that, but once the
database is converted to use mutexes, CTDB cannot even open that persistent
database.  That is the real problem.

Another issue is that TDB_MUTEX_LOCKING flag cannot be specified without
TDB_CLEAR_IF_FIRST.  If there is an option to try to open a database with
TDB_MUTEX_LOCKING and if the database is not using robust mutexes, then the
open can fail with EINVAL.

Since there is no such option, to prevent opening a persistent database
with wrong flags, I need to first try and open the database with no flags.
If it succeeds, then I know it's a persistent database and if it fails then
it's a volatile database.

Amitay.