[PATCH 00/14] tdb: Update pytdb API to match what is provided by libtdb

Jelmer Vernooij jelmer at samba.org
Tue Sep 28 02:05:22 MDT 2010


On Mon, 2010-09-27 at 12:35 +0400, Kirill Smelkov wrote:
> On Mon, Sep 20, 2010 at 01:16:49PM +0400, Kirill Smelkov wrote:
> > Hi Jelmer,
> > 
> > On Sun, Sep 19, 2010 at 10:16:18AM -0700, Jelmer Vernooij wrote:
> > > Hi Kirill,
> > > 
> > > On Sun, 2010-09-19 at 13:53 +0400, Kirill Smelkov wrote: 
> > > > Rusty, Jelmer,
> > > > 
> > > > The subject says it all. Not 100% complete, but near.
> > > Thanks for the patches. I've applied most of the Python ones. I'm not at
> > > all convinced we should match the C API in the Python API though, I
> > > rather think we should let the needs of our Python users drive what we
> > > expose. Some of the worst Python bindings I've seen were created by
> > > simply mapping every C function one on one to Python.
> > > 
> > > Is there any particular reason why some of these functions should be
> > > exposed? Why do you need low-level locking?
> > 
> > Thanks for applying some patches and sorry I've not described my context
> > initially...
> > 
> > In this case I myself is tdb python user - I use tdb in embedded system
> > for internal database to which many programms "connect" simultaneously
> > to read/write it.
> > 
> > That's why I need locking, and better, to avoid lock contention, the
> > chainlock_* family variants.
> > 
> > Also, sometimes it is not important to write data to db immediately, so
> > to minize latencies, apps keep to-be-written queue internally until they
> > know they can write to some chain, or start transaction - that's why I
> > need *_nonblock variants.
> > 
> > Same for reading - once initially read, it's not that important to get
> > up-to-date values immediately, that's why I'd also use
> > tdb_chainlock_read_nonblock().
> > 
> > And to make life a bit more interesting, db is stored on compact flash
> > -- various types, from various vendors, so with various types of flash
> > translation layers (FTL) -- so inevitably with bugs in FTL with respect
> > to sudden power failures, so I'm preparing to have corrupt tdb one day 
> > 
> > http://ozlabs.org/~rusty/index.cgi/tech/2009-10-20.html
> > http://lwn.net/Articles/349970/
> > 
> > That's why I'd also like to have debugging routines (dump_all,
> > print_freelist, etc..,), and tdb_check (not yet done, should I?), and
> > also tdb_fd and tdb_repack come for completness (doesn't tdb_repack
> > complement tdb_wipe_all() which has python bindings?).
> > 
> > And we don't have shutdown sequence - normal shutdown is poweroff...
> > 
> > 
> > Hope this clarifies my rationale about why we should expose more
> > functionality in pytdb.
> 
> Silence...
> 
> Jelmer, others, what I'm maybe doing wrong here? I just wanted to use
> tdb from python without major constraints compared to C version.
Sorry, as Andrew mentioned most of us were at a conference last week so
I haven't had much time to look at your patches again.

With regard to the chainlock functions; I can see the use in exposing
these, but am not convinced mapping them one to one from the C functions
is necessarily the best idea. The header warns to use the chainlock
functions with care; have the bindings for them been tested extensively?
Does using these functions from Python not cause unexpected segfaults in
some situations? With some more unit tests I'd be happy to accept those
patches.

With regards to some of the other functions, I don't think completeness
is a valid reason for adding bindings per se. I really don't see why
tdb_fd would need to be exposed on the Python level (or at all) for
example. The more functions are exposed the harder it becomes to find
something by browsing the API and the harder it becomes to change that
API. You mention corruption, but that would be due to bugs in tdb. I
don't understand why we'd need to expose any other function than
tdb_check in this regard.

See my reply about dump_all() etc writing to stdout. I think if we
expose these functions in Python we should at least make them write to a
file-like object, not to stdout. That will also make testing easier.

Cheers,

Jelmer
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 836 bytes
Desc: This is a digitally signed message part
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20100928/2e64ff00/attachment.pgp>


More information about the samba-technical mailing list