[PATCHES] Port tdb to Python 3
pviktori at redhat.com
Mon Jun 29 07:38:24 MDT 2015
On 06/19/2015 04:17 PM, Petr Viktorin wrote:
> On 06/12/2015 01:27 AM, Andrew Bartlett wrote:
>> On Thu, 2015-06-11 at 14:55 +0200, Petr Viktorin wrote:
>>> Here are initial patches porting tdb & ldb to Python 3. These libraries
>>> deal a lot with text-like data, so I expect some discussion around them.
>>> I'm not as familiar as others on the list with the use of these
>>> libraries, and I didn't find relevant documentation, so I'll write out
>>> some of my assumptions here; please correct me if I'm wrong.
>>> tdb is routinely used to store binary data, and in my understanding
>>> that's its primary use case, so it should primarily have a "bytes"-based
>>> interface. That's what these patches add.
>>> If a text-based interface is needed in more than a few cases (i.e. if
>>> manual encode/decode is expected to be a big pain), it can be added (see
>> This seems reasonable. TDB tries to be a pure key-value store, and
>> while the values stored are often strings, and the keys are almost
>> always strings, these are always cast to data/length pairs in the C
>> interface, and the interface is of bytes.
> OK. Here are the TDB patches with a text (unicode) based interface on
> top. These let you do e.g. `tdb.text['key']` to work with Unicode strings.
> The text wrapper has the same API as a Tdb, so if you work purely with
> text, you can open the database with `db = tdb.open(...).text`, and just
> use the wrapper. The original bytes-based Tdb object is then in `db.raw`.
> I wrote the text wrapper in Python, so as to not repeat all the code.
>>> ldb, on the other hand, stores text, and I remember someone on this list
>>> mentioned that LDAP is a text-only protocol. Is that really the case?
>> No, it isn't. Yes, LDAP is primarily used for text, and when text is
>> transferred over LDAP it is UTF8 encoded (if the server enforces that),
>> but LDB doesn't enforce that in the default configuration.
>> However, binary data is routinely transferred. The C interface is of
>> bytes. The strings stored are not null terminated (but a trailing \0
>> after the length is typically added for safety in the libraries, and is
>> sadly assumed in too many places).
>> The way to know if an attribute is text or binary in proper LDAP is to
>> consult the schema, but by default LDB is schema-less.
> OK, that's what I thought. A scheme similar to what I did with tdb here
> should work.
When you get the chance, could you look at this code?
More information about the samba-technical