[PATCHES] Port tdb to Python 3

Petr Viktorin pviktori at redhat.com
Wed Jul 8 11:07:47 UTC 2015


On 06/29/2015 03:38 PM, Petr Viktorin wrote:
> On 06/19/2015 04:17 PM, Petr Viktorin wrote:
>> On 06/12/2015 01:27 AM, Andrew Bartlett wrote:
>>> On Thu, 2015-06-11 at 14:55 +0200, Petr Viktorin wrote:
>>>> Hello,
>>>>
>>>> Here are initial patches porting tdb & ldb to Python 3. These libraries
>>>> deal a lot with text-like data, so I expect some discussion around them.
>>>> I'm not as familiar as others on the list with the use of these
>>>> libraries, and I didn't find relevant documentation, so I'll write out
>>>> some of my assumptions here; please correct me if I'm wrong.
>>>>
>>>> tdb is routinely used to store binary data, and in my understanding
>>>> that's its primary use case, so it should primarily have a "bytes"-based
>>>> interface. That's what these patches add.
>>>> If a text-based interface is needed in more than a few cases (i.e. if
>>>> manual encode/decode is expected to be a big pain), it can be added (see
>>>> below).
>>>
>>> This seems reasonable.  TDB tries to be a pure key-value store, and
>>> while the values stored are often strings, and the keys are almost
>>> always strings, these are always cast to data/length pairs in the C
>>> interface, and the interface is of bytes. 
>>
>> OK. Here are the TDB patches with a text (unicode) based interface on
>> top. These let you do e.g. `tdb.text['key']` to work with Unicode strings.
>> The text wrapper has the same API as a Tdb, so if you work purely with
>> text, you can open the database with `db = tdb.open(...).text`, and just
>> use the wrapper. The original bytes-based Tdb object is then in `db.raw`.
>>
>> I wrote the text wrapper in Python, so as to not repeat all the code.
>>
>>>> ldb, on the other hand, stores text, and I remember someone on this list
>>>> mentioned that LDAP is a text-only protocol. Is that really the case?
>>>
>>> No, it isn't.  Yes, LDAP is primarily used for text, and when text is
>>> transferred over LDAP it is UTF8 encoded (if the server enforces that),
>>> but LDB doesn't enforce that in the default configuration. 
>>>
>>> However, binary data is routinely transferred.  The C interface is of
>>> bytes.  The strings stored are not null terminated (but a trailing \0
>>> after the length is typically added for safety in the libraries, and is
>>> sadly assumed in too many places).
>>>
>>> The way to know if an attribute is text or binary in proper LDAP is to
>>> consult the schema, but by default LDB is schema-less. 
>>
>> OK, that's what I thought. A scheme similar to what I did with tdb here
>> should work.
>>
> 
> Hello,
> When you get the chance, could you look at this code?

Ping,
Anything I can do to help this get a review?


-- 
Petr Viktorin


More information about the samba-technical mailing list