[PATCHES] Build pytalloc for two Python versions at once, port to py3

Wed Mar 4 05:26:52 MST 2015

On 03/03/2015 11:18 PM, Andrew Bartlett wrote:
> On Tue, 2015-03-03 at 22:09 +0100, Jelmer Vernooij wrote:
>> On Mon, Mar 02, 2015 at 12:09:25PM +0100, Petr Viktorin wrote:
>>> On 02/26/2015 11:36 PM, Andrew Bartlett wrote:
>>>> On Thu, 2015-02-26 at 12:23 +0100, Petr Viktorin wrote:
>>>>> On 02/25/2015 07:53 PM, Andrew Bartlett wrote:
>>>>> I'd really like to have the buildsystem changes vetted before I tackle
>>>>> the split, so I (and also the future reviewers) can focus on the porting
>>>>> itself.
>>>>
>>>> That's understandable.  I'm hoping Jelmer might post some thoughts.  He
>>>> and I came up with a reasonable set of expectations in some private
>>>> mails going back and forth on this.
>>>
>>> I'm quite interested in these thoughts. The last time we discussed this on
>>> the list, I got the impression that there was general agreement that the
>>> standalone libraries (like talloc) could be ported before the rest of Samba,
>>> supporting both versions of Python for a while.
>>> If there are other expectations floating around in private, I would really
>>> like to hear them.
>>
>> I think it would be reasonable to support both Python 2 and Python 3
>> in the standalone libraries for some time, especially since they are used
>> by a more projets than just Samba.
>>
>> These Python modules are also fairly small, they have relatively many tests
>> and the overhead of supporting two Python versions is limited. To
>> prevent regressions, we should build with both versions on the
>> buildfarm and be able to build simultaneously against both versions
>> (to make life easy on both Samba developers and packagers). Because of
>> some of the new features to ease portability, it would be nice if we
>> could require Python3 >= 3.4.
>
> I agree, we shouldn't be both trying to merge and support older python3
> versions at the same time.

+1

>> As for the Samba Python modules, the balance is different there. Samba
>> has *a lot* more Python code and C Python bindings. There are fewer
>> unit tests for that code but many more integration tests for Samba
>> overall. We don't want "make test" to run the entire testsuite twice,
>> once with each Python version, and on top of that we are less
>> likely to catch Python compatibility issues just by running the Python
>> module unit tests.
>>
>> There are also fewer external users of the Samba Python modules; at
>> this point I'm only aware of OpenChange and FreeIPA.
>>
>> It would require a non-trivial extra amount of effort on part of the Samba
>> developers to maintain support for two Python versions; see the
>> earlier thread for a more detailed of why I think supporting two
>> Python versions would require extra resources and why I think this is
>> not worth it.
>>
>> That said, if somebody wants to put the effort in to port Samba to
>> another major Python version (e.g.: 3), I would like to accomodate
>> them if possible. However, such an effort should minimize the burden
>> on the rest of Samba development. I would suggest the following
>> constraints on support for additional major Python versions:
>>
>>   * It shouldn't clutter the code with lots of compatibility wrappers
>>     that make the code unreadable.
>
> I'm sorry to hear earlier in the thread that there are no efforts to
> have a compatibility API for the C interface, because it seems that in
> Samba, we already do have a good separation between bytes and unicode,
> because we care to be null-termination safe.
>
> That is, length-limited strings are byte arrays, null terminated strings
> are utf8 strings.  Within those rules, and within Jelmer's stipulation
> above, I would like to see a library that implements the Python3 C API
> in terms of python2.  Then we could port Samba to that, one module at a
> time.  It wouldn't make the code unreadable, because it would be the
> Python3 API.  I might permit a small number of #ifdefs, but these too
> make the code unreadable.

That sounds good.
I can make a compatibility layer. Especially after I started looking at 
ldb, it's clear that it's the way to go.

Python 2.6+ supports many features from Python3, from small things like 
Py_TYPE to larger things like rich comparison. The major differences are 
the str/bytes split and module initialization, and a compatibility layer 
can reduce a lot of the boilerplate.

However I like to generalize from several examples – to separate what's 
talloc-specific and what is useful to more libraries. Also, setting up a 
compat layer is orthogonal to the main thing in these patches, namely 
the buildsystem improvements.
So to prevent the patchset (and its review) from becoming unmanageably 
large, I would like to have the buildsystem changes vetted before 
setting up a shared compatibility layer.

I'll prepare updated patches with a more distinct "compat layer" 
section, which I'll then pull out of pytalloc in another patchset, when 
I tackle pyldb. Does that sound like a good plan?

> Now, I also wish I had a time machine to explain to those who embarked
> on the python3 fork, what the full implications would be for large users
> of the ABI like ourselves.  We in Samba walked a similar road, and had
> to spend some considerable effort to drag the project back from the
> brink.

Well, you're not alone wishing for a time machine :)

>>   * Samba officially supports only a single major python version,
>>     the others are only supported on a best-effort basis and by
>>     the people that care about them.
>>   * Other Samba developers should be able to develop against a single
>>     major Python version without having to worry about breaking
>>     other major Python versions, or being responsible for keeping
>>     Samba working with any major Python versions but the default
>>     one (Python 2 at the moment).
>
> This I think is the key.  It won't stop (say) arch linux turning it on,
> and then filing bugs when it won't work, but is is the only reasonable
> way in the short term.

I agree; Python3 support should be experimental until Samba switches, 
and I'll be happy to fix regressions (which will further encourage me to 
make this all as seamless as possible).
I also think the py3 support needs to live in the master branch. One 
reason is that I'm likely to make improvements to the Python2 version as 
well (see my recent pyldb patches). Another reason is that FreeIPA could 
move forward with Python 3 support – which would also be experimental, 
but they could then switch at the same time as Samba.

> I'm still very cautious here.
>
> In another space, we had Samba's ldb take an early move to async
> programming, but end up with an interface that not only encouraged data
> corruption bugs (because it didn't deal with transactions), it wasn't
> compatible with the async model the rest of Samba ended up using in the
> long term.  A large amount of code was converted, and even more was
> written with great complexity, yet little benefit was obtained.
 >
> After that, I'm cautious about large API change efforts - we sent some
> back to the drawing board, even from folks as experienced as metze and
> rusty.

Some of the changes here (especially to the buildsystem) are complex, 
but I don't think it's quite near the risks of choosing an async model.
Python 3 has been here for 7 years; it's a known, stationary target.

Here are the API-related changes in these patches:
- str(...) returns an Unicode string (required by Python3)
- objects if disparate types are uncomparable by default (this is normal 
in Python3)
- the largely unused pytalloc_CObject_FromTallocPtr is removed for py3, 
(rather than replaced with something that would need API design)

Really, the only API decision I've made here is "support Python 3". 
These patches are not about API changes, but about preparing 
infrastructure to make the API changes possible in the future.

It's possible I made bad choices in the changes to Samba's internal 
buildsystem – but unlike bad API, those can be fixed as they're found.
I can (and will) reduce the number #ifdefs with a compat layer, but that 
won't affect the pytalloc API.

Your concerns will become very relevant soon, but if we agree that 
Python 3 support will need to be added, I'd like to get infrastructure 
issues out of the way first.

> Our preference and practice is for things that can be done slowly,
> incrementally, and where the result during the journey is better than
> where we start, and just as acceptable as the result at the end (minus
> the feature being accomplished, naturally).

Then we're on the same page. This was not clear to me from your previous 
mails; thank you for clearing it up.

Thanks to this discussion, I see how I can do a better job on this 
front. I'll get to work.

-- 
Petr Viktorin