[PATCHES] Build pytalloc for two Python versions at once, port to py3

Petr Viktorin pviktori at redhat.com
Tue Mar 17 05:16:03 MDT 2015

On 03/16/2015 08:51 PM, Andrew Bartlett wrote:
> On Mon, 2015-03-16 at 12:14 +0100, Petr Viktorin wrote:
>> On 03/14/2015 10:17 AM, Andrew Bartlett wrote:
>>> On Fri, 2015-03-13 at 14:00 +0100, Petr Viktorin wrote:
>> [...]
>>>> I believe that it's impossible to do while still keeping things clear.
>>>> I did try, however, to keep the deviations from Python 3 minimal. Here
>>>> they are:
>>>> * PyStr (to use either PyUnicode or PyBytes depending on the version)
>>> Can't we code to the Python3 API, and have it map both to python2
>>> strings, either as raw bytes or pushing (presumed internal mapping) UTF8
>>> to UTF8?  (Yes, we really should use the unix charset, but these parts
>>> of samba without unix charset == utf8 is just broken).
>> Unfortunately, no. A lot of the Python 3 API is also available in Python
>> 2, and using it would bring Python 3 semantics to the Python 2 version.
>> Specifically, to use the py3 API we'd have to use PyUnicode for all
>> human-readable strings.
>> That leaves two choices for py2 behavior:
>> - aliasing PyUnicode to PyString, which would not work where PyUnicode
>> is also needed in Python 2. Not to mention that alasing an existing type
>> to something else would be extremely confusing.
>> - using PyUnicode everywhere, which won't work either. Even if all of
>> Samba's py2 code didn't care if it was getting str or unicode, things
>> like tp_repr (repr() implementation) fail when you return unicode in py2.
> If it helps, the only place I can see PyUnicode being used is that we
> accept, but do not emit, PyUnicode in the pidl DCE/RPC generated code.
> I may not be grepping on the right things, but I also see no other
> references to unicode, except for some .encode('utf-16-le') calls
> creating password values.

That just means Samba didn't make the bytes/unicode split yet. 
Everything is a string.

PyUnicode already has a meaning in Python 2. Given the choices here, I'm 
much more comfortable introducing a new name, which clearly needs to be 
looked up if things go wrong, than hiding new semantics under an old name.

> (This may help explain our ambivalence about the whole reason for
> python3 in the first place)
> When I get some head-space, I'm going to write about the difficulty the
> Samba Team faces with this kind of project.  I spoke before about the
> async ldb example, but after ntdb came up, I realise we have been here
> before on other efforts.  I'm quite concerned to ensure we don't waste
> your efforts, and that we are all agreed on the risks and benefits.

I'm looking forward to that.
The key difference I see here is that ldb/tdb can continue working 
indefinitely. Python 2 will not – or at least after 2.7's 10-year 
upstream support goes away, Samba would most likely need to take over 
maintaining its own copy.

Petr Viktorin

More information about the samba-technical mailing list