Porting Samba's CPython extensions to Python 3

Petr Viktorin pviktori at redhat.com
Fri Aug 28 10:57:06 UTC 2015


Hello,
Sorry for this long mail: a lot has happened since the last discussions,
and I need to refresh some points buried in the e-mail thread here:
https://lists.samba.org/archive/samba-technical/2015-March/106177.html


In previous discussions, we agreed on a strategy for porting Samba to
Python 3. the stand-alone libraries would get a supported Python 3 port.
Patches for the rest of Samba would be tolerated if they do not
inconvenience other developers, and they would be unsupported (if it
breaks, it's on whomever cares about Python 3 to fix it).

With the patches for the last stand-alone library reviewed, I think it's
time to revive that discussion, to get a better idea of how porting
Samba to Python 3 should work.
Specifically, I'd like to come to understand what would least
inconvenience you, while allowing some kind of progress on this front.

In the mentioned thread, there is an idea that there is no rush to port
– Python 2 will be around for another five years.
But, while five years is a lot of time, if we spend time waiting there
*will* be a rush later. I'm trying to avoid that. If five years is an
absolute deadline for porting to py3, testing, and removing support for
py2, I think it does make sense to start.
In particular, waiting until enterprise Linux distributions switch to
Python 3 creates a Catch-22 that would most likely result in everyone
waiting till the last possible moment, and then rushing wildly. Like
Samba, a distribution wants to switch all at once; but to do that the
code must be ready.


Moving from the "when" to the "how":

Generally, there is opposition against a bespoke compatibility layer,
which could not be tested well and would not get much use beyond Samba.
As with any code written by one external developer, if I got hit by a
bus, the compatibility layer could bitrot.

However, *some* kind of a compatibility layer is needed.
The string type in Python 2 was split to "bytes" and "unicode", and
there is a need to either differentiate these two, or use unicode
everywhere in Python 2 (which would change the semantics of the Python 2
version, which is not practical for a project of Samba's size).
So, my approach is to differentiate between three kinds of strings:
- bytes (PyBytes; called "str" in py2, "bytes" in py3)
- native ("PyStr"; UTF-8 encoded "str" in py2; "str" in py3)
- text ("PyUnicode"; called "unicode" in py3, "str" in py3)
This string split is *the* difficult part of porting C extensions.
Compared to this, other decisions are fairly trivial: either use the py2
spelling or the py3 spelling of the same thing, and choose a point on
the spectrum between shared macros or inline #ifdefs.
Correspondingly, aside from the bytes/text split, the rest of the
porting process is largely mechanical.


The ideal solution for Samba would be if a compatibility layer was
distributed with Python itself. Unfortunately, this can't really work:
no features are added to Python 2.7 any more, and even if they were,
they couldn't be present in older 2.7 releases.

Realistically, I see three options for Samba, if it decides to start
porting:

1) Include relevant macros in the files that need them. This is used in
the stand-alone libraries (which typically have one Python module each).
This makes the code clear to anyone who knows C-API for Python 2 or 3;
but when adding new macros it requires some care to have consistency.

2) Put all compatibility macros in a shared header. This obscures the
code somewhat, with an additional header to know about, but ensures that
the set of macros is the same throughout the project, and allows
documenting them fairly easily.

3) Use a third-party library for the compatibility macros. This way, the
compat layer can be shared with other projects; it also makes it easier
to keep it tested and documented.


Regardless of which option is chosen, I have a pretty good idea about
what a compatibility layer would look like.
I have written a tested, documented library called py3c [0] that
contains all the necessary macros. To encure consistency, this is where
I've been pulling macros from when porting the stand-alone libraries.
The library is not officially recognized by Python upstream (their first
suggestion nowadays would probably be to port to Cython or CFFI). But, I
am in the process of absorbing parts of Python's C Porting Howto [1].

A superset of the macros I'd need for Samba are at:
https://github.com/encukou/py3c/blob/master/include/py3c/compat.h

The first part is specific to the porting strategy I use for Samba; it
boils down to "use PyStr for native strings":

* PyStr_* maps to PyString_* or PyUnicode_*
* Python 2: PyBytes_* maps to PyString_*
(You can ignore the static function PyStr_Concat, this wart is not
needed for Samba.)

The rest emulates py2 or py3 API in the other Python.
(Unfortunately I can't use a single Python's API for both.)

* Python 3: PyInt_* maps to PyLong_*
* Module initialization uses the py3 syntax (except the function
declaration – "MODULE_INIT_FUNC(name)" instead of "static PyObject
*PyInit_name(void)").


I have gone through Samba's C sxtensions and am reasonably sure this is
a superset of the compat layer needed to port them all. (Two exceptions
– PyFile_AsFile and PyCObject – are better dealt with individually.)


I'm attaching draft patches that port "samba.netbios" using options 1
(inline macros) and 2 (shared header). (For the shared header,
additional buildsystem integration would be needed, and possibly a
better location for the header.)


Let me know if you have any thoughts on this matter. And, thank you for
your continued patience.


[0] http://py3c.readthedocs.org
[1] http://bugs.python.org/issue24937
[2] http://py3c.readthedocs.org/en/latest/defs.html

-- 
Petr Viktorin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: py3netbios-header.patch
Type: text/x-patch
Size: 10391 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150828/1a202f4f/py3netbios-header.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: py3netbios-inline.patch
Type: text/x-patch
Size: 8002 bytes
Desc: not available
URL: <http://lists.samba.org/pipermail/samba-technical/attachments/20150828/1a202f4f/py3netbios-inline.bin>


More information about the samba-technical mailing list