|
| 1 | +HPy API |
| 2 | +======= |
| 3 | + |
| 4 | +.. warning:: |
| 5 | + HPy is still in the early stages of development and as such the API is |
| 6 | + subsequent to changes |
| 7 | + |
| 8 | +Handles |
| 9 | +------- |
| 10 | + |
| 11 | +The "H" in HPy stands for **handle**, which is a central concept: handles are |
| 12 | +used to hold a C reference to Python objects, and they are represented by the |
| 13 | +C ``HPy`` type. They play the same role as ``PyObject *`` in the Python/C |
| 14 | +API, albeit with some important differences which are detailed below. |
| 15 | + |
| 16 | +When they are no longer needed, handles must be closed by calling |
| 17 | +``HPy_Close``, which plays more or less the same role as ``Py_DECREF``. |
| 18 | +Similarly, if you need a new handle for an existing object, you can duplicate |
| 19 | +it by calling ``HPy_Dup``, which plays more or less the same role as |
| 20 | +``Py_INCREF``. |
| 21 | + |
| 22 | +The concept of handles is certainly not unique to HPy. Other examples include |
| 23 | +Unix file descriptors, where you have ``dup()`` and ``close()``, and Windows' |
| 24 | +``HANDLE``, where you have ``DuplicateHandle()`` and ``CloseHandle()``. |
| 25 | + |
| 26 | + |
| 27 | +Handles vs ``PyObject *`` |
| 28 | +~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 29 | + |
| 30 | +.. XXX I don't like this sentence, but I can't come up with anything better |
| 31 | + right now. Please rephrase/rewrite :) |
| 32 | +
|
| 33 | +The biggest difference is that in the old Python/C API, multiple ``PyObject |
| 34 | +*`` references to the same objects are completely equivalent to each other, |
| 35 | +and they can be passed to Python/C API functions interchangeably. In |
| 36 | +particular, it does not matter which particular reference you call |
| 37 | +``Py_INCREF`` and ``Py_DECREF`` on, as long as the total number of increfs and |
| 38 | +decrefs to the underlying object is the same at the end of the object |
| 39 | +lifetime. |
| 40 | + |
| 41 | +For example, the following is a perfectly valid piece of Python/C code:: |
| 42 | + |
| 43 | + void foo(void) |
| 44 | + { |
| 45 | + PyObject *x = PyLong_FromLong(42); // implicit INCREF on x |
| 46 | + PyObject *y = x; |
| 47 | + Py_INCREF(y); // INCREF on y |
| 48 | + /* ... */ |
| 49 | + Py_DECREF(x); |
| 50 | + Py_DECREF(x); // two DECREF on x |
| 51 | + } |
| 52 | + |
| 53 | +In HPy, each handle must be closed independently. The example above becomes:: |
| 54 | + |
| 55 | + void foo(HPyContext ctx) |
| 56 | + { |
| 57 | + HPy x = HPyLong_FromLong(ctx, 42); |
| 58 | + HPy y = HPy_Dup(ctx, x); |
| 59 | + /* ... */ |
| 60 | + // we need to close x and y independently |
| 61 | + HPy_Close(ctx, x); |
| 62 | + HPy_Close(ctx, y); |
| 63 | + } |
| 64 | + |
| 65 | +Calling any HPy function on a closed handle is an error. Calling |
| 66 | +``HPy_Close()`` on the same handle twice is an error. Forgetting to call |
| 67 | +``HPy_Close()`` on a handle results in a memory leak. When running in |
| 68 | +:ref:`debug mode`, HPy actively checks that you that you don't close a handle |
| 69 | +twice and that you don't forget to close any. |
| 70 | + |
| 71 | + |
| 72 | +.. note:: |
| 73 | + The debug mode is a good example of how powerful it is to decouple the |
| 74 | + lifetime of handles and the lifetime of an objects. If you find a memory |
| 75 | + leak on CPython, you know that you are missing a ``Py_DECREF`` somewhere but |
| 76 | + the only way to find the corresponding ``Py_INCREF`` is to manually and |
| 77 | + carefully study the source code. On the other hand, if you forget to call |
| 78 | + ``HPy_Close()``, the HPy debug mode is able to tell the precise code |
| 79 | + location which created the unclosed handle. Similarly, if you try to |
| 80 | + operate on a closed handle, it will tell you the precise code locations |
| 81 | + which created and closed it. |
| 82 | + |
| 83 | + |
| 84 | +The other important difference is that Python/C guarantees that multiple |
| 85 | +references to the same object results in the very same ``PyObject *`` pointer. |
| 86 | +Thus, it is possible to compare C pointers by equality to check whether they |
| 87 | +point to the same object:: |
| 88 | + |
| 89 | + void is_same_object(PyObject *x, PyObject *y) |
| 90 | + { |
| 91 | + return x == y; |
| 92 | + } |
| 93 | + |
| 94 | +On the other hand, in HPy, each handle is independent and it is common to have |
| 95 | +two different handles which point to the same underlying object, so comparing |
| 96 | +two handles directly is ill-defined. To prevent this kind of common error |
| 97 | +(especially when porting existing code to HPy), the ``HPy`` C type is opaque |
| 98 | +and the C compiler actively forbids comparisons between them. To check for |
| 99 | +identity, you can use ``HPy_Is()``:: |
| 100 | + |
| 101 | + void is_same_object(HPyContext ctx, HPy x, HPy y) |
| 102 | + { |
| 103 | + // return x == y; // compilation error! |
| 104 | + return HPy_Is(ctx, x, y); |
| 105 | + } |
| 106 | + |
| 107 | +.. note:: |
| 108 | + The main benefit of the semantics of handles is that it allows |
| 109 | + implementations to use very different models of memory management. On |
| 110 | + CPython, implementing handles is trivial because ``HPy`` is basically |
| 111 | + ``PyObject *`` in disguise, and ``HPy_Dup()`` and ``HPy_Close()`` are just |
| 112 | + aliases for ``Py_INCREF`` and ``Py_DECREF``. |
| 113 | + |
| 114 | + Contrarily to CPython, PyPy does not use reference counting for memory |
| 115 | + management: instead, it uses a *moving GC*, which means that the address of |
| 116 | + an object might change during its lifetime, and makes it hard to implement |
| 117 | + semantics like ``PyObject *``'s where the address is directly exposed to |
| 118 | + the user. HPy solves this problem: handles are integers which represent |
| 119 | + indices into a list, which is itself managed by the GC. When an object |
| 120 | + moves, the GC fixes the address into the list, without having to touch all |
| 121 | + the handles which have been passed to C. |
| 122 | + |
| 123 | + |
| 124 | +HPyContext |
| 125 | +----------- |
| 126 | + |
| 127 | +All HPy function calls take a ``HPyContext`` as a first argument, which |
| 128 | +represents the the Python interpreter all the handles belong to. Strictly |
| 129 | +speaking, it would be possible to design the HPy API without using |
| 130 | +``HPyContext``: after all, all HPy function calls are ultimately mapped to |
| 131 | +Python/C function call, where there is no notion of context. |
| 132 | + |
| 133 | +One of the reasons to include ``HPyContext`` from the day one is to be |
| 134 | +future-proof: it is conceivable to use it to hold the interpreter or the |
| 135 | +thread state in the future, in particular when there will be support for |
| 136 | +sub-interpreter. Another possible usage could be to embed different versions |
| 137 | +or implementations of Python inside the same process. |
| 138 | + |
| 139 | +Moreover, ``HPyContext`` is used by the :term:`HPy Universal ABI` to contain a |
| 140 | +sort of virtual function table which is used by the C extensions to call back |
| 141 | +into the Python interpreter. |
| 142 | + |
| 143 | + |
| 144 | +A simple example |
| 145 | +----------------- |
| 146 | + |
| 147 | +In this section, we will see how to write a simple C extension using HPy. It |
| 148 | +is assumed that you are already familiar with the existing Python/C API, so we |
| 149 | +will underline the similarities and the differences with it. |
| 150 | + |
| 151 | +We want to create a function named ``myabs`` which takes a single argument and |
| 152 | +computes its absolute value:: |
| 153 | + |
| 154 | + #include "hpy.h" |
| 155 | + |
| 156 | + HPy_DEF_METH_O(myabs) |
| 157 | + static HPy myabs_impl(HPyContext ctx, HPy self, HPy obj) |
| 158 | + { |
| 159 | + return HPy_Absolute(ctx, obj); |
| 160 | + } |
| 161 | + |
| 162 | +There are a couple of points which are worth noting: |
| 163 | + |
| 164 | + * We use the macro ``HPy_DEF_METH_O`` to declare we are going to define a |
| 165 | + HPy function called ``myabs``, which uses the ``METH_O`` calling |
| 166 | + convention. As in Python/C, ``METH_O`` means that the function receives a |
| 167 | + single argument. |
| 168 | + |
| 169 | + * The actual C function which implements ``myabs`` is called ``myabs_impl``. |
| 170 | + |
| 171 | + * It receives two arguments of type ``HPy``, which are handles which are |
| 172 | + guaranteed to be valid: they are automatically closed by the caller, so |
| 173 | + there is no need to call ``HPy_Close`` on them. |
| 174 | + |
| 175 | + * It returns a handle, which has to be closed by the caller. |
| 176 | + |
| 177 | + * ``HPy_Absolute`` is the equivalent of ``PyNumber_Absolute`` and obviosuly |
| 178 | + computes the absolute value of the given argument. |
| 179 | + |
| 180 | +The usage of the macro is needed to maintain compatibility with CPython. On |
| 181 | +CPython, C functions and methods have a C signature which is different than |
| 182 | +the one used by HPy: they don't receive a ``HPyContext`` and their arguments |
| 183 | +have the type ``PyObject *`` instead of ``HPy``. The macro automatically |
| 184 | +generates a trampoline function whose signature is appropriate for CPython and |
| 185 | +which calls the ``myabs_impl``. |
| 186 | + |
| 187 | +Now, we can define our module:: |
| 188 | + |
| 189 | + static HPyMethodDef SimpleMethods[] = { |
| 190 | + {"myabs", myabs, HPy_METH_O, "Compute the absolute value of the given argument"}, |
| 191 | + {NULL, NULL, 0, NULL} |
| 192 | + }; |
| 193 | + |
| 194 | + static HPyModuleDef moduledef = { |
| 195 | + HPyModuleDef_HEAD_INIT, |
| 196 | + .m_name = "simple", |
| 197 | + .m_doc = "HPy Example", |
| 198 | + .m_size = -1, |
| 199 | + .m_methods = SimpleMethods |
| 200 | + }; |
| 201 | + |
| 202 | +This part is very similar to the one you would write in Python/C. Note that |
| 203 | +we specify ``myabs`` (and **not** ``myabs_impl``) in the method table, and |
| 204 | +that we have to indicate the calling convention again. This is a deliberate |
| 205 | +choice, to minimize the changes needed to port existing extensions, and to |
| 206 | +make it easier to support hybrid extensions in which some of the methods are |
| 207 | +still written using the Python/C API. |
| 208 | + |
| 209 | +Finally, ``HPyModuleDef`` is basically the same as the old ``PyModuleDef``. |
| 210 | + |
| 211 | +Building the module |
| 212 | +~~~~~~~~~~~~~~~~~~~~ |
| 213 | + |
| 214 | +.. note:: |
| 215 | + The integration with distutils/setuptools is probably going to change, |
| 216 | + eventually. The recipe shown here is just provisional and might stop |
| 217 | + working eventually. |
| 218 | + |
| 219 | +Let's write a ``setup.py`` to build our extension: |
| 220 | + |
| 221 | +.. code-block:: python |
| 222 | +
|
| 223 | + from setuptools import setup, Extension |
| 224 | + import hpy.devel |
| 225 | + setup( |
| 226 | + name="hpy-example", |
| 227 | + ext_modules=[ |
| 228 | + Extension( |
| 229 | + 'simple', ['simple.c'] + hpy.devel.get_sources(), |
| 230 | + include_dirs=[hpy.devel.get_include()], |
| 231 | + ), |
| 232 | + ], |
| 233 | + ) |
| 234 | +
|
| 235 | +You need ``hpy.devel`` to be available in your path to run |
| 236 | +it. ``hpy.devel.get_sources()`` returns a list of additionaly C files which |
| 237 | +contain HPy support functions. ``hpy.devel.get_include()`` return the |
| 238 | +directory in which to find ``hpy.h``. |
| 239 | + |
| 240 | +We can now build the extension by running ``python setup.py build_ext -i``. On |
| 241 | +CPython, it will target the :term:`CPython ABI` by default, so you will end up with |
| 242 | +a file named e.g. ``simple.cpython-37m-x86_64-linux-gnu.so`` which can be |
| 243 | +imported directly on CPython with no dependency on HPy. |
| 244 | + |
| 245 | +VARARGS calling convention |
| 246 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 247 | + |
| 248 | +If we want to receive more than a single arguments, we need the |
| 249 | +``HPy_METH_VARARGS`` calling convention. Let's add a function ``add_ints`` |
| 250 | +which adds two integers:: |
| 251 | + |
| 252 | + HPy_DEF_METH_VARARGS(add_ints) |
| 253 | + static HPy add_ints_impl(HPyContext ctx, HPy self, HPy *args, HPy_ssize_t nargs) |
| 254 | + { |
| 255 | + long a, b; |
| 256 | + if (!HPyArg_Parse(ctx, args, nargs, "ll", &a, &b)) |
| 257 | + return HPy_NULL; |
| 258 | + return HPyLong_FromLong(ctx, a+b); |
| 259 | + } |
| 260 | + |
| 261 | +There are a few things to note: |
| 262 | + |
| 263 | + * The C signature is different than the corresponding Python/C |
| 264 | + ``METH_VARARGS``: in particular, instead of taking a ``PyObject *args``, |
| 265 | + we take an array of ``HPy`` and its size. This allows e.g. PyPy to do a |
| 266 | + call more efficiently, because you don't need to create a tuple just to |
| 267 | + pass the arguments. |
| 268 | + |
| 269 | + * We call ``HPyArg_Parse`` to parse the arguments. Contrarily to almost all |
| 270 | + the other HPy functions, this is **not** a thin wrapper around |
| 271 | + ``PyArg_ParseTuple`` because as stated above we don't have a tuple to pass |
| 272 | + to it, although the idea is to mimic its behavior as closely as |
| 273 | + possible. The parsing logic is implemented from scratch inside HPy, and as |
| 274 | + such there might be missing functionalities during the early stages of HPy |
| 275 | + development. |
| 276 | + |
| 277 | + * In case of error, we return ``HPy_NULL``: we cannot simply ``return NULL`` |
| 278 | + because ``HPy`` is not a pointer type. |
| 279 | + |
| 280 | +Once we write our function, we can add it to the ``SimpleMethods[]`` table, |
| 281 | +which now becomes:: |
| 282 | + |
| 283 | + static HPyMethodDef SimpleMethods[] = { |
| 284 | + {"myabs", myabs, HPy_METH_O, "Compute the absolute value of the given argument"}, |
| 285 | + {"add_ints", add_ints, HPy_METH_VARARGS, "Add two integers"}, |
| 286 | + {NULL, NULL, 0, NULL} |
| 287 | + }; |
0 commit comments