Hardcopy of pybind11 instead of using git submodules (#552)

* removed pybind as submodule
* added hardcopy of pybind11 2.10.0
* rename pybind11 folder to avoid conflicts when changing branch

Co-authored-by: Dhanya Thattil <dhanya.thattil@psi.ch>
This commit is contained in:
Erik Fröjdh
2022-09-09 10:42:43 +02:00
committed by GitHub
parent 236f00c810
commit c0c8c8e21a
218 changed files with 57649 additions and 7 deletions

View File

@ -0,0 +1,81 @@
Chrono
======
When including the additional header file :file:`pybind11/chrono.h` conversions
from C++11 chrono datatypes to python datetime objects are automatically enabled.
This header also enables conversions of python floats (often from sources such
as ``time.monotonic()``, ``time.perf_counter()`` and ``time.process_time()``)
into durations.
An overview of clocks in C++11
------------------------------
A point of confusion when using these conversions is the differences between
clocks provided in C++11. There are three clock types defined by the C++11
standard and users can define their own if needed. Each of these clocks have
different properties and when converting to and from python will give different
results.
The first clock defined by the standard is ``std::chrono::system_clock``. This
clock measures the current date and time. However, this clock changes with to
updates to the operating system time. For example, if your time is synchronised
with a time server this clock will change. This makes this clock a poor choice
for timing purposes but good for measuring the wall time.
The second clock defined in the standard is ``std::chrono::steady_clock``.
This clock ticks at a steady rate and is never adjusted. This makes it excellent
for timing purposes, however the value in this clock does not correspond to the
current date and time. Often this clock will be the amount of time your system
has been on, although it does not have to be. This clock will never be the same
clock as the system clock as the system clock can change but steady clocks
cannot.
The third clock defined in the standard is ``std::chrono::high_resolution_clock``.
This clock is the clock that has the highest resolution out of the clocks in the
system. It is normally a typedef to either the system clock or the steady clock
but can be its own independent clock. This is important as when using these
conversions as the types you get in python for this clock might be different
depending on the system.
If it is a typedef of the system clock, python will get datetime objects, but if
it is a different clock they will be timedelta objects.
Provided conversions
--------------------
.. rubric:: C++ to Python
- ``std::chrono::system_clock::time_point````datetime.datetime``
System clock times are converted to python datetime instances. They are
in the local timezone, but do not have any timezone information attached
to them (they are naive datetime objects).
- ``std::chrono::duration````datetime.timedelta``
Durations are converted to timedeltas, any precision in the duration
greater than microseconds is lost by rounding towards zero.
- ``std::chrono::[other_clocks]::time_point````datetime.timedelta``
Any clock time that is not the system clock is converted to a time delta.
This timedelta measures the time from the clocks epoch to now.
.. rubric:: Python to C++
- ``datetime.datetime`` or ``datetime.date`` or ``datetime.time````std::chrono::system_clock::time_point``
Date/time objects are converted into system clock timepoints. Any
timezone information is ignored and the type is treated as a naive
object.
- ``datetime.timedelta````std::chrono::duration``
Time delta are converted into durations with microsecond precision.
- ``datetime.timedelta````std::chrono::[other_clocks]::time_point``
Time deltas that are converted into clock timepoints are treated as
the amount of time from the start of the clocks epoch.
- ``float````std::chrono::duration``
Floats that are passed to C++ as durations be interpreted as a number of
seconds. These will be converted to the duration using ``duration_cast``
from the float.
- ``float````std::chrono::[other_clocks]::time_point``
Floats that are passed to C++ as time points will be interpreted as the
number of seconds from the start of the clocks epoch.

View File

@ -0,0 +1,93 @@
Custom type casters
===================
In very rare cases, applications may require custom type casters that cannot be
expressed using the abstractions provided by pybind11, thus requiring raw
Python C API calls. This is fairly advanced usage and should only be pursued by
experts who are familiar with the intricacies of Python reference counting.
The following snippets demonstrate how this works for a very simple ``inty``
type that that should be convertible from Python types that provide a
``__int__(self)`` method.
.. code-block:: cpp
struct inty { long long_value; };
void print(inty s) {
std::cout << s.long_value << std::endl;
}
The following Python snippet demonstrates the intended usage from the Python side:
.. code-block:: python
class A:
def __int__(self):
return 123
from example import print
print(A())
To register the necessary conversion routines, it is necessary to add an
instantiation of the ``pybind11::detail::type_caster<T>`` template.
Although this is an implementation detail, adding an instantiation of this
type is explicitly allowed.
.. code-block:: cpp
namespace pybind11 { namespace detail {
template <> struct type_caster<inty> {
public:
/**
* This macro establishes the name 'inty' in
* function signatures and declares a local variable
* 'value' of type inty
*/
PYBIND11_TYPE_CASTER(inty, const_name("inty"));
/**
* Conversion part 1 (Python->C++): convert a PyObject into a inty
* instance or return false upon failure. The second argument
* indicates whether implicit conversions should be applied.
*/
bool load(handle src, bool) {
/* Extract PyObject from handle */
PyObject *source = src.ptr();
/* Try converting into a Python integer value */
PyObject *tmp = PyNumber_Long(source);
if (!tmp)
return false;
/* Now try to convert into a C++ int */
value.long_value = PyLong_AsLong(tmp);
Py_DECREF(tmp);
/* Ensure return code was OK (to avoid out-of-range errors etc) */
return !(value.long_value == -1 && !PyErr_Occurred());
}
/**
* Conversion part 2 (C++ -> Python): convert an inty instance into
* a Python object. The second and third arguments are used to
* indicate the return value policy and parent object (for
* ``return_value_policy::reference_internal``) and are generally
* ignored by implicit casters.
*/
static handle cast(inty src, return_value_policy /* policy */, handle /* parent */) {
return PyLong_FromLong(src.long_value);
}
};
}} // namespace pybind11::detail
.. note::
A ``type_caster<T>`` defined with ``PYBIND11_TYPE_CASTER(T, ...)`` requires
that ``T`` is default-constructible (``value`` is first default constructed
and then ``load()`` assigns to it).
.. warning::
When using custom type casters, it's important to declare them consistently
in every compilation unit of the Python extension module. Otherwise,
undefined behavior can ensue.

View File

@ -0,0 +1,310 @@
Eigen
#####
`Eigen <http://eigen.tuxfamily.org>`_ is C++ header-based library for dense and
sparse linear algebra. Due to its popularity and widespread adoption, pybind11
provides transparent conversion and limited mapping support between Eigen and
Scientific Python linear algebra data types.
To enable the built-in Eigen support you must include the optional header file
:file:`pybind11/eigen.h`.
Pass-by-value
=============
When binding a function with ordinary Eigen dense object arguments (for
example, ``Eigen::MatrixXd``), pybind11 will accept any input value that is
already (or convertible to) a ``numpy.ndarray`` with dimensions compatible with
the Eigen type, copy its values into a temporary Eigen variable of the
appropriate type, then call the function with this temporary variable.
Sparse matrices are similarly copied to or from
``scipy.sparse.csr_matrix``/``scipy.sparse.csc_matrix`` objects.
Pass-by-reference
=================
One major limitation of the above is that every data conversion implicitly
involves a copy, which can be both expensive (for large matrices) and disallows
binding functions that change their (Matrix) arguments. Pybind11 allows you to
work around this by using Eigen's ``Eigen::Ref<MatrixType>`` class much as you
would when writing a function taking a generic type in Eigen itself (subject to
some limitations discussed below).
When calling a bound function accepting a ``Eigen::Ref<const MatrixType>``
type, pybind11 will attempt to avoid copying by using an ``Eigen::Map`` object
that maps into the source ``numpy.ndarray`` data: this requires both that the
data types are the same (e.g. ``dtype='float64'`` and ``MatrixType::Scalar`` is
``double``); and that the storage is layout compatible. The latter limitation
is discussed in detail in the section below, and requires careful
consideration: by default, numpy matrices and Eigen matrices are *not* storage
compatible.
If the numpy matrix cannot be used as is (either because its types differ, e.g.
passing an array of integers to an Eigen parameter requiring doubles, or
because the storage is incompatible), pybind11 makes a temporary copy and
passes the copy instead.
When a bound function parameter is instead ``Eigen::Ref<MatrixType>`` (note the
lack of ``const``), pybind11 will only allow the function to be called if it
can be mapped *and* if the numpy array is writeable (that is
``a.flags.writeable`` is true). Any access (including modification) made to
the passed variable will be transparently carried out directly on the
``numpy.ndarray``.
This means you can write code such as the following and have it work as
expected:
.. code-block:: cpp
void scale_by_2(Eigen::Ref<Eigen::VectorXd> v) {
v *= 2;
}
Note, however, that you will likely run into limitations due to numpy and
Eigen's difference default storage order for data; see the below section on
:ref:`storage_orders` for details on how to bind code that won't run into such
limitations.
.. note::
Passing by reference is not supported for sparse types.
Returning values to Python
==========================
When returning an ordinary dense Eigen matrix type to numpy (e.g.
``Eigen::MatrixXd`` or ``Eigen::RowVectorXf``) pybind11 keeps the matrix and
returns a numpy array that directly references the Eigen matrix: no copy of the
data is performed. The numpy array will have ``array.flags.owndata`` set to
``False`` to indicate that it does not own the data, and the lifetime of the
stored Eigen matrix will be tied to the returned ``array``.
If you bind a function with a non-reference, ``const`` return type (e.g.
``const Eigen::MatrixXd``), the same thing happens except that pybind11 also
sets the numpy array's ``writeable`` flag to false.
If you return an lvalue reference or pointer, the usual pybind11 rules apply,
as dictated by the binding function's return value policy (see the
documentation on :ref:`return_value_policies` for full details). That means,
without an explicit return value policy, lvalue references will be copied and
pointers will be managed by pybind11. In order to avoid copying, you should
explicitly specify an appropriate return value policy, as in the following
example:
.. code-block:: cpp
class MyClass {
Eigen::MatrixXd big_mat = Eigen::MatrixXd::Zero(10000, 10000);
public:
Eigen::MatrixXd &getMatrix() { return big_mat; }
const Eigen::MatrixXd &viewMatrix() { return big_mat; }
};
// Later, in binding code:
py::class_<MyClass>(m, "MyClass")
.def(py::init<>())
.def("copy_matrix", &MyClass::getMatrix) // Makes a copy!
.def("get_matrix", &MyClass::getMatrix, py::return_value_policy::reference_internal)
.def("view_matrix", &MyClass::viewMatrix, py::return_value_policy::reference_internal)
;
.. code-block:: python
a = MyClass()
m = a.get_matrix() # flags.writeable = True, flags.owndata = False
v = a.view_matrix() # flags.writeable = False, flags.owndata = False
c = a.copy_matrix() # flags.writeable = True, flags.owndata = True
# m[5,6] and v[5,6] refer to the same element, c[5,6] does not.
Note in this example that ``py::return_value_policy::reference_internal`` is
used to tie the life of the MyClass object to the life of the returned arrays.
You may also return an ``Eigen::Ref``, ``Eigen::Map`` or other map-like Eigen
object (for example, the return value of ``matrix.block()`` and related
methods) that map into a dense Eigen type. When doing so, the default
behaviour of pybind11 is to simply reference the returned data: you must take
care to ensure that this data remains valid! You may ask pybind11 to
explicitly *copy* such a return value by using the
``py::return_value_policy::copy`` policy when binding the function. You may
also use ``py::return_value_policy::reference_internal`` or a
``py::keep_alive`` to ensure the data stays valid as long as the returned numpy
array does.
When returning such a reference of map, pybind11 additionally respects the
readonly-status of the returned value, marking the numpy array as non-writeable
if the reference or map was itself read-only.
.. note::
Sparse types are always copied when returned.
.. _storage_orders:
Storage orders
==============
Passing arguments via ``Eigen::Ref`` has some limitations that you must be
aware of in order to effectively pass matrices by reference. First and
foremost is that the default ``Eigen::Ref<MatrixType>`` class requires
contiguous storage along columns (for column-major types, the default in Eigen)
or rows if ``MatrixType`` is specifically an ``Eigen::RowMajor`` storage type.
The former, Eigen's default, is incompatible with ``numpy``'s default row-major
storage, and so you will not be able to pass numpy arrays to Eigen by reference
without making one of two changes.
(Note that this does not apply to vectors (or column or row matrices): for such
types the "row-major" and "column-major" distinction is meaningless).
The first approach is to change the use of ``Eigen::Ref<MatrixType>`` to the
more general ``Eigen::Ref<MatrixType, 0, Eigen::Stride<Eigen::Dynamic,
Eigen::Dynamic>>`` (or similar type with a fully dynamic stride type in the
third template argument). Since this is a rather cumbersome type, pybind11
provides a ``py::EigenDRef<MatrixType>`` type alias for your convenience (along
with EigenDMap for the equivalent Map, and EigenDStride for just the stride
type).
This type allows Eigen to map into any arbitrary storage order. This is not
the default in Eigen for performance reasons: contiguous storage allows
vectorization that cannot be done when storage is not known to be contiguous at
compile time. The default ``Eigen::Ref`` stride type allows non-contiguous
storage along the outer dimension (that is, the rows of a column-major matrix
or columns of a row-major matrix), but not along the inner dimension.
This type, however, has the added benefit of also being able to map numpy array
slices. For example, the following (contrived) example uses Eigen with a numpy
slice to multiply by 2 all coefficients that are both on even rows (0, 2, 4,
...) and in columns 2, 5, or 8:
.. code-block:: cpp
m.def("scale", [](py::EigenDRef<Eigen::MatrixXd> m, double c) { m *= c; });
.. code-block:: python
# a = np.array(...)
scale_by_2(myarray[0::2, 2:9:3])
The second approach to avoid copying is more intrusive: rearranging the
underlying data types to not run into the non-contiguous storage problem in the
first place. In particular, that means using matrices with ``Eigen::RowMajor``
storage, where appropriate, such as:
.. code-block:: cpp
using RowMatrixXd = Eigen::Matrix<double, Eigen::Dynamic, Eigen::Dynamic, Eigen::RowMajor>;
// Use RowMatrixXd instead of MatrixXd
Now bound functions accepting ``Eigen::Ref<RowMatrixXd>`` arguments will be
callable with numpy's (default) arrays without involving a copying.
You can, alternatively, change the storage order that numpy arrays use by
adding the ``order='F'`` option when creating an array:
.. code-block:: python
myarray = np.array(source, order="F")
Such an object will be passable to a bound function accepting an
``Eigen::Ref<MatrixXd>`` (or similar column-major Eigen type).
One major caveat with this approach, however, is that it is not entirely as
easy as simply flipping all Eigen or numpy usage from one to the other: some
operations may alter the storage order of a numpy array. For example, ``a2 =
array.transpose()`` results in ``a2`` being a view of ``array`` that references
the same data, but in the opposite storage order!
While this approach allows fully optimized vectorized calculations in Eigen, it
cannot be used with array slices, unlike the first approach.
When *returning* a matrix to Python (either a regular matrix, a reference via
``Eigen::Ref<>``, or a map/block into a matrix), no special storage
consideration is required: the created numpy array will have the required
stride that allows numpy to properly interpret the array, whatever its storage
order.
Failing rather than copying
===========================
The default behaviour when binding ``Eigen::Ref<const MatrixType>`` Eigen
references is to copy matrix values when passed a numpy array that does not
conform to the element type of ``MatrixType`` or does not have a compatible
stride layout. If you want to explicitly avoid copying in such a case, you
should bind arguments using the ``py::arg().noconvert()`` annotation (as
described in the :ref:`nonconverting_arguments` documentation).
The following example shows an example of arguments that don't allow data
copying to take place:
.. code-block:: cpp
// The method and function to be bound:
class MyClass {
// ...
double some_method(const Eigen::Ref<const MatrixXd> &matrix) { /* ... */ }
};
float some_function(const Eigen::Ref<const MatrixXf> &big,
const Eigen::Ref<const MatrixXf> &small) {
// ...
}
// The associated binding code:
using namespace pybind11::literals; // for "arg"_a
py::class_<MyClass>(m, "MyClass")
// ... other class definitions
.def("some_method", &MyClass::some_method, py::arg().noconvert());
m.def("some_function", &some_function,
"big"_a.noconvert(), // <- Don't allow copying for this arg
"small"_a // <- This one can be copied if needed
);
With the above binding code, attempting to call the the ``some_method(m)``
method on a ``MyClass`` object, or attempting to call ``some_function(m, m2)``
will raise a ``RuntimeError`` rather than making a temporary copy of the array.
It will, however, allow the ``m2`` argument to be copied into a temporary if
necessary.
Note that explicitly specifying ``.noconvert()`` is not required for *mutable*
Eigen references (e.g. ``Eigen::Ref<MatrixXd>`` without ``const`` on the
``MatrixXd``): mutable references will never be called with a temporary copy.
Vectors versus column/row matrices
==================================
Eigen and numpy have fundamentally different notions of a vector. In Eigen, a
vector is simply a matrix with the number of columns or rows set to 1 at
compile time (for a column vector or row vector, respectively). NumPy, in
contrast, has comparable 2-dimensional 1xN and Nx1 arrays, but *also* has
1-dimensional arrays of size N.
When passing a 2-dimensional 1xN or Nx1 array to Eigen, the Eigen type must
have matching dimensions: That is, you cannot pass a 2-dimensional Nx1 numpy
array to an Eigen value expecting a row vector, or a 1xN numpy array as a
column vector argument.
On the other hand, pybind11 allows you to pass 1-dimensional arrays of length N
as Eigen parameters. If the Eigen type can hold a column vector of length N it
will be passed as such a column vector. If not, but the Eigen type constraints
will accept a row vector, it will be passed as a row vector. (The column
vector takes precedence when both are supported, for example, when passing a
1D numpy array to a MatrixXd argument). Note that the type need not be
explicitly a vector: it is permitted to pass a 1D numpy array of size 5 to an
Eigen ``Matrix<double, Dynamic, 5>``: you would end up with a 1x5 Eigen matrix.
Passing the same to an ``Eigen::MatrixXd`` would result in a 5x1 Eigen matrix.
When returning an Eigen vector to numpy, the conversion is ambiguous: a row
vector of length 4 could be returned as either a 1D array of length 4, or as a
2D array of size 1x4. When encountering such a situation, pybind11 compromises
by considering the returned Eigen type: if it is a compile-time vector--that
is, the type has either the number of rows or columns set to 1 at compile
time--pybind11 converts to a 1D numpy array when returning the value. For
instances that are a vector only at run-time (e.g. ``MatrixXd``,
``Matrix<float, Dynamic, 4>``), pybind11 returns the vector as a 2D array to
numpy. If this isn't want you want, you can use ``array.reshape(...)`` to get
a view of the same data in the desired dimensions.
.. seealso::
The file :file:`tests/test_eigen.cpp` contains a complete example that
shows how to pass Eigen sparse and dense data types in more detail.

View File

@ -0,0 +1,109 @@
Functional
##########
The following features must be enabled by including :file:`pybind11/functional.h`.
Callbacks and passing anonymous functions
=========================================
The C++11 standard brought lambda functions and the generic polymorphic
function wrapper ``std::function<>`` to the C++ programming language, which
enable powerful new ways of working with functions. Lambda functions come in
two flavors: stateless lambda function resemble classic function pointers that
link to an anonymous piece of code, while stateful lambda functions
additionally depend on captured variables that are stored in an anonymous
*lambda closure object*.
Here is a simple example of a C++ function that takes an arbitrary function
(stateful or stateless) with signature ``int -> int`` as an argument and runs
it with the value 10.
.. code-block:: cpp
int func_arg(const std::function<int(int)> &f) {
return f(10);
}
The example below is more involved: it takes a function of signature ``int -> int``
and returns another function of the same kind. The return value is a stateful
lambda function, which stores the value ``f`` in the capture object and adds 1 to
its return value upon execution.
.. code-block:: cpp
std::function<int(int)> func_ret(const std::function<int(int)> &f) {
return [f](int i) {
return f(i) + 1;
};
}
This example demonstrates using python named parameters in C++ callbacks which
requires using ``py::cpp_function`` as a wrapper. Usage is similar to defining
methods of classes:
.. code-block:: cpp
py::cpp_function func_cpp() {
return py::cpp_function([](int i) { return i+1; },
py::arg("number"));
}
After including the extra header file :file:`pybind11/functional.h`, it is almost
trivial to generate binding code for all of these functions.
.. code-block:: cpp
#include <pybind11/functional.h>
PYBIND11_MODULE(example, m) {
m.def("func_arg", &func_arg);
m.def("func_ret", &func_ret);
m.def("func_cpp", &func_cpp);
}
The following interactive session shows how to call them from Python.
.. code-block:: pycon
$ python
>>> import example
>>> def square(i):
... return i * i
...
>>> example.func_arg(square)
100L
>>> square_plus_1 = example.func_ret(square)
>>> square_plus_1(4)
17L
>>> plus_1 = func_cpp()
>>> plus_1(number=43)
44L
.. warning::
Keep in mind that passing a function from C++ to Python (or vice versa)
will instantiate a piece of wrapper code that translates function
invocations between the two languages. Naturally, this translation
increases the computational cost of each function call somewhat. A
problematic situation can arise when a function is copied back and forth
between Python and C++ many times in a row, in which case the underlying
wrappers will accumulate correspondingly. The resulting long sequence of
C++ -> Python -> C++ -> ... roundtrips can significantly decrease
performance.
There is one exception: pybind11 detects case where a stateless function
(i.e. a function pointer or a lambda function without captured variables)
is passed as an argument to another C++ function exposed in Python. In this
case, there is no overhead. Pybind11 will extract the underlying C++
function pointer from the wrapped function to sidestep a potential C++ ->
Python -> C++ roundtrip. This is demonstrated in :file:`tests/test_callbacks.cpp`.
.. note::
This functionality is very useful when generating bindings for callbacks in
C++ libraries (e.g. GUI libraries, asynchronous networking libraries, etc.).
The file :file:`tests/test_callbacks.cpp` contains a complete example
that demonstrates how to work with callbacks and anonymous functions in
more detail.

View File

@ -0,0 +1,43 @@
.. _type-conversions:
Type conversions
################
Apart from enabling cross-language function calls, a fundamental problem
that a binding tool like pybind11 must address is to provide access to
native Python types in C++ and vice versa. There are three fundamentally
different ways to do this—which approach is preferable for a particular type
depends on the situation at hand.
1. Use a native C++ type everywhere. In this case, the type must be wrapped
using pybind11-generated bindings so that Python can interact with it.
2. Use a native Python type everywhere. It will need to be wrapped so that
C++ functions can interact with it.
3. Use a native C++ type on the C++ side and a native Python type on the
Python side. pybind11 refers to this as a *type conversion*.
Type conversions are the most "natural" option in the sense that native
(non-wrapped) types are used everywhere. The main downside is that a copy
of the data must be made on every Python ↔ C++ transition: this is
needed since the C++ and Python versions of the same type generally won't
have the same memory layout.
pybind11 can perform many kinds of conversions automatically. An overview
is provided in the table ":ref:`conversion_table`".
The following subsections discuss the differences between these options in more
detail. The main focus in this section is on type conversions, which represent
the last case of the above list.
.. toctree::
:maxdepth: 1
overview
strings
stl
functional
chrono
eigen
custom

View File

@ -0,0 +1,170 @@
Overview
########
.. rubric:: 1. Native type in C++, wrapper in Python
Exposing a custom C++ type using :class:`py::class_` was covered in detail
in the :doc:`/classes` section. There, the underlying data structure is
always the original C++ class while the :class:`py::class_` wrapper provides
a Python interface. Internally, when an object like this is sent from C++ to
Python, pybind11 will just add the outer wrapper layer over the native C++
object. Getting it back from Python is just a matter of peeling off the
wrapper.
.. rubric:: 2. Wrapper in C++, native type in Python
This is the exact opposite situation. Now, we have a type which is native to
Python, like a ``tuple`` or a ``list``. One way to get this data into C++ is
with the :class:`py::object` family of wrappers. These are explained in more
detail in the :doc:`/advanced/pycpp/object` section. We'll just give a quick
example here:
.. code-block:: cpp
void print_list(py::list my_list) {
for (auto item : my_list)
std::cout << item << " ";
}
.. code-block:: pycon
>>> print_list([1, 2, 3])
1 2 3
The Python ``list`` is not converted in any way -- it's just wrapped in a C++
:class:`py::list` class. At its core it's still a Python object. Copying a
:class:`py::list` will do the usual reference-counting like in Python.
Returning the object to Python will just remove the thin wrapper.
.. rubric:: 3. Converting between native C++ and Python types
In the previous two cases we had a native type in one language and a wrapper in
the other. Now, we have native types on both sides and we convert between them.
.. code-block:: cpp
void print_vector(const std::vector<int> &v) {
for (auto item : v)
std::cout << item << "\n";
}
.. code-block:: pycon
>>> print_vector([1, 2, 3])
1 2 3
In this case, pybind11 will construct a new ``std::vector<int>`` and copy each
element from the Python ``list``. The newly constructed object will be passed
to ``print_vector``. The same thing happens in the other direction: a new
``list`` is made to match the value returned from C++.
Lots of these conversions are supported out of the box, as shown in the table
below. They are very convenient, but keep in mind that these conversions are
fundamentally based on copying data. This is perfectly fine for small immutable
types but it may become quite expensive for large data structures. This can be
avoided by overriding the automatic conversion with a custom wrapper (i.e. the
above-mentioned approach 1). This requires some manual effort and more details
are available in the :ref:`opaque` section.
.. _conversion_table:
List of all builtin conversions
-------------------------------
The following basic data types are supported out of the box (some may require
an additional extension header to be included). To pass other data structures
as arguments and return values, refer to the section on binding :ref:`classes`.
+------------------------------------+---------------------------+-----------------------------------+
| Data type | Description | Header file |
+====================================+===========================+===================================+
| ``int8_t``, ``uint8_t`` | 8-bit integers | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``int16_t``, ``uint16_t`` | 16-bit integers | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``int32_t``, ``uint32_t`` | 32-bit integers | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``int64_t``, ``uint64_t`` | 64-bit integers | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``ssize_t``, ``size_t`` | Platform-dependent size | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``float``, ``double`` | Floating point types | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``bool`` | Two-state Boolean type | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``char`` | Character literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``char16_t`` | UTF-16 character literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``char32_t`` | UTF-32 character literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``wchar_t`` | Wide character literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``const char *`` | UTF-8 string literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``const char16_t *`` | UTF-16 string literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``const char32_t *`` | UTF-32 string literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``const wchar_t *`` | Wide string literal | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::string`` | STL dynamic UTF-8 string | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::u16string`` | STL dynamic UTF-16 string | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::u32string`` | STL dynamic UTF-32 string | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::wstring`` | STL dynamic wide string | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::string_view``, | STL C++17 string views | :file:`pybind11/pybind11.h` |
| ``std::u16string_view``, etc. | | |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::pair<T1, T2>`` | Pair of two custom types | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::tuple<...>`` | Arbitrary tuple of types | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::reference_wrapper<...>`` | Reference type wrapper | :file:`pybind11/pybind11.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::complex<T>`` | Complex numbers | :file:`pybind11/complex.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::array<T, Size>`` | STL static array | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::vector<T>`` | STL dynamic array | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::deque<T>`` | STL double-ended queue | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::valarray<T>`` | STL value array | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::list<T>`` | STL linked list | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::map<T1, T2>`` | STL ordered map | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::unordered_map<T1, T2>`` | STL unordered map | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::set<T>`` | STL ordered set | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::unordered_set<T>`` | STL unordered set | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::optional<T>`` | STL optional type (C++17) | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::experimental::optional<T>`` | STL optional type (exp.) | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::variant<...>`` | Type-safe union (C++17) | :file:`pybind11/stl.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::filesystem::path<T>`` | STL path (C++17) [#]_ | :file:`pybind11/stl/filesystem.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::function<...>`` | STL polymorphic function | :file:`pybind11/functional.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::chrono::duration<...>`` | STL time duration | :file:`pybind11/chrono.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``std::chrono::time_point<...>`` | STL date/time | :file:`pybind11/chrono.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``Eigen::Matrix<...>`` | Eigen: dense matrix | :file:`pybind11/eigen.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``Eigen::Map<...>`` | Eigen: mapped memory | :file:`pybind11/eigen.h` |
+------------------------------------+---------------------------+-----------------------------------+
| ``Eigen::SparseMatrix<...>`` | Eigen: sparse matrix | :file:`pybind11/eigen.h` |
+------------------------------------+---------------------------+-----------------------------------+
.. [#] ``std::filesystem::path`` is converted to ``pathlib.Path`` and
``os.PathLike`` is converted to ``std::filesystem::path``.

View File

@ -0,0 +1,249 @@
STL containers
##############
Automatic conversion
====================
When including the additional header file :file:`pybind11/stl.h`, conversions
between ``std::vector<>``/``std::deque<>``/``std::list<>``/``std::array<>``/``std::valarray<>``,
``std::set<>``/``std::unordered_set<>``, and
``std::map<>``/``std::unordered_map<>`` and the Python ``list``, ``set`` and
``dict`` data structures are automatically enabled. The types ``std::pair<>``
and ``std::tuple<>`` are already supported out of the box with just the core
:file:`pybind11/pybind11.h` header.
The major downside of these implicit conversions is that containers must be
converted (i.e. copied) on every Python->C++ and C++->Python transition, which
can have implications on the program semantics and performance. Please read the
next sections for more details and alternative approaches that avoid this.
.. note::
Arbitrary nesting of any of these types is possible.
.. seealso::
The file :file:`tests/test_stl.cpp` contains a complete
example that demonstrates how to pass STL data types in more detail.
.. _cpp17_container_casters:
C++17 library containers
========================
The :file:`pybind11/stl.h` header also includes support for ``std::optional<>``
and ``std::variant<>``. These require a C++17 compiler and standard library.
In C++14 mode, ``std::experimental::optional<>`` is supported if available.
Various versions of these containers also exist for C++11 (e.g. in Boost).
pybind11 provides an easy way to specialize the ``type_caster`` for such
types:
.. code-block:: cpp
// `boost::optional` as an example -- can be any `std::optional`-like container
namespace pybind11 { namespace detail {
template <typename T>
struct type_caster<boost::optional<T>> : optional_caster<boost::optional<T>> {};
}}
The above should be placed in a header file and included in all translation units
where automatic conversion is needed. Similarly, a specialization can be provided
for custom variant types:
.. code-block:: cpp
// `boost::variant` as an example -- can be any `std::variant`-like container
namespace pybind11 { namespace detail {
template <typename... Ts>
struct type_caster<boost::variant<Ts...>> : variant_caster<boost::variant<Ts...>> {};
// Specifies the function used to visit the variant -- `apply_visitor` instead of `visit`
template <>
struct visit_helper<boost::variant> {
template <typename... Args>
static auto call(Args &&...args) -> decltype(boost::apply_visitor(args...)) {
return boost::apply_visitor(args...);
}
};
}} // namespace pybind11::detail
The ``visit_helper`` specialization is not required if your ``name::variant`` provides
a ``name::visit()`` function. For any other function name, the specialization must be
included to tell pybind11 how to visit the variant.
.. warning::
When converting a ``variant`` type, pybind11 follows the same rules as when
determining which function overload to call (:ref:`overload_resolution`), and
so the same caveats hold. In particular, the order in which the ``variant``'s
alternatives are listed is important, since pybind11 will try conversions in
this order. This means that, for example, when converting ``variant<int, bool>``,
the ``bool`` variant will never be selected, as any Python ``bool`` is already
an ``int`` and is convertible to a C++ ``int``. Changing the order of alternatives
(and using ``variant<bool, int>``, in this example) provides a solution.
.. note::
pybind11 only supports the modern implementation of ``boost::variant``
which makes use of variadic templates. This requires Boost 1.56 or newer.
.. _opaque:
Making opaque types
===================
pybind11 heavily relies on a template matching mechanism to convert parameters
and return values that are constructed from STL data types such as vectors,
linked lists, hash tables, etc. This even works in a recursive manner, for
instance to deal with lists of hash maps of pairs of elementary and custom
types, etc.
However, a fundamental limitation of this approach is that internal conversions
between Python and C++ types involve a copy operation that prevents
pass-by-reference semantics. What does this mean?
Suppose we bind the following function
.. code-block:: cpp
void append_1(std::vector<int> &v) {
v.push_back(1);
}
and call it from Python, the following happens:
.. code-block:: pycon
>>> v = [5, 6]
>>> append_1(v)
>>> print(v)
[5, 6]
As you can see, when passing STL data structures by reference, modifications
are not propagated back the Python side. A similar situation arises when
exposing STL data structures using the ``def_readwrite`` or ``def_readonly``
functions:
.. code-block:: cpp
/* ... definition ... */
class MyClass {
std::vector<int> contents;
};
/* ... binding code ... */
py::class_<MyClass>(m, "MyClass")
.def(py::init<>())
.def_readwrite("contents", &MyClass::contents);
In this case, properties can be read and written in their entirety. However, an
``append`` operation involving such a list type has no effect:
.. code-block:: pycon
>>> m = MyClass()
>>> m.contents = [5, 6]
>>> print(m.contents)
[5, 6]
>>> m.contents.append(7)
>>> print(m.contents)
[5, 6]
Finally, the involved copy operations can be costly when dealing with very
large lists. To deal with all of the above situations, pybind11 provides a
macro named ``PYBIND11_MAKE_OPAQUE(T)`` that disables the template-based
conversion machinery of types, thus rendering them *opaque*. The contents of
opaque objects are never inspected or extracted, hence they *can* be passed by
reference. For instance, to turn ``std::vector<int>`` into an opaque type, add
the declaration
.. code-block:: cpp
PYBIND11_MAKE_OPAQUE(std::vector<int>);
before any binding code (e.g. invocations to ``class_::def()``, etc.). This
macro must be specified at the top level (and outside of any namespaces), since
it adds a template instantiation of ``type_caster``. If your binding code consists of
multiple compilation units, it must be present in every file (typically via a
common header) preceding any usage of ``std::vector<int>``. Opaque types must
also have a corresponding ``class_`` declaration to associate them with a name
in Python, and to define a set of available operations, e.g.:
.. code-block:: cpp
py::class_<std::vector<int>>(m, "IntVector")
.def(py::init<>())
.def("clear", &std::vector<int>::clear)
.def("pop_back", &std::vector<int>::pop_back)
.def("__len__", [](const std::vector<int> &v) { return v.size(); })
.def("__iter__", [](std::vector<int> &v) {
return py::make_iterator(v.begin(), v.end());
}, py::keep_alive<0, 1>()) /* Keep vector alive while iterator is used */
// ....
.. seealso::
The file :file:`tests/test_opaque_types.cpp` contains a complete
example that demonstrates how to create and expose opaque types using
pybind11 in more detail.
.. _stl_bind:
Binding STL containers
======================
The ability to expose STL containers as native Python objects is a fairly
common request, hence pybind11 also provides an optional header file named
:file:`pybind11/stl_bind.h` that does exactly this. The mapped containers try
to match the behavior of their native Python counterparts as much as possible.
The following example showcases usage of :file:`pybind11/stl_bind.h`:
.. code-block:: cpp
// Don't forget this
#include <pybind11/stl_bind.h>
PYBIND11_MAKE_OPAQUE(std::vector<int>);
PYBIND11_MAKE_OPAQUE(std::map<std::string, double>);
// ...
// later in binding code:
py::bind_vector<std::vector<int>>(m, "VectorInt");
py::bind_map<std::map<std::string, double>>(m, "MapStringDouble");
When binding STL containers pybind11 considers the types of the container's
elements to decide whether the container should be confined to the local module
(via the :ref:`module_local` feature). If the container element types are
anything other than already-bound custom types bound without
``py::module_local()`` the container binding will have ``py::module_local()``
applied. This includes converting types such as numeric types, strings, Eigen
types; and types that have not yet been bound at the time of the stl container
binding. This module-local binding is designed to avoid potential conflicts
between module bindings (for example, from two separate modules each attempting
to bind ``std::vector<int>`` as a python type).
It is possible to override this behavior to force a definition to be either
module-local or global. To do so, you can pass the attributes
``py::module_local()`` (to make the binding module-local) or
``py::module_local(false)`` (to make the binding global) into the
``py::bind_vector`` or ``py::bind_map`` arguments:
.. code-block:: cpp
py::bind_vector<std::vector<int>>(m, "VectorInt", py::module_local(false));
Note, however, that such a global binding would make it impossible to load this
module at the same time as any other pybind module that also attempts to bind
the same container type (``std::vector<int>`` in the above example).
See :ref:`module_local` for more details on module-local bindings.
.. seealso::
The file :file:`tests/test_stl_binders.cpp` shows how to use the
convenience STL container wrappers.

View File

@ -0,0 +1,292 @@
Strings, bytes and Unicode conversions
######################################
Passing Python strings to C++
=============================
When a Python ``str`` is passed from Python to a C++ function that accepts
``std::string`` or ``char *`` as arguments, pybind11 will encode the Python
string to UTF-8. All Python ``str`` can be encoded in UTF-8, so this operation
does not fail.
The C++ language is encoding agnostic. It is the responsibility of the
programmer to track encodings. It's often easiest to simply `use UTF-8
everywhere <http://utf8everywhere.org/>`_.
.. code-block:: c++
m.def("utf8_test",
[](const std::string &s) {
cout << "utf-8 is icing on the cake.\n";
cout << s;
}
);
m.def("utf8_charptr",
[](const char *s) {
cout << "My favorite food is\n";
cout << s;
}
);
.. code-block:: pycon
>>> utf8_test("🎂")
utf-8 is icing on the cake.
🎂
>>> utf8_charptr("🍕")
My favorite food is
🍕
.. note::
Some terminal emulators do not support UTF-8 or emoji fonts and may not
display the example above correctly.
The results are the same whether the C++ function accepts arguments by value or
reference, and whether or not ``const`` is used.
Passing bytes to C++
--------------------
A Python ``bytes`` object will be passed to C++ functions that accept
``std::string`` or ``char*`` *without* conversion. In order to make a function
*only* accept ``bytes`` (and not ``str``), declare it as taking a ``py::bytes``
argument.
Returning C++ strings to Python
===============================
When a C++ function returns a ``std::string`` or ``char*`` to a Python caller,
**pybind11 will assume that the string is valid UTF-8** and will decode it to a
native Python ``str``, using the same API as Python uses to perform
``bytes.decode('utf-8')``. If this implicit conversion fails, pybind11 will
raise a ``UnicodeDecodeError``.
.. code-block:: c++
m.def("std_string_return",
[]() {
return std::string("This string needs to be UTF-8 encoded");
}
);
.. code-block:: pycon
>>> isinstance(example.std_string_return(), str)
True
Because UTF-8 is inclusive of pure ASCII, there is never any issue with
returning a pure ASCII string to Python. If there is any possibility that the
string is not pure ASCII, it is necessary to ensure the encoding is valid
UTF-8.
.. warning::
Implicit conversion assumes that a returned ``char *`` is null-terminated.
If there is no null terminator a buffer overrun will occur.
Explicit conversions
--------------------
If some C++ code constructs a ``std::string`` that is not a UTF-8 string, one
can perform a explicit conversion and return a ``py::str`` object. Explicit
conversion has the same overhead as implicit conversion.
.. code-block:: c++
// This uses the Python C API to convert Latin-1 to Unicode
m.def("str_output",
[]() {
std::string s = "Send your r\xe9sum\xe9 to Alice in HR"; // Latin-1
py::str py_s = PyUnicode_DecodeLatin1(s.data(), s.length());
return py_s;
}
);
.. code-block:: pycon
>>> str_output()
'Send your résumé to Alice in HR'
The `Python C API
<https://docs.python.org/3/c-api/unicode.html#built-in-codecs>`_ provides
several built-in codecs.
One could also use a third party encoding library such as libiconv to transcode
to UTF-8.
Return C++ strings without conversion
-------------------------------------
If the data in a C++ ``std::string`` does not represent text and should be
returned to Python as ``bytes``, then one can return the data as a
``py::bytes`` object.
.. code-block:: c++
m.def("return_bytes",
[]() {
std::string s("\xba\xd0\xba\xd0"); // Not valid UTF-8
return py::bytes(s); // Return the data without transcoding
}
);
.. code-block:: pycon
>>> example.return_bytes()
b'\xba\xd0\xba\xd0'
Note the asymmetry: pybind11 will convert ``bytes`` to ``std::string`` without
encoding, but cannot convert ``std::string`` back to ``bytes`` implicitly.
.. code-block:: c++
m.def("asymmetry",
[](std::string s) { // Accepts str or bytes from Python
return s; // Looks harmless, but implicitly converts to str
}
);
.. code-block:: pycon
>>> isinstance(example.asymmetry(b"have some bytes"), str)
True
>>> example.asymmetry(b"\xba\xd0\xba\xd0") # invalid utf-8 as bytes
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xba in position 0: invalid start byte
Wide character strings
======================
When a Python ``str`` is passed to a C++ function expecting ``std::wstring``,
``wchar_t*``, ``std::u16string`` or ``std::u32string``, the ``str`` will be
encoded to UTF-16 or UTF-32 depending on how the C++ compiler implements each
type, in the platform's native endianness. When strings of these types are
returned, they are assumed to contain valid UTF-16 or UTF-32, and will be
decoded to Python ``str``.
.. code-block:: c++
#define UNICODE
#include <windows.h>
m.def("set_window_text",
[](HWND hwnd, std::wstring s) {
// Call SetWindowText with null-terminated UTF-16 string
::SetWindowText(hwnd, s.c_str());
}
);
m.def("get_window_text",
[](HWND hwnd) {
const int buffer_size = ::GetWindowTextLength(hwnd) + 1;
auto buffer = std::make_unique< wchar_t[] >(buffer_size);
::GetWindowText(hwnd, buffer.data(), buffer_size);
std::wstring text(buffer.get());
// wstring will be converted to Python str
return text;
}
);
Strings in multibyte encodings such as Shift-JIS must transcoded to a
UTF-8/16/32 before being returned to Python.
Character literals
==================
C++ functions that accept character literals as input will receive the first
character of a Python ``str`` as their input. If the string is longer than one
Unicode character, trailing characters will be ignored.
When a character literal is returned from C++ (such as a ``char`` or a
``wchar_t``), it will be converted to a ``str`` that represents the single
character.
.. code-block:: c++
m.def("pass_char", [](char c) { return c; });
m.def("pass_wchar", [](wchar_t w) { return w; });
.. code-block:: pycon
>>> example.pass_char("A")
'A'
While C++ will cast integers to character types (``char c = 0x65;``), pybind11
does not convert Python integers to characters implicitly. The Python function
``chr()`` can be used to convert integers to characters.
.. code-block:: pycon
>>> example.pass_char(0x65)
TypeError
>>> example.pass_char(chr(0x65))
'A'
If the desire is to work with an 8-bit integer, use ``int8_t`` or ``uint8_t``
as the argument type.
Grapheme clusters
-----------------
A single grapheme may be represented by two or more Unicode characters. For
example 'é' is usually represented as U+00E9 but can also be expressed as the
combining character sequence U+0065 U+0301 (that is, the letter 'e' followed by
a combining acute accent). The combining character will be lost if the
two-character sequence is passed as an argument, even though it renders as a
single grapheme.
.. code-block:: pycon
>>> example.pass_wchar("é")
'é'
>>> combining_e_acute = "e" + "\u0301"
>>> combining_e_acute
'é'
>>> combining_e_acute == "é"
False
>>> example.pass_wchar(combining_e_acute)
'e'
Normalizing combining characters before passing the character literal to C++
may resolve *some* of these issues:
.. code-block:: pycon
>>> example.pass_wchar(unicodedata.normalize("NFC", combining_e_acute))
'é'
In some languages (Thai for example), there are `graphemes that cannot be
expressed as a single Unicode code point
<http://unicode.org/reports/tr29/#Grapheme_Cluster_Boundaries>`_, so there is
no way to capture them in a C++ character type.
C++17 string views
==================
C++17 string views are automatically supported when compiling in C++17 mode.
They follow the same rules for encoding and decoding as the corresponding STL
string type (for example, a ``std::u16string_view`` argument will be passed
UTF-16-encoded data, and a returned ``std::string_view`` will be decoded as
UTF-8).
References
==========
* `The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!) <https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/>`_
* `C++ - Using STL Strings at Win32 API Boundaries <https://msdn.microsoft.com/en-ca/magazine/mt238407.aspx>`_