Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to find PythonExtensions #966

Open
ofloveandhate opened this issue Jan 10, 2025 · 11 comments
Open

how to find PythonExtensions #966

ofloveandhate opened this issue Jan 10, 2025 · 11 comments

Comments

@ofloveandhate
Copy link

I'm trying to use scikit-build-core for a C++ project that generates Python bindings, and I'm stuck at an early step. I've tried to follow the tutorial, and I can't even get that to work because I can't get CMake to find FindPythonExtensions.cmake. I'm writing to ask for help making sure this .cmake file is available.


I've seen a number of projects online that use your tool, and their CMakeLists.txt files contain the line

find_package(PythonExtensions REQUIRED)

I just cannot get this step to succeed. All I can get is

        By not providing "FindPythonExtensions.cmake" in CMAKE_MODULE_PATH this
        project has asked CMake to find a package configuration file provided by
        "PythonExtensions", but CMake did not find one.
      
        Could not find a package configuration file provided by "PythonExtensions"
        with any of the following names:
      
          PythonExtensionsConfig.cmake
          pythonextensions-config.cmake
      
        Add the installation prefix of "PythonExtensions" to CMAKE_PREFIX_PATH or
        set "PythonExtensions_DIR" to a directory containing one of the above
        files.  If "PythonExtensions" provides a separate development package or
        SDK, be sure it has been installed.

I can see the necessary file FindPythonExtensions.cmake as part of my scikit-build installation at site-packages/skbuild/resources/cmake. But I do not know how to programmatically get the location of this file to add it to my CMAKE_MODULE_PATH. To that end, I've tried running a Python script via EXECUTE_PROCESS a la this forum post, but I cannot get it to succeed for the life of me.


Am I supposed to take a copy of FindPythonExtensions.cmake into my repo? Why doesn't scikit-build-core find it when I invoke pip install . to build and install my library?

@henryiii
Copy link
Collaborator

henryiii commented Jan 10, 2025

PythonExtensions is the old tool that is part of scikit-build (classic) and is based on the deprecated (and partially removed in CMake 3.27+) FindPythonInterp/FindPythonLibs modules in CMake. Scikit-build-core is designed to be used with the modern FindPython, which doesn't need the extra helpers in PythonExtensions.

If you use scikit-build (classic), then that tutorial will work, and PythonExtensions will automatically be available. If you use scikit-build-core (following https://scikit-build-core.readthedocs.io/en/latest/getting_started.html), then you don't need additional modules and can just use what CMake provides.

@ofloveandhate
Copy link
Author

ofloveandhate commented Jan 12, 2025

Ok, thank you.

Where can I find documentation on the CMake things that scikit-build and scikit-build-core provide? I want to be as independent as possible! My goal is to redo my build system for a C++/Python library using Boost.Python (so I can finally hopefully distribute pre-compiled binaries, a goal that's long challenged me), and I'm working to build a minimal example since the examples don't include such an example. For example, my question next is about linking needed libraries to my python_add_library. I get an error on my next line, where i have

python_add_library(slartibartfast MODULE ${SOURCES} ${HEADERS} WITH_SOABI)  
#ok, cool, i'm making a python library
target_link_libraries(slartibartfast ${Boost_LIBRARIES})  # error

The error I get is

        The keyword signature for target_link_libraries has already been used with
        the target "slartibartfast".  All uses of target_link_libraries with a
        target must be either all-keyword or all-plain.
      
        The uses of the keyword signature are here:
      
         * /opt/homebrew/share/cmake/Modules/FindPython/Support.cmake:4249 (target_link_libraries)

I can get it to work by adding the PRIVATE keyword to the target_link_libraries call, which I knew to do from pybind11's example, not from documentation. How could I have known to use PRIVATE to get around this?

Another thing vexing me is how to get scikit-build-core / CMake / pip to install both a built C++/.so library AND a pure-python library that imports the .so and makes calls into it. I haven't found an example of that yet. Is there one in the documentation for scikit-build-core?

Thanks for your time!

@LecrisUT
Copy link
Collaborator

scikit-build-core does not provide any special CMake functions, other than backporting find_python when needed 1. You can (should) design the build using scikit-build-core as you would a plain CMake project. On top of that, you just add packaging instructions, e.g. install.

Regarding dependencies though, you can bundle them and download them as needed using FetchContent (and I believe boost is actually FetchContent compliant, but not sure if all components), but you need to take care of bundling all the necessary dependencies to either have them all statically linked and included or appropriate RPATH. Usually cibuildwheel can help with such bundling.

Another thing vexing me is how to get scikit-build-core / CMake / pip to install both a built C++/.so library AND a pure-python library that imports the .so and makes calls into it. I haven't found an example of that yet. Is there one in the documentation for scikit-build-core?

This is a rather more complex design question, and it can be handled in various ways depending on what are your requirements. I generally ask the question where do you want to package and support your project (PyPI, conda, spack, distro packages)? Each of these will put some constraints on the design and guide you to some good practices for each environment. Another important aspect is what is your intended user experience after pip install, e.g. do you want the project to try to use some system libraries or always use the bundled libraries?

As for reference projects, I do not believe we have any official ones to recommend, but maybe @henryiii has some good references. My design in spglib that is trying to address some of your issues, but it is slightly blocked by #880. At some point though, I will post a hello-world project derived from it. The main aspect that I tackle there is integrating compatibility for distro-packaging, PyPI distribution and FetchContent compatibility.

Footnotes

  1. https://github.com/scikit-build/scikit-build-core/tree/main/src/scikit_build_core/resources/find_python

@ofloveandhate
Copy link
Author

Thank you for your thorough reply. I have a working CMake build setup already for my library, but it still feels piecemeal, in that pip installing from the repo isn't connected to compilation (yet). My tool consists of three pieces:

  • C++ core library, can be installed without Python bindings
  • Boost.Python bindings, depend on core being installed
  • Pure Python wrapper around the compiled bindings

Getting all three built and installed currently takes three steps, and they feel disconnected, and it's a barrier to people using my tools. I desperately need it to be pip- or conda-installable.

I certainly would welcome a complete model example for packaging and distribution of a library built using C++ and Python; not just building, but how to get it all the way into the distribution pipeline. My highest goal is pip over conda, but both would be best (together with homebrew). I think galsim seems pretty close to what I'm doing.

I think we're close to the end of this thread, which has now wandered from the title of this thread. To conclude, do you have advice or resources for closing the gap between a public repo on github and installation via pip, for such a library as I'm working with? That's what brought me to scikit-build in the first place. I keep finding examples that do one step or another, but nothing that's end-to-end (and maybe such an example would be too specific, but here I am, needing to solve this problem, and it continues to be challenging to me).

Thanks again, I really appreciate your time.

@ofloveandhate
Copy link
Author

Following up a bit for myself to help bring this nearer to closure, python_add_library comes from find_package(Python ...), which is documented here

@ofloveandhate
Copy link
Author

Progress updates

Ok, I'm much further than last time. I'm working on a complete example repo that solves all of the bottle necks I have been working on.

I now have the ability to compile the pieces, and I'm back to a scikit-build question, I think.

My pieces:

  • a core C++ library, and a command-line executable
  • compiled bindings using Boost.Python
  • a pure-python wrapper

From top level, I can build the non-python pieces and they install correctly. And, the Python bindings build at the same time, which is a nice improvement. But, the bindings don't install to the correct place, and the pure-python part isn't tied to the rest.


My scikit-build questions are...

For a directory structure that looks like

/
/core/              (a C++ library and an executable)
/python_bindings/   (Boost.Python, depends on core)
/python_wrapper/    (pure python, imports the bindings and does more stuff before the user finally gets to interact)
  1. Where should the scikit-build pieces go?
    • By this I mean pyproject.toml and setup.py
    • I think at top level, /. Or is it in /python_wrapper/? If it's in python_wrapper, how do I expose it from top level?
  2. How do I make the bindings get installed in the correct site_packages folder? I've been using this (nonscikit-build-core) example, and I think I'm close
  3. How do I get pip install to install each of the components I've listed above, using scikit-build-core? Like, how do I express the dependency of python_wrapper on core and python_bindings?

@henryiii
Copy link
Collaborator

It's best to put pyproject.toml in the top level. That will work best. You don't have a setup.py, that's a setuptools file. You do have a CMakeLists.txt file, which can go either place, but I'd guess it's best top level also since you have a /core too you want to build.

The default for install(...) is to install into site-packages. You can globally add a package folder to all installs using tool.scikit-build.wheel.install-dir = "<folder>".

If you use cmake targets and install commands, all the components should get installed.

@LecrisUT
Copy link
Collaborator

LecrisUT commented Jan 15, 2025

Here is an excessive answer, but I think the design that you want is similar to mine so I am sharing some organization notes.


  1. Where should the scikit-build pieces go?

To add on Henry's answer, you are constrained by what sources do you need when building the project, i.e. starting from the folder that contains the pyproject.toml, the CMakeLists.txt must be buildable without the sources from outside. One design you can do there is split your project into multiple sub-projects:

core (foo-sys)

The main C++ project built as-is. Examples: cmake, ninja.

Advantage of having this as an independent package is that you can offer more flexibility in which core library to choose, and packaging wise it is simple for downstream distros to override it.

Note that there is still a lot of controversy if this design is advisable, and we have not fully fleshed it out. foo-sys is a design pattern in rust packages though, but even there there is no strong recommendation on how to split it, and often times it also includes the minimal bindings.

python_bindings (py-foo?)

Just the python bindings built and packaged.

Advantage of having this as independent package is that it does not need to be updated as often and minimizes the package sizes. When it is independent of the foo-sys you can take advantage of having RPATH point to files that are packaged as part of foo-sys instead, so:

  • When foo-sys is not installed (system-mode), the system library loading tries to find a compatible libfoo (determined by SONAME/SOVERSION) to load, sometimes even picking up appropriately optimized libraries based on hardware capabilities (e.g. with AVX512)
  • When foo-sys is installed (bundled-mode), the RPATH takes precedence.
    Design is a bit trickier with windows systems where the .dll file needs to be next to the bindings files
  • When no foo-sys is available, you can easily point the user to try to install the bundled library or look for system installations

Depending on compatibility that you want to support, you may want to still bundle this with either the core-sys or pure-python.

python_wrapper (foo)

Pure python files.

Advantage of having this as independent package is that it offers more quicker development, smaller wheel sizes, less rebuilds and tests in downstream etc.

PS: They do not have to be different sibling sub-directories, they can be nested in one-another. One layout is to have the main project at top-level, /python for the python bindings/wrapper, but not sure how to accommodate it in the full 3-way split.


  1. How do I make the bindings get installed in the correct site_packages folder? I've been using this (nonscikit-build-core) example, and I think I'm close

As Henry answered: wheel.install-dir + install( ./pkg/...) where . will expand to site_packages. If you go with the core-sys split, you can just use the default CMAKE_INSTALL_*DIR path structure and have a vanilla CMake project there that is much easier for distro packagers to pick up.


  1. How do I get pip install to install each of the components I've listed above, using scikit-build-core? Like, how do I express the dependency of python_wrapper on core and python_bindings?

pip install can work on git urls and point them to the subdirectory of the sub-package 1

$ pip install 'foo[foo-sys] @ git+https://git.repo/foo@branch#subdirectory=python_wrapper'

(I am not sure if it works if you combine foo and foo-sys in the same cli)

Inside them though you can have a plain dependency in pyproject.toml with the compatibility model that you want to support.

Footnotes

  1. https://pip.pypa.io/en/stable/cli/pip_install/

@ofloveandhate
Copy link
Author

If you go with the core-sys split, you can just use the default CMAKE_INSTALL_*DIR path structure and have a vanilla CMake project there that is much easier for distro packagers to pick up.

Perfect, this is what I will do.

ofloveandhate added a commit to ofloveandhate/scikit-build_boost-python that referenced this issue Jan 16, 2025
Thanks to help in scikit-build/scikit-build-core#966, this example now installs the bindings to the same folder as the pure-python wrapper.  can now successfully

```
import example
print(example.foo())
```

where `example.foo` is a binding function via Boost.Python which calls `example::core_function` from the pure c++ core library
@ofloveandhate
Copy link
Author

Now something that feels unrelated, but is also part of the design choice we're talking about.

From the packaging.python.org documentation,

When a field is dynamic, it is the build backend’s responsibility to fill it. Consult your build backend’s documentation to learn how it does it.


Another question follows, particularly for @LecrisUT : how to do versioning, since the three pieces are separate. In the "real" library I will apply all these lessons to, I have separate version numbers, and I would appreciate advice on this, since updates in various pieces need to reflect each other's compatibility. Or do I not worry about that?

I ask because the pyproject.toml has a spot for a version number. But the multiple components should each get their own somehow, because I want to keep the core separately usable and bumps in the python side shoudn't case version number changes to the core when no actual changes are made.


I think:

  • I use the version number for the pure-python as my version number up top, and
  • the versions for the core and the bindings are what they are.

Changes happen, and as they effect changes in higher layers, i just increase version numbers using semantic versioning.

Is this right?

@LecrisUT
Copy link
Collaborator

Another question follows, particularly for @LecrisUT : how to do versioning, since the three pieces are separate.

Good question. One way is to regex from the CMakeLists.txt and control it statically there. I don't like this approach because it looses git information. Second more involved way is using setuptools_scm, but it is not designed for multi-repo different version control, at least out of the box. If you have them all in the same repo, than you have to tag them with a prefix or suffix, e.g. core-v0.1.0, and match it in setuptools_scm field (I think it's tag_regex) and .git_archival.txt for extra credit. For cmake-only part I have a cmake module that is intended to be compatible with setuptools-scm workflow, but I think it still lacks a few logics.

The matching version dependency though that is harder and we don't have dynamic dependencies field in scikit-build-core (yet?). That would have to be a manual process on the pyproject.toml and find_package. Note that CMake's find_package version filtering is backwards, i.e. the package tells you what version the user should support, not the user (caller of find_package) 🙁.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants