Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch from setup.cfg + setup.py to pyproject.toml only #708

Closed
wants to merge 12 commits into from

Conversation

baggiponte
Copy link
Contributor

As discussed today during the sprint, here is the PR to move to using pyproject.toml only. This can be done with ease because you do not need to compile any C-extension or anything. You can try out installing skrub configured with pyproject.toml only by running the following command in a fresh new env:

pip install -e "git+https://github.com/baggiponte/skrub@switch-to-pyproject#egg=skrub[dev]"

(if you use @... you can tell git to install from any commit/tag/branch; the #egg=skrub tells pip where the package is so that it can build the wheel; the [dev] at the end installs the package and dev dependencies. this will raise a deprecation warning: do not worry, it's just because you are installing from a branch in my fork, because I did not want to push the changes that I made to main)

The fact that the package is installed means that wheels can be built, i.e. the package can be published and distributed.

I simply ported everything inside setup.cfg following the official pyproject specification here as well as how to configure setuptools with pyproject.

I changed flake8 to ruff because flake8 cannot be configured with pyproject; ruff is a drop in replacement written in rust, very fast, extremely widely adopted (pandas, scipy, apache airflow, and many many more all use it). If you don't like the new tool, we can just leave the flake8 section in setup.cfg or in a .flake8 file.

I also edited the precommit config accordingly. I realised that while you configured codespell, it was not running in precommit, so I added that too. precommit runs fine; please do tell me of other ways to test whether something is broken! 😃

@Vincent-Maladiere
Copy link
Member

Hey @baggiponte, thanks for this PR!
I agree that moving to pyproject.toml is an improvement, since we don't have compiled files, nor intend to have some.

I have no opinion on whether we should use ruff or not, despite its popularity.

However, for the sake of passing the CI, you might want to fix the ruff errors, since many flake8 ignored rules are not mirrored. See the error in the CI.
Here is the list of ruff rules.

Using pyproject.toml with setuptools as its backend, it's good to mention that we can still generate bdist_wheel and sdist using either wheel:

pip wheel . --no-deps -w my_dist

or build (might need to be pip install):

python -m build

What do the other maintainers think? cc @GaelVaroquaux who might be interested in this.

@baggiponte
Copy link
Contributor Author

Hi Vincent!

Thank you for your review. I would have gotten it done this weekend but I was busy. I shall get back to it tomorrow and fix it asap 💪🏼

@Vincent-Maladiere
Copy link
Member

Don't worry, there's no rush :)
Thanks for staying motivated!

@Vincent-Maladiere
Copy link
Member

Hey @baggiponte, do you need help on this? :)

@jeromedockes
Copy link
Member

One thing worth considering is that when using setuptools, to include package data (non-python files that need to be shipped with the code) ATM with setuptools if we don't use setup.py, according to the documentation the 2 options are (i) list the data files in a MANIFEST.in (a bit tedious and there is the risk of forgetting some) or (ii) use the setuptools-scm plugin to include files that are tracked by git.

@baggiponte
Copy link
Contributor Author

baggiponte commented Sep 16, 2023

Ciao @Vincent-Maladiere, sorry if it took me so long to get back to this.

  1. I fetched the latest updates and merged in my working branch
  2. Removed the E24 altogether. My understanding is that if you say E24 flake8 will understand "every rule that starts with E24", which comprises E241 and E242, which denote multiple spaces and tabs after commas. This was ignored, I guess because black solves this beforehand.

UPDATE 1
Okay! Now there are no more errors related to my configuration. The error raised at this line is for a ruff rule that is not respected; the other errors below are from codespell: please tell me how I should proceed here :)

CircleCI fails because it expects setup.py to be there:

python: can't open file '/home/circleci/project/setup.py': [Errno 2] No such file or directory
Exited with code exit status 2

I shall check the build_docs.sh script next!

UPDATE 2

Now this error is raised in CircleCI. My understanding is that is due to something else with sphinx, but I can't really figure out why. Does it normally work? Do you understand what might be going on? Error is here.

Extension error (sphinx.ext.autosummary):
Handler <function process_generate_options at 0x7f71f98e0280> for event 'builder-inited' threw an exception (exception: no module named skrub.datasets)

@Vincent-Maladiere
Copy link
Member

Ciao @baggiponte! @jeromedockes raises a crucial point regarding package data (non-python files) and setup.py. We need to address this point before moving forward with pyproject.toml!

Switching packaging tools can take time since there are a lot of things to consider. In the meantime, feel free to tackle another issue :)

@juanitorduz
Copy link
Contributor

Ciao @baggiponte! @jeromedockes raises a crucial point regarding package data (non-python files) and setup.py. We need to address this point before moving forward with pyproject.toml!

Switching packaging tools can take time since there are a lot of things to consider. In the meantime, feel free to tackle another issue :)

There is a nice cookie-cutter based on pyproject.toml which could help as a guide if needed :) https://github.com/woltapp/wolt-python-package-cookiecutter/tree/master

@baggiponte
Copy link
Contributor Author

baggiponte commented Oct 1, 2023

Ciao @baggiponte! @jeromedockes raises a crucial point regarding package data (non-python files) and setup.py. We need to address this point before moving forward with pyproject.toml!

Ciao @Vincent-Maladiere, thanks for the review. I saw @jeromedockes comment indeed, but did not talk about it as I wanted to wait for your reply before opening a new thread.

This makes a lot of sense. IIUC, there's only one non-Python file to include, which is VERSION.txt. Am I wrong? If that were the only one, we can simply write the version to __version__ instead of reading the text file here.

Currently, VERSION.txt is read in the sphinx conf.py but it would suffice to add an import statement import skrub and a call to skrub.__version__ - e.g., on line 117 simply calling release = skrub.__version__.

I don't mean to push my idea, just trying to write out all possible alternatives. Up to you to decide the road to take :) I'll gladly look at the other issues.

@jeromedockes
Copy link
Member

jeromedockes commented Oct 2, 2023 via email

@Vincent-Maladiere
Copy link
Member

Hi @baggiponte, after discussing with other skrub members, we think it's a good idea to go forward with the pyproject.toml! We're fine with ruff and having a small MANIFEST.in :)

@baggiponte
Copy link
Contributor Author

Great! Do let me know how can I be of help to finish implementing this. I guess I should:

  1. Merge master and update
  2. Write the small MANIFEST.in

Something I forgot?

@Vincent-Maladiere
Copy link
Member

Vincent-Maladiere commented Oct 3, 2023

Finish the migration from flake8 to ruff, or is it done already?

@baggiponte
Copy link
Contributor Author

Let me double check but I recall it was done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants