Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-37929: [Python] begin moving static settings to pyproject.toml #41041

Merged
merged 26 commits into from
Jun 5, 2024

Conversation

anjakefala
Copy link
Collaborator

@anjakefala anjakefala commented Apr 5, 2024

Rationale for this change

To migrate Arrow to modern Python packaging standards, see PEP-517 and PEP-518.

This PR focuses on migrating the static settings, the metadata and version, to pyproject.toml. Future PRs will migrate more of the build process to pyproject.toml.

@anjakefala anjakefala requested review from raulcd and assignUser April 5, 2024 17:37
Copy link

github-actions bot commented Apr 5, 2024

⚠️ GitHub issue #37929 has been automatically assigned in GitHub to PR creator.

@anjakefala
Copy link
Collaborator Author

I will try to investigate how other projects handle development versions with pyproject.toml!

@anjakefala
Copy link
Collaborator Author

I found this! https://setuptools.pypa.io/en/latest/userguide/pyproject_config.html#dynamic-metadata I'll get to work updating.

@anjakefala anjakefala force-pushed the kef/pyproject branch 2 times, most recently from 006ce35 to b0aa7e6 Compare April 5, 2024 21:12
@anjakefala
Copy link
Collaborator Author

I am actively trying to figure out how to fix what's coming up in the build failures!

python/setup.py Outdated Show resolved Hide resolved
@github-actions github-actions bot added awaiting committer review Awaiting committer review and removed awaiting review Awaiting review labels Apr 9, 2024
@anjakefala anjakefala force-pushed the kef/pyproject branch 2 times, most recently from ab076e8 to f9072a4 Compare April 9, 2024 19:37
@anjakefala
Copy link
Collaborator Author

This is the current challenge for this PR.

To migrate the static metadata to the pyproject.toml, we need to set a version. In PyArrow, the version is set dynamically using setuptools_scm.

Setuptools_scm will only let you configure Callables in the setup.py. There are two callables we set, one for parse and one for version_scheme.

 429 def parse_git(root, **kwargs):                                                                                                                                                                                                
 430     """                                                                                                                                                                                                                       
 431     Parse function for setuptools_scm that ignores tags for non-C++                                                                                                                                                           
 432     subprojects, e.g. apache-arrow-js-XXX tags.                                                                                                                                                                               
 433     """                                                                                                                                                                                                                       
 434     from setuptools_scm.git import parse                                                                                                                                                                                      
 435     kwargs['describe_command'] =\                                                                                                                                                                                             
 436         'git describe --dirty --tags --long --match "apache-arrow-[0-9]*.*"'                                                                                                                                                  
 437     return parse(root, **kwargs)                                                                                                                                                                                              
 438                                          
 440 def guess_next_dev_version(version):                                                                                                                                                                                          
 441     if version.exact:                                                                                                                                                                                                         
 442         return version.format_with('{tag}')                                                                                                                                                                                   
 443     else:                                                                                                                                                                                                                     
 444         def guess_next_version(tag_version):                                                                                                                                                                                  
 445             return default_version.replace('-SNAPSHOT', '')                                                                                                                                                                   
 446         return version.format_next_version(guess_next_version)                                                                                                                                                                
 447                                                                

As they currently are, we cannot configure these in the pyproject.toml, it will not accept a Python callable.

The next part of the challenge is that if you move the version metadata to pyproject.toml, none of the configurations in setup.py will be picked up. That is why the build is failing. So you cannot put the static variables in pyproject.toml, and pass the Python callables into setup.py via use_scm_version. It is an all-or-nothing migration.

I'm thinking that the next step is to contact the maintainers of setuptools_scm, and see if they have any advice.

@jorisvandenbossche
Copy link
Member

The setuptools_scm docs have an example of passing a callable in setup.py with using pyproject.toml: https://setuptools-scm.readthedocs.io/en/latest/customizing/#providing-project-local-version-schemes
So based on that it seems this should be possible?

@jorisvandenbossche
Copy link
Member

Also, it seems that we use parse_git callable to use a custom git describe invocation. But, nowadays setuptools_scm also has an option to directly override which describe command is used (git_describe_command). So that part can probably be turned into a static configuration instead of a callable.

@anjakefala
Copy link
Collaborator Author

So based on that it seems this should be possible?

I found open issues for the behaviour I am noticing:

pypa/setuptools-scm#827
pypa/setuptools-scm#1011

@anjakefala
Copy link
Collaborator Author

Locally, the test seems to be behaving decently:

(base) ~/git/arrow/python kef/pyproject $ ls dist 
pyarrow-16.0.0.dev453+g51a3831e4.d20240416.tar.gz

The question is understanding why it is failing in CI.

@pitrou
Copy link
Member

pitrou commented Apr 17, 2024

Note that we're not married to setuptools_scm. If we find out that something else would work better for us, then we can switch to it.

Found this comparison using a quick search: jwodder/versioningit#46 (comment)

@anjakefala
Copy link
Collaborator Author

setuptools_scm does have logging. I'm going to see if the logging helps reveal anything. If nothing, I'll explore the alternatives!

@anjakefala anjakefala force-pushed the kef/pyproject branch 7 times, most recently from 9b0f5b4 to b83e36a Compare April 24, 2024 21:13
@github-actions github-actions bot added awaiting change review Awaiting change review awaiting changes Awaiting changes and removed awaiting changes Awaiting changes awaiting change review Awaiting change review labels Jun 3, 2024
@raulcd
Copy link
Member

raulcd commented Jun 3, 2024

I've rebased main to fix conflicts

@raulcd
Copy link
Member

raulcd commented Jun 3, 2024

@github-actions crossbow submit -g python

Copy link

github-actions bot commented Jun 3, 2024

Revision: 50a35c1

Submitted crossbow builds: ursacomputing/crossbow @ actions-77bf8dcbea

Task Status
example-python-minimal-build-fedora-conda GitHub Actions
example-python-minimal-build-ubuntu-venv GitHub Actions
test-conda-python-3.10 GitHub Actions
test-conda-python-3.10-cython2 GitHub Actions
test-conda-python-3.10-hdfs-2.9.2 GitHub Actions
test-conda-python-3.10-hdfs-3.2.1 GitHub Actions
test-conda-python-3.10-pandas-latest GitHub Actions
test-conda-python-3.10-pandas-nightly GitHub Actions
test-conda-python-3.10-spark-v3.5.0 GitHub Actions
test-conda-python-3.10-substrait GitHub Actions
test-conda-python-3.11 GitHub Actions
test-conda-python-3.11-dask-latest GitHub Actions
test-conda-python-3.11-dask-upstream_devel GitHub Actions
test-conda-python-3.11-hypothesis GitHub Actions
test-conda-python-3.11-pandas-upstream_devel GitHub Actions
test-conda-python-3.11-spark-master GitHub Actions
test-conda-python-3.12 GitHub Actions
test-conda-python-3.8 GitHub Actions
test-conda-python-3.8-pandas-1.0 GitHub Actions
test-conda-python-3.8-spark-v3.5.0 GitHub Actions
test-conda-python-3.9 GitHub Actions
test-conda-python-3.9-pandas-latest GitHub Actions
test-cuda-python GitHub Actions
test-debian-12-python-3-amd64 GitHub Actions
test-debian-12-python-3-i386 GitHub Actions
test-fedora-39-python-3 GitHub Actions
test-ubuntu-20.04-python-3 GitHub Actions
test-ubuntu-22.04-python-3 GitHub Actions

@jorisvandenbossche
Copy link
Member

@github-actions crossbow submit example-python-minimal-build-ubuntu-venv

@github-actions github-actions bot added awaiting change review Awaiting change review and removed awaiting changes Awaiting changes labels Jun 4, 2024
Copy link

github-actions bot commented Jun 4, 2024

Revision: 24054ef

Submitted crossbow builds: ursacomputing/crossbow @ actions-c870c684c4

Task Status
example-python-minimal-build-ubuntu-venv GitHub Actions

@github-actions github-actions bot added awaiting changes Awaiting changes and removed awaiting change review Awaiting change review labels Jun 4, 2024
Copy link
Member

@raulcd raulcd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be merged now, @jorisvandenbossche @pitrou any other concerns?

@github-actions github-actions bot added awaiting merge Awaiting merge and removed awaiting changes Awaiting changes labels Jun 4, 2024
Copy link
Member

@jorisvandenbossche jorisvandenbossche left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I would also say, let's merge this ;)

@anjakefala
Copy link
Collaborator Author

Thanks everyone! This turned out to be such a surprising doozy.

@assignUser assignUser merged commit ad897bb into apache:main Jun 5, 2024
19 of 20 checks passed
@assignUser assignUser removed the awaiting merge Awaiting merge label Jun 5, 2024
vibhatha pushed a commit to vibhatha/arrow that referenced this pull request Jun 5, 2024
…ml (apache#41041)

### Rationale for this change

To migrate Arrow to modern Python packaging standards, see [PEP-517](https://peps.python.org/pep-0517/) and [PEP-518](https://peps.python.org/pep-0518/). 
* GitHub Issue: apache#37929

This PR focuses on migrating the static settings, the metadata and version, to pyproject.toml. Future PRs will migrate more of the build process to pyproject.toml.

Lead-authored-by: anjakefala <[email protected]>
Co-authored-by: Raúl Cumplido <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
Signed-off-by: Jacob Wujciak-Jens <[email protected]>
Copy link

After merging your PR, Conbench analyzed the 6 benchmarking runs that have been run so far on merge-commit ad897bb.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details. It also includes information about 2 possible false positives for unstable benchmarks that are known to sometimes produce them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants