Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove upper caps on dependencies #452

Merged
merged 2 commits into from
Dec 5, 2024

Conversation

arredond
Copy link
Contributor

@arredond arredond commented Oct 7, 2024

What

Like many Poetry-managed projects, this package sets upper caps on dependencies both via caret (^) and lesser-than signs (<=). This PR relaxes these requirements to just establishing lower bounds, not upper ones.

Why

Poetry heavily believes in SemVer but many projects do not strictly follow it. For instance, pyarrow has bumped major versions eight times in the last 2 years without really making breaking changes for the vast majority of users.

Many well-regarded devs in the Python ecosystem are calling against upper-bounds in projects due to the complications it generates for end users (here's an excellent discussion on the topic from a PyPa member).

Personally, I find myself in "dependency hell" in a project where we use SQLAlchemy to interface with many kinds of databases (the area where SQLAlchemy excels, for sure). The latest dialects we're trying to support require SQLAlchemy >= 2.0, but our current databricks-sql-connectorversion of 2.9.3 doesn't allow it. We also can't upgrade to the latest 3.4.0 because an upper limit on pyarrow has been set (which didn't exist before). A similar experience happens with Numpy >= 2.0.

Related issues / PRs

Many versions of this issue have popped up before for this package (especially for pyarrow and numpy):
#427
#55
#428
#74
#88
#432

Note

If this PR seems too ambitious, I'm happy to provide another one that only removes or, at least, bumps the upper cap for pyarrow and numpy. However, I believe removing upper caps in a general sense is the way to go. Major version bumps in Python are very frequently backwards compatible, and it's a much better experience for users to fix problems when they occur than to prevent them from upgrading (and eventually ditching your package altogether).

Here's a quote in the linked article from Brett Cannon, Python Steering Council Member and packaging expert:

Libraries/packages should be setting a floor, and if necessary excluding known buggy versions, but otherwise don’t cap the maximum version as you can’t predict future compatibility

Or another quote from the author of the article himself:

If you absolutely must set upper limits, you should release a new version as soon as possible with a higher cap when a dependency updates (ideally before the dependency releases the update). If you are committing to this, why not just quickly release a patch release with caps only after an actual conflict happens? It will be less common, and will help you quickly sort out and fix incompatibilities, rather than hiding your true compatibilities and delaying updates. You want users to use the latest versions of your libraries if there’s a problem, so why can’t you offer the same consideration to the libraries you depend on and use?

@arredond
Copy link
Contributor Author

@andrefurlan-db @benc-db @jackyhu-db @kravets-levko @rcypher-databricks @yunbodeng-db pinging any of you in case this has gone unseen

Copy link
Contributor

@susodapop susodapop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change LGTM and the motivation is a solid one.

@hauntsaninja
Copy link

@jackyhu-db any chance you could take a look at this? I'm interested in removing the preemptive cap for numpy 2 (see also #432 )

@jackyhu-db
Copy link
Collaborator

@arredond thanks for making the change. Can you reduce the scope of the change? Only change the pyarrow and numpy?

@arredond
Copy link
Contributor Author

arredond commented Dec 5, 2024

Of course, I'll make a change to this PR reducing the scope.

However, at the risk of repeating myself and being annoying, please consider removing all upper caps in the future. Numpy and pyarrow have been the problem this time but it's very likely that in the near future you'll be seeing this same issue with, say, requests.

Once again copying that quote from Brett Cannon in the PR description, since these well-established folks make the case much better than I do:

Libraries/packages should be setting a floor, and if necessary excluding known buggy versions, but otherwise don’t cap the maximum version as you can’t predict future compatibility

@jackyhu-db jackyhu-db requested review from jprakash-db and removed request for kravets-levko December 5, 2024 17:38
@jackyhu-db jackyhu-db merged commit d690516 into databricks:main Dec 5, 2024
10 of 11 checks passed
@arredond arredond deleted the remove-pyarrow-cap branch December 8, 2024 18:12
BaukJ pushed a commit to BaukJ/databricks-sql-python that referenced this pull request Dec 16, 2024
* Remove upper caps on numpy and pyarrow versions
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants