Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use 'confidence' instead of deprecated 'alpha' for scipy.stats.t.interval #403

Merged
merged 2 commits into from
Oct 16, 2022

Conversation

tetsuok
Copy link
Contributor

@tetsuok tetsuok commented Aug 24, 2022

Reduced heavy logging uncovered buried DeprecationWarnings in tests. We get the following DeprecationWarning in the tests that invoke scipy.stats.t.interval method:

test_hits (explainaboard.tests.test_metric.TestMetric) ... /home/runner/work/ExplainaBoard/ExplainaBoard/explainaboard/metrics/metric.py:338: DeprecationWarning: Use of keyword argument `alpha` for method `interval` is deprecated. Use first positional argument or keyword argument `confidence` instead.

This PR fixes the warning as the warning suggests.

@tetsuok
Copy link
Contributor Author

tetsuok commented Aug 24, 2022

It seems better not to fix this DeprecationWarning since the warning is generated since scipy 1.9.0 (latest version). Before 1.9.0, alhpa was used. This can be a breaking change unless we set the minimum version of scipy to support. Currently, setup.py doesn't specify the minimum version of scipy.

@tetsuok tetsuok marked this pull request as draft August 24, 2022 15:48
@pfliu-nlp pfliu-nlp requested a review from neubig August 29, 2022 19:26
@neubig
Copy link
Contributor

neubig commented Aug 29, 2022

This is a hard decision, I'm not sure if we want to limit ourselves to only being compatible with the most recent version of scikit-learn. But also DeprecationWarnings would be good to get rid of as well.

@odashi
Copy link
Contributor

odashi commented Aug 29, 2022

@tetsuok @neubig At least we need to fix a version of libraries we rely on. Setting deprecation warnings is a common process to remove a feature from the library, but it usually takes 2-3 minor releases to remove the feature completely. It shouldn't be a problem if we set appropriate versions.

@neubig
Copy link
Contributor

neubig commented Sep 4, 2022

That's a good point @odashi .

I'm a bit conflicted about this though. If we completely fix the version of libraries that we rely on (a == dependency), then this can cause the libraries to get more and more out of date. Because of this I've preferred specifying a minimum version (a >= dependency). We've also been a bit lenient on the version number (allowing slightly older versions) to reduce the number of conflicts that ExplainaBoard causes with other libraries that may be installed in the environment.

I don't have the best answer, but I think we should carefully choose a policy and implement it.

@odashi
Copy link
Contributor

odashi commented Sep 4, 2022

>= indicates that we require the newest version (unless other restrictions, such as <=), and it should be used only when we know that every new versions never break public interfaces or such change doesn't affect our library.

According to semvar that many libraries follow, I think the recommended policy is:

  • use ~=N.x or ~=N.x.y if the library has a major version (N>=1). It automatically increases both minor/patch numbers (~=N.x) or only the patch number (~=N.x.y) that may not involve breaking changes.
  • use == if the library is beta (0.x.y) since increasing the version may have any kind of changes.

Updating versions of depended libraries needs explicit maintenance (it basically can not be done automatically) so it is our responsibility.

@odashi
Copy link
Contributor

odashi commented Sep 4, 2022

According to the SciPy's log (scipy/scipy#16389) it attempts to introduce breaking changes in the minor version (1.11.0). So we need to fix minor versions too for this library (~=1.N.0)

@odashi odashi mentioned this pull request Sep 6, 2022
@neubig
Copy link
Contributor

neubig commented Sep 10, 2022

Thanks @odashi !

I agree that we should avoid breaking anything, but at the same time == or ~= dependencies may result in our code requiring out of date libraries. This is problematic for two reasons:

  1. The older libraries may be slower, less secure, or less featureful.
  2. Specifying a specific library version (especially an older one) can make it harder to find a set of libraries that doesn't have any conflicts with other libraries in the environment.

Because of these issues, I've tended to go with >= dependencies and then fix errors when they occur, but I agree that this is not the best for library stability. Do you have a good method to overcome the issues of libraries going out of date while using == or ~= dependencies? If one exists I'd be happy to go with that.

@odashi
Copy link
Contributor

odashi commented Sep 10, 2022

@neubig The ~=N.x specifier (fixing the major version, auto-update minor/patch versions) does never cause any breakages on libraries that follows semantic versioning. Others are case-by-case (no silver-bullet solution): we have to investigate how each library is managed to determine appropriate version specifiers.

@odashi
Copy link
Contributor

odashi commented Sep 10, 2022

The best solution here is to choose a version specifier that won't involve breaking changes. Fixing the major version is enough for semvar-based libraries. We realized that the minor version should also be fixed for SciPy as discussed above.

Security updates are basically applied by increasing a weaker version number unless it requires overall refactoring, so ~= specifier should cover most cases of security updates.
There are also GitHub-official bot to detect serious security issues. Since the bot covers most of major libraries, following alerts generated by this bot can prevent most of known issues.

For efficiency/features, we need to treat these kind of updates as a usual feature request to this library.

@neubig
Copy link
Contributor

neubig commented Sep 10, 2022

OK, I agree with this strategy. I guess we should refactor the setup.py/requirements files appropriately then. I'll create an issue.

@tetsuok
Copy link
Contributor Author

tetsuok commented Oct 15, 2022

Blocked by #472

@tetsuok tetsuok marked this pull request as ready for review October 15, 2022 17:31
@tetsuok tetsuok requested a review from odashi as a code owner October 15, 2022 17:31
@tetsuok
Copy link
Contributor Author

tetsuok commented Oct 16, 2022

This is ready for review because #563 pinned the version of scipy using the '~=' operator.

@tetsuok tetsuok merged commit a44e151 into neulab:main Oct 16, 2022
@tetsuok tetsuok deleted the fix-deprecation-warnings branch October 16, 2022 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants