Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIX] Fixes for scikit-learn 1.0 #5608

Merged
merged 2 commits into from
Sep 29, 2021
Merged

Conversation

janezd
Copy link
Contributor

@janezd janezd commented Sep 27, 2021

Issue

Tests fail after release of scikit-learn 1.0.

  • Scikit's fit checks whether the object has columns to distinguish between arrays and pandas frame. Orange.data.Table also has columns, but with different content, which caused a problem when (incorrectly) passed to scikit.
  • ROC has tests for tooltips, which use test results for Titanic with kNN. Scikit changed something in kNN so that all returned probabilities are 0 or 1 (when k=5, this is not entirely unexpected).
Changes
  • I fixed the call to kNN to pass an array X, not the whole Table.
  • The offending test for ROC now uses artificial test results, not results from an actual model.

To be considered: Orange.data.Table's property columns, which contains an instance of this:

class Columns:
is unknown, strange, useless and to my knowledge unused. I assume it was there for easier access to columns when working with Orange interactively. I would just remove it without any deprecation period.

Includes
  • Code changes
  • Tests
  • Documentation N/A

@codecov
Copy link

codecov bot commented Sep 27, 2021

Codecov Report

Merging #5608 (9d8758f) into master (fde224f) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #5608   +/-   ##
=======================================
  Coverage   85.96%   85.96%           
=======================================
  Files         313      313           
  Lines       65466    65466           
=======================================
+ Hits        56275    56277    +2     
+ Misses       9191     9189    -2     

@janezd
Copy link
Contributor Author

janezd commented Sep 27, 2021

I fixed the failing tests, but can't reproduce core dumps on Mac. Help wanted.

@markotoplak markotoplak merged commit 35b7a3a into biolab:master Sep 29, 2021
@markotoplak
Copy link
Member

I merged as is.

I tried reproducing the segfaults locally. At first, I could not, but after I installed xgboost (an optional Orange dependency) it reliably segfaults on my 10.14 Mac. I just need to do the following in a fresh virtualenv:

pip install orange3
pip install xgboost
python -m unittest Orange.tests

For xgboost to work, you also need libomp. Before trying anything, do brew install libomp.

@janezd, can you replicate?

@janezd
Copy link
Contributor Author

janezd commented Sep 29, 2021

I can. test_all_models_work_after_unpickling segfaults after installing xgboost (and libomp).

@janezd
Copy link
Contributor Author

janezd commented Sep 29, 2021

Actually, the learner just crashes, always.

>>> XGBLearner()(Orange.data.Table("iris"))
Segmentation fault: 11

@janezd
Copy link
Contributor Author

janezd commented Sep 29, 2021

We may be experiencing dmlc/xgboost#7156: the crash happens at the same point, but if during debugging I change the value of variable threads from -1 to 1, it does not crash here (but crashes later, perhaps for a related reason).

Xgboost 1.5.0rc was released 4 days ago, and release is planned for Oct 11. I suggest we wait it out; till then, we can add a (time-bombed) exclusion for xgboost tests on MacOs.

@markotoplak
Copy link
Member

That was a got find! Let's hope that 1.5 fixes this segfault and perhaps also #5599.

markotoplak added a commit to markotoplak/orange3 that referenced this pull request Oct 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants