-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH] FreeViz script #2563
[ENH] FreeViz script #2563
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2563 +/- ##
==========================================
+ Coverage 75.65% 75.73% +0.07%
==========================================
Files 333 334 +1
Lines 58422 58654 +232
==========================================
+ Hits 44202 44422 +220
- Misses 14220 14232 +12 |
@janezd: Rebased. |
@janezd : Is this ok to merge? |
Orange/projection/freeviz.py
Outdated
X = data.X | ||
Y = data.Y | ||
if X is None or Y is None: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return None, None, None
Orange/projection/freeviz.py
Outdated
def __init__(self, weights=None, center=True, scale=True, dim=2, p=1, | ||
initial=None, maxiter=500, alpha=0.1, atol=1e-5, preprocessors=None): | ||
super().__init__(preprocessors=preprocessors) | ||
self.params = vars() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use this monstrosity only in SKL wrappers. I think FreeViz
does not need params
.
Orange/projection/freeviz.py
Outdated
|
||
def __call__(self, data): | ||
if data and data.domain.class_var.is_discrete: | ||
self.is_class_discrete = True |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same instance of FreeViz
can be called multiple times and with different data sets, hence it can't count that is_class_discrete
is False
(as set in the constructor`.
if data is not None:
self.is_class_discrete = data.domain.class_var.is_discrete
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
def transform(self, X): | ||
EX = np.dot(X, self.components_) | ||
return EX | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markotoplak, the code from the beginning of the file and up to here is adapted from PCA. I know the idea, but you're an expert here. Can you take a quick look, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems OK.
I think it would be useful to add a test, where we would
- open data, apply some preprocessor, splits the data into two parts, use FreeViz on the first part, and then transform the second part,
- open data, split into two parts, apply the same preprocessor and FreeViz only on the first part, and then transform the second part,
where the transformed second part in (1) and (2) has to be the same.
Orange/projection/freeviz.py
Outdated
return self.freeviz.transform(data.X) | ||
|
||
|
||
class _LinearCombination: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lanzagar, could Orange.projection.pca._LinearCombination
be moved to Orange.projection.base.LinearCombination`, so it could be reused here?
Also, PCAModel
is just a specific case of linear model. Could the common functionality be extracted into a base class? Could you and @jerneju discuss this? I don't insist it should be done, I'm just suggesting considering it. And it doesn't have to be in this PR, either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Could and should.
- Could. Probably should? Definitely should be considered.
We'll take a look.
Orange/projection/freeviz.py
Outdated
self.transformed = None | ||
|
||
def compute(self, data, freeviz_space): | ||
if self.feature >= freeviz_space.shape[1]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How could this happen at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. Fixed.
def transform(self, X): | ||
EX = np.dot(X, self.components_) | ||
return EX | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems OK.
I think it would be useful to add a test, where we would
- open data, apply some preprocessor, splits the data into two parts, use FreeViz on the first part, and then transform the second part,
- open data, split into two parts, apply the same preprocessor and FreeViz only on the first part, and then transform the second part,
where the transformed second part in (1) and (2) has to be the same.
Orange/projection/freeviz.py
Outdated
from Orange.data.util import SharedComputeValue | ||
from Orange.projection import Projector, Projection | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Define __all__
here and pull FreeViz
up into Orange.projection.
Tests and documentation should then import from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@markotoplak, can you check and approve the changes?
@jerneju, can you commit --amend & push -f again to restart travis?
__all__ = ["LinearCombinationSql", "Projector", "Projection", "SklProjector"] | ||
|
||
|
||
class LinearCombinationSql: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know what this actually does, but I suppose that this is a general linear combination, and it also supports SQL -- and not a linear combination that requires SQL, right? If so, the name is misleading. @lanzagar, could you help here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the sql equivalent of the compute_value. It only works for sql, as in-memory data is processed with a different object - Projector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, OK, I see. Thanks.
Orange/tests/test_freeviz.py
Outdated
|
||
def test_transform_changed_domain(self): | ||
""" | ||
Thest where: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thest -> test
Orange/tests/test_freeviz.py
Outdated
model = freeviz(ndata) | ||
result_2 = model(data[100:]) | ||
|
||
self.assertTrue(np.allclose(result_1, result_2)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not np.testing.assert_almost_equal
?
@markotoplak, the test is otherwise OK, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
======================================================================
ERROR: test_transform_changed_domain (Orange.tests.test_freeviz.TestFreeviz)Traceback (most recent call last):
File "/home/travis/build/biolab/orange3/build/travis-test/Orange/tests/test_freeviz.py", line 101, in test_transform_changed_domain
np.testing.assert_almost_equal(result_1, result_2)
File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/numpy/testing/utils.py", line 568, in assert_almost_equal
if not (gisfinite(desired) and gisfinite(actual)):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe
numpy.testing.assert_allclose
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test is almost OK.
To make results repeatable, init_random should be fixed so that it uses rstate = np.random.RandomState(0) if not given. Then, inititalization should be removed from the test.
The comment on the test is superfluous - the code and two in code comments say it all.
Issue
No FreeViz projection.
Description of changes
FreeViz projection scripting part similar as MDS, PCA.
Code coverage is 98%.
Includes