Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] FreeViz script #2563

Merged
merged 8 commits into from
Oct 13, 2017
Merged

[ENH] FreeViz script #2563

merged 8 commits into from
Oct 13, 2017

Conversation

jerneju
Copy link
Contributor

@jerneju jerneju commented Sep 7, 2017

Issue

No FreeViz projection.

Description of changes

FreeViz projection scripting part similar as MDS, PCA.

from Orange.projection.freeviz import FreeViz
from Orange.data import Table
data = Table("iris")
freeviz = FreeViz()
model = freeviz(data)
new_data = model(data)

Code coverage is 98%.

Includes
  • Code changes
  • Tests
  • Documentation

@jerneju jerneju changed the title [ENH] Freeviz script [ENH] FreeViz script Sep 7, 2017
@codecov-io
Copy link

codecov-io commented Sep 7, 2017

Codecov Report

Merging #2563 into master will increase coverage by 0.07%.
The diff coverage is 93.85%.

@@            Coverage Diff             @@
##           master    #2563      +/-   ##
==========================================
+ Coverage   75.65%   75.73%   +0.07%     
==========================================
  Files         333      334       +1     
  Lines       58422    58654     +232     
==========================================
+ Hits        44202    44422     +220     
- Misses      14220    14232      +12

@jerneju
Copy link
Contributor Author

jerneju commented Sep 14, 2017

@janezd

@janezd
Copy link
Contributor

janezd commented Sep 14, 2017

Could you rebase #2512 to this, so I can observe this PR in actual action? You'll have to do it some day anyway.

#2512 does not have to be finished, I just want to see how the projection function works.

@jerneju
Copy link
Contributor Author

jerneju commented Sep 19, 2017

@janezd: Rebased.

@jerneju jerneju added this to the 3.6 milestone Sep 29, 2017
@astaric astaric removed this from the 3.6 milestone Sep 29, 2017
@jerneju jerneju mentioned this pull request Oct 6, 2017
3 tasks
@jerneju
Copy link
Contributor Author

jerneju commented Oct 9, 2017

@janezd : Is this ok to merge?

X = data.X
Y = data.Y
if X is None or Y is None:
return
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return None, None, None

def __init__(self, weights=None, center=True, scale=True, dim=2, p=1,
initial=None, maxiter=500, alpha=0.1, atol=1e-5, preprocessors=None):
super().__init__(preprocessors=preprocessors)
self.params = vars()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use this monstrosity only in SKL wrappers. I think FreeViz does not need params.


def __call__(self, data):
if data and data.domain.class_var.is_discrete:
self.is_class_discrete = True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same instance of FreeViz can be called multiple times and with different data sets, hence it can't count that is_class_discrete is False (as set in the constructor`.

if data is not None:
    self.is_class_discrete = data.domain.class_var.is_discrete

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

def transform(self, X):
EX = np.dot(X, self.components_)
return EX

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markotoplak, the code from the beginning of the file and up to here is adapted from PCA. I know the idea, but you're an expert here. Can you take a quick look, please?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK.

I think it would be useful to add a test, where we would

  1. open data, apply some preprocessor, splits the data into two parts, use FreeViz on the first part, and then transform the second part,
  2. open data, split into two parts, apply the same preprocessor and FreeViz only on the first part, and then transform the second part,

where the transformed second part in (1) and (2) has to be the same.

return self.freeviz.transform(data.X)


class _LinearCombination:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lanzagar, could Orange.projection.pca._LinearCombination be moved to Orange.projection.base.LinearCombination`, so it could be reused here?

Also, PCAModel is just a specific case of linear model. Could the common functionality be extracted into a base class? Could you and @jerneju discuss this? I don't insist it should be done, I'm just suggesting considering it. And it doesn't have to be in this PR, either.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Could and should.
  2. Could. Probably should? Definitely should be considered.

We'll take a look.

self.transformed = None

def compute(self, data, freeviz_space):
if self.feature >= freeviz_space.shape[1]:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How could this happen at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. Fixed.

def transform(self, X):
EX = np.dot(X, self.components_)
return EX

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems OK.

I think it would be useful to add a test, where we would

  1. open data, apply some preprocessor, splits the data into two parts, use FreeViz on the first part, and then transform the second part,
  2. open data, split into two parts, apply the same preprocessor and FreeViz only on the first part, and then transform the second part,

where the transformed second part in (1) and (2) has to be the same.

from Orange.data.util import SharedComputeValue
from Orange.projection import Projector, Projection


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Define __all__ here and pull FreeViz up into Orange.projection.
Tests and documentation should then import from there.

Copy link
Contributor

@janezd janezd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@markotoplak, can you check and approve the changes?

@jerneju, can you commit --amend & push -f again to restart travis?

__all__ = ["LinearCombinationSql", "Projector", "Projection", "SklProjector"]


class LinearCombinationSql:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know what this actually does, but I suppose that this is a general linear combination, and it also supports SQL -- and not a linear combination that requires SQL, right? If so, the name is misleading. @lanzagar, could you help here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the sql equivalent of the compute_value. It only works for sql, as in-memory data is processed with a different object - Projector.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK, I see. Thanks.


def test_transform_changed_domain(self):
"""
Thest where:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thest -> test

model = freeviz(ndata)
result_2 = model(data[100:])

self.assertTrue(np.allclose(result_1, result_2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not np.testing.assert_almost_equal?

@markotoplak, the test is otherwise OK, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

======================================================================
ERROR: test_transform_changed_domain (Orange.tests.test_freeviz.TestFreeviz)

Traceback (most recent call last):
File "/home/travis/build/biolab/orange3/build/travis-test/Orange/tests/test_freeviz.py", line 101, in test_transform_changed_domain
np.testing.assert_almost_equal(result_1, result_2)
File "/home/travis/virtualenv/python3.6.3/lib/python3.6/site-packages/numpy/testing/utils.py", line 568, in assert_almost_equal
if not (gisfinite(desired) and gisfinite(actual)):
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

Copy link
Contributor Author

@jerneju jerneju Oct 13, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe

numpy.testing.assert_allclose

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test is almost OK.

To make results repeatable, init_random should be fixed so that it uses rstate = np.random.RandomState(0) if not given. Then, inititalization should be removed from the test.

The comment on the test is superfluous - the code and two in code comments say it all.

@janezd janezd merged commit 6e04e28 into biolab:master Oct 13, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants