You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Bug Description It appears that it is possible for the visualizer to fail when it finds a "perfectly bad" split of the test and training data.
Expected Behavior @nbokulich says that this is very unlikely to happen, but we should look into reproducing and raising a different error. If a user runs into this, they should be able to just re-run their command.
Screenshots
Test failure/traceback
_____________________ EstimatorsTests.test_maturity_index ______________________
self = <q2_sample_classifier.tests.test_classifier.EstimatorsTests testMethod=test_maturity_index>
def test_maturity_index(self):
maturity_index(self.temp_dir.name, self.table_ecam_fp, self.md_ecam_fp,
category='month', group_by='delivery', n_jobs=1,
> control='Vaginal', test_size=0.4)
test-env/lib/python3.5/site-packages/q2_sample_classifier/tests/test_classifier.py:250:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
test-env/lib/python3.5/site-packages/q2_sample_classifier/classify.py:145: in maturity_index
accuracy, output_dir, maz_stats=maz_stats)
test-env/lib/python3.5/site-packages/q2_sample_classifier/utilities.py:412: in _visualize_maturity_index
g = _clustermap_from_dataframe(top, metadata, group_by, category)
test-env/lib/python3.5/site-packages/q2_sample_classifier/visuals.py:69: in _clustermap_from_dataframe
row_cluster=False)
test-env/lib/python3.5/site-packages/seaborn/matrix.py:1301: in clustermap
**kwargs)
test-env/lib/python3.5/site-packages/seaborn/matrix.py:1131: in plot
row_linkage=row_linkage, col_linkage=col_linkage)
test-env/lib/python3.5/site-packages/seaborn/matrix.py:1032: in plot_dendrograms
axis=1, ax=self.ax_col_dendrogram, linkage=col_linkage)
test-env/lib/python3.5/site-packages/seaborn/matrix.py:746: in dendrogram
label=label, rotate=rotate)
test-env/lib/python3.5/site-packages/seaborn/matrix.py:567: in __init__
self.dendrogram = self.calculate_dendrogram()
test-env/lib/python3.5/site-packages/seaborn/matrix.py:644: in calculate_dendrogram
color_threshold=-np.inf)
test-env/lib/python3.5/site-packages/scipy/cluster/hierarchy.py:2296: in dendrogram
is_valid_linkage(Z, throw=True, name='Z')
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Z = array([], shape=(0, 4), dtype=float64), warning = False, throw = True
name = 'Z'
def is_valid_linkage(Z, warning=False, throw=False, name=None):
"""
Checks the validity of a linkage matrix.
A linkage matrix is valid if it is a two dimensional array (type double)
with :math:`n` rows and 4 columns. The first two columns must contain
indices between 0 and :math:`2n-1`. For a given row ``i``, the following
two expressions have to hold:
.. math::
0 \\leq \\mathtt{Z[i,0]} \\leq i+n-1
0 \\leq Z[i,1] \\leq i+n-1
I.e. a cluster cannot join another cluster unless the cluster being joined
has been generated.
Parameters
----------
Z : array_like
Linkage matrix.
warning : bool, optional
When True, issues a Python warning if the linkage
matrix passed is invalid.
throw : bool, optional
When True, throws a Python exception if the linkage
matrix passed is invalid.
name : str, optional
This string refers to the variable name of the invalid
linkage matrix.
Returns
-------
b : bool
True if the inconsistency matrix is valid.
"""
Z = np.asarray(Z, order='c')
valid = True
name_str = "%r " % name if name else ''
try:
if type(Z) != np.ndarray:
raise TypeError('Passed linkage argument %sis not a valid array.' %
name_str)
if Z.dtype != np.double:
raise TypeError('Linkage matrix %smust contain doubles.' % name_str)
if len(Z.shape) != 2:
raise ValueError('Linkage matrix %smust have shape=2 (i.e. be '
'two-dimensional).' % name_str)
if Z.shape[1] != 4:
raise ValueError('Linkage matrix %smust have 4 columns.' % name_str)
if Z.shape[0] == 0:
> raise ValueError('Linkage must be computed on at least two '
'observations.')
E ValueError: Linkage must be computed on at least two observations.
test-env/lib/python3.5/site-packages/scipy/cluster/hierarchy.py:1459: ValueError
==================== 1 failed, 12 passed in 150.50 seconds =====================
Comments
This is probably going to be difficult to fix.
PR #60 is in to set the seed @jairideout mentioned. It is possible we won't be able to revert it (if say the solution is to just raise an explanatory error, but we'll learn more later).
One possibility is to have the error message print out a histogram of this perfectly bad split -- there is a chance the user specified the wrong column
Bug Description
It appears that it is possible for the visualizer to fail when it finds a "perfectly bad" split of the test and training data.
Expected Behavior
@nbokulich says that this is very unlikely to happen, but we should look into reproducing and raising a different error. If a user runs into this, they should be able to just re-run their command.
Screenshots
Test failure/traceback
Comments
This is probably going to be difficult to fix.
References
It appears
The text was updated successfully, but these errors were encountered: