Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NaNs in --quick-test scores #78

Merged
merged 12 commits into from
Oct 26, 2017
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -6,3 +6,4 @@ rampwf.egg-info
.coverage
coverage.xml
ramp_workflow.egg-info
.cache
7 changes: 3 additions & 4 deletions rampwf/score_types/detection/base.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
from __future__ import division

from ..base import BaseScoreType
from .util import _filter_y_pred


class DetectionBaseScoreType(BaseScoreType):
Expand All @@ -16,10 +17,8 @@ class implements `detection_score`.
def __call__(self, y_true, y_pred, conf_threshold=None):
if conf_threshold is None:
conf_threshold = self.conf_threshold
y_pred_above_confidence = [
[detected_object[1:] for detected_object in single_detection
if detected_object[0] > conf_threshold]
for single_detection in y_pred]
y_pred_above_confidence = _filter_y_pred(y_pred, conf_threshold)

return self.detection_score(y_true, y_pred_above_confidence)


Expand Down
19 changes: 15 additions & 4 deletions rampwf/score_types/detection/precision_recall.py
Original file line number Diff line number Diff line change
Expand Up @@ -155,7 +155,13 @@ def mad_radius(y_true, y_pred, matches=None, iou_threshold=0.5,
loc_true, loc_pred = _locate_matches(
y_true, y_pred, matches, iou_threshold=iou_threshold)

return np.abs((loc_pred[:, 2] - loc_true[:, 2]) / loc_true[:, 2]).mean()
_, _, rad_true = loc_true.T
_, _, rad_pred = loc_pred.T
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this are actually arrays, instead of unpacking I would just index (rad_true = loc_true[:, 2])

(but good clean-up to give it names!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it create a view instead of a copy ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both are views (the slicing certainly)


if not rad_true:
return np.inf

return np.abs((rad_pred - rad_true) / rad_true).mean()


def mad_center(y_true, y_pred, matches=None, iou_threshold=0.5,
Expand Down Expand Up @@ -189,7 +195,12 @@ def mad_center(y_true, y_pred, matches=None, iou_threshold=0.5,
loc_true, loc_pred = _locate_matches(
y_true, y_pred, matches, iou_threshold=iou_threshold)

d = np.sqrt((loc_pred[:, 0] - loc_true[:, 0]) ** 2 + (
loc_pred[:, 1] - loc_true[:, 1]) ** 2)
x_true, y_true, rad_true = loc_true.T
x_pred, y_pred, _ = loc_pred.T

if not x_true:
return np.inf
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering a bit, is this better than NaN ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to conform to the definition of the score and its 'maximum value'.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, but starting from that definition: [0 - np.inf] is in principle the range of valid scores (the lower the bigger the difference in radius, you go towards infinity), while here there is no actual value to calculate. It is just that this score is not defined when there are no matches. And not defined is not necessarily the same as np.inf

But anyhow, doesn't really matter, as both mean there is something really wrong with the model :-)


d = np.sqrt((x_pred - x_true) ** 2 + (y_pred - y_true) ** 2)

return np.abs(d / loc_true[:, 2]).mean()
return np.abs(d / rad_true).mean()
21 changes: 18 additions & 3 deletions rampwf/score_types/detection/util.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,7 +168,22 @@ def _locate_matches(y_true, y_pred, matches, iou_threshold=0.5):


def _filter_y_pred(y_pred, conf_threshold):
return [[detected_object[1:] for detected_object in y_pred_patch
"""
Given a list of list of predicted craters return those
with a confidence value above given threshold

Parameters
----------
y_pred : list of list of tuples
conf_threshold : float

Returns
-------
y_pred_filtered : list of list of tuples

"""
return [[detected_object[1:]
for detected_object in y_pred_patch
if detected_object[0] > conf_threshold]
for y_pred_patch in y_pred]

Expand All @@ -182,7 +197,7 @@ def mask_detection_curve(y_true, y_pred, conf_thresholds):
y_true : list of list of tuples
Tuples are of form (x, y, radius).
y_pred : list of list of tuples
Tuples are of form (x, y, radius, confidence).
Tuples are of form (confidence, x, y, radius).
conf_thresholds : array-like
The confidence threshold for which to calculate the
precision and recall.
Expand Down Expand Up @@ -212,7 +227,7 @@ def ospa_curve(y_true, y_pred, conf_thresholds):
y_true : list of list of tuples
Tuples are of form (x, y, radius).
y_pred : list of list of tuples
Tuples are of form (x, y, radius, confidence).
Tuples are of form (confidence, x, y, radius).
conf_thresholds : array-like
The confidence threshold for which to calculate the
precision and recall.
Expand Down
40 changes: 21 additions & 19 deletions rampwf/utils/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,9 +106,10 @@ def assert_submission(ramp_kit_dir='.', ramp_data_dir='.',
X_train, y_train, X_test, y_test = assert_data(ramp_kit_dir, ramp_data_dir)
cv = assert_cv(ramp_kit_dir, ramp_data_dir)
score_types = assert_score_types(ramp_kit_dir)
print('Training {}/submissions/{} ...'.format(
ramp_kit_dir, submission))

module_path = join(ramp_kit_dir, 'submissions', submission)
print('Training {} ...'.format(module_path))

train_train_scoress = np.empty((len(cv), len(score_types)))
train_valid_scoress = np.empty((len(cv), len(score_types)))
test_scoress = np.empty((len(cv), len(score_types)))
Expand Down Expand Up @@ -164,23 +165,24 @@ def assert_submission(ramp_kit_dir='.', ramp_data_dir='.',
score_type.name, round(score, score_type.precision)))

print('----------------------------')
means = train_train_scoress.mean(axis=0)
stds = train_train_scoress.std(axis=0)
for mean, std, score_type in zip(means, stds, score_types):
print('train {} = {} ± {}'.format(
score_type.name, round(mean, score_type.precision),
round(std, score_type.precision + 1)))

means = train_valid_scoress.mean(axis=0)
stds = train_valid_scoress.std(axis=0)
for mean, std, score_type in zip(means, stds, score_types):
print('valid {} = {} ± {}'.format(
score_type.name, round(mean, score_type.precision),
round(std, score_type.precision + 1)))
_print_result(train_train_scoress, score_types, 'train')
_print_result(train_valid_scoress, score_types, 'valid')
_print_result(test_scoress, score_types, 'test')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean-up!



means = test_scoress.mean(axis=0)
stds = test_scoress.std(axis=0)
def _print_result(scores, score_types, step):
means = scores.mean(axis=0)
stds = scores.std(axis=0)
for mean, std, score_type in zip(means, stds, score_types):
print('test {} = {} ± {}'.format(
score_type.name, round(mean, score_type.precision),
round(std, score_type.precision + 1)))
# If std is a NaN
if std != std:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?
It feels a bit a way to hide bugs, as the std can never really be NaN when using actual data?

Copy link
Member

@jorisvandenbossche jorisvandenbossche Oct 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(ah no, it is NaN if your scores are exactly the same, eg all 0 because of using a dummy model ?)

result = '{step} {name} = {val}'.format(
step=step, name=score_type.name, val=mean)
else:
result = '{step} {name} = {val} ± {std}'.format(
step=step,
name=score_type.name,
val=round(mean, score_type.precision),
std=round(std, score_type.precision + 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the plus one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not change a thing of what was there.
I suppose it is to have 'precision' digits after the coma.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, it is how it was before.
I would say, as you are changing, clean it up to remove the '+1'. But I suppose the original author had a reason for it ;-)

(I personally think a standard deviation with such a high precision does not make much sense)

print(result)