-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add mean_squared_error
functional metric
#515
Changes from 5 commits
bb6e41e
5767624
5d59470
157d361
22aa421
d21df7b
6ca83b2
878313e
e701e49
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,4 @@ | ||
import warnings | ||
from enum import Enum | ||
from functools import partial | ||
from typing import Optional | ||
|
@@ -41,6 +42,52 @@ | |
assert_never(multioutput_enum) | ||
|
||
|
||
def mse_with_missing_handling(y_true: ArrayLike, y_pred: ArrayLike, multioutput: str = "joint") -> ArrayLike: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should discuss naming and the fact that sklearn's There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We can name it There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If we have both We could remove sklearn's version, we haven't mentioned it in our public documentation (but have it in I'm also a bit of worried that by replacing sklearn's mse with our mse we changed list of available kwargs. |
||
"""Mean squared error with missing values handling. | ||
|
||
`Wikipedia entry on the Mean squared error | ||
<https://en.wikipedia.org/wiki/Mean_squared_error>`_ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think it is a good idea to reference wikipedia. Better to replace with a more reputable source (e.g. Hyndman Forecasting) or to remove completely. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Here I just repeated that we have done with |
||
|
||
The nans are ignored during computation. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It would be helpful to also note what will be returned if all nans in the segment |
||
|
||
Parameters | ||
---------- | ||
y_true: | ||
array-like of shape (n_samples,) or (n_samples, n_outputs) | ||
|
||
Ground truth (correct) target values. | ||
|
||
y_pred: | ||
array-like of shape (n_samples,) or (n_samples, n_outputs) | ||
|
||
Estimated target values. | ||
|
||
multioutput: | ||
Defines aggregating of multiple output values | ||
(see :py:class:`~etna.metrics.functional_metrics.FunctionalMetricMultioutput`). | ||
|
||
Returns | ||
------- | ||
: | ||
A non-negative floating point value (the best value is 0.0), or an array of floating point values, | ||
one for each individual target. | ||
""" | ||
y_true_array, y_pred_array = np.asarray(y_true), np.asarray(y_pred) | ||
|
||
if len(y_true_array.shape) != len(y_pred_array.shape): | ||
raise ValueError("Shapes of the labels must be the same") | ||
|
||
axis = _get_axis_by_multioutput(multioutput) | ||
with warnings.catch_warnings(): | ||
# this helps to prevent warning in case of all nans | ||
warnings.filterwarnings( | ||
message="Mean of empty slice", | ||
action="ignore", | ||
) | ||
result = np.nanmean((y_true_array - y_pred_array) ** 2, axis=axis) | ||
return result | ||
|
||
|
||
def mape(y_true: ArrayLike, y_pred: ArrayLike, eps: float = 1e-15, multioutput: str = "joint") -> ArrayLike: | ||
"""Mean absolute percentage error. | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider setting here a plain string instead of an enum value
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You suggest using just "per-segment"? I suppose we should do this in all other places too?
Why do you think we should consider that?