-
Notifications
You must be signed in to change notification settings - Fork 87
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change default action for invalid target data check #4131
Change default action for invalid target data check #4131
Conversation
Codecov Report
@@ Coverage Diff @@
## main #4131 +/- ##
=======================================
+ Coverage 99.7% 99.7% +0.1%
=======================================
Files 349 349
Lines 37740 37752 +12
=======================================
+ Hits 37623 37635 +12
Misses 117 117
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
data_check_name="InvalidTargetDataCheck", | ||
parameters={ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are we removing all the params?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is testing the default action which is now drop rows, and the drop rows action doesn't have any of the parameters required by the impute column action
data_check = InvalidTargetDataCheck("binary", "Log Loss Binary") | ||
data_checks_output = data_check.validate(None, y) | ||
|
||
action_pipeline = make_pipeline_from_data_check_output("binary", data_checks_output) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will be interesting to see how we fit these components or "datacheck pipelines" into our search pipelines!
@@ -82,7 +82,7 @@ def validate(self, X, y): | |||
|
|||
>>> X = pd.DataFrame({"col": [1, 2, 3, 1]}) | |||
>>> y = pd.Series(["cat_1", "cat_2", "cat_1", "cat_2"]) | |||
>>> target_check = InvalidTargetDataCheck("regression", "R2") | |||
>>> target_check = InvalidTargetDataCheck("regression", "R2", null_strategy="impute") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to keep the default behavior in the docstring. Can you file an issue for it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filed here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cheers!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM just some followup comments.
Fixes: #4130
After creating the pull request: in order to pass the release_notes_updated check you will need to update the "Future Release" section of
docs/source/release_notes.rst
to include this pull request by adding :pr:123
.