Implementation of #129: Allow more tuning of the model hyperparameters. #130

sandervh14 · 2022-04-08T08:55:09Z

====> Do not merge yet! See below for explanation. <=====

Story Title

#129 Allow more tuning of the model hyperparameters.

Changes made

Altered the constructor of both LogisticRegressionModel and LinearRegressionModel to support override of the default hyperparameters, via kwargs passing.
Documented in the docstring of both models.
See the issue description for more extensive explanation of why one would want this.

** Do not merge this yet with the Development branch. ** Unit tests have not yet been adapted yet. Was a speedy commit of an idea I was playing with in a notebook, when investigating how to further tune the models for a client. I'll take the time next week to adapt the unit tests.

How does the solution address the problem

This PR allows a data scientist to pass argument overrides for the constructor of the scikit-learn model used behind the scenes in Cobra, to try to tune further on a model's performance, if possible vs. Cobra's default settings for the model.

Linked issues

Resolves #129

sborms · 2022-04-13T12:19:41Z

Looks good Sander! Code is clear to me.

Two remarks:

We have a get_intercept() function in the model classes - what will the result be if no intercept is fitted? This is a barely used helper function, but better to show a graceful error if there is no intercept to retrieve.
Agree that for linear models the extra parameters are already taken care of by Cobra, but let's allow for the flexibility. Maybe add a caveat in one of the classes (so at least we have it written down somewhere).

ZlaTanskY · 2022-04-14T08:36:30Z

cobra/model_building/models.py

+        default_kwargs = dict(fit_intercept=True)
+        for kwarg, val in default_kwargs.items():
+            if kwarg not in kwargs:
+                kwargs[kwarg] = val
+        self.linear = LinearRegression(**kwargs)


This for loop can be replaced by a oneliner by using .update() (see https://www.programiz.com/python-programming/methods/dictionary/update). In that case, default_kwargs best to be renamed to something like model_kwargs.

Suggested change

default_kwargs = dict(fit_intercept=True)

for kwarg, val in default_kwargs.items():

if kwarg not in kwargs:

kwargs[kwarg] = val

self.linear = LinearRegression(**kwargs)

model_kwargs = dict(fit_intercept=True)

model_kwargs.update(kwargs)

self.linear = LinearRegression(**model_kwargs)

Nice Jano, I was going to commit it straight away with the github interface, but then it isn't applied to LogisitcRegression as well. I'll have a look at it this afternoon and will include Sam's remarks

An idea to resolve duplicate code (and documentation to some extent) is to create a BaseModel class from which both the LinearRegression and LogisticRegression class inherit.
But this is of course out of scope for this PR and should be considered in a separate PR.

An idea to resolve duplicate code (and documentation to some extent) is to create a BaseModel class from which both the LinearRegression and LogisticRegression class inherit.

I remembered this comment while fixing #126, I also found there was quite some duplication and I had time to do this, so I've done this today with the solution to #126, yippee! :-). For details, see explanation in #128 (comment).

But this is of course out of scope for this PR and should be considered in a separate PR.

I took the liberty to include the superclassing abstraction in #126 anyway, instead of a new issue & PR dedicated to it, since the evaluate() which I was fixing unit tests for, takes up a BIG chunk of the code of both LinearRegression and LogisticRegressionModel.

sandervh14 · 2022-04-15T09:57:53Z

Hey Sam:

Agree that for linear models the extra parameters are already taken care of by Cobra, but let's allow for the flexibility.

You mean only linear regression right? Do you agree with me that for logistic regression it's certainly a good idea? (most of all for tuning regularization C?)

sborms · 2022-04-15T12:25:00Z

Hey Sam:

Agree that for linear models the extra parameters are already taken care of by Cobra, but let's allow for the flexibility.

You mean only linear regression right? Do you agree with me that for logistic regression it's certainly a good idea? (most of all for tuning regularization C?)

Indeed, linear regression. It's definitely a good idea, it's fine to have the flexibility for both logistic & linear regression.

…for crashing unit tests.

sandervh14 · 2022-06-01T14:17:46Z

Done. Advice: merge #126 first, before this one! I think that'll make merging easier.

Support tuning the model's hyperparameters, instead of the default ones.

c9775b3

sandervh14 linked an issue Apr 8, 2022 that may be closed by this pull request

Allow more tuning of the model hyperparameters. #129

Open

sandervh14 changed the base branch from master to develop April 8, 2022 08:55

sandervh14 requested a review from sborms April 8, 2022 08:55

sandervh14 self-assigned this Apr 8, 2022

sandervh14 added the enhancement New feature or request label Apr 8, 2022

sandervh14 added this to the v1.2.0 milestone Apr 8, 2022

ZlaTanskY reviewed Apr 14, 2022

View reviewed changes

Small improvements based on code review + fixes in forward selection …

a2486c1

…for crashing unit tests.

sandervh14 modified the milestones: 2023-03, 2023-04 Apr 7, 2023

sandervh14 modified the milestones: 2023-04, 2023-05 May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implementation of #129: Allow more tuning of the model hyperparameters. #130

Implementation of #129: Allow more tuning of the model hyperparameters. #130

sandervh14 commented Apr 8, 2022

sborms commented Apr 13, 2022

ZlaTanskY Apr 14, 2022

sandervh14 Apr 15, 2022

ZlaTanskY Apr 15, 2022

sandervh14 Jun 1, 2022

sandervh14 commented Apr 15, 2022

sborms commented Apr 15, 2022

sandervh14 commented Jun 1, 2022 •

edited

Loading

Implementation of #129: Allow more tuning of the model hyperparameters. #130

Are you sure you want to change the base?

Implementation of #129: Allow more tuning of the model hyperparameters. #130

Conversation

sandervh14 commented Apr 8, 2022

Story Title

Changes made

How does the solution address the problem

Linked issues

sborms commented Apr 13, 2022

ZlaTanskY Apr 14, 2022

Choose a reason for hiding this comment

sandervh14 Apr 15, 2022

Choose a reason for hiding this comment

ZlaTanskY Apr 15, 2022

Choose a reason for hiding this comment

sandervh14 Jun 1, 2022

Choose a reason for hiding this comment

sandervh14 commented Apr 15, 2022

sborms commented Apr 15, 2022

sandervh14 commented Jun 1, 2022 • edited Loading

sandervh14 commented Jun 1, 2022 •

edited

Loading