Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rachaelj #68

Open
wants to merge 25 commits into
base: master
Choose a base branch
from
Open

Rachaelj #68

wants to merge 25 commits into from

Conversation

narnij
Copy link
Contributor

@narnij narnij commented Feb 9, 2021

upload .py version of linear regression, Lasso, and met2gcp

for j in range(Ytrain.shape[1]):
x=linear_model.Lasso(alpha=a).fit(Xtrain, Ytrain[:, j])
mdl.append(x)
ypred = mdl[i].predict(Xtest)
Copy link
Member

@ScottCampit ScottCampit Feb 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would move this code to the inner loop and predict with x instead - the reason is because if you evaluate with mdl[i], you're not evaluating all the models you just trained. Instead, if you're on the first alpha, you'll only train 1 model - mdl[0].

x=linear_model.Lasso(alpha=a).fit(Xtrain, Ytrain[:, j])
mdl.append(x)
ypred = mdl[i].predict(Xtest)
error.append(mean_squared_error(ypred,Ytest[:, j]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing here for the same exact reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then to save on memory, write some code to update your model list. The logic is the following:

  1. Evaluate models with first alpha
  2. On the next iteration, save your models and errors in another object
  3. Write code that will see if the MSE in the current alpha level is less than the MSE in the previous alpha level
  4. If it is, replace the model object in your original list, and update the alpha associated with this model
  5. Otherwise, don't update and continue

print(error)
minmse_index = error.index(min(error))
best_alpha = alphas[minmse_index]
print(best_alpha)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why all of your models have a single alpha value. But remember - each model corresponding to each target variable should have a unique "best" alpha. The comments above should resolve this.


GCP2MET_models = []
for i in range(Ytrain.shape[1]):
mdl_G2M =linear_model.Lasso(alpha=best_alpha).fit(Xtrain, Ytrain[:, i])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the case described above, you will need to run this code with the arugment alpha=best_alpha[i] to get the correct values.

# In[7]:


kf = KFold(n_splits=3)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I had you run hold out previously just to get the concept down. In practice, you don't have to run hold out. You can just report the results from k-fold CV,


r, p_value = pearsonr(Ypred, Ytest[:, i])
# Compute P-value based on r-value
pval = norm.sf(abs(r))*2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this may be where things are going wrong. I would run this code once you get a vector of r values out of the loop. I think this will return different values because your distribution will be different. I may be wrong, but it's worth a shot.

Copy link
Member

@ScottCampit ScottCampit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just made comments inline for your linear regression and LASSO code. Take a look and let me know if you have any questions, or if you want to pitch additional ideas, let me know too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants