Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

the sum of explained_variance_ratio_ is not 1 in python #32

Open
SabriQ opened this issue Jan 17, 2021 · 4 comments
Open

the sum of explained_variance_ratio_ is not 1 in python #32

SabriQ opened this issue Jan 17, 2021 · 4 comments

Comments

@SabriQ
Copy link

SabriQ commented Jan 17, 2021

Hi. Just to make sure whether it's normal to be much less than 1 (about 0.5) when having all the explained_variance_ratio_ added up in dPCA (python)? if not, where should I make a mistake?

@wielandbrendel
Copy link
Collaborator

Sure, if the number of components is much smaller than the data dimension, then you cannot explain all the variance in your data. Same as in PCA.

@SabriQ SabriQ closed this as completed Jan 26, 2021
@SabriQ SabriQ reopened this Mar 19, 2021
@SabriQ
Copy link
Author

SabriQ commented Mar 19, 2021

sorry to disturb you again.2questions. one is will the nan in matrix which I replaced with 0 could lead to much lower explained_variance_ratio_? I found the dPCA explained_variance_ratio of the first 10 components are much smaller that that in PCA, which is about ~1% to ~70% in my case. I'm wondering whether it's caused by the way the replacement of nan value.
the another one is will the code calculate the explain ed_variance_ratio in MATLAB and Python is different? they organize the matrix in different dimension orders.

@SabriQ
Copy link
Author

SabriQ commented Mar 27, 2021

Its might be the big difference between dimensions.

@sean-metzger
Copy link

I think the explained variance ratio python code is wrong.

Looking at the python code, it only uses the variance of the projection of the original data onto each decoder dimension to calculate the explained variance ratio. While this would technically work for PCA since the encoder and decoder matrices are the same and each vector has 2-norm of 1, here that's not true, so it doesn't work.

I think the original matlab code had it right, where you have to reproject using the encoding matrix, then calculate the variance explained.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants