You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi there, This looks a great package. I'm particularly interested in the ability to fit LRMs to datasets with missing data (or in my case, outliers that need to be masked). I have a quick question that may be pretty basic, but an answer would help me to apply the code to my own data. Apologies if I've missed something in the documentation. I'm also fairly new to Julia.
If I fit a PCA model to a set of training data A (following your example):
loss = QuadLoss()
r = ZeroReg()
n_comp = 1
glrm = GLRM(A,loss,r,r,n_comp)
X,Y,ch. = fit!(glrm)
how do I then apply the same model to a new set of data B? I would like to keep X fixed and obtain new values Y_b that give the best fit of X to B. That is, I would like to project the observations in B onto the PCA components found from A.
There are other PCA packages in Julia that will do this (e.g., the reconstruct function in MultivariateStats), but they don't seem to be able to handle missing data or sparse arrays.
Thanks in advance! Any help is appreciated!
The text was updated successfully, but these errors were encountered:
I want to first clarify the intent of the question. Let's say A is a matrix (or DataFrame/sparse matrix) of m rows by n columns. The GLRM (assuming real-valued or boolean-valued data for simplicity) produces a matrix X of m rows by k columns, and a matrix Y of k rows by n columns, where k is the rank.
It sounds like you have another dataset B, of size p rows by n columns. B's projection on the PCA components from A would be a matrix of size p rows by k columns. In PCA with no missing values and centered data, this would be a matrix multiplication (B * Y' *<a diagonal matrix>). However, that projection doesn't work with the structure of GLRM because that formula is only correct with a quadratic (least-squares) loss function.
With LowRankModels, the easiest way to do this is to fit another GLRM while holding Y constant. You can do this like so:
loss = QuadLoss() # Or whatever loss you chose before
r_x = ZeroReg() # Or whichever regularizer you desired on X
r_y = [FixedLatentFeaturesConstraint(Y[:, i]) for i=1:size(Y, 2)]
n_comp = 1
glrm_b = GLRM(B, loss, r_x, r_y, n_comp)
X_b, Y, ch = fit!(glrm_b)
If you want to calculate a new Y matrix instead of a new X matrix, just keep r_y to be whatever you used as r, and define r_x = [FixedLatentFeaturesConstraint(X[:, i]) for i=1:size(X, 2)]
Hi there, This looks a great package. I'm particularly interested in the ability to fit LRMs to datasets with missing data (or in my case, outliers that need to be masked). I have a quick question that may be pretty basic, but an answer would help me to apply the code to my own data. Apologies if I've missed something in the documentation. I'm also fairly new to Julia.
If I fit a PCA model to a set of training data A (following your example):
how do I then apply the same model to a new set of data
B
? I would like to keepX
fixed and obtain new valuesY_b
that give the best fit ofX
toB
. That is, I would like to project the observations inB
onto the PCA components found fromA
.There are other PCA packages in Julia that will do this (e.g., the
reconstruct
function inMultivariateStats
), but they don't seem to be able to handle missing data or sparse arrays.Thanks in advance! Any help is appreciated!
The text was updated successfully, but these errors were encountered: