Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rSLDS fitting changes significantly each time I initializa #167

Open
juliagorman opened this issue May 13, 2024 · 4 comments
Open

rSLDS fitting changes significantly each time I initializa #167

juliagorman opened this issue May 13, 2024 · 4 comments

Comments

@juliagorman
Copy link

Even if I set my ssm with gaussian_id emissions as follows, I get very different results each time I run the code:

rslds = ssm.SLDS(D_obs, K, D_latent, transitions="recurrent_only", dynamics="gaussian", emissions="gaussian_id", single_subspace=True)

Is there a way to reduce this?
Below are two seperate times I ran the same code
download-1
download

@slinderman
Copy link
Collaborator

Thanks for the question, @juliagorman. The rSLDS inference algorithm has to solve a nonconvex, combinatorial optimization problem with many local optima, so it's not surprising that you would get different answers from one run to the next. One way to mitigate this issue is to use a heuristic to initialize the model. I believe the SSM default initialization is to first estimate the continuous latent states using PCA (or in the case of gaussian_id emissions, just initialize x to the observations), then fit an ARHMM to initialize the discrete states.

Your example may be especially problematic as it appears that the true latents follow a linear dynamical system (equiv, an rSLDS with one discrete state), but you're fitting the rSLDS with K=5 states. I would recommend doing some model selection (e.g., based on held out ELBOs), which would presumably show that K=1 states is best.

@juliagorman
Copy link
Author

Hello,

thank you for your response. I generated this circle dynamics with some noise so I could play with some of the parameters before I tested it on my own data. Cross validation with ELBOS showed me K=7 was best but I will next try using the explained variance you responded to in another question next. Are there any parameters in the future when I use my own data that will help with this problem? Trying to perform parameter sweeps are currently challenging since I can’t tell the difference between initialization being different or the parameter affecting the model. Also I don’t know if this is useful information but I am currently performing dimensionality reduction before fitting the model on the latent trajectories when using my own data . Thank you for the help!

@sabinary98
Copy link

Hi, and thank you for the previous explanation.
Building on @juliagorman’s question, I’m curious about one specific aspect: I was wondering if you might have any advice on how to approach setting up an rSLDS with gaussian_id emissions, specifically when we don’t want to explicitly specify the number of K states beforehand. Is there a way to let the model infer K dynamically, or would you recommend an alternative approach to handle cases where the optimal number of states isn’t clear from the start?

Thanks in advance for any guidance you can provide, and I really appreciate the resources shared so far!

@bantin
Copy link
Collaborator

bantin commented Jan 28, 2025

Hi @sabinary98 -- Scott can chime in here, but I don't believe SSM has a way of fitting K (doing so is actually quite difficult). The standard way of choosing K would be to do held out cross validation, by looking at the likelihood (or ELBO) on some data that the model hasn't seen. Typically this value should start to saturate, or even decrease, as K increases. Does this help?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants