Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about pdf outcomes #194

Closed
hviolaphd opened this issue Apr 14, 2023 · 3 comments
Closed

Question about pdf outcomes #194

hviolaphd opened this issue Apr 14, 2023 · 3 comments
Labels
user question User question about a specific dataset

Comments

@hviolaphd
Copy link

I cannot find any documentation describing the ideal shape of the ELBO plot in the outcomes PDF. I saw that it should "converge" but what should we do if we aren't getting good convergence of train and test curves? Does that mean it didn't converge? How do we find out why and how do we adjust to fix it?

I attached a plot that seems like it looks good (6348) and one that seems like it looks bad (6333). Can you confirm if my assessment is correct and perhaps provide a suggestion on what parameters might need tuning? I can provide more information if needed. This is a single nuclear sequencing assay.

Thanks,
6348.pdf
6333.pdf

Hannah

@sjfleming
Copy link
Member

Hi Hannah,

Yes you're right, I do not have enough documentation on that. Partly the problem is that it's a little bit more subjective than I'd like it to be.

But your example 6348 is a perfect example of a run that looks great. I've hardly ever seen an ELBO plot that looks better! :)

You are also correct that 6333 is an example of a bad run. Somehow during training, some things went a bit "off the rails" in the middle there, and that should not happen. There are two competing interests during training: getting the right answer, and not taking forever to do it. The desire for speed makes us want to push the "learning rate" as high as we can, but eventually we run into problems with accuracy, where stochastic gradient descent cannot find a near-optimal solution, or gets thrown into some local minimum that is sub-optimal.

In CellBender, we've chosen a default learning rate that works for most datasets we've tested on; however, there are datasets like 6333 where the default is too large.

The recommendation would be to reduce the learning rate by including the input argument --learning rate 5e-5. If that doesn't work, you could try --learning rate 2e-5 or --learning rate 1e-5. Hopefully one of those can fix the issue for you.

The learning curve doesn't need to be as perfect as it looks in 6348. In particular, the train and test ELBO values can be different, and follow different curves. That is not a big problem. But you do want to see both curves generally moving upward, and generally approaching a pretty stable value near the end of training where it "looks like it wouldn't change much if you kept training". The curves can be noisier than they are in 6348, and that's okay. But you don't want to see the kind of wild dips that are present in 6333.

I will be including an automated output HTML report in version 0.3.0 that will have some automatic warnings and commentary about the learning curve that will hopefully help users in the future.

Thanks,
Stephen

@sjfleming sjfleming added the user question User question about a specific dataset label Apr 14, 2023
@hviolaphd
Copy link
Author

Thank you for such a quick and detailed response! I will try again with modified parameters. -Hannah

@sjfleming
Copy link
Member

New output reports are in v0.3.0, and hopefully some fixes have made the learning curves less likely to have problems.

Closed by #238

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
user question User question about a specific dataset
Projects
None yet
Development

No branches or pull requests

2 participants