Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

early nn stuff #89

Draft
wants to merge 10 commits into
base: main
Choose a base branch
from
Draft

early nn stuff #89

wants to merge 10 commits into from

Conversation

jmsull
Copy link
Collaborator

@jmsull jmsull commented Jun 15, 2023

No description provided.

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 15, 2023

Round 1 of very simple Adam opt plots on cdm at fixed background:

delta and v
deltac_learning_v1_multnoise0 1_Adam80_1 0

vc_learning_v1_multnoise0 1_Adam80_1 0

reconstructed delta', v'
deltacprime_learning_v1_multnoise0 1_Adam80_1 0

vcprime_learning_v1_multnoise0 1_Adam80_1 0

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 15, 2023

v and v' look pretty bad - lots of room to improve

Sorry title label on second to last plot is wrong, should say delta'

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 15, 2023

BTW this is really long mode, $k \sim 0.003$

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 16, 2023

Now training with 50 iters of Adam with $\eta=1$, followed by 50 with $\eta=0.1$, 20 with $\eta=0.01$, and 10 iters of BFGS (default hyperparameters) - it looks a little better, especially in the solutions, maybe not so much in the reconstructions of $u'$.
deltac_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

vc_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

Reconstruction:
deltacprime_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

vcprime_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 16, 2023

The loss curve:

loss_learning_v1_multnoise0 1_Adam50_50_20_1 0_0 1_0 01_bfgs

It looks like maybe bfgs is starting to just turn down? But the BFGS iters are super expensive (I suppose due to Hessian approximation, even with forward diff, which I assume it is using for that).
We should perhaps run this on something with more oomph than my laptop...

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 16, 2023

Some other observations:

  • The solutions for $\delta$ and $v$ look way better with more optimization, which is encouraging
  • Both solutions are super wrong initially, which perhaps is an implementation error in taking out the neutrinos from the ICs for this simplified example? I will check on this
  • Otherwise, what the optimization is doing makes sense - it focuses on the last part of the evolution because the solution is biggest there, so it can afford to do much worse in the initial part of the evolution. We may want to try some of the scaling tricks we talked about today (that are also in the stiff neural ode paper) or something hackier
  • This step-y behavior in the $u'$ function is pretty interesting - here maybe I am not using enough weights - the input is $u$, which is of size 37 in this case, and I am only using a 37->8->8->2 network. Going wider will almost certainly help with this so I can try that.

@jmsull
Copy link
Collaborator Author

jmsull commented Jun 16, 2023

Another thing I'm eager to try is adding more data and batching over k, which will be closer to what we want to do eventually...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant