Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to Hyperparameters of the AdamW optimizer? How to use global vectors? #68

Open
Yang-Changhui opened this issue Mar 17, 2024 · 4 comments

Comments

@Yang-Changhui
Copy link

when I train the ICAR-ENSO dataset,I found the hyperparameter settings of AdamW in the paper are different from the parameter settings in the earthformer_enso_v1.yaml file, so which one should be used?If I want to use global vector, should I modify those parameters? Thank you.

@gaozhihan
Copy link
Contributor

Thank you for bringing up this issue. Your observation is correct regarding the default config on ICAR-ENSO, which slightly differs from that of SEVIR, N-body MNIST, and Moving MNIST. We have noticed that using lr=1e-3 and num_global_vectors=8 for training is rather unstable. Therefore, we have released a more stable config for better reproducibility.

@Yang-Changhui
Copy link
Author

Thank you.May I ask, when num_global_vectors=0 in earthformer_enso_v1.yaml, the global variable is not applied?The corresponding indicators should be:
image,correct?How to use global vector to train ENSO dataset,I can get the second row of results?

@gaozhihan
Copy link
Contributor

Thank you for your question. Yes, num_global_vectors=0 indicates not using global vectors.
The training of Earthformer with global vectors (num_global_vectors=8) on ICAR-ENSO is rather unstable. To help alleviate this, I would recommend initializing some parameters to zero, similar to what we did in our PreDiff that used Earthformer.

@Yang-Changhui
Copy link
Author

Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants