Multihead replay finetuning converges more slowly than regular training #626
Unanswered
lucasdekam
asked this question in
Q&A
Replies: 1 comment 13 replies
-
Hello, Can you please share the log files for the two training so I can help you. I need for example to look at the initial loss to see if there is a potential problem. |
Beta Was this translation helpful? Give feedback.
13 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I'd like to share my experience so far with multihead finetuning and ask for ideas.
I'm training MACE on a dataset with ~80 platinum-water interface structures of ~400 atoms each; energy and forces are evaluated using VASP with the RPBE functional. I've tried two methods: multihead replay finetuning (starting from the mace-mp0b agnesi small model) and naive finetuning (starting from the standard small model). The training parameters can be found below. The multihead finetuning converges much more slowly and the model struggles to converge the errors on the replayed data (pt_head) again. I've not been able to get the forces error much lower than 100 meV/A, although perhaps this could be achieved by longer training.
Because I wanted to converge the forces faster, I increased the forces weight by a factor 10, which gave this result. With equal weights the forces converge even more slowly.
On the other hand, naive finetuning converges pretty fast to a a rather low force RMSE. For me, the resulting model also seems very stable, so there's no "catastrophic forgetting" (at least not that I've noticed).
I'm still interested in using the multiheads training, as it might improve the generalizability of my model. My question: what could cause the slow convergence of multiheads training? Is this already known? What parameters can one tune to achieve better convergence (should I increase the forces weight even more, etc.)? I'd be happy to hear about any insights :)
Training parameters:
Multihead
Naive
Beta Was this translation helpful? Give feedback.
All reactions