Replies: 4 comments 7 replies
-
I should also add that I am using the stress ModelBuilder if that is an issue in this case. |
Beta Was this translation helpful? Give feedback.
-
@tgmaxson can you post the actual OOM error? You may not be running OOM on model evaluation but elsewhere... |
Beta Was this translation helpful? Give feedback.
-
For Allegro, due to the locality, this is possible but is much more complicated and is not something we support. I've been writing some helpers for unit tests you might be able to adapt to do this with CPU offloading, but it's unfortunately not something that is a priority on my end... let me know if you'd want to try.
Our approach to multi-GPU is data-parallel, so if a single example won't fit on one GPU it will unfortunately not help. Practically and for the moment, have you checked whether you can get away with a smaller |
Beta Was this translation helpful? Give feedback.
-
OK, I found it @tgmaxson !! 🎉 🎉 It's Check out the Caveats:
(If you really feel like it or have issues you could benchmark a comparison with |
Beta Was this translation helpful? Give feedback.
-
I have a 512 atom system that I run out of memory using a batch size of 1 on A100 GPUs. Is there any way to split the structure into multiple parts for evaluation such that I effectively do a 0.5 batch? My understanding is the NN is evaluated wrt each atom rather than the structure as a whole so this seems like it should be possible in theory, but maybe not implemented? Is there anything else I can do in this case other than evaluate it on the CPU?
Solutions involving multiple GPUs could work as well but I think that is not available right now.
Beta Was this translation helpful? Give feedback.
All reactions