Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

got Out of GPU memory when learning #202

Open
A-Cepheus opened this issue Sep 8, 2023 · 6 comments
Open

got Out of GPU memory when learning #202

A-Cepheus opened this issue Sep 8, 2023 · 6 comments

Comments

@A-Cepheus
Copy link

image
It seems like one iteration can be completed, but OOM occurred during the second iteration, any idea?

@A-Cepheus
Copy link
Author

Maybe I should continue reduce batch size?

@jonathan-laurent
Copy link
Owner

You should probably reduce batch size indeed.

@A-Cepheus
Copy link
Author

image
now got a new error

@jonathan-laurent
Copy link
Owner

Out of memory errors are often shown as other errors. I would reduce the batch size and/or network size even further.

@A-Cepheus
Copy link
Author

I feel that the problem with OOM is indeed accompanied by mem_ buff Appearing as the size increases.

@A-Cepheus
Copy link
Author

A-Cepheus commented Oct 2, 2023

This is a possible reason that I am researching: FluxML/FluxTraining.jl#148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants