Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to train model using mixed precision fp16? #3

Open
Bhavay-2001 opened this issue Jul 30, 2023 · 2 comments
Open

How to train model using mixed precision fp16? #3

Bhavay-2001 opened this issue Jul 30, 2023 · 2 comments

Comments

@Bhavay-2001
Copy link

Hello author, I tried using the original DiT model for training but facing out of memory issue. I saw your repository which implements DiT using memory constraints. In the README file, I saw you used a mixed_precision argument but I couldn't find it anywhere in the code. I just want to copy the model architecture file and adjust it according to my implementation of the work. Can you please tell which model arch uses less memory constraints as it is a bit confusing to me to understand so just clarifying.

@chuanyangjin
Copy link
Owner

Hello @Bhavay192,

The "--mixed_precision" argument is automatically taken care of by the accelerate library. Therefore, there's no direct reference to it in the code.

The gradient checkpointing significantly reduces memory usage. You can find the relevant implementation in the models.py, lines 233-237 and line 251.

@Bhavay-2001
Copy link
Author

Heyy, thanks for the reply.

I'm trying your model implementation on medical image anomaly detection on Brats21 dataset. Currently the input image size is 96, and patch_size(another param) is 48. If I run with these configs, I'm facing out of memory issue.

However, If I reduce the input image and patch size, the code runs well for few epochs until it crashes to throw an error produced by the changed patch_size. I believe the code will run perfect with default settings(image_size - 96, patch_size-46). How can I convert the model to fp16, any ideas?

Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants