Memory issue when try to build customised model #326

CWYuan08 · 2023-01-10T13:51:40Z

Hi I encountered this error when trying bonito basecaller to build our own model:

RuntimeError: CUDA out of memory. Tried to allocate 1.95 GiB (GPU 0; 15.74 GiB total capacity; 2.37 GiB already allocated; 1.70 GiB free; 2.42 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

I don't really understand this message, should I add max_split_size_mb to the command? Thank your very much!

Best,
CW

CWYuan08 · 2023-01-12T11:20:08Z

Hi I have changed export 'PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:100’ and export 'CUDA_LAUNCH_BLOCKING=1’, but I am still getting "RuntimeError: CUDA error: out of memory", do I need to change anything in the serialization.py file?

Thank you!

CWYuan08 · 2023-01-17T13:00:42Z

The GPU we are trying to run it on is an Nvidia A4000 with 16GB of VRAM.

CWYuan08 · 2023-01-18T14:45:27Z

This post: #247 solved my problem

davidnewman02 closed this as completed Dec 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory issue when try to build customised model #326

Memory issue when try to build customised model #326

CWYuan08 commented Jan 10, 2023

CWYuan08 commented Jan 12, 2023

CWYuan08 commented Jan 17, 2023

CWYuan08 commented Jan 18, 2023

Memory issue when try to build customised model #326

Memory issue when try to build customised model #326

Comments

CWYuan08 commented Jan 10, 2023

CWYuan08 commented Jan 12, 2023

CWYuan08 commented Jan 17, 2023

CWYuan08 commented Jan 18, 2023