RTX 4090 finetune with CUDA_OUT_OF_memory #18

clpoz · 2024-10-30T10:15:08Z

As the title, I think it should be fine with 24GB cuda_memory, when I start finetuing, it shows only take 18GB, while 3 hours later, I found it show errors with cuda_out_of_memory. Anyone have ideas? I don't know why?

clpoz · 2024-10-30T10:51:53Z

Ok, I found it, after did a validation and saved a checkpoint, the VRAM usage added about 3GB and never drop back, how could it happen? how could I change setting to solve it？

little51 · 2024-10-31T00:13:56Z

try reduce block_size or per_device_train_batch_size per_device_eval_batch_size

clpoz closed this as completed Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RTX 4090 finetune with CUDA_OUT_OF_memory #18

RTX 4090 finetune with CUDA_OUT_OF_memory #18

clpoz commented Oct 30, 2024

clpoz commented Oct 30, 2024

little51 commented Oct 31, 2024

RTX 4090 finetune with CUDA_OUT_OF_memory #18

RTX 4090 finetune with CUDA_OUT_OF_memory #18

Comments

clpoz commented Oct 30, 2024

clpoz commented Oct 30, 2024

little51 commented Oct 31, 2024