WIP: Keep results on CPU to avoid OOM #148

cvl01 · 2024-11-07T09:57:25Z

To fix issues with OOM with large audio files, I made some changes so that the large tensors are on the CPU and not on the GPU.

This implementation of course introduces a slight amount of overhead because of CPU-GPU transfer. It should'nt be too much though.

Please let me know what you think. If you agree with this approach, it could be extended to the other model architectures too.

Fixes #120, #109 and other similar OOM issues.

cvl01 · 2024-11-07T10:06:02Z

It should be noted that for large audio files, of course a lot of RAM is needed. As most people will have more RAM than VRAM available, I don't think it will be a big issue.

Using the updated code, I was able to transcribe a 4-hour long file succesfully, on a setup with a RTX 4090 (24 GB VRAM) and 64 GB RAM.

@Ylw2014 can you probably test and reflect?

Keep results on CPU to avoid OOM

4e4f026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP: Keep results on CPU to avoid OOM #148

WIP: Keep results on CPU to avoid OOM #148

cvl01 commented Nov 7, 2024

cvl01 commented Nov 7, 2024

WIP: Keep results on CPU to avoid OOM #148

Are you sure you want to change the base?

WIP: Keep results on CPU to avoid OOM #148

Conversation

cvl01 commented Nov 7, 2024

cvl01 commented Nov 7, 2024