Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Keep results on CPU to avoid OOM #148

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

cvl01
Copy link
Contributor

@cvl01 cvl01 commented Nov 7, 2024

To fix issues with OOM with large audio files, I made some changes so that the large tensors are on the CPU and not on the GPU.

This implementation of course introduces a slight amount of overhead because of CPU-GPU transfer. It should'nt be too much though.

Please let me know what you think. If you agree with this approach, it could be extended to the other model architectures too.

Fixes #120, #109 and other similar OOM issues.

@cvl01
Copy link
Contributor Author

cvl01 commented Nov 7, 2024

It should be noted that for large audio files, of course a lot of RAM is needed. As most people will have more RAM than VRAM available, I don't think it will be a big issue.

Using the updated code, I was able to transcribe a 4-hour long file succesfully, on a setup with a RTX 4090 (24 GB VRAM) and 64 GB RAM.

@Ylw2014 can you probably test and reflect?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Killed for the long recording
1 participant