Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ressource estimation #35

Open
henkela opened this issue Nov 22, 2024 · 2 comments
Open

Ressource estimation #35

henkela opened this issue Nov 22, 2024 · 2 comments
Assignees
Labels
enhancement New feature or request

Comments

@henkela
Copy link

henkela commented Nov 22, 2024

Hi,
I'm running olah for a while now and tried pulling larger models like Llama-70B and similar (relevant files between 5G and 17G) -
huggingface-cli stopped at some point with incomplete message errors. I dig into it and found that olah-server went OOM and got killed.
The VM I had it running on was a redhat 9.4 with 4 vcpus, 16G RAM, and 500G ssd for local cache. huggingface-cli usually pulls with 8 workers hence 8 http-requests. And it seems like olah tries to caches all of them in RAM while sending the response. That's at least how it looked like. Maybe the chunk-size could be agnostic and dynamic to the available RAM?
For now I increased the memory of the VM to 64G and maybe you want to add some info on RAM considerations or sizing of the caching server in general.
Best
Andreas

@henkela
Copy link
Author

henkela commented Nov 22, 2024

Actually for one huggingface-cli download of llama 70B to work I needed even more than 64G RAM on the olah-server. It failed when trying to download 8 files of size 17.4GB. I now set it to 128G.

@jstzwj jstzwj self-assigned this Nov 24, 2024
@jstzwj jstzwj added the enhancement New feature or request label Dec 2, 2024
@carlsonp
Copy link

I've also seen some OOM issues. Making adjustments and increasing the available RAM seemed to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants