-
Notifications
You must be signed in to change notification settings - Fork 557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
its posible to use gpu? #141
Comments
Yes, but i need to use a gpu compatible backend. Backends are very easy to build. All you have to do is look at llamacpp backend example and create another folder in backends folder then implement another backend using a library that supports gpu. I have one using hugging face transformers but it requires a beast pc as it doesn't run quantized models and the smallest of non quantized models weighs 24Gb |
how does the oogabooga does it then? with his repo you can allocate the ammount of Vram you have, for instace you got 12gb but you allocate only 10gb of gpu vram and then use either cpu RAM or swap.. thats how i was able to load larger models there but its not reliable.. sometimes models takes twice as much ram space than it does when its on disk. Either ways, looking forward to GPU usages because its waaay faster :) to generate than on CPU |
Yes, they have a file that contains the loading of all that jucy stuff. But I don't want to copy their code. Someone can create a repo with the backend adaptation and I can add it as a possible backend to the ui. There are literally three functions to implement. |
I really wish I had the skills or knew how to do this because I believe it would make these tasks much faster. As of now, my computer specs are: However, even with these specs, I'm still experiencing delays and hang-ups in many responses, which is a bit frustrating. |
Well there is always time to start getting knowledge, learning new stuff,contributing to community. Google, chatgpt, build, test, and commit |
it seem i missed all the fun, but i am having eye's problem. still in the mid of recovering. oogabooga , I had seen their code and tested. have to said it had many issues to get it to run due to the max mem size for certain computer and GPU. yes, they had GPU and non GPU one, but pretty good. I may need pause for a moment until my eye's are recovered. Sorry for the inconvenient. It came at the wrong timing. but the code i had placed in the previous issues just as a back up. if anyone wanted to have it early for https and proxy i had attached the code. but front .js still need some works to change and attention needed to add that https and proxy. if we didn't use socketio, the fast way will be https://pypi.org/project/waitress/. this would be easy for small or medium size request, since we are using socketio we have to use below code. [this is before the latest backend changed] |
its posible to use gpu?
The text was updated successfully, but these errors were encountered: