Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

its posible to use gpu? #141

Open
senkron24 opened this issue May 3, 2023 · 6 comments
Open

its posible to use gpu? #141

senkron24 opened this issue May 3, 2023 · 6 comments

Comments

@senkron24
Copy link

its posible to use gpu?

@ParisNeo
Copy link
Owner

ParisNeo commented May 4, 2023

Yes, but i need to use a gpu compatible backend. Backends are very easy to build. All you have to do is look at llamacpp backend example and create another folder in backends folder then implement another backend using a library that supports gpu. I have one using hugging face transformers but it requires a beast pc as it doesn't run quantized models and the smallest of non quantized models weighs 24Gb

@andzejsp
Copy link
Contributor

andzejsp commented May 4, 2023

how does the oogabooga does it then? with his repo you can allocate the ammount of Vram you have, for instace you got 12gb but you allocate only 10gb of gpu vram and then use either cpu RAM or swap.. thats how i was able to load larger models there but its not reliable.. sometimes models takes twice as much ram space than it does when its on disk. Either ways, looking forward to GPU usages because its waaay faster :) to generate than on CPU

@ParisNeo
Copy link
Owner

ParisNeo commented May 4, 2023

Yes, they have a file that contains the loading of all that jucy stuff.

But I don't want to copy their code. Someone can create a repo with the backend adaptation and I can add it as a possible backend to the ui.

There are literally three functions to implement.

@senkron24
Copy link
Author

I really wish I had the skills or knew how to do this because I believe it would make these tasks much faster. As of now, my computer specs are:
CPU: 13th Gen Intel(R) Core(TM) i7-13700F 2.10 GHz
RAM: 32.0 GB
GPU: 1080 TI

However, even with these specs, I'm still experiencing delays and hang-ups in many responses, which is a bit frustrating.

@andzejsp
Copy link
Contributor

andzejsp commented May 4, 2023

Well there is always time to start getting knowledge, learning new stuff,contributing to community. Google, chatgpt, build, test, and commit

@chongy076
Copy link
Contributor

chongy076 commented May 5, 2023

it seem i missed all the fun, but i am having eye's problem. still in the mid of recovering.

oogabooga , I had seen their code and tested. have to said it had many issues to get it to run due to the max mem size for certain computer and GPU. yes, they had GPU and non GPU one, but pretty good.

I may need pause for a moment until my eye's are recovered. Sorry for the inconvenient. It came at the wrong timing.

but the code i had placed in the previous issues just as a back up. if anyone wanted to have it early for https and proxy i had attached the code.

but front .js still need some works to change and attention needed to add that https and proxy.

if we didn't use socketio, the fast way will be https://pypi.org/project/waitress/.

this would be easy for small or medium size request, since we are using socketio we have to use below code.

[this is before the latest backend changed]
app.txt
https_main.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants