-
Notifications
You must be signed in to change notification settings - Fork 10.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server : add OAI compat for /v1/completions #10974
Conversation
Would this make llama-server compatible with this client? https://github.com/open-webui/open-webui if yes can we please get this in? 😄 I'm also curious for anyone in the know, it seems like a lot of the openai clients (like open-webui) expect the functionality of being able to switch models per request. Does llama-server support this and if not, what would be the effort to add that roughly? |
This is not supported atm. But this logic seems like something more suitable for a proxy/routing layer rather than implementing it in |
@ericcurtin I have no idea if they support 3rd party openai-compatible server or not. Judging from they README, they kinda support it via In either case, I think they rely on |
I wrote llama-swap for just this purpose. It’s a transparent proxy that will swap llama-server based on the model name in the api call. It’s a single golang binary with no dependencies so it is easy to deploy. |
Supersede #10645
Ref documentation: https://platform.openai.com/docs/api-reference/completions/object
The
/v1/completions
endpoint can now be OAI-compatible (not to be confused with/completion
endpoint, without/v1
prefix)Also regrouped the docs to have 2 dedicated sections: one for OAI-compat API and one for non-OAI API
TODO: