You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using Ollama as a Backend for my models.
In Continue.dev I want to use Qwen2.5 1.5B to autocomplete my code.
This works perfectly if set up the config to directly talk to the Ollama API under http://ollamahostip:11434**/api/generate**.
I never got it to work with directly talking to the LiteLLM-API (using mistral api or openai api) so I tried the pass-through function and that finally worked. I have two PCs running the same model as redundancy, so if I set up a pass-through, only one server would be utilized.
I also use Langfuse for Monitoring the requests, and when using pass-through the API User is not visible. My questions, are there any plans to implement /api/generate ?
Thank you very much!
Best regards, Robert
Motivation, pitch
I want to always use LiteLLM for all my AI-API-Requests, it would be great if the endpoint /api/generate can be implemented.
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered:
Hi Krish, thank you for the quick reply.
I was not able to find the /api/generate in the Swagger of Litellm (https://litellm-api.up.railway.app/)
Contiune.dev tries to directly contact url:port**/api/generate** when selecting ollama as provider. (i added the LiteLLM url:4000 as a baseurl to handle the requests)
They do not support OpenAI API as OpenAI does not support FIM they mentioned, so only Ollama API or Mistral is supported.
(see: https://docs.continue.dev/autocomplete/model-setup).
The Feature
I am using Ollama as a Backend for my models.
In Continue.dev I want to use Qwen2.5 1.5B to autocomplete my code.
This works perfectly if set up the config to directly talk to the Ollama API under http://ollamahostip:11434**/api/generate**.
I never got it to work with directly talking to the LiteLLM-API (using mistral api or openai api) so I tried the pass-through function and that finally worked. I have two PCs running the same model as redundancy, so if I set up a pass-through, only one server would be utilized.
I also use Langfuse for Monitoring the requests, and when using pass-through the API User is not visible.
My questions, are there any plans to implement /api/generate ?
Thank you very much!
Best regards, Robert
Motivation, pitch
I want to always use LiteLLM for all my AI-API-Requests, it would be great if the endpoint /api/generate can be implemented.
Twitter / LinkedIn details
No response
The text was updated successfully, but these errors were encountered: