Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Implement /api/generate for Continue.dev FIM / autocompletion with Ollama? #6900

Open
deliciousbob opened this issue Nov 25, 2024 · 2 comments
Labels

Comments

@deliciousbob
Copy link

The Feature

I am using Ollama as a Backend for my models.
In Continue.dev I want to use Qwen2.5 1.5B to autocomplete my code.
This works perfectly if set up the config to directly talk to the Ollama API under http://ollamahostip:11434**/api/generate**.

I never got it to work with directly talking to the LiteLLM-API (using mistral api or openai api) so I tried the pass-through function and that finally worked. I have two PCs running the same model as redundancy, so if I set up a pass-through, only one server would be utilized.

I also use Langfuse for Monitoring the requests, and when using pass-through the API User is not visible.
My questions, are there any plans to implement /api/generate ?

Thank you very much!
Best regards, Robert

Motivation, pitch

I want to always use LiteLLM for all my AI-API-Requests, it would be great if the endpoint /api/generate can be implemented.

Twitter / LinkedIn details

No response

@deliciousbob deliciousbob added the enhancement New feature or request label Nov 25, 2024
@krrishdholakia
Copy link
Contributor

ollama /api/generate is already supported -

if api_base.endswith("/api/generate"):

can you share a sample request to repro the issue? for FIM tasks we recommend using the /completions endpoint, not /chat/completions

@deliciousbob
Copy link
Author

Hi Krish, thank you for the quick reply.
I was not able to find the /api/generate in the Swagger of Litellm (https://litellm-api.up.railway.app/)
Contiune.dev tries to directly contact url:port**/api/generate** when selecting ollama as provider. (i added the LiteLLM url:4000 as a baseurl to handle the requests)
They do not support OpenAI API as OpenAI does not support FIM they mentioned, so only Ollama API or Mistral is supported.
(see: https://docs.continue.dev/autocomplete/model-setup).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants