Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: Elasticsearch Inference API support #6889

Open
stevapple opened this issue Nov 24, 2024 · 0 comments
Open

[Feature]: Elasticsearch Inference API support #6889

stevapple opened this issue Nov 24, 2024 · 0 comments
Labels
enhancement New feature or request

Comments

@stevapple
Copy link

The Feature

Let’s support Elasticsearch as an LLM provider. This includes utilizing Elasticsearch Inference APIs to perform embedding, reranking and chat completion requests.

Motivation, pitch

Elasticsearch has provided a set of Inference APIs for performing inferences with self-hosted or external models. This includes:

  • Embedding models, both dense and sparse vectors are supported;
  • Completion models, with streaming support;
  • Reranking models.

For LiteLLM, it would be great to support using Elasticsearch as an LLM backend. From some perspective, Elasticsearch serves as an LLM proxy gateway in this scenario.

Alternatively, Elasticsearch has provided a set of Trained Models APIs which is dedicated to self-hosted models and not restricted to LLM scenario. While it should provide extra capabilities like NER, classification and mask-filling, these are mostly not related to LiteLLM, so we can focus on the new Inference APIs.

Twitter / LinkedIn details

No response

@stevapple stevapple added the enhancement New feature or request label Nov 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant