[Feature]: Elasticsearch Inference API support #6889

stevapple · 2024-11-24T07:36:01Z

The Feature

Let’s support Elasticsearch as an LLM provider. This includes utilizing Elasticsearch Inference APIs to perform embedding, reranking and chat completion requests.

Motivation, pitch

Elasticsearch has provided a set of Inference APIs for performing inferences with self-hosted or external models. This includes:

Embedding models, both dense and sparse vectors are supported;
Completion models, with streaming support;
Reranking models.

For LiteLLM, it would be great to support using Elasticsearch as an LLM backend. From some perspective, Elasticsearch serves as an LLM proxy gateway in this scenario.

Alternatively, Elasticsearch has provided a set of Trained Models APIs which is dedicated to self-hosted models and not restricted to LLM scenario. While it should provide extra capabilities like NER, classification and mask-filling, these are mostly not related to LiteLLM, so we can focus on the new Inference APIs.

Twitter / LinkedIn details

No response

stevapple added the enhancement New feature or request label Nov 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature]: Elasticsearch Inference API support #6889

[Feature]: Elasticsearch Inference API support #6889

stevapple commented Nov 24, 2024

[Feature]: Elasticsearch Inference API support #6889

[Feature]: Elasticsearch Inference API support #6889

Comments

stevapple commented Nov 24, 2024

The Feature

Motivation, pitch

Twitter / LinkedIn details