Text Generation Inference server #5934

Chance-Obondo · 2023-09-30T19:41:54Z

Chance-Obondo
Sep 30, 2023

Is there any or there any plans to do a custom prompt node integration to https://github.com/huggingface/text-generation-inference by huggingface which I have used to host the ✨ Falcon-40B-Instruct model. I have checked out Langchain and they have such a solution but there is not one by deepset haystack and this might force me to switch to Langchain which I really dont want to

Answered by anakin87

Oct 3, 2023

We don't know yet if TGI will be supported in Haystack 2.0, but we are considering it (see #5625)
Falcon-40b-instruct seems to be supported by vLLM (its architecture should be FalconForCausalLM)
The article talks about Falcon-40b-instruct because at that time it was one of the best open-source LLMs.
Today there are probably better open-source models (see Open LLM Leaderboard), which are also supported by vLLM.

View full answer

anakin87 · 2023-10-01T15:47:59Z

anakin87
Oct 1, 2023
Maintainer

Hello, @Chance-Obondo!

We are definitely looking at TGI with interest.
But since we are working hard to prepare Haystack 2.0, we have no plans to implement support for it in Haystack 1.x.

Other options:

we already support the Hugging Face Inference API (which I think is powered by TGI).
we already have an integration with vLLM, a library for fast/efficient LLM serving, which shares similar purposes with TGI.

I hope one of these options can help you...

3 replies

Chance-Obondo Oct 3, 2023
Author

Hi @anakin87 ,

This is great, is there a possibility to have it in Haystack 2.0 ?
I have checked out vLLM, I have seen that among supported models falcon-40b-instruct is not among them and this specific model has been referenced in this great article and it showcased that specific model in a production set up.

My main inquiry was is it possible to access and use the falcon-40b-instruct model that is self hosted using Haystcak? So far, I just know this possible if using vLLM and not TGI maybe can there be an ingenious workaround, thanks.

anakin87 Oct 3, 2023
Maintainer

We don't know yet if TGI will be supported in Haystack 2.0, but we are considering it (see Add HuggingFaceTGIGenerator LLM support (2.x) #5625)
Falcon-40b-instruct seems to be supported by vLLM (its architecture should be FalconForCausalLM)
The article talks about Falcon-40b-instruct because at that time it was one of the best open-source LLMs.
Today there are probably better open-source models (see Open LLM Leaderboard), which are also supported by vLLM.

Answer selected by Chance-Obondo

Chance-Obondo Oct 4, 2023
Author

Thanks @anakin87 for the tips. I have reviewed vLLM and actually will consider checking it out and testing

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Text Generation Inference server #5934

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 3 replies

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Text Generation Inference server #5934

Chance-Obondo Sep 30, 2023

Replies: 1 comment · 3 replies

anakin87 Oct 1, 2023 Maintainer

Chance-Obondo Oct 3, 2023 Author

anakin87 Oct 3, 2023 Maintainer

Chance-Obondo Oct 4, 2023 Author

Chance-Obondo
Sep 30, 2023

Replies: 1 comment 3 replies

anakin87
Oct 1, 2023
Maintainer

Chance-Obondo Oct 3, 2023
Author

anakin87 Oct 3, 2023
Maintainer

Chance-Obondo Oct 4, 2023
Author