RAG support #56

sak12009cb · 2023-11-06T09:58:36Z

I would like to implement Retrieval augmented generation by using llama deployed through truss.Also i would like to understand if connecting to vector databases and providing the context to llm is supported?

squidarth · 2023-11-06T12:44:07Z

Hi @sak12009cb -- you certainly can! The way to think about it is -- Truss allows you to deploy a model (like llama), and get an API endpoint back with which you can do inference. You can then integrate that into your RAG workflow.

This post doesn't cover the vector DB aspect, but should get you started w/ using Langchain w/ models on Baseten: https://www.baseten.co/blog/build-a-chatbot-with-llama-2-and-langchain/

Truss also supports writing arbitrary Python code, so you could certainly do parts of this (ie: connecting to your vector db) in your Truss if you wanted to (see docs for more info on how to write Trusses: https://truss.baseten.co/learn/intro)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RAG support #56

RAG support #56

sak12009cb commented Nov 6, 2023

squidarth commented Nov 6, 2023

RAG support #56

RAG support #56

Comments

sak12009cb commented Nov 6, 2023

squidarth commented Nov 6, 2023