Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RAG support #56

Open
sak12009cb opened this issue Nov 6, 2023 · 1 comment
Open

RAG support #56

sak12009cb opened this issue Nov 6, 2023 · 1 comment

Comments

@sak12009cb
Copy link

I would like to implement Retrieval augmented generation by using llama deployed through truss.Also i would like to understand if connecting to vector databases and providing the context to llm is supported?

@squidarth
Copy link
Contributor

Hi @sak12009cb -- you certainly can! The way to think about it is -- Truss allows you to deploy a model (like llama), and get an API endpoint back with which you can do inference. You can then integrate that into your RAG workflow.

This post doesn't cover the vector DB aspect, but should get you started w/ using Langchain w/ models on Baseten: https://www.baseten.co/blog/build-a-chatbot-with-llama-2-and-langchain/

Truss also supports writing arbitrary Python code, so you could certainly do parts of this (ie: connecting to your vector db) in your Truss if you wanted to (see docs for more info on how to write Trusses: https://truss.baseten.co/learn/intro)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants