-
Notifications
You must be signed in to change notification settings - Fork 37
ADR: Vector store for RAG #168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
2 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,3 +5,10 @@ dictionary.dic | |
|
||
# python virtualenv | ||
venv | ||
|
||
# Emacs | ||
*~ | ||
\#*\# | ||
.\#* | ||
.projectile | ||
.dir-locals.el |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -116,6 +116,7 @@ Kumar | |
Langchain | ||
Langgraph | ||
leaderboard | ||
lifecycle | ||
lignment | ||
LLM | ||
LLMs | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Architecture Decision Records | ||
|
||
The ADR is a lightweight record format intended to capture individual architecturally important decisions. They are meant to be easy to write - 10 minutes or less. They should be stored in the codebase they affect, go through peer review, and have a commit history. | ||
|
||
This simple format, which is described below, has a surprising number of functions: | ||
|
||
* **Decision making process**: by going through peer review, it includes the entire team and gives all perspectives a chance to be heard. There is a clear decision making process with a clear lifecycle - once an ADR meets whatever approval criteria the team chooses, it is merged and the decision is done. If new information comes to light that causes the team to reconsider the decision, then that is simply a new ADR. | ||
* **Institutional knowledge and transparency**: Not everyone will comment on every ADR, but the transparency of the mechanism should serve to keep everyone informed and encode tribal knowledge into writing. This also builds resilience - there should ideally never be decision making that is blocked by someone being sick or on vacation. The team should always be able to make significant decisions. | ||
* **Distribute design authority**: As a team becomes familiar and comfortable with the ADR mechanism, every team member has an equal tool to bring design decisions to the team. This encourages autonomy, accountability, and ownership. | ||
* **Onboarding and training material**: A natural consequence of it being easy to write an ADR and getting into the habit of doing so is that new team members can simply read the record of ADRs to onboard. | ||
* **Knowledge sharing**: The peer review phase allows sharing of expertise between team members. | ||
* **Fewer meetings**: As decision making becomes asynchronous and as the team forms its social norms around the process, there should be less time required in meetings. | ||
|
||
## When to write an ADR | ||
|
||
* A decision is being made that required discussion between two or more people. | ||
* A decision is being made that required significant investigation. | ||
* A decision is being proposed for feedback / discussion. | ||
* A decision is being proposed that affects multiple teams. | ||
|
||
## Template | ||
|
||
[Here](template.md). | ||
|
||
## Related Reading | ||
|
||
* [Suggestions for writing good ADRs](https://github.com/joelparkerhenderson/architecture-decision-record?tab=readme-ov-file#suggestions-for-writing-good-adrs) | ||
* [ADRs at RedHat](https://www.redhat.com/architect/architecture-decision-records) | ||
* [ADRs at Amazon](https://docs.aws.amazon.com/prescriptive-guidance/latest/architectural-decision-records/adr-process.html) | ||
* [ADRs at GitHub](https://adr.github.io/) | ||
* [ADRs at Google](https://cloud.google.com/architecture/architecture-decision-records) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,26 @@ | ||
# Initial InstructLab Vector Store | ||
|
||
## Context | ||
|
||
One of the first choices to make in implementing RAG is to choose an initial vector store to develop against. Though the usage of frameworks like LangChain or Haystack make it easy to swap vector databases, we need a working end to end implementation for RAG that is tested against and available to install with InstructLab. There are many options (see [here](https://docs.haystack.deepset.ai/docs/choosing-a-document-store)). | ||
|
||
Our main long-term requirements are that our chosen store have fully-developed document update (and thus some sort of notion of primary key), that it be scalable to cluster size, and that it have a permissive license (Apache, MIT, or similar). Among the available choices, [Milvus](https://milvus.io/) provides strategic advantage due to its [investment from watsonx](https://www.ibm.com/new/announcements/ibm-watsonx-data-vector-database-ai-ready-data-management). | ||
|
||
Milvus can be used in-process ([Milvus Lite](https://milvus.io/docs/milvus_lite.md)), single-node ([Milvus](https://milvus.io/docs/prerequisite-docker.md)), or cluster-scale ([Milvus Distributed](https://milvus.io/docs/prerequisite-helm.md)). | ||
|
||
## Decision | ||
|
||
InstructLab will initially integrate with and use Milvus Lite for vector storage and retrieval augmented generation. | ||
|
||
## Status | ||
|
||
Accepted | ||
|
||
## Consequences | ||
|
||
* Users will have a clear [upgrade path](https://milvus.io/docs/upgrade_milvus_cluster-operator.md) from the laptop use case to cluster scale. | ||
* We should be able to have access to expert resources with Milvus via IBM. | ||
* The laptop use case of InstructLab will have a minimally resource intensive option for prototyping. | ||
* Since Milvus is used in watsonx, we can have confidence that it can meet expected scaling requirements. | ||
* Document updates can be accommodated using well-established [primary key functionality](https://milvus.io/docs/primary-field.md) and [partition key](https://milvus.io/docs/use-partition-key.md). | ||
* There is a risk of developing against a mature vector store leading to usage of functionality not available in some other vector store that a potential customer requires to be used. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Succinct title | ||
|
||
## Context | ||
|
||
_What is the context of this decision? What are the technical, social, and political factors? For example, the decision to use a particular library might be simply because most of the team is familiar with it; that is a social context. A political factor might be influences from other teams or executive decisions_ | ||
|
||
## Decision | ||
|
||
_a single decision statement, written in active voice, stated in a single sentence_ | ||
|
||
## Status | ||
|
||
[Proposed | Accepted | Rejected ] | ||
|
||
## Consequences | ||
|
||
_A bulleted list and might be the most important section. What are the consequences of this decision? Does it introduce design constraints into a codebase? Does it require further decisions or investigations to be made? Will it require training/onboarding for team members? Does it impact performance? What about cost? Does it impact development processes? What else? As a rule of thumb, there should usually be 4-6 identified consequences_ |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.