Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(rag): Support RAG SDK #1322

Merged
merged 6 commits into from
Mar 22, 2024
Merged

feat(rag): Support RAG SDK #1322

merged 6 commits into from
Mar 22, 2024

Conversation

fangyinc
Copy link
Collaborator

Description

Release RAG SDK.

How Has This Been Tested?

Install sdk:

pip install "dbgpt[rag]>=0.5.2rc0" --upgrade

For specific steps, see docs/docs/awel/cookbook/first_rag_with_awel.md.

Snapshots:

Include snapshots for easier review.

Checklist:

  • My code follows the style guidelines of this project
  • I have already rebased the commits and make the commit message conform to the project standard.
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • Any dependent changes have been merged and published in downstream modules

@github-actions github-actions bot added the enhancement New feature or request label Mar 22, 2024
@Aries-ckt
Copy link
Collaborator

Aries-ckt commented Mar 22, 2024

Test AWEL RAG Local Embedding Example

Assemble Task

# 1.init embedding factory
from dbgpt.rag.embedding import DefaultEmbeddingFactory

embeddings = DefaultEmbeddingFactory.default("/data/models/text2vec-large-chinese")

import asyncio
import shutil
from dbgpt.core.awel import DAG
from dbgpt.rag import ChunkParameters
from dbgpt.rag.knowledge import KnowledgeType
from dbgpt.rag.operators import EmbeddingAssemblerOperator, KnowledgeOperator
from dbgpt.storage.vector_store.chroma_store import ChromaVectorConfig
from dbgpt.storage.vector_store.connector import VectorStoreConnector

# Delete old vector store directory(/tmp/awel_rag_test_vector_store)
shutil.rmtree("/tmp/awel_rag_test_vector_store", ignore_errors=True)

# 2.init vector_connector
vector_connector = VectorStoreConnector.from_default(
    "Chroma",
    vector_store_config=ChromaVectorConfig(
        name="test_vstore",
        persist_path="/tmp/awel_rag_test_vector_store",
    ),
    embedding_fn=embeddings
)

# 3. write awel assembler_task
with DAG("load_knowledge_dag") as knowledge_dag:
    # Load knowledge from URL
    knowledge_task = KnowledgeOperator(knowledge_type=KnowledgeType.URL.name)
    assembler_task = EmbeddingAssemblerOperator(
        vector_store_connector=vector_connector,
        chunk_parameters=ChunkParameters(chunk_strategy="CHUNK_BY_SIZE")
    )
    knowledge_task >> assembler_task

chunks = asyncio.run(assembler_task.call("https://docs.dbgpt.site/docs/latest/awel/"))
print(f"Chunk length: {len(chunks)}")

image

Retrieve Task

from dbgpt.core.awel import MapOperator
from dbgpt.rag.operators import EmbeddingRetrieverOperator

with DAG("retriever_dag") as retriever_dag:
    retriever_task = EmbeddingRetrieverOperator(
        top_k=3,
        vector_store_connector=vector_connector,
    )
    content_task = MapOperator(lambda cks: "\n".join(c.content for c in cks))
    retriever_task >> content_task

chunks = asyncio.run(content_task.call("What is the AWEL?"))
print(chunks)

image

Copy link
Collaborator

@Aries-ckt Aries-ckt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Collaborator

@csunny csunny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r+

@csunny csunny merged commit 8a17099 into eosphoros-ai:main Mar 22, 2024
4 checks passed
@fangyinc fangyinc deleted the rag-sdk branch March 22, 2024 07:58
Hopshine pushed a commit to Hopshine/DB-GPT that referenced this pull request Sep 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants