Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: much improved RAG, added LLM post-processing of results #435

Merged
merged 10 commits into from
Feb 5, 2025

Conversation

Miyou
Copy link
Contributor

@Miyou Miyou commented Feb 4, 2025

Does the following:

  • Only gets RAG context for last user message, not all user messages. This prevents rag search calls spiraling out of control since currently the number of calls grows by 1 with each new message from the user.
  • Adds typing to RAG config.
  • Adds the option to post-process the extracted context with an LLM. While this adds more latency, from my testing this improves output by only including actually relevant context chunks and greatly reduces the main LLM to be confused when the unrelated chunks are returned from rag search.

Important

Introduces RagConfig for RAG settings, adds post-processing prompt, and updates RAG tool for LLM post-processing support.

  • Configuration:
    • Introduces RagConfig class in gptme/config.py for RAG settings.
    • Adds default_post_process_prompt for context post-processing.
    • Updates ProjectConfig to use RagConfig.
  • RAG Tool:
    • Updates rag_enhance_messages in rag.py to support post-processing with LLM using post_process_model and post_process_prompt.
    • Modifies rag_search to use --format full --print-relevance.
  • Misc:
    • Adds logging for post-processing duration in rag_enhance_messages.

This description was created by Ellipsis for 318099c. It will automatically update as commits are pushed.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Reviewed everything up to 318099c in 2 minutes and 3 seconds

More details
  • Looked at 185 lines of code in 2 files
  • Skipped 0 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. gptme/config.py:91
  • Draft comment:
    Consider clarifying the union type usage in isinstance(self.rag, dict | Container).
  • Reason this comment was not posted:
    Confidence changes required: 33%
    None
2. gptme/config.py:91
  • Draft comment:
    Use a tuple in the isinstance check to avoid runtime errors; change 'dict | Container' to '(dict, Container)'.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment appears to be incorrect. The | syntax for union types was introduced in Python 3.10 and works in isinstance checks. The code base appears to be using modern Python features (given the presence of type hints and union types elsewhere). There's no strong evidence that this would cause runtime errors.
    I could be wrong if this codebase needs to support older Python versions. The tuple syntax would work in all Python versions.
    Even if older Python versions need to be supported, this should be caught by the build/CI system running tests on the supported Python versions. We don't need a comment for this.
    The comment should be deleted as it suggests changing valid modern Python syntax without clear evidence of a problem, and version compatibility issues would be caught by the build system.
3. gptme/config.py:43
  • Draft comment:
    Insert a missing space between 'relevant' and 'context' in the prompt string.
  • Reason this comment was not posted:
    Marked as duplicate.
4. gptme/tools/rag.py:113
  • Draft comment:
    Split CLI flags into separate arguments using cmd.extend rather than appending a single concatenated string.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_Pqbj1snQtE3EQeqT


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

gptme/config.py Outdated Show resolved Hide resolved
gptme/tools/rag.py Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Feb 4, 2025

Codecov Report

Attention: Patch coverage is 37.03704% with 34 lines in your changes missing coverage. Please review.

Project coverage is 56.87%. Comparing base (b9a6457) to head (46e2c4c).
Report is 3 commits behind head on master.

Files with missing lines Patch % Lines
gptme/tools/rag.py 10.81% 33 Missing ⚠️
gptme/config.py 93.75% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master     #435       +/-   ##
===========================================
- Coverage   67.20%   56.87%   -10.34%     
===========================================
  Files          70       70               
  Lines        6187     6226       +39     
===========================================
- Hits         4158     3541      -617     
- Misses       2029     2685      +656     
Flag Coverage Δ
anthropic/claude-3-haiku-20240307 56.87% <37.03%> (-9.08%) ⬇️
openai/gpt-4o-mini 56.87% <37.03%> (-8.56%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Miyou Miyou force-pushed the miyou/improve-rag branch from 1f02733 to 0b528aa Compare February 4, 2025 15:25
gptme/config.py Outdated Show resolved Hide resolved
gptme/tools/rag.py Outdated Show resolved Hide resolved
gptme/tools/rag.py Outdated Show resolved Hide resolved
@ErikBjare ErikBjare changed the title Improve RAG feat: improved/fixing RAG, added LLM post-processing of results Feb 5, 2025
@ErikBjare ErikBjare changed the title feat: improved/fixing RAG, added LLM post-processing of results feat: much improved RAG, added LLM post-processing of results Feb 5, 2025
gptme/tools/rag.py Outdated Show resolved Hide resolved
gptme/tools/rag.py Outdated Show resolved Hide resolved
@ErikBjare ErikBjare merged commit f4b6f02 into ErikBjare:master Feb 5, 2025
6 of 7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants