Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add retireval_top_n to config in env #11132

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

ProseGuys
Copy link
Contributor

@ProseGuys ProseGuys commented Nov 26, 2024

Summary

Solution: Add a new retrieval-related parameter top_n in the environment variables. During the retrieval process, use the default top_k parameter to obtain the top_k most relevant slices. When reranking, use the top_n parameter to return the top_n slices.

Tip

Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.
Resolves #11068

Screenshots

Before: After:
... ...

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@dosubot dosubot bot added size:S This PR changes 10-29 lines, ignoring generated files. 💪 enhancement New feature or request 📚 documentation Improvements or additions to documentation labels Nov 26, 2024
api/.env.example Outdated Show resolved Hide resolved
CREATE_TIDB_SERVICE_JOB_ENABLED=false

RETRIEVAL_TOP_N=0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if RETRIEVAL_TOP_N is 0 , dataset retrieval will use original top_k params

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📚 documentation Improvements or additions to documentation 💪 enhancement New feature or request size:S This PR changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

The Rerank model in RAG needs to support independent score_threshold and top_k
2 participants