Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The update process for the dataset_keyword_tables table is extremely slow. #11146

Open
5 tasks done
imaben opened this issue Nov 26, 2024 · 1 comment
Open
5 tasks done
Labels
💪 enhancement New feature or request

Comments

@imaben
Copy link

imaben commented Nov 26, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.10

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

image

When the number of documents in the knowledge base increases, indexing becomes exceptionally slow. After investigation, the main issue was found to be in the update of the dataset_keyword_tables table. Initial investigation suggests that the problem is due to full updates. I would like to ask if it is possible to disable write operations to this table, or if there are any plans for future updates?

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

Copy link

dosubot bot commented Nov 26, 2024

Hey @imaben! 👋 I'm here to help you with any bugs, questions, or contributions you have in mind. Let's work together to resolve this issue.

I found a similar unsolved discussion titled How to disable keywords?, which might be relevant to your issue [1].

Regarding your question about disabling write operations to the dataset_keyword_tables table, I couldn't find any existing configurations or settings in Dify that allow for this. Additionally, there are no specific details available about planned updates or improvements addressing performance issues related to this table.

If you are looking for optimizations, you might consider reviewing the indexing process, which involves keyword extraction, updating the keyword table, and saving it back to the database. This process uses a Redis lock to ensure thread safety, which might be a point to investigate for performance improvements [2].

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added the 💪 enhancement New feature or request label Nov 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant