Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeleteFlow: Do not always rebuild flow index from scratch #4016

Merged
merged 2 commits into from
Jan 20, 2025

Conversation

hillu
Copy link
Contributor

@hillu hillu commented Jan 16, 2025

Filter out the deleted flow id, write index back to storage.

(If the index can't be opened, still rebuild from scratch.)

Close #4015

@scudette
Copy link
Contributor

This is great! I was actually planning to debounce this operation using the notification service

1. The normal case - listener called before notification is issued.

This will ensure that rebuilding the index only occurs once every few seconds . Usually there are two use cases for deleting flows:

  1. The user deletes the flow in the GUI - this needs to be synchronous because once the delete operation completes, the GUI will refresh the flows list and needs to see the flow removed from it.
  2. A user deletes a lot of flows programmatically (e.g in VQL)- in this case we can delay the index rebuild to some future time where all the flows are deleted.

This solution is good because it removes a flow one at the time so it is synchronous but it might have issues with concurrent deletion because there is no lock. Rebuilding the index will always get a correct solution because it reflects the files that are on disk at the time the index is rebuilt.

Probably there is a cross over point where deleting a lot of flows will benefit from delayed rebuild of the index, deleting a few flows might be faster by rewriting the index incrementally. Ideally this decision can be propagated to the caller. Maybe replacing the really do it bool with a struct options which specifies if to delete synchronously or incrementally as well would allow us to be smart about how to delete in the most optimal way?

hillu and others added 2 commits January 20, 2025 07:23
Filter out the deleted flow id, write index back to storage.

(If the index can't be opened, still rebuild from scratch.)

Close Velocidex#4015
This supports the mass-deletion patten with indexes rebuild
periodically but not very frequently.
@scudette scudette force-pushed the optimize-delete-flows branch from 4bc336f to d754d0b Compare January 20, 2025 01:11
@scudette scudette merged commit 24a9b57 into Velocidex:master Jan 20, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Deleting flows is unusably slow
2 participants