Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up the replace non-lifecycled classification queries #3292

Open
camallen opened this issue Apr 6, 2020 · 1 comment
Open

Speed up the replace non-lifecycled classification queries #3292

camallen opened this issue Apr 6, 2020 · 1 comment

Comments

@camallen
Copy link
Contributor

camallen commented Apr 6, 2020

https://github.com/zooniverse/Panoptes/blob/00cd74003b99458447feb0442b87e1b5e5e32c79/app/workers/requeue_classifications_worker.rb#L24

This query is timing out and needs optimization work. According to the planner the query is using the partial index on lifecycled_at IS NULL and when run without analyze the query planner stats look good. Running with analyze is very slow.

We could limit the scope of the table to look for using a known end point in the table (classification.id) that we've lifecycled all the records up to. We could compute a last known lifecycled id offset without forcing a full scan of the table, perhaps every hour (or as part of this worker job) and store this in a db record for future use.

@camallen
Copy link
Contributor Author

camallen commented Sep 8, 2020

Another option is to run this query on the db replica and ensure the worker returns quickly if the classification is already lifecycled (replica is behind primary).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant