Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of Active Storage lookups #2826 #3428

Merged
merged 6 commits into from
Apr 22, 2024

Conversation

jefferya
Copy link
Contributor

@jefferya jefferya commented Mar 27, 2024

Address, partially, #2826

Active Storage attachments database table appearing in the slow query log.

Requires: a bin/rails db:migration

The change adds an index on the record_id of the active_storage_attachments table to improve performance. The index is added via a DB migration that concurrently added the index while not locking the DB table. Quoting the PostgreSQL documentation:
"CONCURRENTLY When this option is used, PostgreSQL will build the index without taking any locks that prevent concurrent inserts, updates, or deletes on the table; whereas a standard index build locks out writes (but not reads) on the table until it's done. There are several caveats to be aware of when using this option — see Building Indexes Concurrently"

When the active_storage_attachments contains a significant number of records (10k), queries on the table by record_id are causing a sequential DB scan on the table. By adding an index, performance improves on queries like the following from the PostgreSQL slow query log:

explain analyze 
  SELECT "active_storage_attachments".* 
  FROM "active_storage_attachments" 
  WHERE
    "active_storage_attachments"."record_type" = 'Item'
    AND "active_storage_attachments"."name" = 'files'
    AND "active_storage_attachments"."record_id" = '0625726d-cc85-40cd-8c54-7e7e65d06df6';

#2826 (comment)

* add index on active_storage_attachments.record_id
@@ -0,0 +1,6 @@
class ActiveStorageAttachments < ActiveRecord::Migration[7.1]
disable_ddl_transaction!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disable transaction as incompatible with "concurrent" index creation.

disable_ddl_transaction!
def change
add_index :active_storage_attachments, :record_id, algorithm: :concurrently, if_not_exists: true
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uses "CREATE INDEX CONCURRENTLY" to avoid locking the DB table. For details: https://www.postgresql.org/docs/current/sql-createindex.html#SQL-CREATEINDEX-CONCURRENTLY

ConnorSheremeta
ConnorSheremeta previously approved these changes Apr 2, 2024
Copy link
Contributor

@ConnorSheremeta ConnorSheremeta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! I added the schema changes from the migration and did some reformatting in the changelog. I can confirm that the addition of this index has had an impact on queries that were previously slow to complete. I ran a batch ingest locally with 1000 theses and logged sql queries that were not performant through the auto_explain module setting the min_duration threshold, I then added this index and noted that there were no queries in the logs from auto_explain using the same min_duration threshold after the batch ingest had completed a second time (ensured caches were cleared.) Will leave this up for a couple more days just in case there is another review on this and then will merge

lagoan
lagoan previously approved these changes Apr 4, 2024
Copy link
Contributor

@lagoan lagoan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ConnorSheremeta ConnorSheremeta dismissed stale reviews from lagoan and themself via e7246cc April 19, 2024 22:19
@jefferya jefferya changed the title [WIP] Improve performance of Active Storage lookups #2826 Improve performance of Active Storage lookups #2826 Apr 22, 2024
Copy link
Contributor

@lagoan lagoan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@ConnorSheremeta ConnorSheremeta merged commit 9ed276d into master Apr 22, 2024
4 checks passed
@ConnorSheremeta ConnorSheremeta deleted the 2826_db_performance branch April 22, 2024 18:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants