Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved performance fetching posts #20460

Merged
merged 7 commits into from
Jun 26, 2024
Merged

Improved performance fetching posts #20460

merged 7 commits into from
Jun 26, 2024

Conversation

9larsons
Copy link
Contributor

ref https://linear.app/tryghost/issue/ONC-111

  • added composite index to posts_tags for post_id,tag_id for faster lookup
  • added composite index to posts for updated_at; this is commonly used by get helpers on the front end to display data like the latest posts

In testing, this provided a very dramatic improvement for simple get helper requests like 'filter="id:-{{post.id}}+tag:sampleTag" limit="3"' which are by default sorted by updated_at desc. I'm not entirely clear why when sorting by published_at we do not need a composite index - so far it doesn't seem to be necessary. This should cover the primary cases for get helpers - the latest posts with a given tag or set of tags.

ref https://linear.app/tryghost/issue/ONC-111
- added composite index to posts_tags for post_id,tag_id for faster lookup
- added composite index to posts for updated_at; this is commonly used by get helpers on the front end to display data like the latest posts

In testing, this provided a very dramatic improvement for simple get helper requests like 'filter="id:-{{post.id}}+tag:sampleTag" limit="3"' which are by default sorted by updated_at desc. I'm not entirely clear why when sorting by published_at we do not need a composite index - so far it doesn't seem to be necessary. This should cover the primary cases for get helpers - the latest posts with a given tag or set of tags.
@9larsons 9larsons requested a review from mike182uk June 25, 2024 13:00
@github-actions github-actions bot added the migration [pull request] Includes migration for review label Jun 25, 2024
Copy link
Contributor

It looks like this PR contains a migration 👀
Here's the checklist for reviewing migrations:

General requirements

  • Satisfies idempotency requirement (both up() and down())
  • Does not reference models
  • Filename is in the correct format (and correctly ordered)
  • Targets the next minor version
  • All code paths have appropriate log messages
  • Uses the correct utils
  • Contains a minimal changeset
  • Does not mix DDL/DML operations

Schema changes

  • Both schema change and related migration have been implemented
  • For index changes: has been performance tested for large tables
  • For new tables/columns: fields use the appropriate predefined field lengths
  • For new tables/columns: field names follow the appropriate conventions
  • Does not drop a non-alpha table outside of a major version

Data changes

  • Mass updates/inserts are batched appropriately
  • Does not loop over large tables/datasets
  • Defends against missing or invalid data
  • For settings updates: follows the appropriate guidelines

@9larsons
Copy link
Contributor Author

@daniellockyer @mike182uk I removed the previous PR's type, status index as it's redundant with the new type, status,updated_at. I was unable to find why MySQL's optimizer seems to be ok leveraging the type, status index with the published_at index but will not do the same for the updated_at index. Removing that migration should be fine as it doesn't look to have been released.

This led to the following get helper for 125k posts going from 12s to 20ms when the query was run on the db. See here for more details. The overall get helper time is not that fast because there's a lot of other queries run at the same time (relations, member check, etc).

{{#get "posts" filter="tags:{{primary_tag.slug}}+id:-{{id}}" limit="3" as |related_posts|}}

@9larsons
Copy link
Contributor Author

I'm not sure why the test is failing with the following. I can add and drop the index locally just fine within TablePlus.

Error: alter table posts_tags drop index posts_tags_post_id_tag_id_index - Cannot drop index 'posts_tags_post_id_tag_id_index': needed in a foreign key constraint

@9larsons
Copy link
Contributor Author

Ok, the issue is that MySQL will change the foreign key-generated index to the composite index, likely as a sort of optimization, but it is not an irreversible procedure. We can create an index on posts_id on the down function of the migration (for rollback) to avoid the trouble, though this gives us another problem as our schema doesn't indicate that the post_id column is indexed.

@ErisDS
Copy link
Member

ErisDS commented Jun 26, 2024

Do we have a sense of how long this will take to run on a large table?

@9larsons
Copy link
Contributor Author

@ErisDS I ran this locally with 125k rows in posts and ~500k rows in posts_tags and it only took ~5s. I can run this on a staging site w/ that data and check the timing if you'd like before merging this.

@9larsons
Copy link
Contributor Author

@ErisDS This only took 8s on a staging site 👍. Same large dataset.

@9larsons
Copy link
Contributor Author

Confirmed this index is helping with get helpers on staging after running.

@9larsons 9larsons merged commit 2e593eb into main Jun 26, 2024
22 checks passed
@9larsons 9larsons deleted the improve-get-helpers branch June 26, 2024 21:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
migration [pull request] Includes migration for review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants