Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add side_by_side mode to update_materialized_view #430

Merged
merged 2 commits into from
Jan 15, 2025
Merged

Conversation

calebhearth
Copy link
Contributor

This adds a side_by_side kwarg to the update_materialized_view method, which builds the new view alongside the old one and then atomically swaps them to reduce downtime at the cost of increasing disk usage. It is plumbed through to migrations as a hash value for the materialized kwarg of update_view.

Fixes #387

@mjankowski
Copy link

Does usage/docs/README/generator need to list this option?

I was able to figure out how/where to add the option from looking at the diff here, might not be obvious to the general public of materialized view operators?

Comment on lines 180 to 182
rename_materialized_view(name, drop_name)
rename_materialized_view(new_name, name)
drop_materialized_view(drop_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we:

  1. rename existing view
  2. rename new view
  3. drop existing view

instead of:

  1. drop existing view
  2. rename new view

In the first case, the view is no longer available to the application anyway. Is it that drop is potentially slow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure. OP may know; I was just rearranging

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To guarantee that if you're outside a transaction, there's no potential outage window where the view is missing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to be inside a transaction. Really not wild about the state we'd leave someone in if this wasn't inside a transaction. But maybe I'm completely wrong. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - we should be in transactions when performing side_by_side migrations.

Postgres supports SAVEPOINT but it appears that we'd still need to be inside a transaction for it to be useful, so I'm thinking that doesn't help here.

lib/scenic/index.rb Outdated Show resolved Hide resolved
lib/scenic/index.rb Outdated Show resolved Hide resolved
lib/scenic/index.rb Outdated Show resolved Hide resolved
@derekprior
Copy link
Contributor

I've made a very sizable refactor to something I'm happier to maintain. Don't absolutely love the object names introduced here, but like the factoring. Commit message explains much of my thinking:

The initial implementation of side_by_side materialized view creation
worked but had a couple of issues that needed to be resolved and I
wanted to refactor the code for better maintainability.

  • We had postgres-specific things in the Scenic::Index class, which is
    not part of the adapter API. The code was refactored to not rely on
    adding the schema name to this object.
  • Index migration is different from index reapplication, and it felt
    like we were reusing IndexReapplication just to get access to the
    SAVEPOINT functionality in that class. I extracted IndexCreation
    which is now used by IndexReapplication and our new class,
    IndexMigration.
  • Side-by-side logic was moved to a class of its own, SideBySide, for
    encapsulation purposes.
  • Instead of conditionally hashing the view name in the case where the
    temporary name overflows the postgres identifier limit, we now always
    hash the temporary object names. This just keeps the code simpler and
    observed behavior from the outside identical no matter identifier
    length. This behavior is tested in the new TemporaryName class.
  • Removed rename_materialized_view from the public API on the adapter,
    as I'd like to make sure that's something we want separate from this
    before we do something like that.
  • Added connection to the public adapter UI for ease of use from our
    helper objects. Documented as internal use only.
  • I added a number of tests for new and previously existing code.

Still to do:

  1. I think we should consider enforcing that side_by_side updates are
    done in a transaction. Feels like it would be really bad if we failed
    somewhere after renaming indexes on the current view.
  2. We should consider adding a final check to make sure no indexes or
    views are left behind that have our temporary name prefix. But I
    think that's probably being paranoid?
  3. README documentation and generator support

@calebhearth @Roguelazer @mjankowski - Any thoughts? I haven't tried this in a real app yet, so another to do is to make sure the output when doing this is sufficient and looks good.

@derekprior
Copy link
Contributor

derekprior commented Jan 2, 2025

I've improved test coverage, added generator support. Still to do:

  • Ensure we're in a transaction for this migration unless anyone thinks this is a bad idea?
  • Update README
  • Test in a real app

@calebhearth
Copy link
Contributor Author

calebhearth commented Jan 2, 2025 via email

@renchap
Copy link

renchap commented Jan 2, 2025

Don’t all migrations run in transactions

You can disable transactions for some migrations, and this is sometimes very useful if you have migrations that can lock a table for a long time. We do this in Mastodon for example.

@derekprior
Copy link
Contributor

@renchap would you want to disable it with side_by_side materialized view migrations?

I view this as an atomic operation. Failing in the middle could leave you, for example, with nonsense index names on the original view that eventually find their way into your schema.rb, but not your coworkers. I don't want those support requests and I think the fact that we're now refreshing before we do anything that requires locking should alleviate the transaction concern.

My options are:

  1. Raise an error if it's called without an open transaction present already. This leaves it to the user to either remove disable_ddl_transaction! or to manually start a transaction in the part of the migration that updates the materialized view.
  2. Silently start a transaction if there isn't one started already.

@renchap
Copy link

renchap commented Jan 2, 2025

I would prefer your first solution. It would fail with an explicit error message on development so the developer can understand what happens and adjust the migration. Starting a transaction silently can very easily create production outages in my experience.

@derekprior
Copy link
Contributor

This is now ready. I updated the output when running side by side updates, made a change to require an open transaction, added an end-to-end test, and more. I also ran this in my own sample app in several different ways.

@renchap @mjankowski any chance either of you could give it a whirl on Mastodon and let me know how it works out?

Would love to get this merged with some more confidence.

@mjankowski
Copy link

Updated my branch to use latest from this PR, added new migration with SIDE BY SIDE enabled, see output like:

== 20250103141121 UpdateGlobalFollowRecommendationsToVersion3: migrating ======
-- update_view(:global_follow_recommendations, {:version=>3, :revert_to_version=>2, :materialized=>{:side_by_side=>true}})
   -> temporary materialized view _scenic_sbs_53c2caf36ce95257a84e529f90957dc0d96c27c5 has been created
   -> indexes on 'global_follow_recommendations' have been renamed to avoid collisions
   -> index 'index_global_follow_recommendations_on_account_id' on '_scenic_sbs_53c2caf36ce95257a84e529f90957dc0d96c27c5' has been created
   -> materialized view global_follow_recommendations has been dropped
   -> temporary materialized view _scenic_sbs_53c2caf36ce95257a84e529f90957dc0d96c27c5 has been renamed to global_follow_recommendations
   -> 0.0056s
== 20250103141121 UpdateGlobalFollowRecommendationsToVersion3: migrated (0.0056s) 

(and tailing full log shows operations happening -- create new view with unique name, rename index, create index, drop old view, rename new view to old view name).

Still have not run with any meaningful large table size or contemplated how changing code and changing view defs around same time would have to interact with deployments/etc - but thats slightly out of scope for the narrow feature here.

@derekprior
Copy link
Contributor

Thanks! I updated the logging to make sure we're quoting table names consistently. But I think this is ready for a squash and merge. I'll probably get to that this evening.

Then it's a matter of release. I kind of think the tsort changes merged a while back but unreleased changes might necessitate a 2.0 release, in which case there are other things I'd like to get in too. 🤔

Still have not run with any meaningful large table size or contemplated how changing code and changing view defs around same time would have to interact with deployments/etc - but thats slightly out of scope for the narrow feature here.

Yeah, I was thinking about that yesterday. I don't know if this changes anything meaningfully around migration/deployment practices. But if you come up with anything, let me know.

This adds a `side_by_side` kwarg to the `update_materialized_view`
method, which builds the new view alongside the old one and then
atomically swaps them to reduce downtime at the cost of increasing disk
usage. It is plumbed through to migrations as a hash value for the
`materialized` kwarg of `update_view`.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot reviewed 7 out of 22 changed files in this pull request and generated no comments.

Files not reviewed (15)
  • lib/generators/scenic/view/templates/db/migrate/update_view.erb: Evaluated as low risk
  • spec/scenic/statements_spec.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres/errors.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres/index_creation.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres/index_migration.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres/index_reapplication.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres/side_by_side.rb: Evaluated as low risk
  • lib/scenic/adapters/postgres/temporary_name.rb: Evaluated as low risk
  • spec/scenic/command_recorder_spec.rb: Evaluated as low risk
  • spec/generators/scenic/view/view_generator_spec.rb: Evaluated as low risk
  • lib/scenic/statements.rb: Evaluated as low risk
  • spec/acceptance/user_manages_views_spec.rb: Evaluated as low risk
  • lib/generators/scenic/materializable.rb: Evaluated as low risk
  • spec/scenic/adapters/postgres_spec.rb: Evaluated as low risk
Comments suppressed due to low confidence (2)

spec/scenic/adapters/postgres/side_by_side_spec.rb:5

  • The test coverage for the side_by_side mode should include edge cases such as failure in temporary view creation, index migration failure, and inability to drop the original view.
describe Postgres::SideBySide, :db, :silence do

spec/scenic/adapters/postgres/index_migration_spec.rb:18

  • Add an assertion to check if the original indexes are renamed to avoid collisions.
expect(indexes_for_original.first.index_name).not_to eq "hi_greeting_idx"

Tip: Turn on automatic Copilot reviews for this repository to get quick feedback on every pull request. Learn more

Copy link
Contributor Author

@calebhearth calebhearth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great overall. A couple of minor nits but I think this could ship as-is if you're feeling good about it.

lib/generators/scenic/materializable.rb Show resolved Hide resolved
lib/generators/scenic/materializable.rb Show resolved Hide resolved
lib/scenic/adapters/postgres/errors.rb Outdated Show resolved Hide resolved
lib/scenic/adapters/postgres/side_by_side.rb Show resolved Hide resolved
The initial implementation of side_by_side materialized view creation
worked but had a couple of issues that needed to be resolved and I
wanted to refactor the code for better maintainability.

* We had postgres-specific things in the `Scenic::Index` class, which is
  not part of the adapter API. The code was refactored to not rely on
  adding the schema name to this object.
* Index migration is different from index reapplication, and it felt
  like we were reusing `IndexReapplication` just to get access to the
  `SAVEPOINT` functionality in that class. I extracted `IndexCreation`
  which is now used by `IndexReapplication` and our new class,
  `IndexMigration`.
* Side-by-side logic was moved to a class of its own, `SideBySide`, for
  encapsulation purposes.
* Instead of conditionally hashing the view name in the case where the
  temporary name overflows the postgres identifier limit, we now always
  hash the temporary object names. This just keeps the code simpler and
  observed behavior from the outside identical no matter identifier
  length. This behavior is tested in the new `TemporaryName` class.
* Removed `rename_materialized_view` from the public API on the adapter,
  as I'd like to make sure that's something we want separate from this
  before we do something like that.
* Added `connection` to the public adapter UI for ease of use from our
  helper objects. Documented as internal use only.
* Require a transaction in order to use `side_by_side`. This prevents
  leaving the user in a weird state that would be difficult to recover
  from.
* Added `--side-by-side` (and `--side_by_side`) support to the
  `scenic:view` generator. Also added `--no-data` as an alias for the
  existing `--no_data` while I was at it.
* I added a number of tests for new and previously existing code
  throughout, including an acceptance level test for `side_by_side`. Our
  test coverage should be much improved.
* Updated README with information on `side_by_side`.

Here's a sample of the output from running a `side_by_side` update:

```
== 20250102191533 UpdateSearchesToVersion3: migrating =========================
-- update_view(:searches, {version: 3, revert_to_version: 2, materialized: {side_by_side: true}})
   -> temporary materialized view _scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834 has been created
   -> indexes on 'searches' have been renamed to avoid collisions
   -> index 'index_searches_on_content' on '_scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834' has been created
   -> index 'index_searches_on_user_id' on '_scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834' has been created
   -> materialized view searches has been dropped
   -> temporary materialized view _scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834 has been renamed to searches
   -> 0.0299s
== 20250102191533 UpdateSearchesToVersion3: migrated (0.0300s) ================
```
@derekprior derekprior merged commit a9f2cff into main Jan 15, 2025
7 checks passed
@derekprior derekprior deleted the less-locks branch January 15, 2025 15:38
@cpb
Copy link

cpb commented Feb 1, 2025

FWIW I trialed this in a staging environment and on a new view serving 2 req/s perceived no downtime for rematerializations that took 30 sec or 45 minutes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants