Skip to content

Commit

Permalink
Refactor side_by_side materialized view creation
Browse files Browse the repository at this point in the history
The initial implementation of side_by_side materialized view creation
worked but had a couple of issues that needed to be resolved and I
wanted to refactor the code for better maintainability.

* We had postgres-specific things in the `Scenic::Index` class, which is
  not part of the adapter API. The code was refactored to not rely on
  adding the schema name to this object.
* Index migration is different from index reapplication, and it felt
  like we were reusing `IndexReapplication` just to get access to the
  `SAVEPOINT` functionality in that class. I extracted `IndexCreation`
  which is now used by `IndexReapplication` and our new class,
  `IndexMigration`.
* Side-by-side logic was moved to a class of its own, `SideBySide`, for
  encapsulation purposes.
* Instead of conditionally hashing the view name in the case where the
  temporary name overflows the postgres identifier limit, we now always
  hash the temporary object names. This just keeps the code simpler and
  observed behavior from the outside identical no matter identifier
  length. This behavior is tested in the new `TemporaryName` class.
* Removed `rename_materialized_view` from the public API on the adapter,
  as I'd like to make sure that's something we want separate from this
  before we do something like that.
* Added `connection` to the public adapter UI for ease of use from our
  helper objects. Documented as internal use only.
* Require a transaction in order to use `side_by_side`. This prevents
  leaving the user in a weird state that would be difficult to recover
  from.
* Added `--side-by-side` (and `--side_by_side`) support to the
  `scenic:view` generator. Also added `--no-data` as an alias for the
  existing `--no_data` while I was at it.
* I added a number of tests for new and previously existing code
  throughout, including an acceptance level test for `side_by_side`. Our
  test coverage should be much improved.
* Updated README with information on `side_by_side`.

Here's a sample of the output from running a `side_by_side` update:

```
== 20250102191533 UpdateSearchesToVersion3: migrating =========================
-- update_view(:searches, {version: 3, revert_to_version: 2, materialized: {side_by_side: true}})
   -> temporary materialized view _scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834 has been created
   -> indexes on 'searches' have been renamed to avoid collisions
   -> index 'index_searches_on_content' on '_scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834' has been created
   -> index 'index_searches_on_user_id' on '_scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834' has been created
   -> materialized view searches has been dropped
   -> temporary materialized view _scenic_sbs_8a03f467c615b126f59617cc510d2abd41296834 has been renamed to searches
   -> 0.0299s
== 20250102191533 UpdateSearchesToVersion3: migrated (0.0300s) ================
```
  • Loading branch information
derekprior committed Jan 13, 2025
1 parent cc92ba6 commit a2a9be5
Show file tree
Hide file tree
Showing 23 changed files with 644 additions and 180 deletions.
73 changes: 55 additions & 18 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -91,7 +91,8 @@ hierarchies of dependent views.

Scenic offers a `replace_view` schema statement, resulting in a `CREATE OR
REPLACE VIEW` SQL query which will update the supplied view in place, retaining
all dependencies. Materialized views cannot be replaced in this fashion.
all dependencies. Materialized views cannot be replaced in this fashion, though
the `side_by_side` update strategy may yield similar results (see below).

You can generate a migration that uses the `replace_view` schema statement by
passing the `--replace` option to the `scenic:view` generator:
Expand Down Expand Up @@ -137,7 +138,7 @@ end
```

Scenic even provides a `scenic:model` generator that is a superset of
`scenic:view`. It will act identically to the Rails `model` generator except
`scenic:view`. It will act identically to the Rails `model` generator except
that it will create a Scenic view migration rather than a table migration.

There is no special base class or mixin needed. If desired, any code the model
Expand Down Expand Up @@ -185,6 +186,44 @@ you would need to refresh view B first, then right after refresh view A. If you
would like this cascading refresh of materialized views, set `cascade: true`
when you refresh your materialized view.

## Can I update the definition of a materialized view without dropping it?

No, but Scenic can help you approximate this behavior with its `side_by_side`
update strategy.

Generally, changing the definition of a materialized view requires dropping it
and recreating it, either without data or with a non-concurrent refresh. The
materialized view will be locked for selects during the refresh process, which
can cause problems in your application if the refresh is not fast.

The `side_by_side` update strategy prepares the new version of the view under a
temporary name. This includes copying the indexes from the original view and
refreshing the data. Once prepared, the original view is dropped and the new
view is renamed to the original view's name. This process minimizes the time the
view is locked for selects at the cost of additional disk space.

You can generate a migration that uses the `side_by_side` strategy by passing
the `--side-by-side` option to the `scenic:view` generator:

```sh
$ rails generate scenic:view search_results --materialized --side-by-side
create db/views/search_results_v02.sql
create db/migrate/[TIMESTAMP]_update_search_results_to_version_2.rb
```

The migration will look something like this:

```ruby
class UpdateSearchResultsToVersion2 < ActiveRecord::Migration
def change
update_view :search_results,
version: 2,
revert_to_version: 1,
materialized: { side_by_side: true }
end
end
```

## I don't need this view anymore. Make it go away.

Scenic gives you `drop_view` too:
Expand Down Expand Up @@ -234,18 +273,18 @@ It's our experience that maintaining a library effectively requires regular use
of its features. We're not in a good position to support MySQL, SQLite or other
database users.

Scenic *does* support configuring different database adapters and should be
Scenic _does_ support configuring different database adapters and should be
extendable with adapter libraries. If you implement such an adapter, we're happy
to review and link to it. We're also happy to make changes that would better
accommodate adapter gems.

We are aware of the following existing adapter libraries for Scenic which may
meet your needs:

* [`scenic_sqlite_adapter`](<https://github.com/pdebelak/scenic_sqlite_adapter>)
* [`scenic-mysql_adapter`](<https://github.com/EmpaticoOrg/scenic-mysql_adapter>)
* [`scenic-sqlserver-adapter`](<https://github.com/ClickMechanic/scenic_sqlserver_adapter>)
* [`scenic-oracle_adapter`](<https://github.com/cdinger/scenic-oracle_adapter>)
- [`scenic_sqlite_adapter`](https://github.com/pdebelak/scenic_sqlite_adapter)
- [`scenic-mysql_adapter`](https://github.com/EmpaticoOrg/scenic-mysql_adapter)
- [`scenic-sqlserver-adapter`](https://github.com/ClickMechanic/scenic_sqlserver_adapter)
- [`scenic-oracle_adapter`](https://github.com/cdinger/scenic-oracle_adapter)

Please note that the maintainers of Scenic make no assertions about the
quality or security of the above adapters.
Expand All @@ -255,26 +294,24 @@ quality or security of the above adapters.
### Used By

Scenic is used by some popular open source Rails apps:
[Mastodon](<https://github.com/mastodon/mastodon/>),
[Code.org](<https://github.com/code-dot-org/code-dot-org>), and
[Lobste.rs](<https://github.com/lobsters/lobsters/>).
[Mastodon](https://github.com/mastodon/mastodon/),
[Code.org](https://github.com/code-dot-org/code-dot-org), and
[Lobste.rs](https://github.com/lobsters/lobsters/).

### Related projects

- [`fx`](<https://github.com/teoljungberg/fx>) Versioned database functions and
- [`fx`](https://github.com/teoljungberg/fx) Versioned database functions and
triggers for Rails


### Media

Here are a few posts we've seen discussing Scenic:

- [Announcing Scenic - Versioned Database Views for Rails](<https://thoughtbot.com/blog/announcing-scenic--versioned-database-views-for-rails>) by Derek Prior for thoughtbot
- [Effectively Using Materialized Views in Ruby on Rails](<https://pganalyze.com/blog/materialized-views-ruby-rails>) by Leigh Halliday for pganalyze
- [Optimizing String Concatenation in Ruby on Rails](<https://dev.to/pimp_my_ruby/from-slow-to-lightning-fast-optimizing-string-concatenation-in-ruby-on-rails-28nk>)
- [Materialized Views In Ruby On Rails With Scenic](<https://www.ideamotive.co/blog/materialized-views-ruby-rails-scenic>) by Dawid Karczewski for Ideamotive
- [Using Scenic and SQL views to aggregate data](<https://dev.to/weareredlight/using-scenic-and-sql-views-to-aggregate-data-226k>) by André Perdigão for Redlight Software

- [Announcing Scenic - Versioned Database Views for Rails](https://thoughtbot.com/blog/announcing-scenic--versioned-database-views-for-rails) by Derek Prior for thoughtbot
- [Effectively Using Materialized Views in Ruby on Rails](https://pganalyze.com/blog/materialized-views-ruby-rails) by Leigh Halliday for pganalyze
- [Optimizing String Concatenation in Ruby on Rails](https://dev.to/pimp_my_ruby/from-slow-to-lightning-fast-optimizing-string-concatenation-in-ruby-on-rails-28nk)
- [Materialized Views In Ruby On Rails With Scenic](https://www.ideamotive.co/blog/materialized-views-ruby-rails-scenic) by Dawid Karczewski for Ideamotive
- [Using Scenic and SQL views to aggregate data](https://dev.to/weareredlight/using-scenic-and-sql-views-to-aggregate-data-226k) by André Perdigão for Redlight Software

### Maintainers

Expand Down
28 changes: 27 additions & 1 deletion lib/generators/scenic/materializable.rb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,14 @@ module Materializable
type: :boolean,
required: false,
desc: "Adds WITH NO DATA when materialized view creates/updates",
default: false
default: false,
aliases: ["--no-data"]
class_option :side_by_side,
type: :boolean,
required: false,
desc: "Uses side-by-side strategy to update materialized view",
default: false,
aliases: ["--side-by-side"]
class_option :replace,
type: :boolean,
required: false,
Expand All @@ -35,6 +42,25 @@ def replace_view?
def no_data?
options[:no_data]
end

def side_by_side?
options[:side_by_side]
end

def materialized_view_update_options
set_options = {no_data: no_data?, side_by_side: side_by_side?}
.select { |_, v| v }

if set_options.empty?
"true"
else
string_options = set_options.reduce("") do |memo, (key, value)|
memo + "#{key}: #{value}, "
end

"{ #{string_options.chomp(", ")} }"
end
end
end
end
end
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ class <%= migration_class_name %> < <%= activerecord_migration_class %>
<%= method_name %> <%= formatted_plural_name %>,
version: <%= version %>,
revert_to_version: <%= previous_version %>,
materialized: <%= no_data? ? "{ no_data: true }" : true %>
materialized: <%= materialized_view_update_options %>
<%- else -%>
<%= method_name %> <%= formatted_plural_name %>, version: <%= version %>, revert_to_version: <%= previous_version %>
<%- end -%>
Expand Down
56 changes: 15 additions & 41 deletions lib/scenic/adapters/postgres.rb
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,10 @@
require_relative "postgres/indexes"
require_relative "postgres/views"
require_relative "postgres/refresh_dependencies"
require_relative "postgres/side_by_side"
require_relative "postgres/index_creation"
require_relative "postgres/index_migration"
require_relative "postgres/temporary_name"

module Scenic
# Scenic database adapters.
Expand All @@ -22,8 +26,6 @@ module Adapters
# The methods are documented here for insight into specifics of how Scenic
# integrates with Postgres and the responsibilities of {Adapters}.
class Postgres
MAX_IDENTIFIER_LENGTH = 63

# Creates an instance of the Scenic Postgres adapter.
#
# This is the default adapter for Scenic. Configuring it via
Expand Down Expand Up @@ -169,17 +171,9 @@ def update_materialized_view(name, sql_definition, no_data: false, side_by_side:
raise_unless_materialized_views_supported

if side_by_side
session_id = Time.now.to_i
new_name = generate_name name, "new_#{session_id}"
drop_name = generate_name name, "drop_#{session_id}"
IndexReapplication.new(connection: connection).on_side_by_side(
name, new_name, session_id
) do
create_materialized_view(new_name, sql_definition, no_data: no_data)
end
rename_materialized_view(name, drop_name)
rename_materialized_view(new_name, name)
drop_materialized_view(drop_name)
SideBySide
.new(adapter: self, name: name, definition: sql_definition)
.update
else
IndexReapplication.new(connection: connection).on(name) do
drop_materialized_view(name)
Expand All @@ -202,20 +196,6 @@ def drop_materialized_view(name)
execute "DROP MATERIALIZED VIEW #{quote_table_name(name)};"
end

# Renames a materialized view from {name} to {new_name}
#
# @param name The existing name of the materialized view in the database.
# @param new_name The new name to which it should be renamed
# @raise [MaterializedViewsNotSupportedError] if the version of Postgres
# in use does not support materialized views.
#
# @return [void]
def rename_materialized_view(name, new_name)
raise_unless_materialized_views_supported
execute "ALTER MATERIALIZED VIEW #{quote_table_name(name)} " \
"RENAME TO #{quote_table_name(new_name)};"
end

# Refreshes a materialized view from its SQL schema.
#
# This is typically called from application code via {Scenic.database}.
Expand Down Expand Up @@ -286,15 +266,19 @@ def populated?(name)
end
end

# A decorated ActiveRecord connection object with some Scenic-specific
# methods. Not intended for direct use outside of the Postgres adapter.
#
# @api private
def connection
Connection.new(connectable.connection)
end

private

attr_reader :connectable
delegate :execute, :quote_table_name, to: :connection

def connection
Connection.new(connectable.connection)
end

def raise_unless_materialized_views_supported
unless connection.supports_materialized_views?
raise MaterializedViewsNotSupportedError
Expand All @@ -315,16 +299,6 @@ def refresh_dependencies_for(name, concurrently: false)
concurrently: concurrently
)
end

def generate_name(base, suffix)
candidate = "#{base}_#{suffix}"
if candidate.size <= MAX_IDENTIFIER_LENGTH
candidate
else
digest_length = MAX_IDENTIFIER_LENGTH - suffix.size - 1
"#{Digest::SHA256.hexdigest(base)[0...digest_length]}_#{suffix}"
end
end
end
end
end
68 changes: 68 additions & 0 deletions lib/scenic/adapters/postgres/index_creation.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
module Scenic
module Adapters
class Postgres
# Used to resiliently create indexes on a materialized view. If the index
# cannot be applied to the view (e.g. the columns don't exist any longer),
# we log that information and continue rather than raising an error. It is
# left to the user to judge whether the index is necessary and recreate
# it.
#
# Used when updating a materialized view to ensure the new version has all
# apprioriate indexes.
#
# @api private
class IndexCreation
# Creates the index creation object.
#
# @param connection [Connection] The connection to execute SQL against.
# @param speaker [#say] (ActiveRecord::Migration) The object used for
# logging the results of creating indexes.
def initialize(connection:, speaker: ActiveRecord::Migration.new)
@connection = connection
@speaker = speaker
end

# Creates the provided indexes. If an index cannot be created, it is
# logged and the process continues.
#
# @param indexes [Array<Scenic::Index>] The indexes to create.
#
# @return [void]
def try_create(indexes)
Array(indexes).each(&method(:try_index_create))
end

private

attr_reader :connection, :speaker

def try_index_create(index)
success = with_savepoint(index.index_name) do
connection.execute(index.definition)
end

if success
say "index '#{index.index_name}' on '#{index.object_name}' has been created"
else
say "index '#{index.index_name}' on '#{index.object_name}' is no longer valid and has been dropped."
end
end

def with_savepoint(name)
connection.execute("SAVEPOINT #{name}")
yield
connection.execute("RELEASE SAVEPOINT #{name}")
true
rescue
connection.execute("ROLLBACK TO SAVEPOINT #{name}")
false
end

def say(message)
subitem = true
speaker.say(message, subitem)
end
end
end
end
end
Loading

0 comments on commit a2a9be5

Please sign in to comment.