Skip to content

Commit

Permalink
add continuous aggregate refresh policies (#49)
Browse files Browse the repository at this point in the history
* add continuous aggregate refresh policies

* adding comment on migration to explain how often cagg refresh policies are run and how far they look back
yuenmichelle1 authored Nov 7, 2023
1 parent a8a100f commit 20a7ec2
Showing 2 changed files with 52 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# frozen_string_literal: true

# These refresh policies have a start_offset of 3 days, because that is the smallest possible starting offset (anything smaller will get a `refresh window is too small` db error)
# These refresh policies will look back at the associated tables (classification_events, comment_events, or classification_user_groups) for the past 3 days through the last hour of when the refresh job is run, and re-compute the continuous aggregate.

# These refresh policies run everyday, if we want to change the frequency refresh policies are run, we will need to update the `scheduling_interval` of the refresh policy

# Eg.
#
# Refresh Continuous Aggregate Job is run today Nov 6th 1:47PM CST, the job will look at the classification_events table for the past 3 days (Nov 3rd) through the past hour (Nov 6th 12:47 PM CST), and will check for any changes in the classification_events table that have NOT been calculated in previous refresh jobs and then update the caggs with updated calculations.

# More on refresh policies for continuous aggregates found here:
# https://docs.timescale.com/api/latest/continuous-aggregates/add_continuous_aggregate_policy/

class AddContinuousAggregateRefreshPolicies < ActiveRecord::Migration[7.0]
# we have to disable the migration transaction because creating materialized views within it is not allowed.
disable_ddl_transaction!
def change
execute <<~SQL
SELECT add_continuous_aggregate_policy('daily_classification_count', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 hour');
SELECT add_continuous_aggregate_policy('daily_classification_count_per_project', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_classification_count_per_workflow', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_comment_count', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_comment_count_per_project_and_user', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_comment_count_per_user', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_group_classification_count_and_time', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_group_classification_count_and_time_per_project', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_group_classification_count_and_time_per_user', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_group_classification_count_and_time_per_user_per_project', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_group_classification_count_and_time_per_user_per_workflow', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_group_classification_count_and_time_per_workflow', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_user_classification_count_and_time', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_user_classification_count_and_time_per_project', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SELECT add_continuous_aggregate_policy('daily_user_classification_count_and_time_per_workflow', start_offset => INTERVAL '3 days', end_offset => INTERVAL '1 hour', schedule_interval => INTERVAL '1 day');
SQL
end
end
2 changes: 1 addition & 1 deletion db/schema.rb
Original file line number Diff line number Diff line change
@@ -10,7 +10,7 @@
#
# It's strongly recommended that you check this file into your version control system.

ActiveRecord::Schema[7.0].define(version: 2023_10_23_161237) do
ActiveRecord::Schema[7.0].define(version: 2023_11_06_170457) do
# These are extensions that must be enabled in order to support this database
enable_extension "plpgsql"
enable_extension "timescaledb"

0 comments on commit 20a7ec2

Please sign in to comment.