Skip to content

Commit

Permalink
doc: add design doc for improved temporal filters
Browse files Browse the repository at this point in the history
  • Loading branch information
sdht0 committed Sep 20, 2024
1 parent 26767eb commit 1606dfd
Showing 1 changed file with 86 additions and 0 deletions.
86 changes: 86 additions & 0 deletions doc/developer/design/20240919_replica_expiration.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,86 @@
# Replica Expiration

- Associated issues/prs: [#26029](https://github.com/MaterializeInc/materialize/issues/26029) [#29587](https://github.com/MaterializeInc/materialize/pull/29587)

## The Problem

Temporal filters currently require Materialize to maintain all future retractions
of data that is currently visible. For long windows, the retractions could be at
timestamps beyond our next scheduled restart, for example during our weekly
DB releases.

For instance, in the example below, the temporal filter in the `last_30_sec`
view causes two diffs to be generated for every row inserted into `events`, the
row itself and a retraction 30 seconds later. However, if the replica is
restarted in the next 30 seconds, the retraction diff is never processed, making
it unnecessary to generate and store that extra diff.

```sql
-- Create a table of timestamped events.
CREATE TABLE events (
content TEXT,
event_ts TIMESTAMP
);
-- Create a view of events from the last 30 seconds.
CREATE VIEW last_30_sec AS
SELECT event_ts, content
FROM events
WHERE mz_now() <= event_ts + INTERVAL '30s';

INSERT INTO events VALUES ('hello', now());

COPY (SUBSCRIBE (SELECT event_ts, content FROM last_30_sec)) TO STDOUT;
1686868190714 1 2023-06-15 22:29:50.711 hello -- now()
1686868220712 -1 2023-06-15 22:29:50.711 hello -- now() + 30s
```

## Success Criteria

Diffs associated with timestamps beyond a set expiration time---mainly the
retractions generated in temporal filters---are dropped without affecting
correctness, resulting in lower memory utilization.

## Out of Scope

- Dataflows whose timeline type is not `Timeline::EpochMillis`. We rely on the
frontier timestamp being comparable to wall clock time of the replica.

## Solution Proposal

We introduce a new LaunchDarkly feature flag that allows us to configure the
expiration time for replicas for each environment. When the feature flag is
enabled (non-zero offset), replicas will filter out diffs that are beyond the
expiration time. To ensure correctness, replicas will panic if their frontier
exceeds the expiration time before the replica is restarted.

More concretely, we make the following changes:

* Introduce a new dyncfg `compute_replica_expiration` to set an offset `Duration`.
* If the offset is configured with a non-zero value, compute the
`replica_expiration` time as `now() + offset`. This value specifies the maximum
time for which the replica is expected to be running. Consequently, diffs
associated with timestamps beyond this limit do not have to be stored and can
be dropped.
* We only consider dataflows with timeline kind `Timeline::EpochMillis`. For
these dataflows, we propagate `replica_expiration` to the existing `until`
checks in `mfp.evaluate()`, such that any data beyond the expiration time is
filtered out.
* To ensure correctness, we add checks in `Context::export_index` to stop the
replica with a panic if the frontier exceeds the expiration time. This is to
prevent the replica from serving requests without any data that has
potentially been dropped.
* If the expiration time is exceeded, the replica will panic and restart with a
new expiration limit as an offset of the new start time. This time, any data
whose timestamps fall within the new limit are not filtered, thus maintaining
correctness.

## Alternatives

-

## Open Questions

- What is the appropriate default expiration time?
- Given that we currently restart replicas every week as part of the DB release
and leaving some buffer for skipped week, 3 weeks (+1 day margin) seems like
a good limit to start with.

0 comments on commit 1606dfd

Please sign in to comment.