-
Notifications
You must be signed in to change notification settings - Fork 465
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feature-benchmark: Add MaterializedViewSink #30762
base: main
Are you sure you want to change the base?
Conversation
If the append operator discards a batch because it can already not be appended anymore, we should call `delete()` on the batch before dropping it. This doesn't actually change any behavior, since batch deletion is disable in production currently, but it avoids triggering the WARN log in `Batch::drop`.
This commit makes the MV sink always perform a consolidation of (the relevant parts of) the correction buffer after it has received all updates for the snapshot, both from `desired` and `persist`. We know that these updates will cancel out, so we have an opportunity to shed a large amount of memory by forcing a consolidation at this time. This is especially important in read-only mode when the MV sink does not produce any batches, which is the process that normally forces consolidation to happen.
misc/python/materialize/feature_benchmark/scenarios/benchmark_main.py
Outdated
Show resolved
Hide resolved
misc/python/materialize/feature_benchmark/scenarios/benchmark_main.py
Outdated
Show resolved
Hide resolved
> CREATE MATERIALIZED VIEW sink1_check_v AS SELECT COUNT(*) FROM sink1_check_tbl; | ||
|
||
> SELECT * FROM sink1_check_v |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is just to have a way to know when the sink has finished writing the data, and not to test MV performance, is there a reason to create the sink1_check_v
? Couldn't we just SELECT from the sink1_check_tbl
directly instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it's what the ExactlyOnce
scenario is already doing.
Local run: NAME | TYPE | THIS | OTHER | UNIT | THRESHOLD | Regression? | 'THIS' is -------------------------------------------------------------------------------------------------------------------------------------------------------- MaterializedViewSink | wallclock | 11.270 | 31.620 | s | 10% | no | better: 2.8 times faster MaterializedViewSink | memory_mz | 3699.303 | 3870.010 | MB | 20% | no | better: 4.4% less MaterializedViewSink | memory_clusterd | 258.636 | 294.971 | MB | 50% | no | better: 12.3% less
b8525d2
to
5038023
Compare
@teskje Does this look reasonable for you? Only look at the last commit, it's based on #30758
Checklist
$T ⇔ Proto$T
mapping (possibly in a backwards-incompatible way), then it is tagged with aT-proto
label.