Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature-benchmark: Add MaterializedViewSink #30762

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

def-
Copy link
Contributor

@def- def- commented Dec 6, 2024

    NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
    --------------------------------------------------------------------------------------------------------------------------------------------------------
    MaterializedViewSink                | wallclock       |          11.270 |          31.620 |   s    |    10%     |      no       | better:  2.8 times faster
    MaterializedViewSink                | memory_mz       |        3699.303 |        3870.010 |   MB   |    20%     |      no       | better:  4.4% less
    MaterializedViewSink                | memory_clusterd |         258.636 |         294.971 |   MB   |    50%     |      no       | better: 12.3% less

@teskje Does this look reasonable for you? Only look at the last commit, it's based on #30758

Checklist

  • This PR has adequate test coverage / QA involvement has been duly considered. (trigger-ci for additional test/nightly runs)
  • This PR has an associated up-to-date design doc, is a design doc (template), or is sufficiently small to not require a design.
  • If this PR evolves an existing $T ⇔ Proto$T mapping (possibly in a backwards-incompatible way), then it is tagged with a T-proto label.
  • If this PR will require changes to cloud orchestration or tests, there is a companion cloud PR to account for those changes that is tagged with the release-blocker label (example).
  • If this PR includes major user-facing behavior changes, I have pinged the relevant PM to schedule a changelog post.

If the append operator discards a batch because it can already not be
appended anymore, we should call `delete()` on the batch before dropping
it. This doesn't actually change any behavior, since batch deletion is
disable in production currently, but it avoids triggering the WARN log
in `Batch::drop`.
This commit makes the MV sink always perform a consolidation of (the
relevant parts of) the correction buffer after it has received all
updates for the snapshot, both from `desired` and `persist`. We know
that these updates will cancel out, so we have an opportunity to shed a
large amount of memory by forcing a consolidation at this time. This is
especially important in read-only mode when the MV sink does not produce
any batches, which is the process that normally forces consolidation to
happen.
@def- def- requested a review from teskje December 6, 2024 17:39
Comment on lines +1582 to +1583
> CREATE MATERIALIZED VIEW sink1_check_v AS SELECT COUNT(*) FROM sink1_check_tbl;

> SELECT * FROM sink1_check_v
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just to have a way to know when the sink has finished writing the data, and not to test MV performance, is there a reason to create the sink1_check_v? Couldn't we just SELECT from the sink1_check_tbl directly instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then it's what the ExactlyOnce scenario is already doing.

Local run:

NAME                                | TYPE            |      THIS       |      OTHER      |  UNIT  | THRESHOLD  |  Regression?  | 'THIS' is
--------------------------------------------------------------------------------------------------------------------------------------------------------
MaterializedViewSink                | wallclock       |          11.270 |          31.620 |   s    |    10%     |      no       | better:  2.8 times faster
MaterializedViewSink                | memory_mz       |        3699.303 |        3870.010 |   MB   |    20%     |      no       | better:  4.4% less
MaterializedViewSink                | memory_clusterd |         258.636 |         294.971 |   MB   |    50%     |      no       | better: 12.3% less
@def- def- force-pushed the pr-test-mv-sink-v2 branch from b8525d2 to 5038023 Compare December 10, 2024 15:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants