Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AQUMV] Answer Aggregation Query Directly. #705

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Commits on Nov 21, 2024

  1. [AQUMV] Answer Aggregation Query directly.

    This commits enable answer query which has aggregation directly. Use the
    results of view has aggregations to avoid compute those from origin
    table.
    This may lead to significant efficiency gains if the SQL has a large
    amount of data.
    AQUMV will always return results immediately.
    
    If we have a valid view like:
    create materialized view mv as
      select sum(c1) as mc1, count(c2) as mc2, avg(c3) as mc3, count(*) as
    mc4
      from t where c1 > 90;
    
    SQL:
    select count(*), sum(c1), count(c2), avg(c3), abs(count(*) - 21) from t
    where c1 > 90;
    
    Could be rewritten to:
    
    select mc4, mc1, mc2, mc3, abs((mc4 - 21)) from mv;
    
    Plan:
    explain(verbose, costs off)
    select count(*), sum(c1), count(c2), avg(c3), abs(count(*) - 21) from t
    where c1 > 90;
                                        QUERY PLAN
    ------------------------------------------------------------------------
     Gather Motion 3:1  (slice1; segments: 3)
       Output: mc4, mc1, mc2, mc3, (abs((mc4 - 21)))
       ->  Seq Scan on mv
             Output: mc4, mc1, mc2, mc3, abs((mc4 - 21))
     Settings: enable_answer_query_using_materialized_views = 'on',
    optimizer = 'off'
     Optimizer: Postgres query optimizer
    (6 rows)
    
    View query with Group By is not supported yet.
    
    If some HAVING quals only exist in origin query and they could be
    computed from view query's target list, then we could keep them like
    post_quals.But as the view has aggregations, the additional quals should
    be moved to WHERE instead of HAVING.
    
    create table t(c1 int, c2 int, c3 int, c4 int);
    create materialized view mv as
      select sum(c1) as mc1, count(c2) as mc2, avg(c3) as mc3, count(*) as
    mc4
      from t where c1 > 90;
    
    SQL:
    select count(*), sum(c1) from t where c1 > 90 having abs(count(*) - 21)
    > 0 and 2 > 1 and avg(c3) > 97;
    
    Could be rewritten to (The HAVING clause has been rewritten to WHERE
    clause):
    
    select mc4, mc1 from mv where mc3 > 97 and abs(mc4 - 21) > 0;
    
    Plan:
    explain(verbose, costs off)
    select count(*), sum(c1) from t where c1 > 90 having abs(count(*) - 21)
    > 0 and 2 > 1 and avg(c3) > 97;
                                        QUERY PLAN
    ------------------------------------------------------------------------
     Gather Motion 3:1  (slice1; segments: 3)
       Output: mc4, mc1
       ->  Seq Scan on aqumv.mv
             Output: mc4, mc1
             Filter: ((mv.mc3 > '97'::numeric) AND (abs((mv.mc4 - 21)) > 0))
     Optimizer: Postgres query optimizer
    (7 rows)
    
    There are two additional HAVING quals:
    Expression: 2 > 1 (would be eliminated during planner). Expression:
    abs(count(*) - 21) > 0, it could be computed from view as:
    	abs(mc4 - 21) > 0
    
    And the new one is put to WHERE clause and acts as a Filter finally.
    
    There is a trick for ORDER BY for both origin query and view query. As
    we has no Groupy By curretly, the aggregation results would be either
    one or zero rows that make the Order By clause pointless.
    We could avoid considering the sort columns if it's a junk for view
    matching.
    
    As we have no group by for view with aggs now, the final result would be
    either one or zero row.
    LIMIT, OFFSET clause of origin query could be applied to view if there
    are consts.
    
    create incremental materialized view mv as
      select sum(c1) as mc1, count(c2) as mc2, avg(c3) as mc3, count(*) as
    mc4
      from t where c1 > 90;
    
    Query:
      select count(*), sum(c1) from t where c1 > 90 limit 2;
    
    Could be rewritten to:
      select mc4, mc1 from mv limit 2;
    
    Authored-by: Zhang Mingli [email protected]
    avamingli committed Nov 21, 2024
    Configuration menu
    Copy the full SHA
    a3b07d7 View commit details
    Browse the repository at this point in the history