ESQL: Refactor STATS substitution optimizer rules #110345
Labels
:Analytics/ES|QL
AKA ESQL
>refactoring
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
In the substitutions batch of our LogicalPlanOptimizer, there's 4 rules that take an expression like
| STATS foo = avg(x*x) + 2
and turn this into a simple aggregation with enclosingEVAL
s; in this example, this becomes (essentially)This is becoming complicated and more difficult to argue about due to the substitutions happening in 4 rules; let's see if we can do with just 2 rules.
More specifically,
ReplaceStatsNestedExpressionWithEval
turnsSTATS avg(x*x) + 2
intoEVAL $$x = x*x | STATS foo = avg($$x) + 2
.ReplaceStatsAggExpressionWithEval
then turns| STATS foo = avg($$x) + 2
into| STATS $$foo = avg($$x) | EVAL foo = $$foo + 2
SubstituteSurrogates
replaces| STATS $$foo = avg($$x)
by| STATS $$foo_sum = sum($$x), $$foo_count = count($$x) | EVAL $$foo = $$foo_sum/$$foo_count
ReplaceStatsNestedExpressionWithEval
again to account for stuff that happened inTranslateMetricsAggregate
It makes sense that there's 1 rule that creates
EVAL
s after the aggregation (ReplaceStatsNestedExpressionWithEval
) and one that pulls nested expressions out of agg functions into anEVAL
before the aggregation (ReplaceStatsAggExpressionWithEval
).SubstituteSurrogates
should only substitute and letReplaceStatsNestedExpressionWithEval
handle creating theEVAL
after theSTATS
.ReplaceStatsNestedExpressionWithEval
afterTranslateMetricsAggregate
The text was updated successfully, but these errors were encountered: