Use CTEs instead of subqueries #1523
siljamardla
started this conversation in
Ideas
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Idea
At the moment MF queries are fully based on subqueries.
To make the queries more efficient and easier to read, they should start by CTEs to read the data that will be needed.
For example:
Impact
Real world example: I have a saved query with 200 metrics that depends on 22 tables, but the compiled SQL has about 630 places where it reads data from a table.
By using the CTEs:
Sample compiled SQL
Here's an example where we have two metrics that are otherwise similar, but the filter value is different. And another metric that comes from another table.
currently compiles to
Notice how we query the rides order table twice.
This could instead compile to something like this (keeping the logic of filtering the whole subquery):
The example I'm giving here with different filters is making it more difficult to merge the metrics subqueries (there's
COUNT_IF (...)
available as a function, but couldn't find a correspondingSUM_IF
, at least not in Databricks). For use cases when we just need multiple columns from the same table, but used in different downstream derived metrics, it would be more obvious. I can come back here later and provide examples.Beta Was this translation helpful? Give feedback.
All reactions