-
Notifications
You must be signed in to change notification settings - Fork 447
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE] Basic runnable version of ACBO (Advanced CBO) #5058
Conversation
This comment was marked as abuse.
This comment was marked as abuse.
1 similar comment
Run Gluten Clickhouse CI |
This comment was marked as abuse.
This comment was marked as abuse.
12 similar comments
Run Gluten Clickhouse CI |
This comment was marked as abuse.
This comment was marked as abuse.
Run Gluten Clickhouse CI |
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as abuse.
Run Gluten Clickhouse CI |
This comment was marked as abuse.
This comment was marked as abuse.
Run Gluten Clickhouse CI |
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as abuse.
This comment was marked as abuse.
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will keep filling code's document during further development.
And I'll make some notes on the PR first.
.github/workflows/velox_be.yml
Outdated
- name: TPC-H SF1.0 && TPC-DS SF10.0 Parquet local spark3.2 with advanced CBO | ||
run: | | ||
$PATH_TO_GLUTEN_TE/$OS_IMAGE_NAME/gha/gha-checkout/exec.sh 'cd /opt/gluten/tools/gluten-it && \ | ||
mvn clean install -Pspark-3.2 \ | ||
&& GLUTEN_IT_JVM_ARGS=-Xmx5G sbin/gluten-it.sh queries-compare \ | ||
--local --preset=velox --benchmark-type=h --error-on-memleak --off-heap-size=10g -s=1.0 --threads=16 --iterations=1 \ | ||
--extra-conf=spark.gluten.sql.advanced.cbo.enabled=true \ | ||
&& GLUTEN_IT_JVM_ARGS=-Xmx20G sbin/gluten-it.sh queries-compare \ | ||
--local --preset=velox --benchmark-type=ds --error-on-memleak --off-heap-size=40g -s=10.0 --threads=32 --iterations=1 \ | ||
--extra-conf=spark.gluten.sql.advanced.cbo.enabled=true' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI Job for ACBO + Velox + TPC-H SF1 + TPC-DS SF10
} | ||
} | ||
|
||
class Cbo[T <: AnyRef] private ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cbo
is a stateless optimization context consisting of configs and utilities.
assert(!notThrew, message) | ||
} | ||
|
||
private def validateModels(): Unit = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do validation on user's API implementations.
} | ||
} | ||
|
||
trait CboCluster[T <: AnyRef] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CboCluster
is a set of nodes sharing the same context (the so-called "logical properties") in the original input plan. One cluster can derive its own set of CboGroup
s. Nodes in one CboGroup
share the same ("physical") properties.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will raise a independent PR to add code doc. The code is under frequent modification as of now.
case class CboConfig( | ||
plannerType: PlannerType = PlannerType.Dp | ||
) | ||
|
||
object CboConfig { | ||
sealed trait PlannerType | ||
object PlannerType { | ||
case object Exhaustive extends PlannerType | ||
case object Dp extends PlannerType | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dp
is the default planner implementation while Exhaustive
is currently only used for testing. It's expected that we can implement parallelized optimization on exhaustive planner comparatively easier than on dp planner in future.
|
||
import io.glutenproject.cbo.memo.MemoStore | ||
|
||
trait CboGroup[T <: AnyRef] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A set of nodes that share the same property set in the same cluster.
} | ||
} | ||
|
||
trait CanonicalNode[T <: AnyRef] extends CboNode[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Canonical node is a node with all children replaced by resident groups.
extends CanonicalNode[T] | ||
} | ||
|
||
trait GroupNode[T <: AnyRef] extends CboNode[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A node that exactly represents a group.
*/ | ||
package io.glutenproject.cbo | ||
|
||
trait CboNode[T <: AnyRef] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A single immutable node wrapper that hides the tree structure from it.
For representing tree structure, use CboPath
.
trait Best[T <: AnyRef] { | ||
import Best._ | ||
def rootGroupId(): Int | ||
def bestNodes(): Set[InGroupNode[T]] | ||
def winnerNodes(): Set[InGroupNode[T]] | ||
def costs(): InGroupNode[T] => Option[Cost] | ||
def path(allGroups: Int => CboGroup[T]): KnownCostPath[T] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Best
is basically the output of one shot of planning.
trait PlannerState[T <: AnyRef] { | ||
def cbo(): Cbo[T] | ||
def memoState(): MemoState[T] | ||
def rootGroupId(): Int | ||
def best(): Best[T] | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The immutable dump of the planner.
|
||
import scala.collection.mutable | ||
|
||
class ForwardMemoTable[T <: AnyRef] private (override val cbo: Cbo[T]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memo table that handles cluster merging / forwarding internally.
def defineEquiv(node: CanonicalNode[T], newNode: T): Unit | ||
} | ||
|
||
trait Memo[T <: AnyRef] extends Closure[T] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Memo
is the basic structure that stores the whole search space of planner. All the nodes stored in it are canonized.
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
1 similar comment
Run Gluten Clickhouse CI |
See proposal #5057
This is the first runnable version of ACBO with TPCH SF 1.0 and TPCDS 10.0 passed.
After this patch, one could set
spark.gluten.sql.advanced.cbo.enabled=true
to enable ACBO. It's by default disabled.Issues:
TransformPreOverrides()
with a rough cost model to do fallback;The following improvements are on the way:
The required facilities of the above were already added but not enabled yet. Will enable and test them in PRs respectively.