-
Notifications
You must be signed in to change notification settings - Fork 114
Conversation
3671dd5
to
1bda13c
Compare
@clee704 Could you review the PR? |
@@ -171,6 +171,17 @@ class Hyperspace(spark: SparkSession) { | |||
indexManager.index(indexName) | |||
} | |||
|
|||
def whyNot(df: DataFrame, indexName: String = "", extended: Boolean = false)( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: indexName: Option[String]
would make a better interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But this way, users need to add "Some(" . . which does not look intuitive
hyperspace.whyNot(query(leftDf, rightDf)(), Some("leftDfJoinIndex"), extended = true)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. By the way, how about supporting multiple index names? indexNames: Seq[String] = Nil
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could add additional APIs later on demand - like returning DF instead of printing
src/main/scala/com/microsoft/hyperspace/index/covering/JoinIndexRule.scala
Outdated
Show resolved
Hide resolved
src/test/scala/com/microsoft/hyperspace/index/rules/ScoreBasedIndexPlanOptimizerTest.scala
Outdated
Show resolved
Hide resolved
src/main/scala/com/microsoft/hyperspace/index/plananalysis/FilterReason.scala
Outdated
Show resolved
Hide resolved
deed490
to
37c0bee
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks!
What is the context for this pull request?
What changes were proposed in this pull request?
Introduce whyNot API that explains why each index is not applied to a specific sub plan of the given dataframe.
The following is an example definition for a disqualified reason.
These "FilterReason"s will be collected for each (index entry, sub plan) pair, and the result will be printed like the following:
Follow-up PRs
Does this PR introduce any user-facing change?
Yes, new API is introduced
How was this patch tested?