-
Notifications
You must be signed in to change notification settings - Fork 75
Add a strategy to fall back to Vanilla Spark shuffle manager #1047
base: main
Are you sure you want to change the base?
Conversation
…ect#978) * [NSE-927] Add macro __AVX512BW__ check for different CPU architecture (oap-project#975) * Add __AVX512BW__ check * Fix cFormat * [NSE-126] set default codegen opt to O1 for branch-1.4
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/native-sql-engine/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Assuming the first two commits are not relevant to your patch, please do NOT include them. If your work depends on these commits, it would be better to open a dedicate PR to port them to main branch. |
# Conflicts: # native-sql-engine/core/src/main/scala/com/intel/oap/expression/ConverterUtils.scala
@@ -83,6 +83,11 @@ class GazellePluginConfig(conf: SQLConf) extends Logging { | |||
val enableColumnarShuffledHashJoin: Boolean = | |||
conf.getConfString("spark.oap.sql.columnar.shuffledhashjoin", "true").toBoolean && enableCpu | |||
|
|||
// enable or disable fallback shuffle manager | |||
val enableFallbackShuffle: Boolean = conf |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you please also add a short note on how to use this feature? and also make this default to false
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added that in the description dialog
Disabled the configuration "spark.oap.sql.columnar.enableFallbackShuffle".
What changes were proposed in this pull request?
Add the strategy to fallback to Vanilla Spark shuffle manager.
o Enable fallback shuffle configuration and reuse the ColumnarShuffleExchangeExec
o Initiate the splitter iterator in Shuffle Dependency, and transform to the RDD: Produce2[Int, ColumnarBatch]
o Serialize the record batch to Shuffle Writer of Vanilla Spark.
How does this patch work?
When submit an application, we use native SQL engine with default ColumnarShuffleManager configuration,
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.ColumnarShuffleManager
However, we want to specify the custom or other shuffle manager for some situations, to enable Vanilla Spark shuffle manager,
--conf spark.shuffle.manager=org.apache.spark.shuffle.sort.SortShuffleManager --conf spark.oap.sql.columnar.enableFallbackShuffle=true
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)