-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Enable CI Test on Scala 2.13 and support custom or spark-core extracted Scala version for Spark's engine #5196
Conversation
1eabd56
to
0114663
Compare
|
Curiously,
Fixing this by ignoring the output order of column lineages, for the reasons, 1. order column is guaranteed not strictly stable in the logical plan (as long as it's aligned with its children plan), 2. no major impact on end users of lineage plugins |
d6404e0
to
18c48d0
Compare
@@ -62,7 +63,7 @@ trait ProcBuilder { | |||
def mainResource: Option[String] = { | |||
// 1. get the main resource jar for user specified config first | |||
// TODO use SPARK_SCALA_VERSION instead of SCALA_COMPILE_VERSION | |||
val jarName = s"${module}_$SCALA_COMPILE_VERSION-$KYUUBI_VERSION.jar" | |||
val jarName = s"${module}_$scalaRuntimeSemanticVersion-$KYUUBI_VERSION.jar" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The engine's Scala runtime version is independent of the server, e.g. it allows building a server with Scala 2.12, but using Spark with Scala 2.13.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem, this change is intended to pass the tests on the same scala version first.
e.g. Server on Scala 2.13+ Spark on Scala 2.13
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the dynamic combinations, we could do that in follow-ups.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The quick idea in my mind is extracting the SPARK_SCALA_VERSION
from $SPARK_HOME/jars/spark-core_{scala_binary_version}-{spark_version}.jar
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adopted.
Codecov Report
@@ Coverage Diff @@
## master #5196 +/- ##
======================================
Coverage 0.00% 0.00%
======================================
Files 588 588
Lines 33406 33425 +19
Branches 4388 4393 +5
======================================
- Misses 33406 33425 +19
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
daa6742
to
725d1e7
Compare
1 remaining blocker issue of Spark Hive connector plugin in CI tests. The test case in The Spark hive connector's "DROP NAMESPACE using V1 catalog V1 command: drop non-empty namespace with a non-cascading mode" failed on Spark 3.3/3.4 with Scala2.13, while it passes on Spark 3.2 with Scala2.13, and also on Spark 3.1-3.4 with Scala2.12. I think this case is to ensure preventing dropping non-empty namespace, both exceptions show the enough evidence for it. Test log:
Related info from the driver log:
Other related facts:
Remaining doubts:
|
I need a hand in this block issue in Hive Spark Connector, with mixed runtime conditions and various test results. Do you have time to have look at this? @yikf |
kyuubi-common/src/test/scala/org/apache/kyuubi/operation/SparkQueryTests.scala
Show resolved
Hide resolved
kyuubi-common/src/test/scala/org/apache/kyuubi/operation/SparkQueryTests.scala
Show resolved
Hide resolved
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/ProcBuilder.scala
Outdated
Show resolved
Hide resolved
kyuubi-util-scala/src/main/scala/org/apache/kyuubi/util/ScalaVersionUtils.scala
Outdated
Show resolved
Hide resolved
Hi @yikf , thanks for your support for fixing the problem in Hive Spark connector module. Well done, and now all the test pass on all Spark version on Scala 2.13. Could you consider move the changes in Hive Spark connector module to a separate PR ? As the adaption is more about Spark version rather than about Scala version. After the offline discussion with @pan3793 , we would like to do
|
3e9d9be
to
5f48f7f
Compare
58acb68
to
dafbfa4
Compare
Hi, all the modules (except Flink and Flink IT) now pass all the CI test on Scala 2.13. |
The PR title/description does not match the change now, please update it before merging |
26c0658
to
8d0c22d
Compare
Could you have a second look at this PR ? @pan3793 |
eb2484b
to
7589a0c
Compare
7589a0c
to
42f8e55
Compare
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/spark/SparkProcessBuilder.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, with one minor comments
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/spark/SparkProcessBuilder.scala
Outdated
Show resolved
Hide resolved
kyuubi-server/src/main/scala/org/apache/kyuubi/engine/spark/SparkProcessBuilder.scala
Outdated
Show resolved
Hide resolved
0712689
to
76b99d4
Compare
…rk-core extracted Scala version for Spark's engine ### _Why are the changes needed?_ - enable CI test on Scala-2.13 for all modules except Flink SQL engine - For testing, choose available Spark engine home in `download` module by `SCALA_COMPILE_VERSION` of Kyuubi server - Choose the Scala version of Spark engine main resource Jar in the following order: 1. `SPARK_SCALA_VERSION` system env 2. Extract Scala version from Spark home's `spark-core` jar filename - Fixed 1 assertion error of kyuubi-spark-lineage module, as Spark on Scala 2.12 and 2.13 show different order of column linage output in `MergeIntoTable` ut ``` SparkSQLLineageParserHelperSuite: - columns lineage extract - MergeIntoTable *** FAILED *** inputTables(List(v2_catalog.db.source_t)) outputTables(List(v2_catalog.db.target_t)) columnLineage(List(ColumnLineage(v2_catalog.db.target_t.name,Set(v2_catalog.db.source_t.name)), ColumnLineage(v2_catalog.db.target_t.price,Set(v2_catalog.db.source_t.price)), ColumnLineage(v2_catalog.db.target_t.id,Set(v2_catalog.db.source_t.id)))) did not equal inputTables(List(v2_catalog.db.source_t)) outputTables(List(v2_catalog.db.target_t)) columnLineage(List(ColumnLineage(v2_catalog.db.target_t.id,Set(v2_catalog.db.source_t.id)), ColumnLineage(v2_catalog.db.target_t.name,Set(v2_catalog.db.source_t.name)), ColumnLineage(v2_catalog.db.target_t.price,Set(v2_catalog.db.source_t.price)))) (SparkSQLLineageParserHelperSuite.scala:182) ``` - Fixed other tests relying on Scala scripting results ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ Closes #5196 from bowenliang123/scala213-test. Closes #5196 97fafac [liangbowen] prevent repeated compilation for regrex pattern 76b99d4 [Bowen Liang] test on scala-2.13 Lead-authored-by: Bowen Liang <[email protected]> Co-authored-by: liangbowen <[email protected]> Signed-off-by: Bowen Liang <[email protected]> (cherry picked from commit e33df9c) Signed-off-by: Bowen Liang <[email protected]>
…respect engine env ### _Why are the changes needed?_ Only extract the spark core scala version if `SPARK_SCALA_VERSION` env is empty, and respect engine env. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5434 from turboFei/lazy_scala_version. Closes #5196 fdccef7 [fwang12] lazy extract spark core scala version Authored-by: fwang12 <[email protected]> Signed-off-by: fwang12 <[email protected]>
…respect engine env ### _Why are the changes needed?_ Only extract the spark core scala version if `SPARK_SCALA_VERSION` env is empty, and respect engine env. ### _How was this patch tested?_ - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible - [ ] Add screenshots for manual tests if appropriate - [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request ### _Was this patch authored or co-authored using generative AI tooling?_ No. Closes #5434 from turboFei/lazy_scala_version. Closes #5196 fdccef7 [fwang12] lazy extract spark core scala version Authored-by: fwang12 <[email protected]> Signed-off-by: fwang12 <[email protected]> (cherry picked from commit c60f5b7) Signed-off-by: fwang12 <[email protected]>
Why are the changes needed?
download
module bySCALA_COMPILE_VERSION
of Kyuubi serverSPARK_SCALA_VERSION
system envspark-core
jar filenameMergeIntoTable
utHow was this patch tested?
Add some test cases that check the changes thoroughly including negative and positive cases if possible
Add screenshots for manual tests if appropriate
Run test locally before make a pull request
Was this patch authored or co-authored using generative AI tooling?