Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #5380][UT] Create PySpark batch jobs tests for RESTful API #5498

Closed

Conversation

weixi62961
Copy link
Contributor

@weixi62961 weixi62961 commented Oct 23, 2023

Why are the changes needed?

To close #5380.

As PySpark jobs become popular approach for data exploring and processing, we need to create tests for creating PySpark jobs.

According the existing Spark Jar unit tests, two PySpark job unit test were added, they are all simple PI computing jobs from Spark examples.

case1, "pyspark submit - basic batch rest client with existing resource file"

It's almost same with the spark jar job test case, except the following two points:

  1. param batchType should be set to PYSPARK, not SPARK. please refer to [SPARK] Support pyspark batch job by restful api #3836 for detailed information.
  2. For PySpark job,param className is useless, should be set to null

case2, "pyspark submit - basic batch rest client with uploading resource file"

Through the two test cases, simple PySpark jobs can be submitted normally.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible
  • Add screenshots for manual tests if appropriate
  • Run test locally before make a pull request

Was this patch authored or co-authored using generative AI tooling?

No

@codecov-commenter
Copy link

codecov-commenter commented Oct 23, 2023

Codecov Report

Merging #5498 (dc8b6bf) into master (59c25b9) will not change coverage.
Report is 10 commits behind head on master.
The diff coverage is n/a.

❗ Current head dc8b6bf differs from pull request most recent head b693efc. Consider uploading reports for the commit b693efc to get more accurate results

@@          Coverage Diff           @@
##           master   #5498   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         588     588           
  Lines       33466   33480   +14     
  Branches     4401    4405    +4     
======================================
- Misses      33466   33480   +14     

see 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@bowenliang123 bowenliang123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall. Some comments to follow.

val requestObj = newBatchRequest(
batchType = PySparkJobPI.batchType,
resource = PySparkJobPI.resource.get,
className = null,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in PR descriptions, it's a key point to set the className here to null. Is it applicable to all pyspark jobs ? If so , whether we could have a follow-up PR to skip sending className the pyspark batch type?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bowenliang123
Copy link
Contributor

@weixi62961 Thank you for your contribution and patience.
Merged to master.

@weixi62961 weixi62961 deleted the unittest-batchapi-pyspark-simple branch October 25, 2023 01:13
@bowenliang123 bowenliang123 modified the milestones: v1.9.0, v1.8.0 Oct 25, 2023
bowenliang123 added a commit that referenced this pull request Oct 25, 2023
### _Why are the changes needed?_

To close #5380.

As PySpark jobs become popular approach for data exploring and processing, we need to create tests for creating PySpark jobs.

According the existing Spark Jar unit tests, two PySpark job unit test were added, they are all simple PI computing jobs from Spark examples.

#### case1, "pyspark submit - basic batch rest client with existing resource file"
It's almost same with the spark jar job test case, except the following two points:
1. param `batchType` should be set to `PYSPARK`, not `SPARK`.  please refer to #3836 for detailed information.
2. For PySpark job,param `className` is useless, should be set to null

#### case2, "pyspark submit - basic batch rest client with uploading resource file"

Through the two test cases, simple PySpark jobs can be submitted normally.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes #5498 from weixi62961/unittest-batchapi-pyspark-simple.

Closes #5380

b693efc [Bowen Liang] simplify sparkBatchTestResource
72a92b5 [Bowen Liang] Update kyuubi-server/src/test/scala/org/apache/kyuubi/server/rest/client/PySparkBatchRestApiSuite.scala
b2035a3 [weixi] remove no necessary wrapper object "PySparkJobPI"
27d12e8 [weixi] rename from BatchRestApiPySparkSuite to PySparkBatchRestApiSuite
e680e60 [weixi] Create a dedicated batch API suite for PySpark jobs.
dc8b6bf [weixi] add 2 test cases for pyspark batch job submit.

Lead-authored-by: weixi <[email protected]>
Co-authored-by: Bowen Liang <[email protected]>
Co-authored-by: Bowen Liang <[email protected]>
Signed-off-by: liangbowen <[email protected]>
(cherry picked from commit 5cff4fb)
Signed-off-by: liangbowen <[email protected]>
@bowenliang123
Copy link
Contributor

Backported this PR to branch-1.8 (1.8.0) as well. No feature change is included.

davidyuan1223 pushed a commit to davidyuan1223/kyuubi that referenced this pull request Oct 26, 2023
### _Why are the changes needed?_

To close apache#5380.

As PySpark jobs become popular approach for data exploring and processing, we need to create tests for creating PySpark jobs.

According the existing Spark Jar unit tests, two PySpark job unit test were added, they are all simple PI computing jobs from Spark examples.

#### case1, "pyspark submit - basic batch rest client with existing resource file"
It's almost same with the spark jar job test case, except the following two points:
1. param `batchType` should be set to `PYSPARK`, not `SPARK`.  please refer to apache#3836 for detailed information.
2. For PySpark job,param `className` is useless, should be set to null

#### case2, "pyspark submit - basic batch rest client with uploading resource file"

Through the two test cases, simple PySpark jobs can be submitted normally.

### _How was this patch tested?_
- [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
- [ ] Add screenshots for manual tests if appropriate
- [x] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes apache#5498 from weixi62961/unittest-batchapi-pyspark-simple.

Closes apache#5380

b693efc [Bowen Liang] simplify sparkBatchTestResource
72a92b5 [Bowen Liang] Update kyuubi-server/src/test/scala/org/apache/kyuubi/server/rest/client/PySparkBatchRestApiSuite.scala
b2035a3 [weixi] remove no necessary wrapper object "PySparkJobPI"
27d12e8 [weixi] rename from BatchRestApiPySparkSuite to PySparkBatchRestApiSuite
e680e60 [weixi] Create a dedicated batch API suite for PySpark jobs.
dc8b6bf [weixi] add 2 test cases for pyspark batch job submit.

Lead-authored-by: weixi <[email protected]>
Co-authored-by: Bowen Liang <[email protected]>
Co-authored-by: Bowen Liang <[email protected]>
Signed-off-by: liangbowen <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[TASK][EASY] Create PySpark batch jobs tests for RESTful API
3 participants