Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[KYUUBI #5328] Batch supports priority scheduling #5352

Closed
wants to merge 16 commits into from
Closed

[KYUUBI #5328] Batch supports priority scheduling #5352

wants to merge 16 commits into from

Conversation

zwangsheng
Copy link
Contributor

@zwangsheng zwangsheng commented Oct 7, 2023

Why are the changes needed?

Follow #5329 and close #5328:

  1. Add new config kyuubi.metadata.store.jdbc.priority.enabled to control whether enable priority scheduling, due to users may experience performance issues when using MySQL5.7 as metastore backend and enabling kyuubi batch v2 priority feature.
  2. When priority scheduling is enabled, KyuubiBatchService picks metadata job with ORDER BY priority DESC, create_time ASC.
  3. Insert metadata with priority field, default priority value is 10.
  4. Add new config kyuubi.batch.priority for each batch priority.

How was this patch tested?

  • Add some test cases that check the changes thoroughly including negative and positive cases if possible

  • Add screenshots for manual tests if appropriate

  • Run test locally before make a pull request

Was this patch authored or co-authored using generative AI tooling?

No

@@ -26,6 +26,7 @@ object KyuubiReservedKeys {
final val KYUUBI_SESSION_USER_SIGN = "kyuubi.session.user.sign"
final val KYUUBI_SESSION_REAL_USER_KEY = "kyuubi.session.real.user"
final val KYUUBI_SESSION_CONNECTION_URL_KEY = "kyuubi.session.connection.url"
final val KYUUBI_SESSION_PRIORITY = "kyuubi.session.priority"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kyuubi.batch.priority

@codecov-commenter
Copy link

codecov-commenter commented Oct 8, 2023

Codecov Report

Merging #5352 (687ed1e) into master (2690d6d) will not change coverage.
Report is 12 commits behind head on master.
The diff coverage is 0.00%.

@@          Coverage Diff           @@
##           master   #5352   +/-   ##
======================================
  Coverage    0.00%   0.00%           
======================================
  Files         590     588    -2     
  Lines       33493   33434   -59     
  Branches     4424    4392   -32     
======================================
+ Misses      33493   33434   -59     
Files Coverage Δ
.../org/apache/kyuubi/config/KyuubiReservedKeys.scala 0.00% <ø> (ø)
...g/apache/kyuubi/server/metadata/api/Metadata.scala 0.00% <0.00%> (ø)
...org/apache/kyuubi/session/KyuubiBatchSession.scala 0.00% <0.00%> (ø)
...g/apache/kyuubi/session/KyuubiSessionManager.scala 0.00% <0.00%> (ø)
...yuubi/server/metadata/jdbc/JDBCMetadataStore.scala 0.00% <0.00%> (ø)
...i/server/metadata/jdbc/JDBCMetadataStoreConf.scala 0.00% <0.00%> (ø)

... and 9 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

}
}

override def pickMetadata(kyuubiInstance: String): Option[Metadata] = synchronized {
JdbcUtils.executeQueryWithRowMapper(
s"""SELECT identifier FROM $METADATA_TABLE
|WHERE state=?
|ORDER BY create_time ASC LIMIT 1
|ORDER BY priority DESC, create_time ASC LIMIT 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as we discussed before, there would cause significant performance issues on MySQL 5.7(which is adopted widely at the current stage), we should provide a switch to allow user disable this feature

@pan3793 pan3793 changed the title [KYUUBI #5328][V2] Kyuubi Server Pick Metadata job with priority [KYUUBI #5328][Batch] Kyuubi Server Pick Metadata job with priority Oct 9, 2023
@pan3793 pan3793 changed the title [KYUUBI #5328][Batch] Kyuubi Server Pick Metadata job with priority [KYUUBI #5328][BATCH] Kyuubi Server Pick Metadata job with priority Oct 9, 2023
@pan3793
Copy link
Member

pan3793 commented Oct 9, 2023

As this involves a basic algorithm, we'd better add UT at the MetadataManager level to make sure the priority-based FIFO algorithm works as expected.

@zwangsheng
Copy link
Contributor Author

As this involves a basic algorithm, we'd better add UT at the MetadataManager level to make sure the priority-based FIFO algorithm works as expected.

Sure, i'll go on with schedule UT.

@zwangsheng zwangsheng requested a review from pan3793 October 9, 2023 06:20
Comment on lines 99 to 103
.doc("Whether to pick up job from metadata store order by priority." +
"If disabled this, set kyuubi.batch.priority wont take effect." +
"We recommend to enabled this when you already have a metadata store, " +
"which can support mix order index(such as Mysql8)." +
"See why recommend this in https://github.com/apache/kyuubi/pull/5329")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the text need to polish, again, "MySQL"

@github-actions github-actions bot added the kind:documentation Documentation is a feature! label Oct 9, 2023
Copy link
Contributor

@bowenliang123 bowenliang123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall.
Some optional comments to follow.

@@ -78,6 +78,7 @@ case class Metadata(
engineState: String = null,
engineError: Option[String] = None,
endTime: Long = 0L,
priority: Int = 10,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a magic code here. Default priority is worth to be extracted to explicit static value , and further mentioned in config docs. Or it's confusing for user and developers reviewing the UTs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's at least document it should be consistent with the metadata table DDL definition

Comment on lines 153 to 177
val mockBatchJob1 = Metadata(
identifier = "mock_batch_job_1",
sessionType = SessionType.BATCH,
realUser = "kyuubi",
username = "kyuubi",
engineType = EngineType.SPARK_SQL.toString,
state = OperationState.INITIALIZED.toString,
priority = 20,
createTime = time + 10000)
val mockBatchJob2 = Metadata(
identifier = "mock_batch_job_2",
sessionType = SessionType.BATCH,
realUser = "kyuubi",
username = "kyuubi",
engineType = EngineType.SPARK_SQL.toString,
state = OperationState.INITIALIZED.toString,
createTime = time)
val mockBatchJob3 = Metadata(
identifier = "mock_batch_job_3",
sessionType = SessionType.BATCH,
realUser = "kyuubi",
username = "kyuubi",
engineType = EngineType.SPARK_SQL.toString,
state = OperationState.INITIALIZED.toString,
createTime = time + 10000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be extracted to a dedicate method for generating mocked batch job objects?
And make sure same job objects are the same in positive and negative tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure.

Reused and modified newMetadata method to build mock batch job and keep use same mock object in both positive and negative case.

@pan3793
Copy link
Member

pan3793 commented Oct 11, 2023

Could you please update the PR description to match the real change?

@zwangsheng
Copy link
Contributor Author

Could you please update the PR description to match the real change?

Updated, mentioned new changes.

Comment on lines 99 to 103
.doc("Whether to enable picking up jobs from the metadata store in order of priority." +
"If this feature is disabled, setting kyuubi.batch.priority will not take effect." +
"However, it is recommended to enable this feature if you already have a metadata store " +
"that can support mixed order indexing(such as MySQL 8)." +
"See why recommend this in https://github.com/apache/kyuubi/pull/5329")
Copy link
Member

@pan3793 pan3793 Oct 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggested change:

Whether to enable the priority scheduling for batch impl v2. When false, ignore kyuubi.batch.priority and use the FIFO ordering strategy for batch job scheduling. Note: it's recommended to use MySQL 8.0 as metadata backend when enabling this feature, since it may cause significant performance issues on MySQL 5.7 due to the lack of support for mixed order index. See more details at KYUUBI #5329.

Also, please take care of the space between each line.

@pan3793 pan3793 changed the title [KYUUBI #5328][BATCH] Kyuubi Server Pick Metadata job with priority [KYUUBI #5328][BATCH] Batch supports priority scheduling Oct 12, 2023
@pan3793 pan3793 changed the title [KYUUBI #5328][BATCH] Batch supports priority scheduling [KYUUBI #5328] Batch supports priority scheduling Oct 12, 2023
Copy link
Member

@pan3793 pan3793 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, only a suggestion of configuration docs

@pan3793 pan3793 added this to the v1.8.0 milestone Oct 16, 2023
@pan3793 pan3793 closed this in b24d94e Oct 16, 2023
pan3793 added a commit that referenced this pull request Oct 16, 2023
### _Why are the changes needed?_

Follow #5329 and close #5328:

1. Add new config `kyuubi.metadata.store.jdbc.priority.enabled` to control whether enable priority scheduling, due to users may experience performance issues when using MySQL5.7 as metastore backend and enabling kyuubi batch v2 priority feature.
2. When priority scheduling is enabled, `KyuubiBatchService` picks metadata job with `ORDER BY priority DESC, create_time ASC`.
3. Insert metadata with priority field, default priority value is `10`.
4. Add new config `kyuubi.batch.priority` for each batch priority.

### _How was this patch tested?_
- [x] Add some test cases that check the changes thoroughly including negative and positive cases if possible

- [ ] Add screenshots for manual tests if appropriate

- [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request

### _Was this patch authored or co-authored using generative AI tooling?_

No

Closes #5352 from zwangsheng/KYUUBI#5328.

Closes #5328

687ed1e [Cheng Pan] Update kyuubi-server/src/main/scala/org/apache/kyuubi/server/metadata/jdbc/JDBCMetadataStore.scala
58621b5 [zwangsheng] fix comments
1bf81e7 [zwangsheng] fix style
7ed2551 [zwangsheng] update default priority desc & improve UT
21ceccb [zwangsheng] fix doc
27fc5e8 [zwangsheng] enrich desc
c0bbc0d [zwangsheng] fix style
6b8d0f0 [zwangsheng] fix comment
67eb252 [zwangsheng] fix comment
e1705c3 [zwangsheng] Add config to control whether pick order by priority or not
129a467 [zwangsheng] Add unit test for pickBatchForSubmitting
fcaf85d [zwangsheng] Fix unit test
f7ca221 [zwangsheng] Fix unit test
8d4b276 [wangsheng] fix code style
4c6b990 [wangsheng] fix comments
654ad84 [zwangsheng] [KYUUBI #5328][V2] Kyuubi Server Pick Metadata job with priority

Lead-authored-by: zwangsheng <[email protected]>
Co-authored-by: wangsheng <[email protected]>
Co-authored-by: Cheng Pan <[email protected]>
Signed-off-by: Cheng Pan <[email protected]>
(cherry picked from commit b24d94e)
Signed-off-by: Cheng Pan <[email protected]>
@@ -167,9 +169,10 @@ class JDBCMetadataStore(conf: KyuubiConf) extends MetadataStore with Logging {
|request_args,
|create_time,
|engine_type,
|cluster_manager
|cluster_manager,
|priority
Copy link
Member

@turboFei turboFei Oct 16, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see the metadata table schema change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review, metadata table schema changed in #5329

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @zwangsheng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Subtask][V2] Kyuubi Server Pick Metadata job with priority
5 participants