-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Exited spark-submit process should not block batch submit queue #6028
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ulysses-you
reviewed
Jan 30, 2024
kyuubi-server/src/main/scala/org/apache/kyuubi/server/KyuubiBatchService.scala
Show resolved
Hide resolved
ulysses-you
approved these changes
Jan 30, 2024
BruceKellan
pushed a commit
to BruceKellan/kyuubi
that referenced
this pull request
Jan 30, 2024
…r/web-ui Bumps [axios](https://github.com/axios/axios) from 0.27.2 to 1.6.0. <details> <summary>Release notes</summary> <p><em>Sourced from <a href="https://github.com/axios/axios/releases">axios's releases</a>.</em></p> <blockquote> <h2>Release v1.6.0</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>CSRF:</strong> fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>) (<a href="https://github.com/axios/axios/commit/96ee232bd3ee4de2e657333d4d2191cd389e14d0">96ee232</a>)</li> <li><strong>dns:</strong> fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>) (<a href="https://github.com/axios/axios/commit/5aaff532a6b820bb9ab6a8cd0f77131b47e2adb8">5aaff53</a>)</li> <li><strong>types:</strong> fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>) (<a href="https://github.com/axios/axios/commit/a1c8ad008b3c13d53e135bbd0862587fb9d3fc09">a1c8ad0</a>)</li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code>⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h3>Contributors to this release</h3> <ul> <li> <a href="https://github.com/DigitalBrainJS" title="+449/-114 ([apache#6032](axios/axios#6032) [apache#6021](axios/axios#6021) [apache#6011](axios/axios#6011) [apache#5932](axios/axios#5932) [apache#5931](axios/axios#5931) )">Dmitriy Mozgovoy</a></li> <li> <a href="https://github.com/valentin-panov" title="+4/-4 ([apache#6028](axios/axios#6028) )">Valentin Panov</a></li> <li> <a href="https://github.com/therealrinku" title="+1/-1 ([apache#5889](axios/axios#5889) )">Rinku Chaudhari</a></li> </ul> <h2>Release v1.5.1</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapters:</strong> improved adapters loading logic to have clear error messages; (<a href="https://redirect.github.com/axios/axios/issues/5919">#5919</a>) (<a href="https://github.com/axios/axios/commit/e4107797a7a1376f6209fbecfbbce73d3faa7859">e410779</a>)</li> <li><strong>formdata:</strong> fixed automatic addition of the <code>Content-Type</code> header for FormData in non-browser environments; (<a href="https://redirect.github.com/axios/axios/issues/5917">#5917</a>) (<a href="https://github.com/axios/axios/commit/bc9af51b1886d1b3529617702f2a21a6c0ed5d92">bc9af51</a>)</li> <li><strong>headers:</strong> allow <code>content-encoding</code> header to handle case-insensitive values (<a href="https://redirect.github.com/axios/axios/issues/5890">#5890</a>) (<a href="https://redirect.github.com/axios/axios/issues/5892">#5892</a>) (<a href="https://github.com/axios/axios/commit/4c89f25196525e90a6e75eda9cb31ae0a2e18acd">4c89f25</a>)</li> <li><strong>types:</strong> removed duplicated code (<a href="https://github.com/axios/axios/commit/9e6205630e1c9cf863adf141c0edb9e6d8d4b149">9e62056</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li> <a href="https://github.com/DigitalBrainJS" title="+89/-18 ([apache#5919](axios/axios#5919) [apache#5917](axios/axios#5917) )">Dmitriy Mozgovoy</a></li> <li> <a href="https://github.com/DavidJDallas" title="+11/-5 ()">David Dallas</a></li> <li> <a href="https://github.com/fb-sean" title="+2/-8 ()">Sean Sattler</a></li> <li> <a href="https://github.com/0o001" title="+4/-4 ()">Mustafa Ateş Uzun</a></li> <li> <a href="https://github.com/sfc-gh-pmotacki" title="+2/-1 ([apache#5892](axios/axios#5892) )">Przemyslaw Motacki</a></li> <li> <a href="https://github.com/Cadienvan" title="+1/-1 ()">Michael Di Prisco</a></li> </ul> <h2>Release v1.5.0</h2> <h2>Release notes:</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapter:</strong> make adapter loading error more clear by using platform-specific adapters explicitly (<a href="https://redirect.github.com/axios/axios/issues/5837">#5837</a>) (<a href="https://github.com/axios/axios/commit/9a414bb6c81796a95c6c7fe668637825458e8b6d">9a414bb</a>)</li> <li><strong>dns:</strong> fixed <code>cacheable-lookup</code> integration; (<a href="https://redirect.github.com/axios/axios/issues/5836">#5836</a>) (<a href="https://github.com/axios/axios/commit/b3e327dcc9277bdce34c7ef57beedf644b00d628">b3e327d</a>)</li> <li><strong>headers:</strong> added support for setting header names that overlap with class methods; (<a href="https://redirect.github.com/axios/axios/issues/5831">#5831</a>) (<a href="https://github.com/axios/axios/commit/d8b4ca0ea5f2f05efa4edfe1e7684593f9f68273">d8b4ca0</a>)</li> <li><strong>headers:</strong> fixed common Content-Type header merging; (<a href="https://redirect.github.com/axios/axios/issues/5832">#5832</a>) (<a href="https://github.com/axios/axios/commit/8fda2766b1e6bcb72c3fabc146223083ef13ce17">8fda276</a>)</li> </ul> <h3>Features</h3> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Changelog</summary> <p><em>Sourced from <a href="https://github.com/axios/axios/blob/v1.x/CHANGELOG.md">axios's changelog</a>.</em></p> <blockquote> <h1><a href="https://github.com/axios/axios/compare/v1.5.1...v1.6.0">1.6.0</a> (2023-10-26)</h1> <h3>Bug Fixes</h3> <ul> <li><strong>CSRF:</strong> fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>) (<a href="https://github.com/axios/axios/commit/96ee232bd3ee4de2e657333d4d2191cd389e14d0">96ee232</a>)</li> <li><strong>dns:</strong> fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>) (<a href="https://github.com/axios/axios/commit/5aaff532a6b820bb9ab6a8cd0f77131b47e2adb8">5aaff53</a>)</li> <li><strong>types:</strong> fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>) (<a href="https://github.com/axios/axios/commit/a1c8ad008b3c13d53e135bbd0862587fb9d3fc09">a1c8ad0</a>)</li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code>⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h3>Contributors to this release</h3> <ul> <li> <a href="https://github.com/DigitalBrainJS" title="+449/-114 ([apache#6032](axios/axios#6032) [apache#6021](axios/axios#6021) [apache#6011](axios/axios#6011) [apache#5932](axios/axios#5932) [apache#5931](axios/axios#5931) )">Dmitriy Mozgovoy</a></li> <li> <a href="https://github.com/valentin-panov" title="+4/-4 ([apache#6028](axios/axios#6028) )">Valentin Panov</a></li> <li> <a href="https://github.com/therealrinku" title="+1/-1 ([apache#5889](axios/axios#5889) )">Rinku Chaudhari</a></li> </ul> <h2><a href="https://github.com/axios/axios/compare/v1.5.0...v1.5.1">1.5.1</a> (2023-09-26)</h2> <h3>Bug Fixes</h3> <ul> <li><strong>adapters:</strong> improved adapters loading logic to have clear error messages; (<a href="https://redirect.github.com/axios/axios/issues/5919">#5919</a>) (<a href="https://github.com/axios/axios/commit/e4107797a7a1376f6209fbecfbbce73d3faa7859">e410779</a>)</li> <li><strong>formdata:</strong> fixed automatic addition of the <code>Content-Type</code> header for FormData in non-browser environments; (<a href="https://redirect.github.com/axios/axios/issues/5917">#5917</a>) (<a href="https://github.com/axios/axios/commit/bc9af51b1886d1b3529617702f2a21a6c0ed5d92">bc9af51</a>)</li> <li><strong>headers:</strong> allow <code>content-encoding</code> header to handle case-insensitive values (<a href="https://redirect.github.com/axios/axios/issues/5890">#5890</a>) (<a href="https://redirect.github.com/axios/axios/issues/5892">#5892</a>) (<a href="https://github.com/axios/axios/commit/4c89f25196525e90a6e75eda9cb31ae0a2e18acd">4c89f25</a>)</li> <li><strong>types:</strong> removed duplicated code (<a href="https://github.com/axios/axios/commit/9e6205630e1c9cf863adf141c0edb9e6d8d4b149">9e62056</a>)</li> </ul> <h3>Contributors to this release</h3> <ul> <li> <a href="https://github.com/DigitalBrainJS" title="+89/-18 ([apache#5919](axios/axios#5919) [apache#5917](axios/axios#5917) )">Dmitriy Mozgovoy</a></li> <li> <a href="https://github.com/DavidJDallas" title="+11/-5 ()">David Dallas</a></li> <li> <a href="https://github.com/fb-sean" title="+2/-8 ()">Sean Sattler</a></li> <li> <a href="https://github.com/0o001" title="+4/-4 ()">Mustafa Ateş Uzun</a></li> <li> <a href="https://github.com/sfc-gh-pmotacki" title="+2/-1 ([apache#5892](axios/axios#5892) )">Przemyslaw Motacki</a></li> <li> <a href="https://github.com/Cadienvan" title="+1/-1 ()">Michael Di Prisco</a></li> </ul> <h3>PRs</h3> <ul> <li>CVE 2023 45857 ( <a href="https://api.github.com/repos/axios/axios/pulls/6028">#6028</a> )</li> </ul> <pre><code>⚠️ Critical vulnerability fix. See https://security.snyk.io/vuln/SNYK-JS-AXIOS-6032459 </code></pre> <h1><a href="https://github.com/axios/axios/compare/v1.4.0...v1.5.0">1.5.0</a> (2023-08-26)</h1> </blockquote> <p>... (truncated)</p> </details> <details> <summary>Commits</summary> <ul> <li><a href="https://github.com/axios/axios/commit/f7adacdbaa569281253c8cfc623ad3f4dc909c60"><code>f7adacd</code></a> chore(release): v1.6.0 (<a href="https://redirect.github.com/axios/axios/issues/6031">#6031</a>)</li> <li><a href="https://github.com/axios/axios/commit/9917e67cbb6c157382863bad8c741de58e3f3c2b"><code>9917e67</code></a> chore(ci): fix release-it arg; (<a href="https://redirect.github.com/axios/axios/issues/6032">#6032</a>)</li> <li><a href="https://github.com/axios/axios/commit/96ee232bd3ee4de2e657333d4d2191cd389e14d0"><code>96ee232</code></a> fix(CSRF): fixed CSRF vulnerability CVE-2023-45857 (<a href="https://redirect.github.com/axios/axios/issues/6028">#6028</a>)</li> <li><a href="https://github.com/axios/axios/commit/7d45ab2e2ad6e59f5475e39afd4b286b1f393fc0"><code>7d45ab2</code></a> chore(tests): fixed tests to pass in node v19 and v20 with <code>keep-alive</code> enabl...</li> <li><a href="https://github.com/axios/axios/commit/5aaff532a6b820bb9ab6a8cd0f77131b47e2adb8"><code>5aaff53</code></a> fix(dns): fixed lookup function decorator to work properly in node v20; (<a href="https://redirect.github.com/axios/axios/issues/6011">#6011</a>)</li> <li><a href="https://github.com/axios/axios/commit/a48a63ad823fc20e5a6a705f05f09842ca49f48c"><code>a48a63a</code></a> chore(docs): added AxiosHeaders docs; (<a href="https://redirect.github.com/axios/axios/issues/5932">#5932</a>)</li> <li><a href="https://github.com/axios/axios/commit/a1c8ad008b3c13d53e135bbd0862587fb9d3fc09"><code>a1c8ad0</code></a> fix(types): fix AxiosHeaders types; (<a href="https://redirect.github.com/axios/axios/issues/5931">#5931</a>)</li> <li><a href="https://github.com/axios/axios/commit/2ac731d60545ba5c4202c25fd2e732ddd8297d82"><code>2ac731d</code></a> chore(docs): update readme.md (<a href="https://redirect.github.com/axios/axios/issues/5889">#5889</a>)</li> <li><a href="https://github.com/axios/axios/commit/88fb52b5fad7aabab0532e7ad086c5f1b0178905"><code>88fb52b</code></a> chore(release): v1.5.1 (<a href="https://redirect.github.com/axios/axios/issues/5920">#5920</a>)</li> <li><a href="https://github.com/axios/axios/commit/e4107797a7a1376f6209fbecfbbce73d3faa7859"><code>e410779</code></a> fix(adapters): improved adapters loading logic to have clear error messages; ...</li> <li>Additional commits viewable in <a href="https://github.com/axios/axios/compare/v0.27.2...v1.6.0">compare view</a></li> </ul> </details> <br /> [![Dependabot compatibility score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=axios&package-manager=npm_and_yarn&previous-version=0.27.2&new-version=1.6.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores) Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting `dependabot rebase`. [//]: # (dependabot-automerge-start) Dependabot will merge this PR once CI passes on it, as requested by yaooqinn. [//]: # (dependabot-automerge-end) --- <details> <summary>Dependabot commands and options</summary> <br /> You can trigger Dependabot actions by commenting on this PR: - `dependabot rebase` will rebase this PR - `dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `dependabot merge` will merge this PR after your CI passes on it - `dependabot squash and merge` will squash and merge this PR after your CI passes on it - `dependabot cancel merge` will cancel a previously requested merge and block automerging - `dependabot reopen` will reopen this PR if it is closed - `dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `dependabot show <dependency name> ignore conditions` will show all of the ignore conditions of the specified dependency - `dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) You can disable automated security fix PRs for this repo from the [Security Alerts page](https://github.com/apache/kyuubi/network/alerts). </details> Closes apache#5671 from dependabot[bot]/dependabot/npm_and_yarn/kyuubi-server/web-ui/axios-1.6.0. Closes apache#5671 243d0c6 [Cheng Pan] license 2fa3217 [Cheng Pan] nit c21a873 [Cheng Pan] nit ae46d5d [Cheng Pan] Update kyuubi-server/web-ui/pnpm-lock.yaml 4adccd2 [dependabot[bot]] Bump axios from 0.27.2 to 1.6.0 in /kyuubi-server/web-ui Lead-authored-by: Cheng Pan <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Signed-off-by: Cheng Pan <[email protected]>
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #6028 +/- ##
============================================
- Coverage 61.11% 61.07% -0.04%
Complexity 23 23
============================================
Files 623 623
Lines 37136 37140 +4
Branches 5031 5031
============================================
- Hits 22695 22685 -10
- Misses 11997 12005 +8
- Partials 2444 2450 +6 ☔ View full report in Codecov by Sentry. |
cxzl25
approved these changes
Jan 30, 2024
pan3793
added a commit
that referenced
this pull request
Jan 30, 2024
…mit queue While enabling batch implementation V2 with the following configurations ``` kyuubi.batch.impl.version=2 kyuubi.batch.submitter.enabled=true kyuubi.batch.submitter.threads=48 spark.master=yarn spark.submit.deployMode=cluster spark.yarn.submit.waitAppCompletion=false ``` I found that the batch jobs will be blocked in the DB queue once a YARN queue has no resources, this brings an issue, the subsequential batch jobs that are going to be submitted to another YARN queue also be queued in DB, rather than YARN queue. ``` mysql> select state, engine_state, count(1) from metadata where state in ('INITIALIZED', 'PENDING', 'RUNNING') group by state, engine_state; +-------------+--------------+----------+ | state | engine_state | count(1) | +-------------+--------------+----------+ | INITIALIZED | NULL | 166 | | PENDING | NULL | 1 | | RUNNING | PENDING | 148 | | RUNNING | RUNNING | 415 | +-------------+--------------+----------+ ``` The submitter queue whose size is controlled by `kyuubi.batch.submitter.threads` is designed to address the `spark-submit` process concurrency issue, too many `spark-submit` processes may run out of the Kyuubi server's node CPU/memory resources and eventually crash the service. For Spark YARN cluster mode, if set `spark.yarn.submit.waitAppCompletion=false`, the local `spark-submit` process exits immediately once the Application goes ACCEPTED status, even no resource could be allocated for the AM container, we should not block such case in submitter queue. - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) Pass GA, and roll out into internal cluster. --- - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6028 from pan3793/batch-submit. Closes #6028 05fcc75 [Cheng Pan] Exited spark-submit process should not block batch submit queue Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]> (cherry picked from commit 208354c) Signed-off-by: Cheng Pan <[email protected]>
Thanks, merged to master/1.8 |
zhaohehuhu
pushed a commit
to zhaohehuhu/incubator-kyuubi
that referenced
this pull request
Feb 5, 2024
…ch submit queue # 🔍 Description ## Issue References 🔗 While enabling batch implementation V2 with the following configurations ``` kyuubi.batch.impl.version=2 kyuubi.batch.submitter.enabled=true kyuubi.batch.submitter.threads=48 spark.master=yarn spark.submit.deployMode=cluster spark.yarn.submit.waitAppCompletion=false ``` I found that the batch jobs will be blocked in the DB queue once a YARN queue has no resources, this brings an issue, the subsequential batch jobs that are going to be submitted to another YARN queue also be queued in DB, rather than YARN queue. ``` mysql> select state, engine_state, count(1) from metadata where state in ('INITIALIZED', 'PENDING', 'RUNNING') group by state, engine_state; +-------------+--------------+----------+ | state | engine_state | count(1) | +-------------+--------------+----------+ | INITIALIZED | NULL | 166 | | PENDING | NULL | 1 | | RUNNING | PENDING | 148 | | RUNNING | RUNNING | 415 | +-------------+--------------+----------+ ``` ## Describe Your Solution 🔧 The submitter queue whose size is controlled by `kyuubi.batch.submitter.threads` is designed to address the `spark-submit` process concurrency issue, too many `spark-submit` processes may run out of the Kyuubi server's node CPU/memory resources and eventually crash the service. For Spark YARN cluster mode, if set `spark.yarn.submit.waitAppCompletion=false`, the local `spark-submit` process exits immediately once the Application goes ACCEPTED status, even no resource could be allocated for the AM container, we should not block such case in submitter queue. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA, and roll out into internal cluster. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes apache#6028 from pan3793/batch-submit. Closes apache#6028 05fcc75 [Cheng Pan] Exited spark-submit process should not block batch submit queue Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
pan3793
added a commit
that referenced
this pull request
Feb 22, 2024
…se notes # 🔍 Description ## Issue References 🔗 Currently, we use a rather primitive way to manually write release notes from scratch, and some of the mechanical and repetitive work can be simplified by the scripts. ## Describe Your Solution 🔧 Adds a script to simplify the process of creating release notes. Note: it just simplifies some processes, the release manager still needs to tune the outputs by hand. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 ``` RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 build/release/pre_gen_release_notes.py ``` ``` $ head build/release/commits-v1.8.1.txt [KYUUBI #5981] Deploy Spark Hive connector with Scala 2.13 to Maven Central [KYUUBI #6058] Make Jetty server stop timeout configurable [KYUUBI #5952][1.8] Disconnect connections without running operations after engine maxlife time graceful period [KYUUBI #6048] Assign serviceNode and add volatile for variables [KYUUBI #5991] Error on reading Atlas properties composed of multi values [KYUUBI #6045] [REST] Sync the AdminRestApi with the AdminResource Apis [KYUUBI #6047] [CI] Free up disk space [KYUUBI #6036] JDBC driver conditional sets fetchSize on opening session [KYUUBI #6028] Exited spark-submit process should not block batch submit queue [KYUUBI #6018] Speed up GetTables operation for Spark session catalog ``` ``` $ head build/release/contributors-v1.8.1.txt * Shaoyun Chen -- [KYUUBI #5857][KYUUBI #5720][KYUUBI #5785][KYUUBI #5617] * Chao Chen -- [KYUUBI #5750] * Flyangz -- [KYUUBI #5832] * Pengqi Li -- [KYUUBI #5713] * Bowen Liang -- [KYUUBI #5730][KYUUBI #5802][KYUUBI #5767][KYUUBI #5831][KYUUBI #5801][KYUUBI #5754][KYUUBI #5626][KYUUBI #5811][KYUUBI #5853][KYUUBI #5765] * Paul Lin -- [KYUUBI #5799][KYUUBI #5814] * Senmiao Liu -- [KYUUBI #5969][KYUUBI #5244] * Xiao Liu -- [KYUUBI #5962] * Peiyue Liu -- [KYUUBI #5331] * Junjie Ma -- [KYUUBI #5789] ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes #6074 from pan3793/release-script. Closes #6074 3d5ec20 [Cheng Pan] credits 1765279 [Cheng Pan] Add a script to simplify the process of creating release notes Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
zhaohehuhu
pushed a commit
to zhaohehuhu/incubator-kyuubi
that referenced
this pull request
Mar 21, 2024
…ch submit queue # 🔍 Description ## Issue References 🔗 While enabling batch implementation V2 with the following configurations ``` kyuubi.batch.impl.version=2 kyuubi.batch.submitter.enabled=true kyuubi.batch.submitter.threads=48 spark.master=yarn spark.submit.deployMode=cluster spark.yarn.submit.waitAppCompletion=false ``` I found that the batch jobs will be blocked in the DB queue once a YARN queue has no resources, this brings an issue, the subsequential batch jobs that are going to be submitted to another YARN queue also be queued in DB, rather than YARN queue. ``` mysql> select state, engine_state, count(1) from metadata where state in ('INITIALIZED', 'PENDING', 'RUNNING') group by state, engine_state; +-------------+--------------+----------+ | state | engine_state | count(1) | +-------------+--------------+----------+ | INITIALIZED | NULL | 166 | | PENDING | NULL | 1 | | RUNNING | PENDING | 148 | | RUNNING | RUNNING | 415 | +-------------+--------------+----------+ ``` ## Describe Your Solution 🔧 The submitter queue whose size is controlled by `kyuubi.batch.submitter.threads` is designed to address the `spark-submit` process concurrency issue, too many `spark-submit` processes may run out of the Kyuubi server's node CPU/memory resources and eventually crash the service. For Spark YARN cluster mode, if set `spark.yarn.submit.waitAppCompletion=false`, the local `spark-submit` process exits immediately once the Application goes ACCEPTED status, even no resource could be allocated for the AM container, we should not block such case in submitter queue. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 Pass GA, and roll out into internal cluster. --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes apache#6028 from pan3793/batch-submit. Closes apache#6028 05fcc75 [Cheng Pan] Exited spark-submit process should not block batch submit queue Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
zhaohehuhu
pushed a commit
to zhaohehuhu/incubator-kyuubi
that referenced
this pull request
Mar 21, 2024
… release notes # 🔍 Description ## Issue References 🔗 Currently, we use a rather primitive way to manually write release notes from scratch, and some of the mechanical and repetitive work can be simplified by the scripts. ## Describe Your Solution 🔧 Adds a script to simplify the process of creating release notes. Note: it just simplifies some processes, the release manager still needs to tune the outputs by hand. ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 ``` RELEASE_TAG=v1.8.1 PREVIOUS_RELEASE_TAG=v1.8.0 build/release/pre_gen_release_notes.py ``` ``` $ head build/release/commits-v1.8.1.txt [KYUUBI apache#5981] Deploy Spark Hive connector with Scala 2.13 to Maven Central [KYUUBI apache#6058] Make Jetty server stop timeout configurable [KYUUBI apache#5952][1.8] Disconnect connections without running operations after engine maxlife time graceful period [KYUUBI apache#6048] Assign serviceNode and add volatile for variables [KYUUBI apache#5991] Error on reading Atlas properties composed of multi values [KYUUBI apache#6045] [REST] Sync the AdminRestApi with the AdminResource Apis [KYUUBI apache#6047] [CI] Free up disk space [KYUUBI apache#6036] JDBC driver conditional sets fetchSize on opening session [KYUUBI apache#6028] Exited spark-submit process should not block batch submit queue [KYUUBI apache#6018] Speed up GetTables operation for Spark session catalog ``` ``` $ head build/release/contributors-v1.8.1.txt * Shaoyun Chen -- [KYUUBI apache#5857][KYUUBI apache#5720][KYUUBI apache#5785][KYUUBI apache#5617] * Chao Chen -- [KYUUBI apache#5750] * Flyangz -- [KYUUBI apache#5832] * Pengqi Li -- [KYUUBI apache#5713] * Bowen Liang -- [KYUUBI apache#5730][KYUUBI apache#5802][KYUUBI apache#5767][KYUUBI apache#5831][KYUUBI apache#5801][KYUUBI apache#5754][KYUUBI apache#5626][KYUUBI apache#5811][KYUUBI apache#5853][KYUUBI apache#5765] * Paul Lin -- [KYUUBI apache#5799][KYUUBI apache#5814] * Senmiao Liu -- [KYUUBI apache#5969][KYUUBI apache#5244] * Xiao Liu -- [KYUUBI apache#5962] * Peiyue Liu -- [KYUUBI apache#5331] * Junjie Ma -- [KYUUBI apache#5789] ``` --- # Checklist 📝 - [x] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes apache#6074 from pan3793/release-script. Closes apache#6074 3d5ec20 [Cheng Pan] credits 1765279 [Cheng Pan] Add a script to simplify the process of creating release notes Authored-by: Cheng Pan <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
4 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
🔍 Description
Issue References 🔗
While enabling batch implementation V2 with the following configurations
I found that the batch jobs will be blocked in the DB queue once a YARN queue has no resources, this brings an issue, the subsequential batch jobs that are going to be submitted to another YARN queue also be queued in DB, rather than YARN queue.
Describe Your Solution 🔧
The submitter queue whose size is controlled by
kyuubi.batch.submitter.threads
is designed to address thespark-submit
process concurrency issue, too manyspark-submit
processes may run out of the Kyuubi server's node CPU/memory resources and eventually crash the service. For Spark YARN cluster mode, if setspark.yarn.submit.waitAppCompletion=false
, the localspark-submit
process exits immediately once the Application goes ACCEPTED status, even no resource could be allocated for the AM container, we should not block such case in submitter queue.Types of changes 🔖
Test Plan 🧪
Pass GA, and roll out into internal cluster.
Checklist 📝
Be nice. Be informative.