-
Notifications
You must be signed in to change notification settings - Fork 919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[KYUUBI #6070] Improve perf on assembling row-based TRowSet #6077
Conversation
kyuubi-common/src/main/scala/org/apache/kyuubi/engine/result/TRowSetGenerator.scala
Outdated
Show resolved
Hide resolved
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #6077 +/- ##
============================================
- Coverage 61.12% 61.08% -0.04%
Complexity 23 23
============================================
Files 623 623
Lines 37206 37200 -6
Branches 5041 5040 -1
============================================
- Hits 22741 22725 -16
+ Misses 12012 12011 -1
- Partials 2453 2464 +11 ☔ View full report in Codecov by Sentry. |
kyuubi-common/src/main/scala/org/apache/kyuubi/engine/result/TRowSetGenerator.scala
Outdated
Show resolved
Hide resolved
Thanks, merged to master |
TColumnGenerator.getColumnToList was missed? |
# 🔍 Description ## Issue References 🔗 This pull request fixes apache#6070 ## Describe Your Solution 🔧 https://issues.apache.org/jira/browse/SPARK-47085 ## Types of changes 🔖 - [ ] Bugfix (non-breaking change which fixes an issue) - [ ] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to change) ## Test Plan 🧪 #### Behavior Without This Pull Request ⚰️ #### Behavior With This Pull Request 🎉 #### Related Unit Tests --- # Checklist 📝 - [ ] This patch was not authored or co-authored using [Generative Tooling](https://www.apache.org/legal/generative-tooling.html) **Be nice. Be informative.** Closes apache#6077 from Kwafoor/kyuubi_6070. Closes apache#6070 84114f3 [wangjunbo] fix 90a1256 [wangjunbo] fix 97db3c9 [wangjunbo] fix 5442296 [wangjunbo] [KYUUBI apache#6070] Performance Improvement for converting rows to thrift rows Authored-by: wangjunbo <[email protected]> Signed-off-by: Cheng Pan <[email protected]>
I'm sorry to bother you, but it seems that this issue hasn't been fully resolved. The logic related to toRowBasedSet has been fixed, but in the toColumnBasedSet, the getColumnToList method still involves similar access to a non-IndexedSeq with val row = rows(idx). This results in significantly impacted serialization speed when Hive JDBC statements have a large fetchSize (> 300).
|
@hh-cn Can you send a PR for that issue? |
Can you create an issue then? I'm fully booked and don't have time to fix this. But I'm sure someone will be interested in fixing it. |
🔍 Description
Issue References 🔗
This pull request fixes #6070
Describe Your Solution 🔧
https://issues.apache.org/jira/browse/SPARK-47085
Types of changes 🔖
Test Plan 🧪
Behavior Without This Pull Request ⚰️
Behavior With This Pull Request 🎉
Related Unit Tests
Checklist 📝
Be nice. Be informative.