Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] Pullout pre-project for ExpandExec #5066

Merged
merged 1 commit into from
Mar 21, 2024

Conversation

liujiayi771
Copy link
Contributor

@liujiayi771 liujiayi771 commented Mar 21, 2024

What changes were proposed in this pull request?

Support pullout pre-project for ExpandExec.

How was this patch tested?

These test case will have expressions in Expand's projections.

  • GlutenParquetV1SchemaPruningSuite
    • Spark vectorized reader - without partition data column - select nested field in Expand
    • Spark vectorized reader - with partition data column - select nested field in Expand
    • Non-vectorized reader - without partition data column - select nested field in Expand
    • Non-vectorized reader - with partition data column - select nested field in Expand
  • GlutenParquetV2SchemaPruningSuite
    • Spark vectorized reader - without partition data column - select nested field in Expand
    • Spark vectorized reader - with partition data column - select nested field in Expand
    • Non-vectorized reader - without partition data column - select nested field in Expand
    • Non-vectorized reader - with partition data column - select nested field in Expand

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/apache/incubator-gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@liujiayi771 liujiayi771 marked this pull request as ready for review March 21, 2024 06:24
@liujiayi771
Copy link
Contributor Author

@ulysses-you Could you help to review?

@ulysses-you ulysses-you merged commit 3ee108f into apache:main Mar 21, 2024
19 of 21 checks passed
@liujiayi771 liujiayi771 deleted the expand-pullout branch March 21, 2024 11:59
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_5066_time.csv log/native_master_03_20_2024_e1f0c01a7_time.csv difference percentage
q1 36.41 38.53 2.115 105.81%
q2 23.85 24.00 0.149 100.62%
q3 36.84 37.50 0.657 101.78%
q4 36.65 38.61 1.962 105.35%
q5 69.79 71.63 1.840 102.64%
q6 5.87 7.37 1.495 125.48%
q7 84.75 83.96 -0.792 99.07%
q8 84.35 84.46 0.106 100.13%
q9 119.95 125.15 5.199 104.33%
q10 44.58 45.58 0.999 102.24%
q11 20.03 20.52 0.487 102.43%
q12 28.38 28.80 0.426 101.50%
q13 46.78 48.70 1.924 104.11%
q14 16.96 22.25 5.284 131.15%
q15 31.93 31.14 -0.791 97.52%
q16 14.60 15.42 0.819 105.61%
q17 100.02 102.47 2.450 102.45%
q18 143.03 143.60 0.569 100.40%
q19 13.71 13.64 -0.069 99.50%
q20 26.77 27.02 0.254 100.95%
q21 228.16 225.52 -2.639 98.84%
q22 14.07 13.96 -0.114 99.19%
total 1227.48 1249.81 22.330 101.82%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants