Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for non-static encodings #2066

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

gatesn
Copy link
Contributor

@gatesn gatesn commented Jan 23, 2025

Still some more changes to make here, but how do we feel about Arc<dyn EncodingVTable>? Internally, arrays have a LazyLock for cloning the same Arc'd value.

This will allow us to heap allocate encodings with WASM implementations.

@gatesn gatesn requested a review from lwwmanning January 23, 2025 21:32
@gatesn gatesn added the benchmark Run benchmarks on this branch label Jan 23, 2025
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Jan 23, 2025
Copy link
Contributor

Benchmarks: random_access

Table of Results
name PR 976da07 base 1d36132 ratio (PR/base) unit
random-access/vortex-tokio-local-disk 2.50528e+06 2.62011e+06 0.956174 ns
random-access/vortex-local-fs 3.19049e+06 3.30901e+06 0.964185 ns
random-access/parquet-tokio-local-disk 2.16088e+08 2.21792e+08 0.974284 ns

Copy link
Contributor

Benchmarks: datafusion

Table of Results
name PR 976da07 base 1d36132 ratio (PR/base) unit
arrow/planning 957705 955104 1.00272 ns
arrow/exec 1.99503e+06 2.01344e+06 0.990856 ns
vortex-compressed/planning 587858 586488 1.00234 ns
vortex-compressed/exec 2.70319e+06 2.71034e+06 0.997364 ns
vortex-uncompressed/planning 587814 585229 1.00442 ns
vortex-uncompressed/exec 1.55511e+06 1.55866e+06 0.997721 ns

Copy link
Contributor

Benchmarks: TPC-H

Table of Results
name PR 976da07 base 1d36132 ratio (PR/base) unit
tpch_q01/arrow 542256970 5.49082e+08 0.987571 ns
tpch_q01/parquet 765404301 7.72487e+08 0.990831 ns
tpch_q01/vortex-file-compressed 522802534 5.3619e+08 0.975032 ns
tpch_q02/arrow 138400328 1.42769e+08 0.969403 ns
tpch_q02/parquet 173189015 1.77336e+08 0.976615 ns
tpch_q02/vortex-file-compressed 144032765 1.50431e+08 0.95747 ns
tpch_q03/arrow 174526554 1.75393e+08 0.995058 ns
tpch_q03/parquet 374102734 3.76328e+08 0.994087 ns
tpch_q03/vortex-file-compressed 231941977 2.36388e+08 0.981191 ns
tpch_q04/arrow 177610974 1.76527e+08 1.00614 ns
tpch_q04/parquet 216488527 2.24674e+08 0.963568 ns
tpch_q04/vortex-file-compressed 172979661 1.75805e+08 0.98393 ns
tpch_q05/arrow 321223473 3.2567e+08 0.986345 ns
tpch_q05/parquet 496943929 5.05141e+08 0.983773 ns
tpch_q05/vortex-file-compressed 374304790 3.78238e+08 0.989602 ns
tpch_q06/arrow 25235751 2.59607e+07 0.972076 ns
tpch_q06/parquet 146600834 1.50801e+08 0.972149 ns
tpch_q06/vortex-file-compressed 65802927 6.52746e+07 1.00809 ns
tpch_q07/arrow 611176835 6.31055e+08 0.9685 ns
tpch_q07/parquet 747916675 7.70841e+08 0.970261 ns
tpch_q07/vortex-file-compressed 629353845 6.50893e+08 0.966908 ns
tpch_q08/arrow 269843650 2.73412e+08 0.986949 ns
tpch_q08/parquet 543059802 5.57444e+08 0.974195 ns
tpch_q08/vortex-file-compressed 364562535 3.61058e+08 1.00971 ns
tpch_q09/arrow 470162602 4.82854e+08 0.973716 ns
tpch_q09/parquet 779187175 7.98464e+08 0.975858 ns
tpch_q09/vortex-file-compressed 596334588 6.0579e+08 0.984391 ns
tpch_q10/arrow 260358334 2.60901e+08 0.997919 ns
tpch_q10/parquet 504062661 5.09425e+08 0.989473 ns
tpch_q10/vortex-file-compressed 284320584 2.83051e+08 1.00448 ns
tpch_q11/arrow 136000577 1.38797e+08 0.979853 ns
tpch_q11/parquet 142734519 1.47443e+08 0.968065 ns
tpch_q11/vortex-file-compressed 131250809 1.34036e+08 0.979219 ns
tpch_q12/arrow 182761684 1.82964e+08 0.998896 ns
tpch_q12/parquet 324536722 3.25584e+08 0.996783 ns
tpch_q12/vortex-file-compressed 257282929 2.60808e+08 0.986482 ns
tpch_q13/arrow 165446329 1.68035e+08 0.984593 ns
tpch_q13/parquet 298702327 3.1406e+08 0.951099 ns
tpch_q13/vortex-file-compressed 179748448 1.79007e+08 1.00414 ns
tpch_q14/arrow 35182794 3.75003e+07 0.938201 ns
tpch_q14/parquet 228224916 2.36181e+08 0.966312 ns
tpch_q14/vortex-file-compressed 76292148 7.9851e+07 0.955432 ns
tpch_q15/arrow 66459772 6.82557e+07 0.973689 ns
tpch_q15/parquet 322292369 3.27624e+08 0.983726 ns
tpch_q15/vortex-file-compressed 130925826 1.34481e+08 0.973564 ns
tpch_q16/arrow 96080296 9.88606e+07 0.971876 ns
tpch_q16/parquet 112051513 1.11792e+08 1.00232 ns
tpch_q16/vortex-file-compressed 103233590 1.03176e+08 1.00056 ns
tpch_q17/arrow 591017500 6.20673e+08 0.952221 ns
tpch_q17/parquet 668982089 7.0725e+08 0.945892 ns
tpch_q17/vortex-file-compressed 596292710 6.19487e+08 0.962559 ns
tpch_q18/arrow 1079814242 1.13064e+09 0.95505 ns
tpch_q18/parquet 1311590100 1.33451e+09 0.982829 ns
tpch_q18/vortex-file-compressed 1155818842 1.17145e+09 0.986659 ns
tpch_q19/arrow 150001597 1.50214e+08 0.998584 ns
tpch_q19/parquet 412735867 4.18875e+08 0.985344 ns
tpch_q19/vortex-file-compressed 149803231 1.50768e+08 0.9936 ns
tpch_q20/arrow 174748436 1.80828e+08 0.966377 ns
tpch_q20/parquet 313072198 3.19009e+08 0.98139 ns
tpch_q20/vortex-file-compressed 207111636 2.16685e+08 0.95582 ns
tpch_q21/arrow 975965777 9.96145e+08 0.979743 ns
tpch_q21/parquet 1100913974 1.12573e+09 0.977957 ns
tpch_q21/vortex-file-compressed 970764020 9.90732e+08 0.979845 ns
tpch_q22/arrow 76628314 7.7492e+07 0.988855 ns
tpch_q22/parquet 107577414 1.08957e+08 0.987337 ns
tpch_q22/vortex-file-compressed 82741134 8.50379e+07 0.972991 ns

@robert3005
Copy link
Member

I think this is preferable. The static thing always felt like a short term workaround

Copy link
Contributor

Benchmarks: Clickbench

Table of Results
name PR 976da07 base 1d36132 ratio (PR/base) unit
clickbench_q00/parquet 1912147 1.84216e+06 1.03799 ns
clickbench_q01/parquet 59710082 6.10587e+07 0.977913 ns
clickbench_q02/parquet 115819202 1.17718e+08 0.983873 ns
clickbench_q03/parquet 84122980 8.19706e+07 1.02626 ns
clickbench_q04/parquet 648252280 6.64052e+08 0.976207 ns
clickbench_q05/parquet 836881308 8.30202e+08 1.00805 ns
clickbench_q06/parquet 1989473 1.94516e+06 1.02278 ns
clickbench_q07/parquet 62541496 6.26823e+07 0.997754 ns
clickbench_q08/parquet 747259418 7.45145e+08 1.00284 ns
clickbench_q09/parquet 1047563979 1.04093e+09 1.00638 ns
clickbench_q10/parquet 251626963 2.53184e+08 0.993851 ns
clickbench_q11/parquet 301946647 3.05789e+08 0.987435 ns
clickbench_q12/parquet 835365368 8.15683e+08 1.02413 ns
clickbench_q13/parquet 1089672688 1.06036e+09 1.02764 ns
clickbench_q14/parquet 850394197 8.40992e+08 1.01118 ns
clickbench_q15/parquet 766391106 7.73472e+08 0.990845 ns
clickbench_q16/parquet 1716181099 1.65755e+09 1.03537 ns
clickbench_q17/parquet 1428751950 1.43566e+09 0.995189 ns
clickbench_q18/parquet 3030405805 3.00494e+09 1.00847 ns
clickbench_q19/parquet 66025654 6.41428e+07 1.02935 ns
clickbench_q20/parquet 1256811202 1.19311e+09 1.05339 ns
clickbench_q21/parquet 1416508228 1.42357e+09 0.995041 ns
clickbench_q22/parquet 2412245037 2.44051e+09 0.988418 ns
clickbench_q23/parquet 8268689700 8.32308e+09 0.993465 ns
clickbench_q24/parquet 532939702 5.30821e+08 1.00399 ns
clickbench_q25/parquet 514041485 5.12806e+08 1.00241 ns
clickbench_q26/parquet 601315192 5.90394e+08 1.0185 ns
clickbench_q27/parquet 1617566444 1.61425e+09 1.00205 ns
clickbench_q28/parquet 11486084235 1.15588e+10 0.993705 ns
clickbench_q29/parquet 428636531 4.37618e+08 0.979476 ns
clickbench_q30/parquet 768644118 7.82011e+08 0.982907 ns
clickbench_q31/parquet 805708077 8.33563e+08 0.966584 ns
clickbench_q32/parquet 2768350449 2.8165e+09 0.982904 ns
clickbench_q33/parquet 2838446645 2.88288e+09 0.984589 ns
clickbench_q34/parquet 2833271064 2.81636e+09 1.006 ns
clickbench_q35/parquet 853487287 8.61993e+08 0.990133 ns
clickbench_q36/parquet 171007417 1.75881e+08 0.972292 ns
clickbench_q37/parquet 87177787 8.66685e+07 1.00588 ns
clickbench_q38/parquet 112358154 1.14515e+08 0.981165 ns
clickbench_q39/parquet 328359111 3.23325e+08 1.01557 ns
clickbench_q40/parquet 49718652 5.106e+07 0.97373 ns
clickbench_q41/parquet 49188102 4.98389e+07 0.986942 ns
clickbench_q42/parquet 67198987 6.78308e+07 0.990685 ns
clickbench_q00/vortex-file-compressed 2061306 2.04523e+06 1.00786 ns
clickbench_q01/vortex-file-compressed 29578922 2.77991e+07 1.06402 ns
clickbench_q02/vortex-file-compressed 91914958 8.96519e+07 1.02524 ns
clickbench_q03/vortex-file-compressed 82567411 8.04188e+07 1.02672 ns
clickbench_q04/vortex-file-compressed 625199310 6.3389e+08 0.98629 ns
clickbench_q05/vortex-file-compressed 650419953 6.45317e+08 1.00791 ns
clickbench_q06/vortex-file-compressed 2147397 2.11042e+06 1.01752 ns
clickbench_q07/vortex-file-compressed 60089550 5.80498e+07 1.03514 ns
clickbench_q08/vortex-file-compressed 735808813 7.59196e+08 0.969195 ns
clickbench_q09/vortex-file-compressed 946062062 9.59783e+08 0.985704 ns
clickbench_q10/vortex-file-compressed 286873848 2.54635e+08 1.12661 ns
clickbench_q11/vortex-file-compressed 343358644 3.09823e+08 1.10824 ns
clickbench_q12/vortex-file-compressed 588696753 5.90131e+08 0.997569 ns
clickbench_q13/vortex-file-compressed 914780987 9.07128e+08 1.00844 ns
clickbench_q14/vortex-file-compressed 610657627 6.00085e+08 1.01762 ns
clickbench_q15/vortex-file-compressed 753764997 7.40349e+08 1.01812 ns
clickbench_q16/vortex-file-compressed 1451209163 1.40387e+09 1.03372 ns
clickbench_q17/vortex-file-compressed 1334665666 1.30415e+09 1.0234 ns
clickbench_q18/vortex-file-compressed 2970242121 2.93385e+09 1.0124 ns
clickbench_q19/vortex-file-compressed 46986414 4.3393e+07 1.08281 ns
clickbench_q20/vortex-file-compressed 496415639 5.0538e+08 0.982263 ns
clickbench_q21/vortex-file-compressed 762308666 7.71493e+08 0.988095 ns
clickbench_q22/vortex-file-compressed 1844472938 1.9305e+09 0.95544 ns
clickbench_q23/vortex-file-compressed 3975814229 4.00298e+09 0.993214 ns
clickbench_q24/vortex-file-compressed 363572514 3.59923e+08 1.01014 ns
clickbench_q25/vortex-file-compressed 322441034 3.22661e+08 0.999317 ns
clickbench_q26/vortex-file-compressed 422349536 4.18232e+08 1.00985 ns
clickbench_q27/vortex-file-compressed 1373843874 1.40692e+09 0.976489 ns
clickbench_q28/vortex-file-compressed 10710365575 1.07256e+10 0.998577 ns
clickbench_q29/vortex-file-compressed 696337853 6.78528e+08 1.02625 ns
clickbench_q30/vortex-file-compressed 583279254 5.9261e+08 0.984255 ns
clickbench_q31/vortex-file-compressed 619511521 6.20059e+08 0.999117 ns
clickbench_q32/vortex-file-compressed 2810488214 2.79847e+09 1.00429 ns
clickbench_q33/vortex-file-compressed 2196408568 2.22569e+09 0.986844 ns
clickbench_q34/vortex-file-compressed 2198647797 2.21627e+09 0.992047 ns
clickbench_q35/vortex-file-compressed 943898166 9.46139e+08 0.997632 ns
clickbench_q36/vortex-file-compressed 52564192 4.57112e+07 1.14992 ns
clickbench_q37/vortex-file-compressed 41124588 4.25824e+07 0.965765 ns
clickbench_q38/vortex-file-compressed 41673250 3.84227e+07 1.0846 ns
clickbench_q39/vortex-file-compressed 83812430 7.27625e+07 1.15186 ns
clickbench_q40/vortex-file-compressed 28242321 2.88393e+07 0.9793 ns
clickbench_q41/vortex-file-compressed 28912431 3.0271e+07 0.955121 ns
clickbench_q42/vortex-file-compressed 34644137 3.35341e+07 1.0331 ns

@@ -523,18 +523,20 @@ impl IntoCanonical for ArrayData {
if !self.is_canonical() && self.len() > 1 {
log::trace!("Canonicalizing array with encoding {:?}", self.encoding());
}
self.encoding().into_canonical(self)
self.encoding().clone().into_canonical(self)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we can save on a bunch of these clones by changing into_canonical/into_arrow to take self: Arc<Self>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants