Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dk/alp validity in encoded only #2053

Draft
wants to merge 13 commits into
base: develop
Choose a base branch
from

Conversation

danking
Copy link
Member

@danking danking commented Jan 22, 2025

No description provided.

The patches are now always non-nullable.

This required PrimitiveArray::patch to gracefully handle non-nullable patches when the array is
nullable.

I modified the benchmarks to include patch manipulation time, but notice that the test data has no
patches. The benchmarks measure the overhead of `is_valid`. If we had test data where the invalid
positions contained exceptional values, I would expect a modest improvement in both decompression
and compression time.

As discussed [in slack](https://spiraldb.slack.com/archives/C07BV3GKAJ2/p1736894376100079), the
`is_valid` is expensive for two reasons: (a) creation of a Scalar and (b) conversion to an Arrow
BooleanBuffer. @gatesn is working on a FilterMask change that should subsume all our boolean
compression strategies into a new "BoolMask". With that in place, we can require Validity::Array to
hold a "BoolMask" and thus check validity without either of the issues above.

This PR
-------

```
Timer precision: 41 ns
alp_compress               fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ compress_alp                          │               │               │               │         │
│  ├─ f32                                │               │               │               │         │
│  │  ├─ (100000, 0.25)    170.2 µs      │ 442.1 µs      │ 170.7 µs      │ 173.6 µs      │ 100     │ 100
│  │  ├─ (100000, 0.95)    169.3 µs      │ 202.2 µs      │ 170.4 µs      │ 171.7 µs      │ 100     │ 100
│  │  ├─ (100000, 1.0)     136.3 µs      │ 147.1 µs      │ 136.9 µs      │ 137.2 µs      │ 100     │ 100
│  │  ├─ (10000000, 0.25)  13.6 ms       │ 16.46 ms      │ 14.39 ms      │ 14.23 ms      │ 100     │ 100
│  │  ├─ (10000000, 0.95)  13.64 ms      │ 17.7 ms       │ 14.44 ms      │ 14.38 ms      │ 100     │ 100
│  │  ╰─ (10000000, 1.0)   13.55 ms      │ 14.97 ms      │ 14.35 ms      │ 14.23 ms      │ 100     │ 100
│  ╰─ f64                                │               │               │               │         │
│     ├─ (100000, 0.25)    240.7 µs      │ 385.7 µs      │ 247.8 µs      │ 249.1 µs      │ 100     │ 100
│     ├─ (100000, 0.95)    240.2 µs      │ 253.2 µs      │ 243.5 µs      │ 244.4 µs      │ 100     │ 100
│     ├─ (100000, 1.0)     172.6 µs      │ 184.4 µs      │ 175.2 µs      │ 175.5 µs      │ 100     │ 100
│     ├─ (10000000, 0.25)  15.95 ms      │ 24.24 ms      │ 16.61 ms      │ 17.25 ms      │ 100     │ 100
│     ├─ (10000000, 0.95)  15.95 ms      │ 21.34 ms      │ 16.39 ms      │ 16.85 ms      │ 100     │ 100
│     ╰─ (10000000, 1.0)   15.92 ms      │ 23.41 ms      │ 16.46 ms      │ 17.04 ms      │ 100     │ 100
╰─ decompress_alp                        │               │               │               │         │
   ├─ f32                                │               │               │               │         │
   │  ├─ (100000, 0.25)    12.2 µs       │ 34.7 µs       │ 12.29 µs      │ 12.52 µs      │ 100     │ 100
   │  ├─ (100000, 0.95)    12.12 µs      │ 12.74 µs      │ 12.35 µs      │ 12.37 µs      │ 100     │ 100
   │  ├─ (100000, 1.0)     12.16 µs      │ 12.95 µs      │ 12.37 µs      │ 12.4 µs       │ 100     │ 100
   │  ├─ (10000000, 0.25)  2.117 ms      │ 4.544 ms      │ 2.637 ms      │ 2.674 ms      │ 100     │ 100
   │  ├─ (10000000, 0.95)  2.085 ms      │ 4.458 ms      │ 2.362 ms      │ 2.504 ms      │ 100     │ 100
   │  ╰─ (10000000, 1.0)   2.097 ms      │ 3.875 ms      │ 2.229 ms      │ 2.338 ms      │ 100     │ 100
   ╰─ f64                                │               │               │               │         │
      ├─ (100000, 0.25)    23.41 µs      │ 25.16 µs      │ 23.66 µs      │ 23.68 µs      │ 100     │ 100
      ├─ (100000, 0.95)    23.2 µs       │ 24.7 µs       │ 24.06 µs      │ 24.05 µs      │ 100     │ 100
      ├─ (100000, 1.0)     22.79 µs      │ 26.08 µs      │ 22.91 µs      │ 22.95 µs      │ 100     │ 100
      ├─ (10000000, 0.25)  4.216 ms      │ 6.862 ms      │ 4.416 ms      │ 4.568 ms      │ 100     │ 100
      ├─ (10000000, 0.95)  4.242 ms      │ 7.647 ms      │ 4.59 ms       │ 4.827 ms      │ 100     │ 100
      ╰─ (10000000, 1.0)   4.236 ms      │ 8.129 ms      │ 4.377 ms      │ 4.507 ms      │ 100     │ 100
```

Develop
-------

I patched develop with this PR's benchmark code

```
Timer precision: 41 ns
alp_compress               fastest       │ slowest       │ median        │ mean          │ samples │ iters
├─ compress_alp                          │               │               │               │         │
│  ├─ f32                                │               │               │               │         │
│  │  ├─ (100000, 0.25)    136 µs        │ 279.7 µs      │ 136.8 µs      │ 138.5 µs      │ 100     │ 100
│  │  ├─ (100000, 0.95)    136.2 µs      │ 149.7 µs      │ 136.9 µs      │ 137.6 µs      │ 100     │ 100
│  │  ├─ (100000, 1.0)     136 µs        │ 148.5 µs      │ 136.6 µs      │ 137.1 µs      │ 100     │ 100
│  │  ├─ (10000000, 0.25)  13.68 ms      │ 15.06 ms      │ 14.07 ms      │ 14.14 ms      │ 100     │ 100
│  │  ├─ (10000000, 0.95)  13.67 ms      │ 19.4 ms       │ 14.05 ms      │ 14.18 ms      │ 100     │ 100
│  │  ╰─ (10000000, 1.0)   13.66 ms      │ 14.73 ms      │ 13.87 ms      │ 14.04 ms      │ 100     │ 100
│  ╰─ f64                                │               │               │               │         │
│     ├─ (100000, 0.25)    167.7 µs      │ 301.4 µs      │ 172.7 µs      │ 173.2 µs      │ 100     │ 100
│     ├─ (100000, 0.95)    167.7 µs      │ 187.7 µs      │ 170.7 µs      │ 171.1 µs      │ 100     │ 100
│     ├─ (100000, 1.0)     167.6 µs      │ 183.7 µs      │ 170.9 µs      │ 171.1 µs      │ 100     │ 100
│     ├─ (10000000, 0.25)  15.54 ms      │ 21.9 ms       │ 15.92 ms      │ 16.05 ms      │ 100     │ 100
│     ├─ (10000000, 0.95)  15.61 ms      │ 16.74 ms      │ 15.97 ms      │ 16.03 ms      │ 100     │ 100
│     ╰─ (10000000, 1.0)   15.64 ms      │ 18.85 ms      │ 16.1 ms       │ 16.23 ms      │ 100     │ 100
╰─ decompress_alp                        │               │               │               │         │
   ├─ f32                                │               │               │               │         │
   │  ├─ (100000, 0.25)    12.37 µs      │ 85.49 µs      │ 12.49 µs      │ 13.22 µs      │ 100     │ 100
   │  ├─ (100000, 0.95)    12.29 µs      │ 12.74 µs      │ 12.43 µs      │ 12.44 µs      │ 100     │ 100
   │  ├─ (100000, 1.0)     11.7 µs       │ 11.95 µs      │ 11.83 µs      │ 11.81 µs      │ 100     │ 100
   │  ├─ (10000000, 0.25)  2.081 ms      │ 3.003 ms      │ 2.175 ms      │ 2.263 ms      │ 100     │ 100
   │  ├─ (10000000, 0.95)  2.082 ms      │ 2.228 ms      │ 2.109 ms      │ 2.124 ms      │ 100     │ 100
   │  ╰─ (10000000, 1.0)   2.082 ms      │ 2.904 ms      │ 2.13 ms       │ 2.202 ms      │ 100     │ 100
   ╰─ f64                                │               │               │               │         │
      ├─ (100000, 0.25)    23.66 µs      │ 25.66 µs      │ 23.83 µs      │ 23.86 µs      │ 100     │ 100
      ├─ (100000, 0.95)    23.16 µs      │ 24.62 µs      │ 24.04 µs      │ 23.89 µs      │ 100     │ 100
      ├─ (100000, 1.0)     22.87 µs      │ 23.49 µs      │ 23.04 µs      │ 23.03 µs      │ 100     │ 100
      ├─ (10000000, 0.25)  4.221 ms      │ 5.59 ms       │ 4.326 ms      │ 4.469 ms      │ 100     │ 100
      ├─ (10000000, 0.95)  4.242 ms      │ 5.319 ms      │ 4.545 ms      │ 4.536 ms      │ 100     │ 100
      ╰─ (10000000, 1.0)   4.228 ms      │ 5.652 ms      │ 4.342 ms      │ 4.519 ms      │ 100     │ 100
```
@danking danking added the benchmark Run benchmarks on this branch label Jan 22, 2025
@github-actions github-actions bot removed the benchmark Run benchmarks on this branch label Jan 22, 2025
@danking
Copy link
Member Author

danking commented Jan 22, 2025

previously: #1951

Copy link
Contributor

Benchmarks: random_access

Table of Results
name PR 47ddb6a base ee7abec ratio (PR/base) unit
random-access/vortex-tokio-local-disk 2.27409e+06 2.0574e+06 1.10532 ns
random-access/vortex-local-fs 2.80325e+06 2.4822e+06 1.12934 ns
random-access/parquet-tokio-local-disk 2.40006e+08 2.20467e+08 1.08862 ns

Copy link
Contributor

Benchmarks: datafusion

Table of Results
name PR 47ddb6a base ee7abec ratio (PR/base) unit
arrow/planning 941690 946100 0.995339 ns
arrow/exec 1.99437e+06 1.97353e+06 1.01056 ns
vortex-pushdown-compressed/planning 576362 577093 0.998733 ns
vortex-pushdown-compressed/exec 2.71224e+06 2.72758e+06 0.994379 ns
vortex-pushdown-uncompressed/planning 573850 585435 0.980211 ns
vortex-pushdown-uncompressed/exec 1.55374e+06 1.57173e+06 0.988554 ns
vortex-nopushdown-compressed/planning 949597 946758 1.003 ns
vortex-nopushdown-compressed/exec 3.10029e+06 3.24769e+06 0.954612 ns
vortex-nopushdown-uncompressed/planning 949927 955033 0.994653 ns
vortex-nopushdown-uncompressed/exec 5.18226e+06 5.1997e+06 0.996646 ns

Copy link
Contributor

Benchmarks: Clickbench

Table of Results
name PR 47ddb6a base ee7abec ratio (PR/base) unit
clickbench_q00/parquet 1842798 2.22916e+06 0.826677 ns
clickbench_q01/parquet 63075960 6.72286e+07 0.938231 ns
clickbench_q02/parquet 118515311 1.27739e+08 0.927792 ns
clickbench_q03/parquet 85854826 8.76762e+07 0.979226 ns
clickbench_q04/parquet 694157718 6.95585e+08 0.997948 ns
clickbench_q05/parquet 863553932 8.48193e+08 1.01811 ns
clickbench_q06/parquet 1944455 1.9152e+06 1.01528 ns
clickbench_q07/parquet 64348473 6.58015e+07 0.977918 ns
clickbench_q08/parquet 784103235 7.75254e+08 1.01141 ns
clickbench_q09/parquet 1095136232 1.07099e+09 1.02254 ns
clickbench_q10/parquet 262424750 2.62154e+08 1.00103 ns
clickbench_q11/parquet 307331900 3.11871e+08 0.985447 ns
clickbench_q12/parquet 864086174 8.63919e+08 1.00019 ns
clickbench_q13/parquet 1161506122 1.17411e+09 0.989267 ns
clickbench_q14/parquet 872684960 8.66189e+08 1.0075 ns
clickbench_q15/parquet 811117118 7.68233e+08 1.05582 ns
clickbench_q16/parquet 1741356461 1.70598e+09 1.02074 ns
clickbench_q17/parquet 1477164961 1.50937e+09 0.978665 ns
clickbench_q18/parquet 3054220387 3.15249e+09 0.968827 ns
clickbench_q19/parquet 64169428 6.6442e+07 0.965796 ns
clickbench_q20/parquet 1239762688 1.22519e+09 1.01189 ns
clickbench_q21/parquet 1397237235 1.45289e+09 0.961695 ns
clickbench_q22/parquet 2444405515 2.46858e+09 0.990209 ns
clickbench_q23/parquet 8289417943 8.46343e+09 0.979439 ns
clickbench_q24/parquet 535194187 5.33741e+08 1.00272 ns
clickbench_q25/parquet 522428558 5.08943e+08 1.0265 ns
clickbench_q26/parquet 591874189 5.95383e+08 0.994107 ns
clickbench_q27/parquet 1642537148 1.65142e+09 0.994623 ns
clickbench_q28/parquet 11500316221 1.13015e+10 1.0176 ns
clickbench_q29/parquet 426936947 4.30876e+08 0.990859 ns
clickbench_q30/parquet 783053726 7.80748e+08 1.00295 ns
clickbench_q31/parquet 808928733 8.13262e+08 0.994671 ns
clickbench_q32/parquet 2776089570 2.86576e+09 0.968711 ns
clickbench_q33/parquet 2909274909 2.91797e+09 0.99702 ns
clickbench_q34/parquet 2825298108 2.75928e+09 1.02392 ns
clickbench_q35/parquet 860467593 8.47991e+08 1.01471 ns
clickbench_q36/parquet 174704145 1.75245e+08 0.996914 ns
clickbench_q37/parquet 86428609 8.65908e+07 0.998127 ns
clickbench_q38/parquet 115545777 1.13534e+08 1.01772 ns
clickbench_q39/parquet 350876082 3.28117e+08 1.06936 ns
clickbench_q40/parquet 55706859 4.95501e+07 1.12425 ns
clickbench_q41/parquet 49439108 4.93016e+07 1.00279 ns
clickbench_q42/parquet 69636692 6.96685e+07 0.999544 ns
clickbench_q00/vortex-file-compressed 2009884 1.92749e+06 1.04275 ns
clickbench_q01/vortex-file-compressed 28563955 2.71911e+07 1.05049 ns
clickbench_q02/vortex-file-compressed 82216453 8.64883e+07 0.950608 ns
clickbench_q03/vortex-file-compressed 80412382 7.89193e+07 1.01892 ns
clickbench_q04/vortex-file-compressed 628318210 6.45918e+08 0.972752 ns
clickbench_q05/vortex-file-compressed 655190977 6.58859e+08 0.994433 ns
clickbench_q06/vortex-file-compressed 2090151 2.12875e+06 0.981868 ns
clickbench_q07/vortex-file-compressed 56935279 5.45219e+07 1.04426 ns
clickbench_q08/vortex-file-compressed 761953447 7.69171e+08 0.990617 ns
clickbench_q09/vortex-file-compressed 938015968 9.4969e+08 0.987707 ns
clickbench_q10/vortex-file-compressed 275819271 2.64699e+08 1.04201 ns
clickbench_q11/vortex-file-compressed 322406282 3.20442e+08 1.00613 ns
clickbench_q12/vortex-file-compressed 604144099 5.94991e+08 1.01538 ns
clickbench_q13/vortex-file-compressed 937704013 9.19434e+08 1.01987 ns
clickbench_q14/vortex-file-compressed 610114298 6.05295e+08 1.00796 ns
clickbench_q15/vortex-file-compressed 774528673 7.67532e+08 1.00912 ns
clickbench_q16/vortex-file-compressed 1438646246 1.42747e+09 1.00783 ns
clickbench_q17/vortex-file-compressed 1322132304 1.38722e+09 0.953079 ns
clickbench_q18/vortex-file-compressed 2855611260 2.89589e+09 0.986092 ns
clickbench_q19/vortex-file-compressed 40917902 4.57227e+07 0.894915 ns
clickbench_q20/vortex-file-compressed 505600179 5.14954e+08 0.981836 ns
clickbench_q21/vortex-file-compressed 767860322 7.70213e+08 0.996945 ns
clickbench_q22/vortex-file-compressed 1894356847 1.90139e+09 0.996299 ns
clickbench_q23/vortex-file-compressed 3717772552 3.73403e+09 0.995645 ns
clickbench_q24/vortex-file-compressed 374107342 3.76427e+08 0.993838 ns
clickbench_q25/vortex-file-compressed 332298312 3.32238e+08 1.00018 ns
clickbench_q26/vortex-file-compressed 425414657 4.65382e+08 0.914119 ns
clickbench_q27/vortex-file-compressed 1417967474 1.43135e+09 0.990651 ns
clickbench_q28/vortex-file-compressed 10861323993 1.08821e+10 0.998094 ns
clickbench_q29/vortex-file-compressed 706476934 6.93902e+08 1.01812 ns
clickbench_q30/vortex-file-compressed 591605992 5.91772e+08 0.999719 ns
clickbench_q31/vortex-file-compressed 614489036 6.11026e+08 1.00567 ns
clickbench_q32/vortex-file-compressed 2787199239 2.88495e+09 0.966117 ns
clickbench_q33/vortex-file-compressed 2217570266 2.33551e+09 0.949502 ns
clickbench_q34/vortex-file-compressed 2223051777 2.31571e+09 0.959986 ns
clickbench_q35/vortex-file-compressed 955634008 9.91634e+08 0.963696 ns
clickbench_q36/vortex-file-compressed 75478588 3.99229e+07 1.89061 ns
clickbench_q37/vortex-file-compressed 60033474 3.5985e+07 1.66829 ns
clickbench_q38/vortex-file-compressed 53437882 3.33813e+07 1.60083 ns
clickbench_q39/vortex-file-compressed 130890602 5.92301e+07 2.20987 ns
clickbench_q40/vortex-file-compressed 30841602 2.64551e+07 1.16581 ns
clickbench_q41/vortex-file-compressed 31369537 2.64928e+07 1.18408 ns
clickbench_q42/vortex-file-compressed 45551424 3.0369e+07 1.49993 ns

let (encoded, exceptional_positions) = T::chunked_encode(values.as_slice::<T>(), exponents);

let encoded_array = PrimitiveArray::new(encoded, values.validity()).into_array();
let exceptional_positions = match values.logical_validity() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we add like 1-2 comments in this section just to make it clear what is going on

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants