fix: Optimize subscript and array/map filter in favor of memory #11608

bikramSingh91 · 2024-11-20T19:25:01Z

Summary:
Currently, these functions wrap the underlying element vectors of the
map/array with a dictionary layer where the indices point to the
selected/remaining elements. If the element vector is large and only
a small subset of elements are selected, then this can create
counterproductive dictionaries where the base is much larger than the
dictionary. This can then result in large amounts of memory being
passed up the execution pipeline, holding onto memory, and
furthermore, expression eval can peel these vectors and operate on
the large bases that can end up creating intermediate vectors of
similar large size. This can put further memory pressure on queries,
causing them to hit their memory limits, resulting in failures or
spills.

This change ensures that the aforementioned functions that can
generate these kinds of results would instead flatten the elements
vector if the size of the dictionary is less than 1/8th the size of
the base (elements vector).

This helped reduce memory usage in the filterProject operator in a
particular query from 2.6GB to 380MB.

Differential Revision: D66253163

netlify · 2024-11-20T19:25:15Z

✅ Deploy Preview for meta-velox canceled.

Name	Link
🔨 Latest commit	`15a05dc`
🔍 Latest deploy log	https://app.netlify.com/sites/meta-velox/deploys/673f7ecd55937f000894e110

facebook-github-bot · 2024-11-20T19:25:16Z

This pull request was exported from Phabricator. Differential Revision: D66253163

velox/vector/BaseVector.cpp

…bookincubator#11608) Summary: Currently, these functions wrap the underlying element vectors of the map/array with a dictionary layer where the indices point to the selected/remaining elements. If the element vector is large and only a small subset of elements are selected, then this can create counterproductive dictionaries where the base is much larger than the dictionary. This can then result in large amounts of memory being passed up the execution pipeline, holding onto memory, and furthermore, expression eval can peel these vectors and operate on the large bases that can end up creating intermediate vectors of similar large size. This can put further memory pressure on queries, causing them to hit their memory limits, resulting in failures or spills. This change ensures that the aforementioned functions that can generate these kinds of results would instead flatten the elements vector if the size of the dictionary is less than 1/8th the size of the base (elements vector). This helped reduce memory usage in the filterProject operator in a particular query from 2.6GB to 380MB. Differential Revision: D66253163

facebook-github-bot · 2024-11-21T18:41:30Z

This pull request was exported from Phabricator. Differential Revision: D66253163

bikramSingh91 · 2024-11-21T21:54:40Z

Linux build failed due to unrelated test. Filed an issue for it: #11619

facebook-github-bot · 2024-11-21T22:17:34Z

This pull request has been merged in 8d91c1c.

conbench-facebook · 2024-11-21T23:10:37Z

Conbench analyzed the 1 benchmark run on commit 8d91c1cc.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 20, 2024

facebook-github-bot added the fb-exported label Nov 20, 2024

bikramSingh91 requested a review from Yuhta November 20, 2024 19:25

Yuhta reviewed Nov 20, 2024

View reviewed changes

velox/vector/BaseVector.cpp Outdated Show resolved Hide resolved

bikramSingh91 force-pushed the export-D66253163 branch from e7a06a0 to 15a05dc Compare November 21, 2024 18:41

bikramSingh91 requested a review from Yuhta November 21, 2024 19:22

Yuhta approved these changes Nov 21, 2024

View reviewed changes

facebook-github-bot closed this in 8d91c1c Nov 21, 2024

facebook-github-bot added the Merged label Nov 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Optimize subscript and array/map filter in favor of memory #11608

fix: Optimize subscript and array/map filter in favor of memory #11608

bikramSingh91 commented Nov 20, 2024

netlify bot commented Nov 20, 2024 •

edited

Loading

facebook-github-bot commented Nov 20, 2024

facebook-github-bot commented Nov 21, 2024

bikramSingh91 commented Nov 21, 2024

facebook-github-bot commented Nov 21, 2024

conbench-facebook bot commented Nov 21, 2024

fix: Optimize subscript and array/map filter in favor of memory #11608

fix: Optimize subscript and array/map filter in favor of memory #11608

Conversation

bikramSingh91 commented Nov 20, 2024

netlify bot commented Nov 20, 2024 • edited Loading

✅ Deploy Preview for meta-velox canceled.

facebook-github-bot commented Nov 20, 2024

facebook-github-bot commented Nov 21, 2024

bikramSingh91 commented Nov 21, 2024

facebook-github-bot commented Nov 21, 2024

conbench-facebook bot commented Nov 21, 2024

netlify bot commented Nov 20, 2024 •

edited

Loading