Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Optimize subscript and array/map filter in favor of memory #11608

Closed

Conversation

bikramSingh91
Copy link
Contributor

Summary:
Currently, these functions wrap the underlying element vectors of the
map/array with a dictionary layer where the indices point to the
selected/remaining elements. If the element vector is large and only
a small subset of elements are selected, then this can create
counterproductive dictionaries where the base is much larger than the
dictionary. This can then result in large amounts of memory being
passed up the execution pipeline, holding onto memory, and
furthermore, expression eval can peel these vectors and operate on
the large bases that can end up creating intermediate vectors of
similar large size. This can put further memory pressure on queries,
causing them to hit their memory limits, resulting in failures or
spills.

This change ensures that the aforementioned functions that can
generate these kinds of results would instead flatten the elements
vector if the size of the dictionary is less than 1/8th the size of
the base (elements vector).

This helped reduce memory usage in the filterProject operator in a
particular query from 2.6GB to 380MB.

Differential Revision: D66253163

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 20, 2024
Copy link

netlify bot commented Nov 20, 2024

Deploy Preview for meta-velox canceled.

Name Link
🔨 Latest commit 15a05dc
🔍 Latest deploy log https://app.netlify.com/sites/meta-velox/deploys/673f7ecd55937f000894e110

@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66253163

velox/vector/BaseVector.cpp Outdated Show resolved Hide resolved
…bookincubator#11608)

Summary:

Currently, these functions wrap the underlying element vectors of the
map/array with a dictionary layer where the indices point to the
selected/remaining elements. If the element vector is large and only
a small subset of elements are selected, then this can create
counterproductive dictionaries where the base is much larger than the
dictionary. This can then result in large amounts of memory being
passed up the execution pipeline, holding onto memory, and
furthermore, expression eval can peel these vectors and operate on
the large bases that can end up creating intermediate vectors of
similar large size. This can put further memory pressure on queries,
causing them to hit their memory limits, resulting in failures or
spills.

This change ensures that the aforementioned functions that can
generate these kinds of results would instead flatten the elements
vector if the size of the dictionary is less than 1/8th the size of
the base (elements vector).

This helped reduce memory usage in the filterProject operator in a
particular query from 2.6GB to 380MB.

Differential Revision: D66253163
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D66253163

@bikramSingh91 bikramSingh91 requested a review from Yuhta November 21, 2024 19:22
@bikramSingh91
Copy link
Contributor Author

Linux build failed due to unrelated test. Filed an issue for it: #11619

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 8d91c1c.

Copy link

Conbench analyzed the 1 benchmark run on commit 8d91c1cc.

There were no benchmark performance regressions. 🎉

The full Conbench report has more details.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported Merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants