-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce enable_expression_evaluation_cache query config #6898
Conversation
✅ Deploy Preview for meta-velox canceled.
|
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_memory query config to trade performance for memory. When this flag is set to true, optimizations including VectorPool, ExecCtx::decodedVectorPool_, ExecCtx::selectivityVectorPool_, and Expr::evalWithMemo are disabled. Differential Revision: D49922027
eecaf92
to
bda5aeb
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_low_memory query config to trade performance for memory. When this flag is set to true, optimizations VectorPool and Expr::evalWithMemo are disabled. Differential Revision: D49922027
bda5aeb
to
7ac4980
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kagamiori Thanks for the change % minors!
velox/core/QueryCtx.h
Outdated
} | ||
|
||
private: | ||
// Pool for all Buffers for this thread. | ||
memory::MemoryPool* pool_; | ||
QueryCtx* queryCtx_; | ||
|
||
bool optimizeForLowMemory_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const and the same for pool_ and queryCtx_? Thanks!
velox/core/QueryConfig.h
Outdated
@@ -277,6 +277,11 @@ class QueryConfig { | |||
static constexpr const char* kValidateOutputFromOperators = | |||
"debug.validate_output_from_operators"; | |||
|
|||
/// If true, trade performance for memory. Optimizations including VectorPool | |||
/// and Expr::evalWithMemo are disabled. | |||
static constexpr const char* kOptimizeForLowMemory = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we just call it enable_expresssion_eval_cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think VectorPool is not limited to expression evaluation. It's part of ExecCtx inside OperatorCtx, so many operators can have VectorPool.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then how about enable_operator_buffer_cache? kOptimizeForLowMemory naming is too broad.
velox/expression/EvalCtx.cpp
Outdated
@@ -49,6 +51,8 @@ EvalCtx::EvalCtx(core::ExecCtx* execCtx, ExprSet* exprSet, const RowVector* row) | |||
EvalCtx::EvalCtx(core::ExecCtx* execCtx) | |||
: execCtx_(execCtx), exprSet_(nullptr), row_(nullptr) { | |||
VELOX_CHECK_NOT_NULL(execCtx); | |||
|
|||
optimizeForLowMemory_ = execCtx->optimizeForLowMemory(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we put this ctor initializer list and make it const? Thanks!
velox/vector/VectorPool.h
Outdated
@@ -73,7 +73,7 @@ class VectorPool { | |||
/// the allocated vector back to vector pool on destruction. | |||
class VectorRecycler { | |||
public: | |||
explicit VectorRecycler(VectorPtr& vector, VectorPool& pool) | |||
explicit VectorRecycler(VectorPtr& vector, VectorPool* pool) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Drop explicit as there is more than one input args
pool_.release(vector_); | ||
if (pool_) { | ||
pool_->release(vector_); | ||
} | ||
} | ||
|
||
private: | ||
VectorPtr& vector_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VectorPool* const pool_;
VectorPtr& vector_;
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_low_memory query config to trade performance for memory. When this flag is set to true, optimizations VectorPool and Expr::evalWithMemo are disabled. Reviewed By: bikramSingh91 Differential Revision: D49922027
7ac4980
to
e3fc961
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_low_memory query config to trade performance for memory. When this flag is set to true, optimizations VectorPool and Expr::evalWithMemo are disabled. Reviewed By: bikramSingh91 Differential Revision: D49922027
e3fc961
to
290f12e
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_low_memory query config to trade performance for memory. When this flag is set to true, optimizations VectorPool and Expr::evalWithMemo are disabled. Reviewed By: bikramSingh91 Differential Revision: D49922027
290f12e
to
06d9ec1
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_low_memory query config to trade performance for memory. When this flag is set to true, optimizations VectorPool and Expr::evalWithMemo are disabled. Reviewed By: bikramSingh91 Differential Revision: D49922027
06d9ec1
to
a86593f
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an optimize_for_low_memory query config to trade performance for memory. When this flag is set to true, optimizations VectorPool and Expr::evalWithMemo are disabled. Reviewed By: bikramSingh91 Differential Revision: D49922027
a86593f
to
5f507e1
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
a345edb
to
58e594c
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
…cubator#6898) Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027
58e594c
to
4d5d412
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
…cubator#6898) Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027
4d5d412
to
5f22804
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
Updated. Please let me know if you have further suggestions. Thanks! @mbasmanova, @xiaoxmeng. |
…cubator#6898) Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027
5f22804
to
721f303
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
…cubator#6898) Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027
721f303
to
95279ef
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
velox/core/QueryConfig.h
Outdated
@@ -567,6 +575,10 @@ class QueryConfig { | |||
return get<bool>(kValidateOutputFromOperators, false); | |||
} | |||
|
|||
bool enableExpressionEvaluationCache() const { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isExpressionEvaluationCacheEnabled to avoid giving an impression that this method enables the cache
95279ef
to
effb34f
Compare
…cubator#6898) Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027
…cubator#6898) Summary: Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027
effb34f
to
c9f5e1a
Compare
This pull request was exported from Phabricator. Differential Revision: D49922027 |
1 similar comment
This pull request was exported from Phabricator. Differential Revision: D49922027 |
This pull request has been merged in 36f9621. |
Conbench analyzed the 1 benchmark run on commit There was 1 benchmark result indicating a performance regression:
The full Conbench report has more details. |
…cubator#6898) Summary: Pull Request resolved: facebookincubator#6898 Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead. Expression evaluator also has a code path evalWithMemo that caches the base vector of dictionary input to avoid unnecessary computation later. These caches are cleared when Tasks are destroyed. An internal streaming use case, however, observed large memory usage by these caches when the streaming pipeline takes large nested-complex-typed input vectors, has a large number of operators, and runs for very long time without Task destruction. This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false, optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled. Reviewed By: xiaoxmeng, bikramSingh91 Differential Revision: D49922027 fbshipit-source-id: dce4bf6f1a896c7b05a504dd60ce9c2480759434
Summary:
Some Velox optimizations cache vectors for possible reuse later to reduce runtime overhead.
Expression evaluator also has a code path evalWithMemo that caches the base vector of
dictionary input to avoid unnecessary computation later. These caches are cleared when
Tasks are destroyed. An internal streaming use case, however, observed large memory usage
by these caches when the streaming pipeline takes large nested-complex-typed input vectors,
has a large number of operators, and runs for very long time without Task destruction.
This diff introduces an enable_expression_evaluation_cache query config. When this flag is set to false,
optimizations including VectorPool, DecodedVector pool, SelectivityVector pool, and Expr::evalWithMemo are disabled.
Differential Revision: D49922027