You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Partial aggregation is expected to reduce data size transmitted from mapper to reducer. So far Velox's partial aggregation flushes intermediate data out once memory usage reaches limit, so more or less the data reduction ratio will be subject to the frequency of flushing.
This issue would introduce the possibility of brining spilling support to partial aggregation, to make it be able to maintain the lowest data reduction ratio despites the memory resource limit. This could be useful when:
When reducer number is far less than mapper number
When network between mappers and reducers is bad
When shuffle/exchange cost between mappers and reducers is high
In the fix, we'd introduce an option which is by default false to enable spilling support for partial aggregation. When the option is true, disable flushing at the same time.
The text was updated successfully, but these errors were encountered:
Description
For more earlier discussion, refer to #7511
Partial aggregation is expected to reduce data size transmitted from mapper to reducer. So far Velox's partial aggregation flushes intermediate data out once memory usage reaches limit, so more or less the data reduction ratio will be subject to the frequency of flushing.
This issue would introduce the possibility of brining spilling support to partial aggregation, to make it be able to maintain the lowest data reduction ratio despites the memory resource limit. This could be useful when:
In the fix, we'd introduce an option which is by default false to enable spilling support for partial aggregation. When the option is true, disable flushing at the same time.
The text was updated successfully, but these errors were encountered: