Aggregations could reap unused key states #5840
Labels
2023_unscheduled
breaking
core
Core development tasks
feature request
New feature or request
query engine
Milestone
As a user, I want to build aggregations of recent data without consuming memory for states that have since been removed.
When a state loses it's last row and is removed from the result, we would move the output position to a free list. We need to take care not to remove/add a new state on the same cycle.
Instead of leaving empty output positions, we should use a scheme like our incremental rehash credits to shift the output rows down to the unused slots. This would keep our incremental operations from needing to perform shifts linear in output size on a given cycle; but would necessitate additional data movement.
Making this change would be breaking, because we would no longer preserve initial encounter order for reincarnated states.
The text was updated successfully, but these errors were encountered: