Aggregations could reap unused key states #5840

cpwright · 2024-07-24T18:26:08Z

As a user, I want to build aggregations of recent data without consuming memory for states that have since been removed.

When a state loses it's last row and is removed from the result, we would move the output position to a free list. We need to take care not to remove/add a new state on the same cycle.

Instead of leaving empty output positions, we should use a scheme like our incremental rehash credits to shift the output rows down to the unused slots. This would keep our incremental operations from needing to perform shifts linear in output size on a given cycle; but would necessitate additional data movement.

Making this change would be breaking, because we would no longer preserve initial encounter order for reincarnated states.

rcaudy · 2024-07-25T15:51:54Z

This is likely a performance loss with no gain for some cases (we need to track previous state for key columns, etc, and the code will be more complex). For long-running aggregations where buckets can go away, this may be a significant win in memory usage.

We have some engine tools that rely on states never moving, including AggregationRowLookup (used for tree lookup and data index lookup).

This will need to be configurable, at a minimum, which likely means adding a builder interface to aggBy.

cpwright added feature request New feature or request triage labels Jul 24, 2024

rcaudy assigned rcaudy and cpwright Jul 25, 2024

rcaudy added query engine core Core development tasks breaking and removed triage labels Jul 25, 2024

rcaudy added this to the 4. Unscheduled milestone Jul 25, 2024

pete-petey added the 2023_unscheduled label Aug 26, 2024

pete-petey modified the milestones: 4. Unscheduled, 5. Backlog Aug 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Aggregations could reap unused key states #5840

Aggregations could reap unused key states #5840

cpwright commented Jul 24, 2024

rcaudy commented Jul 25, 2024

Aggregations could reap unused key states #5840

Aggregations could reap unused key states #5840

Comments

cpwright commented Jul 24, 2024

rcaudy commented Jul 25, 2024