Skip to content

Consider optimizations for DuplicateKeyTracker memory utilization #406

Open
@aiwilliams

Description

@aiwilliams

Integrations that process millions of entities and relationships and have rather large _key values can run out of memory keeping the complete set of values in the DuplicateKeyTracker. Also, there is a large impact on memory due to tracking the DuplicateKeyTrackerGraphObjectMetadata for each value, which becomes the sum of the bytes to define that object, the reference to it, and the references it has to the _key and _type strings.

This is not a problem for most integrations which process small sets of configuration information. Any optimization should be opt in, particularly if it comes at a cost of losing any features those integrations need which are supported by keeping this information around.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions