Skip to content

[Crowdstrike FDR] Scale host metadata enrichment #12822

Open
@chemamartinez

Description

@chemamartinez

As part of #2816, a cache processor was added to enrich FDR events with host and user metadata at ingest-time.

Image

That means that right now, the cached metadata is stored locally in the agent and the enrichment doesn't work when agents are scaled horizontally.

Crowdstrike delivers the FDR events containing only an opaque host ID. We cannot directly associate the event with a named host and its metadata like OS or IP. To do that we must enrich the event with that metadata ourselves through a lookup.

The ingest-time host metadata enrichment that exists today was designed to work in single Elastic Agent deployments. We should evaluate making it work when Agent is scaled horizontally.

Query-time enrichment with ES|QL and an enrich table is possible, but there are trade-offs.

Ideas

  • Support storing data in memcached or redis (something that is available as a service on CSPs). Make the existing cache processor "multi-layer" with read-through to the distributed cache when the local memory cache doesn't contain the key.

  • ES|QL is adding a new lookup join feature that could be used to perform the metadata join at query time. That would simplify the architecture as it doesn't require any changes on the agent side. See [Discuss] Supporting ES|QL LOOKUP JOIN on integration data package-spec#873.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions