Skip to content

Incremental Result Streaming #69

@cawalch

Description

@cawalch

Problem

Current JMESPath implementations typically load entire datasets into memory, which is inefficient and often impossible for large data streams (e.g., logs, sensor data, large API responses). This leads to out-of-memory errors and performance bottlenecks in high-volume or resource-constrained environments.

Proposed Solution

Introduce a streaming JMESPath evaluation capability, similar to jmespath.searchStream(expression, largeDataStream), where largeDataStream is a ReadableStream. This would allow incremental processing of data, emitting results as they are evaluated, without buffering the entire dataset.

This approach supports

  • node.js environments using Node's stream.Readable or stream/web.ReadableStream.
  • Browser environments utilizing the Web Streams API (e.g., Response.body from fetch, File.stream()), requiring a streaming JSON parser.

Example Usage

const stream = jmespath.searchStream(expression, largeDataReadableStream);
stream.on('data', (chunk) => {
  // Process results incrementally
  console.log(chunk);
});
stream.on('end', () => {
  console.log('Stream finished.');
});

Benefits

  • Memory Efficiency: Drastically reduce memory footprint for large datasets.

  • Improved Performance: Process data as it arrives, enabling real-time or near real-time analysis.

  • Scalability: Better support for big data pipelines, log analysis, and API gateways in cybersecurity and other domains.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions