Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Limiter does not obey documented processor behaviour when used in multiple pipelines #11969

Open
pranavmarla opened this issue Dec 20, 2024 · 1 comment
Labels
bug Something isn't working

Comments

@pranavmarla
Copy link

Describe the bug

Recently my team ran into an issue where the memory_limiter processor did not behave as expected when it was referenced in multiple pipelines. We believe this is because it does not actually follow the documented behaviour for processors in this situation.

Specifically, as per this documentation, when the same processor is referenced in multiple pipelines, each pipeline gets its own independent copy of that processor -- the processor is not "shared" across pipelines.

The same name of the processor can be referenced in the processors key of multiple pipelines. In this case, the same configuration is used for each of these processors, but each pipeline always gets its own instance of the processor. Each of these processors has its own state, and the processors are never shared between pipelines. For example, if batch processor is used in several pipelines, each pipeline has its own batch processor.

Based on this, when we referenced the same memory_limiter processor in multiple pipelines (eg. A and B), we expected:

  1. The memory_limiter processor in pipeline A would only examine the memory used by pipeline A, and same for B.
  2. If pipeline A's memory usage went above the limit defined in the memory_limiter processor, only pipeline A would be halted -- pipeline B would not be impacted.

However, what we actually saw was that, as soon as pipeline A's memory usage breached the limit defined in the memory_limiter processor, both pipelines A and B were halted. This suggests that, contrary to what the documentation says should be the case, the memory_limiter processor is "shared" across pipelines -- i.e. the memory_limiter processor examines + limits the total memory usage of all the pipelines, not just its own individual pipeline.

Also, this issue comment also implies that the memory_limiter does not obey the documented processor behaviour:

I also still confused why it is a processor :) you can only define once per instance right? it applies to all pipelines (or the least defined one wins for all pipelines)

Steps to reproduce

Configure an agent with 1 memory_limiter processor that is referenced in two different pipelines: A logs pipeline and a metrics pipeline. Generate a large volume of logs so that only the memory used by the logs pipeline increases until it breaches the limit defined in the memory_limiter processor.

What did you expect to see?

Since only the logs pipeline's memory usage went over the limit defined in the memory_limiter processor, only the logs pipeline should get halted. The metrics pipeline should be unaffected and should continue sending metrics.

What did you see instead?

Both the logs pipeline and metrics pipeline got halted, even though only the logs pipeline's memory usage was too high. We know the metrics pipeline got halted by the memory_limiter processor because, in the agent logs, the metrics pipeline generated an error message that said "data refused due to high memory usage" which is also present in the code for the memory_limiter processor.

What version did you use?

OTEL agent v0.102.1

Environment

Kubernetes

Suggested Solution

  1. One solution is to refactor the memory_limiter processor so it obeys the documented processor behaviour. However, I don't know how difficult this would be. I'm also not sure if the wider community would want its current behaviour to change.
  2. As I said, perhaps the current behaviour of the memory_limiter processor is actually desired. If that is the case, then the real "bug" here is just that its behaviour in this situation is not clearly documented. Regardless of whether or not we implement solution 1, I think we should at least document this anomalous behaviour of the memory_limiter processor so that it avoids such confusion in the future.
    Specifically:
    2a. In the general processor documentation, clearly note that the memory_limiter processor is an exception and behaves differently
    2b. The memory_limiter processor documentation currently only mentions its behaviour when referenced in a single pipeline. Instead, clearly document (with examples) how it behaves when referenced in multiple pipelines. In particular it should call out that it does not behave as per the current processor documentation.

Workaround?
Assuming the memory_limiter processor's current behaviour is not changed any time soon, is there any suggested workaround to get our desired behaviour (where, if the logs pipeline's memory usage goes too high, only the logs pipeline is halted)?
Eg. Would defining two different memory_limiter processors (eg. memory_limiter/logs and memory_limiter/metrics), such that each pipeline gets a different memory_limiter processor, solve this issue? Or would it not make a difference since, regardless of what they're named, they will both be examining the total memory usage of all the pipelines, not just their own pipeline, which means they will always be impacted by the other pipeline?

@pranavmarla pranavmarla added the bug Something isn't working label Dec 20, 2024
@dehaansa
Copy link

dehaansa commented Jan 3, 2025

The memory_limiter processor does not examine the memory usage of a pipeline, it examines the memory usage of the collector host. While the documentation does not say this as explicitly as it could, it does not at any point mention the memory usage of a pipeline.

The behavior is consistent with the documentation, each pipeline gets its own instance, however each instance is looking at total system memory not pipeline memory. Are there specific changes to the documentation that you think would make this behavior more clear/explicit?

If you're looking for a component to limit the memory usage of a single pipeline I would expect that would be a completely new component, or at minimum a completely different explicit mode of the memory_limiter processor (with comprehensive new documentation I hope!).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants