Request more fine-grained control for remote sampling #6127

garrettlish · 2024-09-19T09:29:30Z

Problem Statement

The PerOperationSampler provides operation-level customized sampling probabilities but lacks support for more fine-grained control, such as adjusting sampling probabilities based on specific tag key-value pairs (see sampling.proto).

For instance, in a real-world scenario, we might want to enforce sampling for a particular user session by customizing sampling probabilities based on specific tag key-value pairs.

Proposed Solution

Introduce a tag key-value pair in OperationSamplingStrategy to enable fine-grained control for remote sampling.

The text was updated successfully, but these errors were encountered:

dmathieu · 2024-09-20T07:57:39Z

cc @yurishkuro as codeowner of the jaegerremote sampler.

garrettlish · 2024-09-23T09:05:12Z

Thanks @dmathieu! @yurishkuro The proposed schema change is as follows: while you're right that a schema change alone is insufficient, we can also make corresponding adjustments to the OTEL SDKs to enable fine-grained control for remote sampling. What are your thoughts on this?

message Tag {
  string key = 1;
  string value = 2;
}

message TagBasedSamplingStrategy {
  repeated Tag matchingTags = 1;
  ProbabilisticSamplingStrategy probabilisticSampling = 2;
}

message OperationSamplingStrategy {
  string operation = 1;

  // Default sampling probability for the operation.
  ProbabilisticSamplingStrategy defaultSampling = 2;

  // Tag-based sampling customization, which overrides default sampling when matched.
  repeated TagBasedSamplingStrategy tagBasedSampling = 3;
}

yurishkuro · 2024-09-28T16:55:37Z

The current implementation performs stratified sampling by dividing all requests into strata where each stratum corresponds to one of the endpoints of the service. In order to make sampling sensitive to tags the strata need to be redefined carefully, such that the overall space remains deterministically partitioned.

I don't agree that the proposed schema change is the best partitioning, because it only does sub-partitioning within existing strata by the endpoint. But I can easily imaging someone wanting to say "sample all errors 100%" regardless of the endpoint, which is not really possible via this schema.

Fundamentally, I think this requires a large re-design where we treat all dimensions of a span equally. There is technically nothing special about the operation, we can consider it as yet another attribute. Then the sampling expression language becomes more uniform. It requires more complex house-keeping by the sampler, but I think the same complexity would be introduces by the proposed change anyway (going from one-level maps to two-level maps), only it will be a lot more rigid.

There are a couple of proposals in the OTEL spec (I don't have time to look for them) that propose more generic configuration for samplers. We need to align with those proposals, not build something bespoke. Especially because changing the sampling definition schema means that Jaeger's adaptive sampling would also need to be changed accordingly (keeping track of the same strata as the SDK), and right now it's pretty much hard-coded with service/operation partitioning.

garrettlish · 2024-10-01T21:42:07Z

Thanks @yurishkuro for your rely. Your concerns are well-founded. Relying solely on endpoint-based partitioning does limit the flexibility needed for more advanced use cases. As you pointed out, redesigning the sampling approach to treat all dimensions of a span equally could enable fine-grained control in remote sampling.

While there are several proposals in the OTEL spec, most don’t directly address remote sampling strategies. Do you think it's worth iterating on our current approach to introduce a v3 version that supports fine-grained control in remote sampling?

yurishkuro · 2024-10-02T03:04:46Z

Yes, I would be supportive of designing a more flexible sampling strategy data model and gradually implementing it.

garrettlish added enhancement New feature or request sampler: jaegerremote labels Sep 19, 2024

garrettlish mentioned this issue Sep 19, 2024

[Feature]: Request more fine-grained control for remote sampling jaegertracing/jaeger-idl#106

Open

garrettlish mentioned this issue Oct 10, 2024

Add a more flexible sampling strategy data model jaegertracing/jaeger-idl#107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Request more fine-grained control for remote sampling #6127

Request more fine-grained control for remote sampling #6127

garrettlish commented Sep 19, 2024

dmathieu commented Sep 20, 2024

garrettlish commented Sep 23, 2024

yurishkuro commented Sep 28, 2024 •

edited

Loading

garrettlish commented Oct 1, 2024

yurishkuro commented Oct 2, 2024

Request more fine-grained control for remote sampling #6127

Request more fine-grained control for remote sampling #6127

Comments

garrettlish commented Sep 19, 2024

Problem Statement

Proposed Solution

dmathieu commented Sep 20, 2024

garrettlish commented Sep 23, 2024

yurishkuro commented Sep 28, 2024 • edited Loading

garrettlish commented Oct 1, 2024

yurishkuro commented Oct 2, 2024

yurishkuro commented Sep 28, 2024 •

edited

Loading