Skip to content

Do not instrument "hello" #4994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

michaldo
Copy link

@michaldo michaldo commented Jun 4, 2025

When MongoObservationCommandListener is enabled according to documentation, there is a race in metrics registration:

The meter (MeterId{name='spring.data.mongodb.command.active', tags=[tag(db.name=test),tag(db.operation=hello),tag(db.system=mongodb),tag(net.peer.name=localhost),tag(net.peer.port=27017),tag(net.transport=IP.TCP),tag(spring.data.mongodb.cluster_id=6840b93dabcd29b2f0526362)]}) registration has failed: ...

Prometheus does not allow register metric with different tags. The problem is that healthcheck calls Mongo command "hello" which does not refer to any collection. Consequently, tag "db.mongodb.collection" is absent for "hello" but present in regular commands.

Event if regular commands always wins the race (and metrics always has tag "collection"), the race must be removed.

My proposal is skip "hello" instrumentation, like "admin" commands

  • You have read the Spring Data contribution guidelines.
  • You use the code formatters provided here and have them applied to your changes. Don’t submit any formatting related changes.
  • You submit test cases (unit or integration tests) that back your changes.
  • You added yourself as author in the headers of the classes you touched. Amend the date range in the Apache license header if needed. For new types, add the license header (copy from another file and set the current year only).

When MongoObservationCommandListener is enabled according to documentation, there is a race in metrics registration:

> The meter (MeterId{name='spring.data.mongodb.command.active', tags=[tag(db.name=test),tag(db.operation=hello),tag(db.system=mongodb),tag(net.peer.name=localhost),tag(net.peer.port=27017),tag(net.transport=IP.TCP),tag(spring.data.mongodb.cluster_id=6840b93dabcd29b2f0526362)]}) registration has failed: ...

Prometheus does not allow register metric with different tags. The problem is that healthcheck calls Mongo command "hello" which does not refer to any collection. Consequently, tag "db.mongodb.collection" is absent for "hello" but present in regular commands.

Event if regular commands always wins the race (and metrics always has tag "collection"), the race must be removed.

My proposal is skip "hello" instrumentation, like "admin" commands

Signed-off-by: michaldo <[email protected]>
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged label Jun 4, 2025
@michaldo
Copy link
Author

michaldo commented Jun 5, 2025

BTW, micrometer has different solution for this problem. (Maybe better?)

io.micrometer.core.instrument.binder.mongodb.DefaultMongoCommandTagsProvider:

    private String getAndRemoveCollectionNameForCommand(CommandEvent event) {
        String collectionName = inFlightCommandCollectionNames.remove(event.getRequestId());
        return collectionName != null ? collectionName : "unknown";
    }

@mp911de
Copy link
Member

mp911de commented Jun 5, 2025

@mp911de
Copy link
Member

mp911de commented Jun 5, 2025

The collection may be absent for a wide number of commands. Clearly, a hello command might not make much sense in all arrangements. On the other hand, it is a metric over time how latency evolves.

I suggest that you implement a ObservationPredicate that checks MongoHandlerContext whether a collection name is associated with it to filter unwanted observations.

Paging @jonatan-ivanov for further guidance

@mp911de mp911de added the status: waiting-for-feedback We need additional information before we can continue label Jun 5, 2025
@jonatan-ivanov
Copy link
Member

jonatan-ivanov commented Jun 5, 2025

I'm not sure what you mean by "race" but this is not a race condition. Order does not matter here, if you try to create something with the same name but different set of tags (e.g.: a tag is sometimes there, sometimes not), this is happening in Prometheus.

The problem is that healthcheck calls Mongo command "hello" which does not refer to any collection. Consequently, tag "db.mongodb.collection" is absent for "hello" but present in regular commands.

I think if you want to solve this issue, the solution is not removing instrumentation but improving it. This means that if a value is missing, use a value that signals that it is missing, e.g. instead of:

if (collection != null) {
    observation.lowCardinalityKeyValue("collection", collection);
}

this can work:

observation.lowCardinalityKeyValue("collection", collection != null ? collection : KeyValue.NONE_VALUE);

I think, this is the interesting part for this exact issue:

if (!ObjectUtils.isEmpty(context.getCollectionName())) {
keyValues = keyValues
.and(LowCardinalityCommandKeyNames.MONGODB_COLLECTION.withValue(context.getCollectionName()));
}

I think a fix could be something like this?

if (!ObjectUtils.isEmpty(context.getCollectionName())) {
    keyValues = keyValues.and(MONGODB_COLLECTION.withValue(context.getCollectionName()));
}
else {
    keyValues = keyValues.and(MONGODB_COLLECTION.withValue(KeyValue.NONE_VALUE));
}

I think not instrumenting hello is a different question but regardless if it is valid or not, I think this might be a bug that could be fixed in patch versions while nit instrumenting hello can be a breaking change so it should rather happen in a minor version.

Also, other than @mp911de's suggestion to use an ObservationPredicate to ignore these Observations (the simplest solution), you can also:

  • Use an ObservationFilter or an ObservationConvention to add "none value" in case if it is missing from the Observation
  • Use a MeterFilter to do the same but only for metrics

@spring-projects-issues spring-projects-issues added status: feedback-provided Feedback has been provided and removed status: waiting-for-feedback We need additional information before we can continue labels Jun 5, 2025
@michaldo
Copy link
Author

michaldo commented Jun 5, 2025

@jonatan-ivanov thanks for clarification, I will update my PR

I'm not sure what you mean by "race" but this is not a race condition. Order does not matter here, if you try to create something with the same name but different set of tags (e.g.: a tag is sometimes there, sometimes not), this is happening in Prometheus.

By race I mean that code

Counter counter  = Metrics.globalRegistry.counter("t1");
counter.increment();

Counter counter2  = Metrics.globalRegistry.counter("t1", "tag-name", "tag-value");
counter2.increment();
counter2.increment();

if visible on http://localhost:8080/actuator/prometheus?includedNames=t1
as

t1_total 1.0

But code

Counter counter2  = Metrics.globalRegistry.counter("t1", "tag-name", "tag-value");
counter2.increment();
counter2.increment();

Counter counter  = Metrics.globalRegistry.counter("t1");
counter.increment();

shows

t1_total{tag_name="tag-value"} 2.0

It means who first registers metrics, decided which tags are present

If tags match, both counters work

 Counter counter2  = Metrics.globalRegistry.counter("t1", "tag-name", "tag-value");
 counter2.increment();
 counter2.increment();

 Counter counter  = Metrics.globalRegistry.counter("t1","tag-name", "other-value");
 counter.increment()

t1_total{tag_name="other-value"} 1.0
t1_total{tag_name="tag-value"} 2.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status: feedback-provided Feedback has been provided status: waiting-for-triage An issue we've not yet triaged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants