Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] pulsar-recovery and pulsar-bookie components do not supply metrics in the standard Prometheus format. #555

Closed
2 of 3 tasks
JWebDev opened this issue Dec 1, 2024 · 6 comments · Fixed by #577
Closed
2 of 3 tasks

Comments

@JWebDev
Copy link

JWebDev commented Dec 1, 2024

Search before asking

  • I searched in the issues and found nothing similar.

Read release policy

  • I understand that unsupported versions don't get bug fixes. I will attempt to reproduce the issue on a supported version of Pulsar client and Pulsar broker.

Version

apachepulsar/pulsar-all:4.0.0

Minimal reproduce step

Is this a bug or is it designed that way? Is there any solution for this? Because 2 other components send metrics in the correct format and prometheus reads them without problems. I don't use kube-prometheus-stack I add annotations to each component for my prometheus server to pick up the metrics.
For example:

autorecovery:
  replicaCount: 1
  podMonitor:
    enabled: false
  annotations:
    prometheus.io/scrape: “true”
    prometheus.io/path: “/metrics”
    prometheus.io/port: “8000”
broker:
  replicaCount: 1
  podMonitor:
    enabled: false
  annotations:
    prometheus.io/scrape: “true”
    prometheus.io/path: “/metrics”
    prometheus.io/port: “8080”

``/metrics` are visible in all components, so they are there.

What did you expect to see?

The same as in broker and zookeeper

pulsar-broker-0:/pulsar$ curl -I localhost:8080/metrics
HTTP/1.1 302 Found
Date: Sun, 01 Dec 2024 04:45:11 GMT
Location: http://localhost:8080/metrics/
Content-Length: 0
Server: Jetty(9.4.56.v20240826)

pulsar-zookeeper-0:/pulsar$ curl -I localhost:8000/metrics
HTTP/1.1 200 OK
Date: Sun, 01 Dec 2024 04:45:37 GMT
Content-Type: text/plain; version=0.0.4; charset=utf-8
Content-Length: 61427
Server: Jetty(9.4.56.v20240826)

What did you see instead?

pulsar-recovery-0:/pulsar$ curl -I localhost:8000/metrics
HTTP/1.1 405 Method Not Allowed

pulsar-bookie-0:/pulsar$ curl -I localhost:8000/metrics
HTTP/1.1 405 Method Not Allowed

Anything else?

No response

Are you willing to submit a PR?

  • I'm willing to submit a PR!
@lhotari
Copy link
Member

lhotari commented Dec 1, 2024

@JWebDev Do you use the Apache Pulsar Helm chart for deployment?

@JWebDev
Copy link
Author

JWebDev commented Dec 2, 2024

Hi @lhotari , yes. https://artifacthub.io/packages/helm/apache/pulsar

@lhotari lhotari transferred this issue from apache/pulsar Dec 2, 2024
@cizara
Copy link

cizara commented Dec 19, 2024

Hi, I'm experiencing the same behavior. Seems to be related to Prometheus v3 being rolled out in latest kube-prometheus-stack, as it has a breaking change to fail to scrape if Content-Type is not valid or missing, as stated in https://prometheus.io/docs/prometheus/3.0/migration/#scrape-protocols:

Prometheus v3 is more strict concerning the Content-Type header received when scraping. Prometheus v2 would default to the standard Prometheus text protocol if the target being scraped did not specify a Content-Type header or if the header was unparsable or unrecognised. This could lead to incorrect data being parsed in the scrape. Prometheus v3 will now fail the scrape in such cases.

The Content-Type seems to be missing:

pulsar-bookie-0:/pulsar$ curl -X GET -v -I http://127.0.0.1:8000/metrics
*   Trying 127.0.0.1:8000...
* Connected to 127.0.0.1 (127.0.0.1) port 8000
* using HTTP/1.x
> GET /metrics HTTP/1.1
> Host: 127.0.0.1:8000
> User-Agent: curl/8.11.0
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 OK
HTTP/1.1 200 OK
< content-length: 264341
content-length: 264341
< 

* shutting down connection #0

@lhotari
Copy link
Member

lhotari commented Feb 18, 2025

This is fixed in Bookkeeper PR apache/bookkeeper#4208 . The change is pending a 4.17.2 release since the PR didn't get included in 4.17.0 release. That will happen in 1-2 months. It could possibly be included in Pulsar 4.1.0 and 4.0.4 release.

@lhotari
Copy link
Member

lhotari commented Mar 3, 2025

In the Pulsar Helm Chart version 4.0.0, Prometheus version will be 3.2.1 and there's a workaround in #577 by specifying fallbackScrapeProtocol: PrometheusText0.0.4 in the podmonitors.

@lhotari
Copy link
Member

lhotari commented Mar 5, 2025

This issue will be solved in the Helm chart release 4.0.0 with fallbackScrapeProtocol: PrometheusText0.0.4 in the podmonitors. That's something that the CRDs in the kube-prometheus-stack 67.x.x version supports. Upgrade instructions are already available at https://github.com/apache/pulsar-helm-chart?tab=readme-ov-file#upgrading-from-helm-chart-version-3xx-to-400-version-and-above once 4.0.0 release completes. The release is currently in voting stage, https://lists.apache.org/thread/vftq7j4mk6gb45pvcj862txf9rd5fft9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants