Skip to content
This repository has been archived by the owner on Aug 22, 2024. It is now read-only.

Troubles with patroni metrics #10

Open
dan-aksenov opened this issue Sep 23, 2021 · 5 comments
Open

Troubles with patroni metrics #10

dan-aksenov opened this issue Sep 23, 2021 · 5 comments

Comments

@dan-aksenov
Copy link

dan-aksenov commented Sep 23, 2021

Describe the bug
Having troubles with reading patroni metrics:

{"level":"error","service":"pgscv","time":"2021-09-23T16:41:30+03:00","message":"patroni/common collector failed; Get \"http://[<nil>]:8008/liveness\": dial tcp: lookup <nil>: n$
 such host"}

Steps to reproduce
sudo -u postgres ./pgscv --config-file pgscv.yaml
Also tried running file with root.
Also tried running with PATRONI_URL="https://hostname:8008" exported

Expected behavior
No errors and patroni metrics shown in grafana.

pgSCV startup options

listen_address: 0.0.0.0:9890
defaults:
    postgres_username: "postgres"
    postgres_password: "mypassword"

Errors and Logs

sudo ./pgscv --config-file pgscv.yaml
{"level":"info","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"read configuration from pgscv.yaml"}
{"level":"info","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"no-track disabled, for details check the documentation about 'no_track_mode' option."}
{"level":"info","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"listen on http://0.0.0.0:9890"}
{"level":"info","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"auto-discovery: service added [system:0]"}
{"level":"info","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"auto-discovery [python3]: service added [patroni:8008]"}
{"level":"info","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"auto-discovery [postgres]: service added [postgres:5432]"}
{"level":"warn","service":"pgscv","time":"2021-09-23T16:55:03+03:00","message":"service [patroni:8008] failed: tries remain 1/10"}
{"level":"error","service":"pgscv","time":"2021-09-23T16:55:30+03:00","message":"patroni/common collector failed; Get \"http://[<nil>]:8008/liveness\": dial tcp: lookup <nil>: no such host"}
{"level":"error","service":"pgscv","time":"2021-09-23T16:55:30+03:00","message":"patroni/common collector failed; Get \"http://[<nil>]:8008/liveness\": dial tcp: lookup <nil>: no such host"}
{"level":"warn","service":"pgscv","time":"2021-09-23T16:55:30+03:00","message":"get model for vda failed: open /sys/block/vda/device/model: no such file or directory; skip"}
{"level":"warn","service":"pgscv","time":"2021-09-23T16:55:30+03:00","message":"get model for vdb failed: open /sys/block/vdb/device/model: no such file or directory; skip"}

In Debug mode:

{"level":"debug","service":"pgscv","time":"2021-09-23T17:15:31+03:00","message":"auto-discovery: looking up for new services..."}
{"level":"debug","service":"pgscv","time":"2021-09-23T17:15:31+03:00","message":"auto-discovery [patroni]: analyzing process with pid 24042"}
{"level":"debug","service":"pgscv","time":"2021-09-23T17:15:31+03:00","message":"auto-discovery: patroni service has been found, pid 24042, available through [<nil>]:8008"}

Environment (please complete the following information):

  • Used OS (or Containers): [output of cat /etc/os-release]
NAME="Ubuntu"
VERSION="18.04.5 LTS (Bionic Beaver)"
  • pgSCV Version [output of pgscv --version]
    pgscv 0.7.5 d52f09925a44a67493bd8c598a42c7455e182d5e-release
  • PostgreSQL Version [output of psql -c 'select version()']
    PostgreSQL 11.8 on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0, 64-bit
    patroni 2.0.2
  • Do PostgreSQL and pgSCV are running on the same host?
    Yes

Additional context
None

@dan-aksenov dan-aksenov changed the title Got patroni/common collector failed Troubles with patroni metrics Sep 23, 2021
@lesovsky
Copy link
Owner

provide please patroni.yml and output of sudo ss -luntp |grep 8008

@dan-aksenov
Copy link
Author

dan-aksenov commented Sep 23, 2021

patroni.yml

restapi:
  listen: my.host.fqdn:8008
  connect_address: my.host.fqdn:8008

I hope it's sufficient, since you require only listen and certfile string from config file.

sudo ss -luntp |grep 8008
tcp   LISTEN   0        5                ip_address:8008          0.0.0.0:*      users:(("python3",pid=24042,fd=8))

also this:

curl -s  http://$(hostname):8008

works fine

@dan-aksenov
Copy link
Author

dan-aksenov commented Sep 23, 2021

Changed listen from fqdn to ip. Seems working
at least getting:

curl -s http://ip:9890/metrics | grep patroni
# HELP patroni_up State of Patroni service: 1 is up, 0 otherwise.
# TYPE patroni_up gauge
patroni_up{service_id="patroni:8008"} 1
pgscv_services_registered_total{service="patroni",service_id="patroni:8008"} 1

But why can't I use fqdn? In out IAC rules fqdns are preffered over IPs.

@lesovsky
Copy link
Owner

There is no reason not to use FQDN's. I think this is just a bug/mistake, I will fix it.

@dan-aksenov
Copy link
Author

got some more errors for patroni monitoring:

{"level":"error","service":"pgscv","time":"2021-09-24T11:07:00+03:00","message":"patroni/common collector failed; parse patroni postmaster_start_time string '2021-09-24 06:22:35.102 MSK' failed: parsing time \"2021-09-24 06:22:35.102 MSK\" as \"2006-01-02 15:04:05.999999Z07:00\": cannot parse \" MSK\" as \"Z07:00\""}

Not sure id it deserves separate issue.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants