Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring cluster via external tools #959

Closed
jnehlmeier opened this issue Oct 26, 2022 · 2 comments
Closed

Monitoring cluster via external tools #959

jnehlmeier opened this issue Oct 26, 2022 · 2 comments

Comments

@jnehlmeier
Copy link

Are there any plans to add monitoring capabilities to pg_auto_failover? While the monitor knows the cluster state, it still feels like black box and you will not be notified via mail/messages/whatever if a failover occurs.

I know that a lot of commands have an option to output json and I am relatively sure it exists to build monitoring tools around it. But given that pg_autoctl already manages multiple processes, wouldn't it be great to also provide a simple HTTP endpoint that publishes the state the monitor sees? Or some script hook that will be called once the state changes so that a script can publish the new state somewhere else?

Otherwise there is only a polling solution possible that requires access to pg_autoctl executable (or direct access to the monitor database).

I would really like to have the cluster state visible in a dashboard (e.g. Grafana) and add alerting features on top of it. What is the best practice you have in mind?

@DimCitus
Copy link
Collaborator

DimCitus commented Nov 2, 2022

See #958 for script hooks. Introducing an HTTP API would be nice too. Meanwhile a cgi-bin thing that would call into the pg_autoctl binary using --json might be a good way to have it. Closing this one now because I believe the work in #958 is closing it.

We can revise the HTTP idea later. I believe last time I had a look integrating with https://sqlite.org/althttpd/doc/trunk/althttpd.md seemed a good way forward. I would review a PR that would integrate that lib (vendor it in) and expose information in the JSON format over HTTP then.

@DimCitus DimCitus closed this as completed Nov 2, 2022
@s4ke
Copy link

s4ke commented Nov 2, 2022

While I do think an embedded server will be quite nice and the way to go down the road, it is pretty straight forward to set something like this up. We have a small script for this over at https://github.com/neuroforgede/pg_auto_failover_ansible/tree/master/tools/health_monitor

You would probably run this on the app server where your monitoring lives. Also the tool in the link is quite configurable and you can run arbitrary checks via http.

If you want more granular monitoring, maybe something like a prometheus exporter for postgres would be something you might like.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants