Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

grafana-agent doesn't support having multiple units running on the same machine #335

Open
Vultaire opened this issue Dec 6, 2024 · 0 comments

Comments

@Vultaire
Copy link

Vultaire commented Dec 6, 2024

Bug Description

I recently hit this during a cloud handover review. We had an environment with landscape-server and postgresql, both with grafana-agent subordinates related through the cos-agent relation.

It seems that grafana agent does not support such deployments. I saw that in a 3 machine cluster, one had jobs for landscape-server while the others had jobs for postgresql; neither had jobs for both.

Likely it is due to a single grafana agent instance being run, using the same config file: /etc/grafana-agent.yaml

Unfortunately, I didn't see a clear way of getting this to work where both apps' jobs would be included in grafana-agent.yaml. We needed to fall back to using the nrpe charm and cos-proxy to address the alerts for one of the apps.

To Reproduce

Not providing the bundle since it's for a customer, but it's pretty simple:

  • Deploy a 3 unit postgresql cluster. This was tested with the 14/stable channel, rev 468.
  • Deploy a 3 unit landscape-server cluster, onto the same machines as the above cluster. This was tested with the latest/stable channel, rev 121.
  • Deploy the grafana-agent charm. This was tested with the latest/stable channel, rev 223.
  • Relate the grafana-agent charm to both of the other charms.
  • Observe the rendered /etc/grafana-agent.yaml file on each of the 3 machines. None of the machines will have jobs for both of the apps; it's either one or the other.

Environment

This was tested on Juju 3.4.6 on Azure, although the cloud likely does not matter in this case.

Relevant log output

Just look at the /etc/grafana-agent.yaml file.  You can grep for these patterns:

  "job_name: charmed-postgresql"
  "job_name: landscape-server"

Based on the reproducer on this ticket, only one of those will return values due to the race between the two grafana-agent subordinates running on the same machine.

Additional context

postgresql and rabbitmq-server were intentionally put on the same machines in order to reduce how many Azure VMs were needed for the project. Separating them would require additional VMs and thus likely additional cost. (If this were a MAAS/LXD cloud instead, splitting them apart into separate containers would be the obvious workaround.)

@Vultaire Vultaire changed the title grafana-agent doesn't support having units running on the same machine grafana-agent doesn't support having multiple units running on the same machine Dec 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant