You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like a way to track RDS Metrics (but especially for "FreeDiskSpace") and trigger a notification when this falls below a particular threshold.
Option 1: SNS Topic
We could create a new SNS Topic and route this to our Zendesk inbox. Disk Space Notifications may be time sensitive but are generally not urgent and are something that can be addressed during business hours.
Option 2: AlertManager
We could find a way to get our metrics out of Cloudwatch and into AlertManager. But this sounds like it would be more complex and land us with redundant data.
Why?
Once upon a time, we had an SNS Topic set up called govuk-notifications that was used for a multitude of different CloudWatch Alarms. That SNS Topic was then subscribed to by an SQS Queue by the same name, where it is assumed that this Queue was then processed by a Lambda function. The bulk majority of our metrics and alerts are now managed by Prometheus and AlertManager, however, RDS instances are not currently covered by this.
The text was updated successfully, but these errors were encountered:
Seems like the easiest solution for this is to re-implement the SNS topic and email subscription. We can have it create zendesk tickets so any issues can be actioned during working hours. For now, I have set up a new SNS topic and subscription going to my inbox to see what kind of noise it will generate (if any). I'll leave it running for a week or so before switching it to sending notifications to zendesk.
I think we should consider how we can triage those tickets to appropriate team(s) (we are using tags and triggers for this) given 2nd line is transitioning to in-hours on-call and the aim is to have no tickets coming to the general Tech 2nd line queue.
Before launch we should also have some docs on what to do with those alerts.
What?
We would like a way to track RDS Metrics (but especially for "FreeDiskSpace") and trigger a notification when this falls below a particular threshold.
Option 1: SNS Topic
We could create a new SNS Topic and route this to our Zendesk inbox. Disk Space Notifications may be time sensitive but are generally not urgent and are something that can be addressed during business hours.
Option 2: AlertManager
We could find a way to get our metrics out of Cloudwatch and into AlertManager. But this sounds like it would be more complex and land us with redundant data.
Why?
Once upon a time, we had an SNS Topic set up called
govuk-notifications
that was used for a multitude of different CloudWatch Alarms. That SNS Topic was then subscribed to by an SQS Queue by the same name, where it is assumed that this Queue was then processed by a Lambda function. The bulk majority of our metrics and alerts are now managed by Prometheus and AlertManager, however, RDS instances are not currently covered by this.The text was updated successfully, but these errors were encountered: