Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(Re-)Implement RDS Disk Space Notifications #1441

Open
dj-maisy opened this issue Sep 6, 2024 · 2 comments
Open

(Re-)Implement RDS Disk Space Notifications #1441

dj-maisy opened this issue Sep 6, 2024 · 2 comments
Assignees
Labels
monitoring Issues related to monitoring RDS AWS Relational Database Service

Comments

@dj-maisy
Copy link
Member

dj-maisy commented Sep 6, 2024

What?

We would like a way to track RDS Metrics (but especially for "FreeDiskSpace") and trigger a notification when this falls below a particular threshold.

Option 1: SNS Topic

We could create a new SNS Topic and route this to our Zendesk inbox. Disk Space Notifications may be time sensitive but are generally not urgent and are something that can be addressed during business hours.

Option 2: AlertManager

We could find a way to get our metrics out of Cloudwatch and into AlertManager. But this sounds like it would be more complex and land us with redundant data.

Why?

Once upon a time, we had an SNS Topic set up called govuk-notifications that was used for a multitude of different CloudWatch Alarms. That SNS Topic was then subscribed to by an SQS Queue by the same name, where it is assumed that this Queue was then processed by a Lambda function. The bulk majority of our metrics and alerts are now managed by Prometheus and AlertManager, however, RDS instances are not currently covered by this.

@dj-maisy dj-maisy added RDS AWS Relational Database Service monitoring Issues related to monitoring labels Sep 6, 2024
@samsimpson1 samsimpson1 self-assigned this Sep 25, 2024
@samsimpson1
Copy link
Member

Seems like the easiest solution for this is to re-implement the SNS topic and email subscription. We can have it create zendesk tickets so any issues can be actioned during working hours. For now, I have set up a new SNS topic and subscription going to my inbox to see what kind of noise it will generate (if any). I'll leave it running for a week or so before switching it to sending notifications to zendesk.

@AgaDufrat
Copy link
Contributor

I think we should consider how we can triage those tickets to appropriate team(s) (we are using tags and triggers for this) given 2nd line is transitioning to in-hours on-call and the aim is to have no tickets coming to the general Tech 2nd line queue.

Before launch we should also have some docs on what to do with those alerts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
monitoring Issues related to monitoring RDS AWS Relational Database Service
Projects
None yet
Development

No branches or pull requests

3 participants