Skip to content

Commit

Permalink
Add runbooks for Nuxt response times alarms (#3012)
Browse files Browse the repository at this point in the history
* Add Nuxt response time alarm runbooks

* Update documentation/meta/monitoring/runbooks/nuxt_p99_response_time_above_threshold.md

Co-authored-by: Krystle Salazar <[email protected]>

* Update documentation/meta/monitoring/runbooks/nuxt_avg_response_time_above_threshold.md

Co-authored-by: Krystle Salazar <[email protected]>

---------

Co-authored-by: Krystle Salazar <[email protected]>
  • Loading branch information
obulat and krysal authored Sep 25, 2023
1 parent 5a2b45c commit 50005d1
Show file tree
Hide file tree
Showing 3 changed files with 66 additions and 2 deletions.
6 changes: 4 additions & 2 deletions documentation/meta/monitoring/runbooks/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,17 @@ that can be a good resource when writing a new one.
```{toctree}
:titlesonly:
api_request_count_above_threshold
api_http_2xx_under_threshold
api_http_5xx_above_threshold
api_request_count_above_threshold
api_avg_response_time_above_threshold
api_avg_response_time_anomaly
api_p99_response_time_above_threshold
api_p99_response_time_anomaly
nuxt_request_count
nuxt_2xx_under_threshold
nuxt_5xx_above_threshold
nuxt_request_count
nuxt_avg_response_time_above_threshold
nuxt_p99_response_time_above_threshold
unhealthy_ecs_hosts
```
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Run Book: Nuxt Production Average Response Time above threshold

```{admonition} Metadata
Status: **Unstable**
Maintainer: @obulat
Alarm link:
- <https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#alarmsV2:alarm/Nuxt+Production+Average+Response+Time+above+threshold?>
```

## Severity Guide

To identify the source of the slowdown first check if there was a recent
deployment that may have introduced the problem, in that case rollback to the
previous version. Otherwise, check the following, in order:

1. Request count and general network activity. If abnormally high, refer to the
[traffic analysis run book][traffic_runbook] to identify whether there is
malicious traffic. If not, move on.
2. Check if dependencies like the API or Plausible analytics are constrained. If
stable, move on.

[traffic_runbook]:
/meta/monitoring/traffic/runbooks/identifying-and-blocking-traffic-anomalies.md

## Historical false positives

Nothing registered to date.

## Related incident reports

- 2023-06-13 at 03:50 UTC: Frontend increased response times (reason unknown)
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
# Run Book: Nuxt Production Average Response Time above threshold

```{admonition} Metadata
Status: **Unstable**
Maintainer: @obulat
Alarm link:
- <https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#alarmsV2:alarm/Nuxt+Production+P99+Response+Time+above+threshold?>
```

## Severity Guide

To identify the source of the slowdown first check if there was a recent
deployment that may have introduced the problem, in that case rollback to the
previous version. Otherwise, check the following, in order:

1. Request count and general network activity. If abnormally high, refer to the
[traffic analysis run book][traffic_runbook] to identify whether there is
malicious traffic. If not, move on.
2. Check if dependencies like the API or Plausible analytics are constrained. If
stable, move on.

[traffic_runbook]:
/meta/monitoring/traffic/runbooks/identifying-and-blocking-traffic-anomalies.md

## Historical false positives

Nothing registered to date.

## Related incident reports

- 2023-06-13 at 03:50 UTC: Frontend increased response times (reason unknown)

0 comments on commit 50005d1

Please sign in to comment.