Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Uptime Monitoring Documentation #10810

Merged
merged 17 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/organization/early-adopter-features/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ Limitations:
- [Issue Status](/product/issues/states-triage/) tags
- [Span Summary](/product/performance/transaction-summary/#span-summary)
- [Investigation Mode](/product/performance/retention-priorities/#investigation-mode) for retention priorities in Tracing
- [Uptime Monitoring](/product/alerts/uptime-monitoring/)
6 changes: 6 additions & 0 deletions docs/product/alerts/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ Create alerts to monitor metrics, such as:

You can find a full list of available metric alerts in [Metric Alerts](/product/alerts/alert-types/#metric-alerts).

## Uptime Monitoring Alerts

[Uptime alerts](/product/alerts/uptime-monitoring/) are triggered when an uptime HTTP check request fails to meet our
[uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria).
You can use uptime alerts to make sure a specific URL is constantly available, even during periods of low or no traffic.

## Creating Alerts

When you create a new project in [sentry.io](https://sentry.io), you can select a default issue alert. However, you can also [create your own alerts](/product/alerts/create-alerts/) to suit your team’s needs, using these [best practices](/product/alerts/best-practices/) as a guide.
Expand Down
36 changes: 36 additions & 0 deletions docs/product/alerts/uptime-monitoring/automatic-detection.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
---
title: Automatic Detection
sidebar_order: 51
description: "Learn how automatic detection of uptime monitoring works."
---

<Include name="feature-stage-alpha-uptime.mdx" />

The automatic detection of uptime alerts sets up uptime alerts for the most frequently encountered
hostnames in all URLs of your error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.
Comment on lines +9 to +11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The automatic detection of uptime alerts sets up uptime alerts for the most frequently encountered
hostnames in all URLs of your error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.
Automatic detection of uptime monitoring alerts sets up alerts for the most frequently encountered
error data URL hostnames. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.

Let me know if I didn't get the meaning right :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ok?

Suggested change
The automatic detection of uptime alerts sets up uptime alerts for the most frequently encountered
hostnames in all URLs of your error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.
Automatic detection of uptime monitoring alerts sets up alerts for the most frequently encountered hostname URLs in error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.


## How It Works

We analyze all the URLs detected in your project's captured error data to find the hostname that appears most frequently. We then create an uptime alert if it passes our [uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria).

To avoid creating flaky alerts, the hostname undergoes an "onboarding period" of three days. During this period, we send HTTP requests to the hostname every hour. If the request fails at least three times, the hostname is dropped and re-evaluated after seven days.

<Alert level="info">
Sentry will execute uptime checks against the hostname root path of the most frequently seen URLs. For example, if the most seen URL in your events is `GET https://www.example.com/docs/introduction` the check will be `GET https://www.example.com/`.
</Alert>

## Disabling Automatic Detection
Deleting an alert will disable automatic detection for the entire project linked to the host. This feature can also be turned off globally for the entire organization from the [organization settings](https://sentry.io/orgredirect/organizations/:orgslug/settings/organization) page.

Alternatively, the hostname's `robots.txt` can be updated to disallow Sentry:

```txt{tabTitle: Example}{filename: robots.txt}
User-agent: SentryUptimeBot
Disallow: *
```
Comment on lines +26 to +31
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this will just disable detection at the moment, it won't disable an ongoing checker (even in onboarding mode)


## Current Limitations

In the current version, automatically-detected uptime alerts can only be deleted, not edited. Support for editing
will be added in the future. Additionally, each organization is limited to one automatically-detected host.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
39 changes: 39 additions & 0 deletions docs/product/alerts/uptime-monitoring/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
---
title: Uptime Monitoring
sidebar_order: 50
description: "Learn how to help maintain uptime for your web services by monitoring relevant URLs with Sentry's Uptime Monitoring."
---

<Include name="feature-stage-alpha-uptime.mdx" />

Sentry's Uptime Monitoring lets you monitor the availability and reliability of your web services effortlessly.
In the current version, uptime is [automatically configured](/product/alerts/uptime-monitoring/automatic-detection/) as a new alert for only the most relevant URL detected in your organization. In future updates, you'll have the flexibility to add and monitor additional URLs.

## Uptime Check Criteria

Our uptime monitoring system verifies the availability of your URLs
by performing GET requests at regular 5-minute intervals.
For a URL to be considered up and running, the response must meet the following criteria:

1. **Successful Response (2xx Status Codes):**
The URL must return an HTTP status code in the 200–299 range, indicating a successful request.
2. **Automatic Handling of Redirects (3xx Status Codes):** Sentry will follow redirects for URLs returning an HTTP status code in the 300–399 range and verify that the final destination URL returns a successful response. This ensures that redirects won't falsely trigger downtime alerts.
3. **Timeout Setting:** Each request has a timeout threshold of 10 seconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually 5 seconds right now isn't it @wedamija ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 10 seconds right now

If the server doesn't respond within this period, the check will be marked as a failure,
indicating a potential downtime or performance issues.
4. **DNS Issue Detection:** Our monitoring also includes the detection of DNS resolution issues.
If a DNS issue is detected, the check will be marked as a failure,
allowing you to address the underlying connectivity problems.

## Notifications

An uptime alert continuously monitors the configured URL with the criteria defined above. If a failure is detected,
a new [uptime issue](/product/issues/issue-details/uptime-issues/) with failed check and related errors details will be created.

To start getting notifications for a new downtime issue, [configure an issue alert](/product/alerts/create-alerts/issue-alert-config/) and choose the issue category "uptime". Then choose how you'd like to be notified (via email, Slack, and so on).

![Uptime issue alert rule configuration](./img/uptime-issue-alert-rule.png)

## Learn More About Uptime Monitoring

<PageGrid />
33 changes: 33 additions & 0 deletions docs/product/alerts/uptime-monitoring/troubleshooting.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
title: Troubleshooting
sidebar_order: 52
description: "Learn how to troubleshoot potential Uptime Monitoring problems."
---

<Include name="feature-stage-alpha-uptime.mdx" />

## Verify Feature Eligibility

Uptime alerts are only available for organizations that have early adopter features enabled. They must also have URLs that match our [auto detection criteria](/product/alerts/uptime-monitoring/automatic-detection/#how-it-works). In the current version, organizations are limited to a single uptime alert.

## Verify Firewall Configuration

Some hosting platforms can block incoming requests from Sentry's Uptime Bot, falsely triggering uptime alerts. We recommend verifying your firewall configuration to ensure incoming requests from Sentry are allowed.

If you need to configure your firewall allowlist to include Sentry's Uptime Bot, we recommend checking against our `User-Agent`, given that our IP addresses can change without notice.

### User Agent

Our uptime check requests use the following `User-Agent`:

```
Mozilla/5.0 (compatible; SentryUptimeBot/1.0; +http://docs.sentry.io/product/alerts/uptime-monitoring/)
```

### IP Addresses

See [IP Ranges](/security-legal-pii/security/ip-ranges/#uptime-monitoring) for a complete list of IP addresses used for uptime checks.

## Verify That Issue Alerts Match Downtime Issues

Uptime alerts create downtime issues. If you're not receiving notifications when downtimes are detected, make sure you've properly [configured an issue alert](/product/alerts/create-alerts/issue-alert-config/) with the issue category "uptime".
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
22 changes: 22 additions & 0 deletions docs/product/issues/issue-details/uptime-issues/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
---
title: Uptime Issues
sidebar_order: 40
description: "Learn how to use the information on the Issue Details page to debug an error issue."
---

<Include name="feature-stage-alpha-uptime.mdx" />

An uptime issue is a grouping of detected downtime events for a specific URL. A downtime event is generated by
active uptime alerts when HTTP requests fail to meet our
[uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria).

![Uptime issue details](./img/uptime-issue-details.png)

## Traced Errors

Uptime checks made against web services configured with one of Sentry's supported SDKs contain a
[trace](/concepts/key-terms/tracing/) that can be used to track detected errors resulting from failed HTTP uptime checks. The trace navigator allows you to browse through potential root causes of your downtime and is a powerful tool for quickly identifying and resolving issues.

## Issue Lifecycle

Uptime issues are grouped by the monitored URL and created upon the first detected downtime. Sentry automatically resolves an ongoing uptime issue when the monitored URL returns to a healthy status and meets our [uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria). If the URL experiences subsequent downtime, the issue's status will change to regressed.
16 changes: 16 additions & 0 deletions docs/security-legal-pii/security/ip-ranges.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,19 @@ All email is delivered from [SendGrid](https://sendgrid.com/) from the following
```

These IP addresses are only for Sentry use.

## Uptime Monitoring

Sentry uses the following IP addresses for uptime checks:

US
```
34.123.33.225
34.41.121.171
```

EU
```
34.159.197.47
35.242.231.10
```
3 changes: 3 additions & 0 deletions includes/feature-stage-alpha-uptime.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<Note>
This feature is only available if your organization has enabled [early adopter features](/organization/early-adopter-features/). Early adopter features are still in-progress and may have bugs. We recognize the irony. If you’re interested in participating, enable early adopter features in [organization settings](https://sentry.io/orgredirect/organizations/:orgslug/settings/organization).
</Note>
Loading