Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Uptime Monitoring Documentation #10810

Merged
merged 17 commits into from
Aug 13, 2024
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions docs/organization/early-adopter-features/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,4 @@ Limitations:
- [Issue Status](/product/issues/states-triage/) tags
- [Span Summary](/product/performance/transaction-summary/#span-summary)
- [Investigation Mode](/product/performance/retention-priorities/#investigation-mode) for retention priorities in Tracing
- [Uptime Monitoring](/product/alerts/uptime-monitoring/)
6 changes: 6 additions & 0 deletions docs/product/alerts/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,12 @@ Create alerts to monitor metrics, such as:

You can find a full list of available metric alerts in [Metric Alerts](/product/alerts/alert-types/#metric-alerts).

## Uptime Alerts
gaprl marked this conversation as resolved.
Show resolved Hide resolved

[Uptime alerts](/product/alerts/uptime-monitoring/) are triggered when an uptime HTTP check request fails to meet our
[uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria).
Use uptime alerts to ensure a specific URL is constantly available, even during periods of low or no traffic.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

## Creating Alerts

When you create a new project in [sentry.io](https://sentry.io), you can select a default issue alert. However, you can also [create your own alerts](/product/alerts/create-alerts/) to suit your team’s needs, using these [best practices](/product/alerts/best-practices/) as a guide.
Expand Down
45 changes: 45 additions & 0 deletions docs/product/alerts/uptime-monitoring/automatic-detection.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
---
title: Automatic Detection
sidebar_order: 51
description: "Learn how automatic detection of uptime monitoring works."
---

<Include name="feature-stage-alpha-uptime.mdx" />

The automatic detection of uptime alerts sets up uptime alerts for the most frequently encountered
hostnames in all URLs of your error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.
Comment on lines +9 to +11
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The automatic detection of uptime alerts sets up uptime alerts for the most frequently encountered
hostnames in all URLs of your error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.
Automatic detection of uptime monitoring alerts sets up alerts for the most frequently encountered
error data URL hostnames. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.

Let me know if I didn't get the meaning right :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ok?

Suggested change
The automatic detection of uptime alerts sets up uptime alerts for the most frequently encountered
hostnames in all URLs of your error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.
Automatic detection of uptime monitoring alerts sets up alerts for the most frequently encountered hostname URLs in error data. This helps ensure that critical hostnames are continuously monitored,
enhancing the reliability and availability of your web services.


## How It Works

We analyze all URLs detected in your project's captured error data, finding the most frequently seen hostnames.
For the most frequently seen hostname, an uptime alert is created if it passes our
[uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria).

To avoid flaky alerts from being created, the hostname undergoes an
"onboarding period" of three days. During this period, we send HTTP requests to the hostname every hour. If the
request fails at least three times, the hostname is dropped and re-evaluated after seven days.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

<Alert level="info">
Sentry will execute uptime checks against the hostname root path of the most frequently seen URLs. Example, if the
most seen URL in your events is `GET https://www.example.com/docs/introduction` the check will be made as `GET
https://www.example.com/`.
gaprl marked this conversation as resolved.
Show resolved Hide resolved
</Alert>

## Disabling Automatic Detection

Automatically created uptime alerts can be deleted. Deleting an alert will disable automatic detection for the
entire project linked to the host. This feature can be turned off globally for the entire organization in the
[organization settings](https://sentry.io/orgredirect/organizations/:orgslug/settings/organization) page.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

Alternatively, the hostname's `robots.txt` can be updated to disallow Sentry:

```txt{tabTitle: Example}{filename: robots.txt}
User-agent: Mozilla/5.0 (compatible; SentryUptimeBot/1.0; +http://docs.sentry.io/product/alerts/uptime-monitoring/)
Disallow: *
```

## Current Limitations

Automatically detected uptime alerts cannot be edited at this time; they can only be deleted. Support for editing
will be added in the future. Additionally, each organization is limited to one automatically detected host.
gaprl marked this conversation as resolved.
Show resolved Hide resolved
51 changes: 51 additions & 0 deletions docs/product/alerts/uptime-monitoring/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
title: Uptime Monitoring
sidebar_order: 50
description: "Learn how Sentry monitors relevant URLs to help maintain uptime for your web services."
gaprl marked this conversation as resolved.
Show resolved Hide resolved
---

<Include name="feature-stage-alpha-uptime.mdx" />

Sentry's Uptime Monitoring allows you to ensure the availability and reliability of your web services effortlessly.
Currently, uptime is
[automatically configured](/product/alerts/uptime-monitoring/automatic-detection/) as a new
alert for the most relevant URL detected in your organization. In future updates, you'll have the flexibility to add
and monitor additional URLs.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

## Uptime Check Criteria

Our uptime monitoring system verifies the availability of your URLs
by performing GET requests at regular 5-minute intervals.
For a URL to be considered up and running, the response must meet the following criteria:

1. **Successful Response (2xx Status Codes):**
The URL must return an HTTP status code in the 200–299 range, indicating a successful request.
2. **Automatic Handling of Redirects (3xx Status Codes):** URLs returning an HTTP status code in the 300–399 range,
indicating a redirect,
will trigger Sentry to automatically follow the redirect
and verify the final destination URL returns a successful response.
This ensures that redirects don’t falsely trigger downtime alerts.
3. **Timeout Setting:** Each request has a timeout threshold of 10 seconds.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually 5 seconds right now isn't it @wedamija ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's 10 seconds right now

If the server doesn't respond within this period, the check will be marked as a failure,
indicating a potential downtime or performance issues.
4. **DNS Issue Detection:** Our monitoring also includes the detection of DNS resolution issues.
If a DNS issue is detected, the check will be marked as a failure,
allowing you to address the underlying connectivity problems.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

## Notifications

An existing uptime alert continuously monitors the configured URL with the criteria defined above,
and upon any failures,
a new [uptime issue](/product/issues/issue-details/uptime-issues/) is created with details of the failed check and related errors.

To receive notifications for a new downtime issue,
[configure an issue alert](/product/alerts/create-alerts/issue-alert-config/)
that matches the issue's category as "uptime"
with the chosen actions
(such as sending an email, triggering Slack, etc.).
gaprl marked this conversation as resolved.
Show resolved Hide resolved

[TODO: Add image of issue alert conditions here]

## Learn More About Uptime Monitoring

<PageGrid />
35 changes: 35 additions & 0 deletions docs/product/alerts/uptime-monitoring/troubleshooting.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
---
title: Troubleshooting
sidebar_order: 52
description: "Troubleshooting for Uptime Monitoring."
gaprl marked this conversation as resolved.
Show resolved Hide resolved
---

<Include name="feature-stage-alpha-uptime.mdx" />

## Verify Feature Eligibility

Uptime alerts are only available for organizations with early adopter features enabled, and that have relevant URLs
matching our [auto detection criteria](/product/alerts/uptime-monitoring/automatic-detection/#how-it-works).
Currently, organizations are limited to a single uptime alert.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

## Verify Firewall Configuration

Some hosting platforms can block incoming requests from Sentry's Uptime Bot, falsely triggering uptime alerts. We recommend verifying your firewall configuration to ensure incoming requests from Sentry are allowed.

If you need to configure your firewall allowlist to include Sentry's Uptime Bot, we recommend checking against our `User-Agent` given our IP addresses can change without notice.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

### User Agent

Our uptime check requests use the following `User-Agent`:

```
Mozilla/5.0 (compatible; SentryUptimeBot/1.0; +http://docs.sentry.io/product/alerts/uptime-monitoring/)
```

### IP addresses
gaprl marked this conversation as resolved.
Show resolved Hide resolved

See [IP Ranges](/security-legal-pii/security/ip-ranges/#uptime-monitoring) for a complete list of IP addresses used for uptime checks.

## Verify Issue Alerts Match Downtime Issues
gaprl marked this conversation as resolved.
Show resolved Hide resolved

Uptime alerts create downtime issues, if you are not receiving notifications when downtimes are detected, make sure you have properly [configured an issue alert](/product/alerts/create-alerts/issue-alert-config/) that matches the issue's category as "uptime".
gaprl marked this conversation as resolved.
Show resolved Hide resolved
27 changes: 27 additions & 0 deletions docs/product/issues/issue-details/uptime-issues/index.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
---
title: Uptime Issues
sidebar_order: 40
description: "Learn how to use the information on the Issue Details page to debug an error issue."
---

<Include name="feature-stage-alpha-uptime.mdx" />

An uptime issue is a grouping of detected downtime events for a specific URL. A downtime event is generated by
active uptime alerts when HTTP requests fail to meet our
[uptime check criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria).

[TODO: add uptime issue screenshot]

## Traced Errors

Uptime checks made against web services configured with one of Sentry's supported SDKs contains a
[trace](/concepts/key-terms/tracing/) that can be used to track detected errors from failed HTTP uptime checks.
The trace navigator allows you to browse through potential root causes of your downtime, providing a powerful tool
to quickly identify and resolve issues.
gaprl marked this conversation as resolved.
Show resolved Hide resolved

## Issue Lifecycle

Uptime issues are grouped by the monitored URL, and created upon the first detected downtime. Sentry automatically
resolves an ongoing uptime issue when the monitored URL returns to a healthy status and meets our [uptime check
criteria](/product/alerts/uptime-monitoring/#uptime-check-criteria). If the URL experiences subsequent downtimes,
the issue's status changes to regressed.
gaprl marked this conversation as resolved.
Show resolved Hide resolved
14 changes: 14 additions & 0 deletions docs/security-legal-pii/security/ip-ranges.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -122,3 +122,17 @@ All email is delivered from [SendGrid](https://sendgrid.com/) from the following
```

These IP addresses are only for Sentry use.

## Uptime Monitoring

Sentry uses the following IP addresses for uptime checks:

US Data Storage Location
```
TODO
```

EU Data Storage Location
```
TODO
```
3 changes: 3 additions & 0 deletions includes/feature-stage-alpha-uptime.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
<Note>
This feature is only available if your organization has enabled [early adopter features](/organization/early-adopter-features/). Early adopter features are still in-progress and may have bugs. We recognize the irony. If you’re interested in participating, enable early adopter features in [organization settings](https://sentry.io/orgredirect/organizations/:orgslug/settings/organization).
</Note>
Loading