Skip to content

Commit

Permalink
v1.14.0
Browse files Browse the repository at this point in the history
  • Loading branch information
joeyorlando authored Jan 6, 2025
2 parents b8afe1b + 152d5f7 commit 95ad2f2
Show file tree
Hide file tree
Showing 29 changed files with 732 additions and 111 deletions.
32 changes: 24 additions & 8 deletions docs/make-docs
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,28 @@
# [Semantic versioning](https://semver.org/) is used to help the reader identify the significance of changes.
# Changes are relevant to this script and the support docs.mk GNU Make interface.
#
# ## 8.3.0 (2024-12-27)
#
# ### Added
#
# - Debug output of the final command when DEBUG=true.
#
# Useful to inspect if the script is correctly constructing the final command.
#
# ## 8.2.0 (2024-12-22)
#
# ### Removed
#
# - Special cases for Oracle and Datadog plugins now that they exist in the plugins monorepo.
#
# ## 8.1.0 (2024-08-22)
#
# ### Added
#
# - Additional website mounts for projects that use the website repository.
#
# Mounts are required for `make docs` to work in the website repository or with the website project.
# The Makefile is also mounted for convenient development of the procedure that repository.
# The Makefile is also mounted for convenient development of the procedure in that repository.
#
# ## 8.0.1 (2024-07-01)
#
Expand Down Expand Up @@ -355,8 +369,6 @@ SOURCES_grafana_cloud_frontend_observability_faro_web_sdk='faro-web-sdk'
SOURCES_helm_charts_mimir_distributed='mimir'
SOURCES_helm_charts_tempo_distributed='tempo'
SOURCES_opentelemetry='opentelemetry-docs'
SOURCES_plugins_grafana_datadog_datasource='datadog-datasource'
SOURCES_plugins_grafana_oracle_datasource='oracle-datasource'
SOURCES_resources='website'

VERSIONS_as_code='UNVERSIONED'
Expand All @@ -367,8 +379,6 @@ VERSIONS_grafana_cloud_k6='UNVERSIONED'
VERSIONS_grafana_cloud_data_configuration_integrations='UNVERSIONED'
VERSIONS_grafana_cloud_frontend_observability_faro_web_sdk='UNVERSIONED'
VERSIONS_opentelemetry='UNVERSIONED'
VERSIONS_plugins_grafana_datadog_datasource='latest'
VERSIONS_plugins_grafana_oracle_datasource='latest'
VERSIONS_resources='UNVERSIONED'
VERSIONS_technical_documentation='UNVERSIONED'
VERSIONS_website='UNVERSIONED'
Expand All @@ -378,8 +388,6 @@ PATHS_grafana_cloud='content/docs/grafana-cloud'
PATHS_helm_charts_mimir_distributed='docs/sources/helm-charts/mimir-distributed'
PATHS_helm_charts_tempo_distributed='docs/sources/helm-charts/tempo-distributed'
PATHS_mimir='docs/sources/mimir'
PATHS_plugins_grafana_datadog_datasource='docs/sources'
PATHS_plugins_grafana_oracle_datasource='docs/sources'
PATHS_resources='content'
PATHS_tempo='docs/sources/tempo'
PATHS_website='content'
Expand Down Expand Up @@ -631,7 +639,7 @@ POSIX_HERESTRING

case "${_project}" in
# Workaround for arbitrary mounts where the version field is expected to be the local directory
# and the repo field is expected to be the container directory.
# and the repo field is expected to be the container directory.
arbitrary)
echo "${_project}^${_version}^${_repo}^" # TODO
;;
Expand Down Expand Up @@ -801,6 +809,10 @@ case "${image}" in
| sed "s#$(proj_dst "${proj}")#sources#"
EOF

if [ -n "${DEBUG}" ]; then
debg "${cmd}"
fi

case "${OUTPUT_FORMAT}" in
human)
if ! command -v jq >/dev/null 2>&1; then
Expand Down Expand Up @@ -837,6 +849,10 @@ EOF
/hugo/content/docs
EOF

if [ -n "${DEBUG}" ]; then
debg "${cmd}"
fi

case "${OUTPUT_FORMAT}" in
human)
${cmd} --output=line \
Expand Down
40 changes: 37 additions & 3 deletions docs/sources/configure/integrations/references/manual/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -89,13 +89,47 @@ to the team's ChatOps channels and start an appropriate escalation chain.

## Set up direct paging for a team

By default all teams will have a direct paging integration created for them. However, these are not configured by default.
If a team does not have their direct paging integration configured, such that it is "contactable" (ie. it has an
escalation chain assigned to it, or has at least one Chatops integration connected to send notifications to), you will
By default all teams will have a direct paging integration created for them. Each direct paging integration will be
created with two routes:

- a non-default route which has a Jinja2 filtering term of `{{ payload.oncall.important }}`
(see [Important Escalations](#important-escalations) below for more details)
- a default route to capture all other alerts

However, these integrations are not configured by default to be "contactable" (ie. their routes will have no
escalation chains assigned to them, nor any Chatops integrations connected to send notifications to).
If a team does not have their direct paging integration configured, such that it is "contactable" , you will
not be able to direct page this team. If this happens, consider following the following steps for the team (or reach out
to the relevant team and suggest doing so).

Navigate to the **Integrations** page and find the "Direct paging" integration for the team in question. From the
integration's detail page, you can customize its settings, link it to an escalation chain, and configure associated
ChatOps channels. To confirm that the integration is functioning as intended, [create a new alert group](#page-a-team)
and select the same team for a test run.

### Important escalations

Sometimes you really need to get the attention of a particular team. When directly paging a team, it is possible to
page them using an "important escalation". Practically speaking, this will create an alert, using the specified team's
direct paging integration as such:

```json
{
"oncall": {
"title": "IRM is paging Network team to join escalation",
"message": "I really need someone from your team to come take a look! The k8s cluster is down!",
"uid": "8a20b8d1-56fd-482e-824e-43fbd1bd7b10",
"author_username": "irm",
"permalink": null,
"important": true
}
}
```

When you are directly paging a team, either via the web UI, chatops apps, or the API, you can specify that this
esclation be "important", which will effectively set the value of `oncall.important` to `true`. As mentioned above in
[Set up direct paging for a team](#set-up-direct-paging-for-a-team), direct paging integrations come pre-configured with
two routes, with the non-default route having a Jinja2 filtering term of `{{ payload.oncall.important }}`.

This allows teams to be contacted via different escalation chains, depending on whether or not the user paging them
believes that this is an "important escalation".
13 changes: 12 additions & 1 deletion docs/sources/oncall-api-reference/escalation.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,11 @@ refs:
destination: /docs/oncall/<ONCALL_VERSION>/configure/integrations/references/manual
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/configure/integrations/references/manual
manual-paging-team-important:
- pattern: /docs/oncall/
destination: /docs/oncall/<ONCALL_VERSION>/configure/integrations/references/manual#important-escalations
- pattern: /docs/grafana-cloud/
destination: /docs/grafana-cloud/configure/integrations/references/manual#important-escalations
---

# Escalation HTTP API
Expand Down Expand Up @@ -90,7 +95,8 @@ curl "{{API_URL}}/api/v1/escalation/" \
"title": "We are seeing a network outage in the datacenter",
"message": "I need help investigating, can you join the investigation?",
"source_url": "https://github.com/myorg/myrepo/issues/123",
"team": "TI73TDU19W48J"
"team": "TI73TDU19W48J",
"important_team_escalation": true
}'
```

Expand Down Expand Up @@ -176,6 +182,7 @@ The above command returns JSON structured in the following way:
| `team` | No | Yes (see [Things to Note](#things-to-note)) | Grafana OnCall team ID. If specified, will use the "Direct Paging" Integration associated with this Grafana OnCall team, to create the Alert Group. |
| `users` | No | Yes (see [Things to Note](#things-to-note)) | List of user(s) to escalate to. See above request example for object schema. `id` represents the Grafana OnCall user's ID. `important` is a boolean representing whether to escalate the Alert Group using this user's default or important personal notification policy. |
| `alert_group_id` | No | No | If specified, will escalate the specified users for this Alert Group. |
| `important_team_escalation` | No | No | Sets the value of `payload.oncall.important` to the value specified here (default is `False`; see [Things to Note](#things-to-note) for more details). |

## Things to note

Expand All @@ -186,6 +193,10 @@ existing Alert Group
if you are trying to escalate to a set of users on an existing Alert Group, you cannot update the `title`, `message`, or
`source_url` of that Alert Group
- If escalating to a set of users for an existing Alert Group, the Alert Group cannot be in a resolved state
- Regarding `important_team_escalation`; this can be useful to send an "important" escalation to the specified team.
Teams can configure their Direct Paging Integration to route to different escalation chains based on the value of
`payload.oncall.important`. See [Manual paging integration - important escalations](ref:manual-paging-team-important)
for more details.

**HTTP request**

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# Generated by Django 4.2.17 on 2024-12-20 14:19

import logging

from django.db import migrations
from django.db.models import Count

logger = logging.getLogger(__name__)


def upsert_direct_paging_integration_routes(apps, schema_editor):
AlertReceiveChannel = apps.get_model("alerts", "AlertReceiveChannel")
ChannelFilter = apps.get_model("alerts", "ChannelFilter")

DIRECT_PAGING_INTEGRATION_TYPE = "direct_paging"
IMPORTANT_FILTERING_TERM = "{{ payload.oncall.important }}"

# Fetch all direct paging integrations
logger.info("Fetching direct paging integrations which have not had their routes updated.")

# Ignore updating Direct Paging integrations that have > 1 route, as this means that users have
# gone ahead and created their own routes. We don't want to overwrite these.
unedited_direct_paging_integrations = (
AlertReceiveChannel.objects
.filter(integration=DIRECT_PAGING_INTEGRATION_TYPE)
.annotate(num_routes=Count("channel_filters"))
.filter(num_routes=1)
)

integration_count = unedited_direct_paging_integrations.count()
if integration_count == 0:
logger.info("No integrations found which meet this criteria. No routes will be upserted.")
return

logger.info(f"Found {integration_count} direct paging integrations that meet this criteria.")

# Direct Paging Integrations are currently created with a single default route (order=0)
# see AlertReceiveChannelManager.create_missing_direct_paging_integrations
#
# we first need to update this route to be order=1, and then we will subsequently bulk-create the
# non-default route (order=0) which will have a filtering term set
routes = ChannelFilter.objects.filter(
alert_receive_channel__in=unedited_direct_paging_integrations,
is_default=True,
order=0,
)

logger.info(
f"Swapping the order=0 value to order=1 for {routes.count()} Direct Paging Integrations default routes"
)

updated_rows = routes.update(order=1)
logger.info(f"Swapped order=0 to order=1 for {updated_rows} Direct Paging Integrations default routes")

# Bulk create the new non-default routes
logger.info(
f"Creating new non-default routes for {len(unedited_direct_paging_integrations)} Direct Paging Integrations"
)
created_objs = ChannelFilter.objects.bulk_create(
[
ChannelFilter(
alert_receive_channel=integration,
filtering_term=IMPORTANT_FILTERING_TERM,
filtering_term_type=1, # 1 = ChannelFilter.FILTERING_TERM_TYPE_JINJA2
is_default=False,
order=0,
) for integration in unedited_direct_paging_integrations
],
batch_size=5000,
)
logger.info(f"Created {len(created_objs)} new non-default routes for Direct Paging Integrations")

logger.info("Migration for direct paging integration routes completed.")


class Migration(migrations.Migration):

dependencies = [
("alerts", "0071_migrate_labels"),
]

operations = [
migrations.RunPython(upsert_direct_paging_integration_routes, migrations.RunPython.noop),
]
51 changes: 40 additions & 11 deletions engine/apps/alerts/models/alert_receive_channel.py
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,8 @@ class AlertReceiveChannelManager(models.Manager):
def create_missing_direct_paging_integrations(organization: "Organization") -> None:
from apps.alerts.models import ChannelFilter

logger.info(f"Starting create_missing_direct_paging_integrations for organization: {organization.id}")

# fetch teams without direct paging integration
teams_missing_direct_paging = list(
organization.teams.exclude(
Expand All @@ -134,10 +136,17 @@ def create_missing_direct_paging_integrations(organization: "Organization") -> N
).values_list("team_id", flat=True)
)
)
number_of_teams_missing_direct_paging = len(teams_missing_direct_paging)
logger.info(
f"Found {number_of_teams_missing_direct_paging} teams missing direct paging integrations.",
)

if not teams_missing_direct_paging:
logger.info("No missing direct paging integrations found. Exiting.")
return

# create missing integrations
logger.info(f"Creating missing direct paging integrations for {number_of_teams_missing_direct_paging} teams.")
AlertReceiveChannel.objects.bulk_create(
[
AlertReceiveChannel(
Expand All @@ -151,29 +160,49 @@ def create_missing_direct_paging_integrations(organization: "Organization") -> N
batch_size=5000,
ignore_conflicts=True, # ignore if direct paging integration already exists for team
)
logger.info("Missing direct paging integrations creation step completed.")

# fetch integrations for teams (some of them are created above, but some may already exist previously)
alert_receive_channels = organization.alert_receive_channels.filter(
team__in=teams_missing_direct_paging, integration=AlertReceiveChannel.INTEGRATION_DIRECT_PAGING
)
logger.info(f"Fetched {alert_receive_channels.count()} direct paging integrations for the specified teams.")

# we create two routes for each Direct Paging Integration
# 1. route for important alerts (using the payload.oncall.important alert field value) - non-default
# 2. route for all other alerts - default
routes_to_create = []
for alert_receive_channel in alert_receive_channels:
routes_to_create.extend(
[
ChannelFilter(
alert_receive_channel=alert_receive_channel,
filtering_term="{{ payload.oncall.important }}",
filtering_term_type=ChannelFilter.FILTERING_TERM_TYPE_JINJA2,
is_default=False,
order=0,
),
ChannelFilter(
alert_receive_channel=alert_receive_channel,
filtering_term=None,
is_default=True,
order=1,
),
]
)

# create default routes
logger.info(f"Creating {len(routes_to_create)} channel filter routes.")
ChannelFilter.objects.bulk_create(
[
ChannelFilter(
alert_receive_channel=alert_receive_channel,
filtering_term=None,
is_default=True,
order=0,
)
for alert_receive_channel in alert_receive_channels
],
routes_to_create,
batch_size=5000,
ignore_conflicts=True, # ignore if default route already exists for integration
ignore_conflicts=True, # ignore if routes already exist for integration
)
logger.info("Direct paging routes creation completed.")

# add integrations to metrics cache
logger.info("Adding integrations to metrics cache.")
metrics_add_integrations_to_cache(list(alert_receive_channels), organization)
logger.info("Integrations have been added to the metrics cache.")

def get_queryset(self):
return AlertReceiveChannelQueryset(self.model, using=self._db).filter(
Expand Down
Loading

0 comments on commit 95ad2f2

Please sign in to comment.