Skip to content

DOCS-235 - Historical baselines in Cloud SIEM #5194

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 12 commits into
base: main
Choose a base branch
from
17 changes: 17 additions & 0 deletions blog-cse/2025-05-13-application.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
---
title: May 13, 2025 - Application Update
image: https://help.sumologic.com/img/sumo-square.png
keywords:
- outlier rules
- first seen rules
- baseline
hide_table_of_contents: true
---

import useBaseUrl from '@docusaurus/useBaseUrl';

### New method for building baselines

We're happy to announce that now when you create or update a first seen or outlier rule, the baseline starts building immediately using existing system data. Typically, the baseline is ready within minutes. You no longer need to wait days for a baseline learning period to complete before it becomes usable. This change enables you to gain insights faster and iterate on your first seen and outlier rules rapidly, reducing tuning time from weeks to minutes.

To learn more, see our information about baselines for [first seen rules](/docs/cse/rules/write-first-seen-rule/) and [outlier rules](/docs/cse/rules/write-outlier-rule/#baselines-for-outlier-rules).
32 changes: 26 additions & 6 deletions docs/cse/rules/rules-status.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,23 +33,43 @@ Following are the different kinds of rule status. A rule's status can change dep
| Status | Description | Action required |
| :-- | :-- | :-- |
| **Active** | The rule is executing normally. | No action required. |
| **Degraded** | The rule is approaching a rule limit and it is removed from execution for one hour to allow processing to catch up. At the end of the hour, the rule is allowed to execute again and its status changes back to Active. | Click the information button <img src={useBaseUrl('img/cse/rule-status-information-button.png')} alt="Rule status information button" width="20"/> on the **Degraded** label for details. Depending on the information provided, you may want to edit the rule to reduce the chance it will become degraded again later. See [Degraded rules](#degraded-rules) below for more information. |
| **Degraded** | The rule encountered a problem during processing and is removed from execution until the problem is resolved. | Click the information button <img src={useBaseUrl('img/cse/rule-status-information-button.png')} alt="Rule status information button" width="20"/> on the **Degraded** label for details. Depending on the information provided, you may need to edit the rule to reduce the chance it will become degraded again later. See [Degraded rules](#degraded-rules) below for more information. |
| **Disabled** | The rule was manually disabled using the toggle in the UI, or was disabled with the API. | Enable the rule with the toggle in the UI, or enable the rule with the [API](https://api.sumologic.com/docs/sec/#operation/UpdateRuleEnabled). |
| **Failed** | The rule exceeded a rule limit and was automatically disabled. | Click the information button <img src={useBaseUrl('img/cse/rule-status-information-button.png')} alt="Rule status information button" width="20"/> on the **Failed** label for details about the failure. Depending on the reasons provided in the details, you may need to edit the rule to prevent it from failing again in the future. <br/><br/>After addressing the reasons for the failure, enable the rule with the toggle in the UI, or enable the rule with the [API](https://api.sumologic.com/docs/sec/#operation/UpdateRuleEnabled). |
| **Failed** | The rule encountered a problem that resulted in its being automatically disabled. For example, processing the rule caused the system to exceed a rule limit. | Click the information button <img src={useBaseUrl('img/cse/rule-status-information-button.png')} alt="Rule status information button" width="20"/> on the **Failed** label for details about the failure. Depending on the reason provided in the details, you may need to edit the rule to prevent it from failing again in the future. <br/><br/>After addressing the reason for the failure, enable the rule with the toggle in the UI, or enable the rule with the [API](https://api.sumologic.com/docs/sec/#operation/UpdateRuleEnabled). |
| **Pending Baseline** | The baseline for the [first seen rule](/docs/cse/rules/write-first-seen-rule/#baselines-for-first-seen-rules) or [outlier rule](/docs/cse/rules/write-outlier-rule/#baselines-for-outlier-rules) is being generated. | Click the information button <img src={useBaseUrl('img/cse/rule-status-information-button.png')} alt="Rule status information button" width="20"/> on the **Pending Baseline** label for details. If data exists in the system to build the baseline, baseline generation typically takes only minutes to complete, and then the rule's status changes to "Active". However, if there is not enough data in the system, the pending status can last longer. See [Troubleshoot baseline problems](#troubleshoot-baseline-problems) below. |

<!-- For DOCS-72 - Rule limits
| **Warning** | The rule is approaching a rule limit and risks being disabled. | Click the information button <img src={useBaseUrl('img/cse/rule-warning-info-button.png')} alt="Rule warning information button" width="20"/> on the **Warning** label for details about the warning. Depending on the reasons provided in the details, you may need to edit the rule to prevent it from being disabled. |
-->

### Degraded rules

A degraded rule is one that has been temporarily shut off to prevent it from exceeding a processing limit. If you write a [custom rule](/docs/cse/rules/before-writing-custom-rule/) that becomes degraded, you must tune the rule to correct the problem.
A degraded rule is one that has been temporarily removed from execution because a problem was encountered during rule processing. After the problem is resolved, the rule returns to execution.

For example, rules have a limit on the number of records per second they can evaluate. If there is a value used in the "group by" field that causes the rule to exceed that threshold, Cloud SIEM might display a message like this:
Rules can be degraded for many reasons, such as a failure to parse the rule. If the rule is degraded because it is approaching a rule limit, it is removed for one hour to allow processing to catch up, and at the end of the hour, the rule is allowed to execute again and its status changes back to Active.

If you write a [custom rule](/docs/cse/rules/before-writing-custom-rule/) that becomes degraded, you must tune the rule to correct the problem. Create a [rule tuning expression](/docs/cse/rules/rule-tuning-expressions/) to address the portion of the rule causing the rule degradation.

Following are some situations when a rule can be become degraded:
* When a rule cannot be parsed, a message like this can appear when you click the information button on the "Degraded" rule status:
<br/>`Failure to parse rule: Line 1:2 mismatched input 'Unknown' expecting {<EOF>, '[', '.', AND, BETWEEN, IN, IS, LIKE, MATCHES, NOT, OR, RLIKE, EQ, '<=>', '<>', '!=', '<', LTE, '>', GTE, '+', '-', '*', '/', '%', WS}`
* Rules have a limit on the number of records per second they can evaluate. If there is a value used in the "group by" field that causes the rule to exceed that threshold, Cloud SIEM might display a message like this when you click the information button on the "Degraded" rule status:
<br/>`The aggregation on the group key '[email protected]' has a record volume exceeding the supported limit, and has been disabled. Consider tuning the rule to exclude records producing this group key.`

### Troubleshoot baseline problems

Sometimes there may be a problem creating a baseline for a [first seen rule](/docs/cse/rules/write-first-seen-rule/#baselines-for-first-seen-rules) or [outlier rule](/docs/cse/rules/write-outlier-rule/#baselines-for-outlier-rules). In these cases, the rule might enter a Degraded, Failed, or Pending Baseline state. Clicking the information button <img src={useBaseUrl('img/cse/rule-status-information-button.png')} alt="Rule status information button" width="20"/> on the status label in most cases will provide enough information to resolve the problem. But if not, you can do additional troubleshooting:
* Check the [Sumo Logic status](https://status.sumologic.com/) page to see if there’s an outage in your deployment. If the system is down, it cannot generate the baseline.
* If the rule has a Degraded status because it failed to parse, fix the rule so that it parses correctly. A baseline cannot be built if the rule does not successfully parse. One thing you can do is ensure that a matching expression for the rule parses correctly is to use the compatible [core platform literals](/docs/cse/rules/cse-rules-syntax/#sumo-logic-core-platform-literals-supported-in-cloud-siem).
* If the rule has a Failed status, clicking the information button might show that the amount of data requested is too large to return (see [Rule limits](#rule-limits)). In this case, create a more filtered baseline focusing on the exact activity you want to capture.
* If the rule has a persistent Pending Baseline status, there might not be enough data in the system to build the baseline:
* Check the ingest configuration of your Cloud SIEM data sources and confirm the appropriate records are being added to the system.
* The matching expression may not be using the right fields. Cloud SIEM records are normalized to a defined [schema](/docs/cse/schema/schema-attributes/). The matching expression and all other fields should use that schema and not the raw log field names.
* There may not be enough activity to build a baseline. Expand the baseline retention period to gather more activity.
* Make sure that the Sumo Logic system has been active and ingesting data for the full baseline retention period. For example, if the rule has a default baseline retention period of 90 days, but your company only started using Sumo Logic a few days ago, then the rule will remain in the Pending Baseline state until 90 days have passed. To resolve the issue, change the baseline retention period window.


`The aggregation on the group key '[email protected]' has a record volume exceeding the supported limit, and has been disabled. Consider tuning the rule to exclude records producing this group key.`

To resolve a degraded rule issue, create a [rule tuning expression](/docs/cse/rules/rule-tuning-expressions/) to address the portion of the rule causing the rule degradation.

## Rule limits

Expand Down
48 changes: 29 additions & 19 deletions docs/cse/rules/write-first-seen-rule.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,23 +20,14 @@ If you are new to writing rules, see [About Cloud SIEM Rules](/docs/cse/rules/ab
:::

## About first seen rules

First seen rules allow you to generate a signal when behavior by an entity (such as a user) is encountered that hasn't been seen before. For example, a first seen rule might look for the events like the following:

* First time a user logged in from a new geographic location (geolocation)
* Newly created or added admin accounts
* High severity EDR alert seen for the first time
* MFA acceptance from first seen device

A first seen rule is different from other Cloud SIEM rule types in that you don’t define the criteria for firing a signal. Instead, the rule expression in a first seen rule is simply a filter condition that defines what incoming records the rule will apply to. For each first seen rule, Cloud SIEM automatically creates a baseline model of normal behavior evidenced by records that match the Rule Expression. After the baseline learning period is completed, when an incoming record includes matching activity not seen during the baseline learning period, the rule creates a signal.

For example, for the “First time a user logged in from a new geographic location” use case, Cloud SIEM will build a baseline model of all the geolocations from where a logon event is seen for the entity (user). Once the baselining period is complete, Cloud SIEM will create a signal for every new geolocation detected and incrementally add to the baseline.

:::tip
Sumo Logic ensures that rule processing does not impact the reliability of production environments through the implementation of "circuit breakers." If a rule matches too many records in too short a period of time, the circuit breaker will trip and the rule will move to a degraded state, and first seen rules are no exception.

On the rule detail page, if you hover over the degraded message, you will usually see more details about what tripped the circuit breaker and how to resolve the problem. Generally speaking, a rule that is degraded probably needs to be tuned for your specific environment.
:::

:::sumo Micro Lesson

Watch this micro lesson to learn more about first seen rules.
Expand Down Expand Up @@ -68,10 +59,29 @@ Watch this micro lesson to learn more about first seen rules.

:::

## Baselines for first seen rules

A first seen rule is different from other Cloud SIEM rule types in that you don’t define the criteria for firing a signal. Instead, the rule expression in a first seen rule is simply a filter condition that defines what incoming records the rule will apply to. For each first seen rule, Cloud SIEM automatically creates a baseline model of normal behavior for a defined time period (by default for the last 90 days) evidenced by records that match the Rule Expression. The activity found during this period is considered normal behavior and will not be alerted on. As soon as you save or update a first seen rule, the baseline is built using existing data collected. If data exists in the system to build the baseline, baseline creation typically takes only minutes to complete.

Once the baseline is created, when an incoming record includes matching activity not seen during the baseline retention period, the rule creates a signal identifying the activity as *first seen*. The signal indicates that the activity is first seen:

<img src={useBaseUrl('img/cse/first-seen-signal-example.png')} alt="First seen signal example" style={{border: '1px solid gray'}} width="600"/>

For example, for the “First time a user logged in from a new geographic location” use case, Cloud SIEM will build a baseline model of all the geolocations from where a logon event is seen for the entity (user). Once the baseline is created, Cloud SIEM will create a signal for every new geolocation detected and incrementally add to the baseline.

:::tip
Sumo Logic ensures that rule processing does not impact the reliability of production environments through the implementation of "circuit breakers." If a rule matches too many records in too short a period of time, the circuit breaker will trip and the rule will move to a degraded state, and first seen rules are no exception.

On the rule detail page, if you view the degraded message, you will usually see more details about what tripped the circuit breaker and how to resolve the problem. Generally speaking, a rule that is degraded probably needs to be tuned for your specific environment.

For more information, see [Troubleshoot baseline problems](/docs/cse/rules/rules-status/#troubleshoot-baseline-problems).
:::

## Example rule

The screenshot below shows a first seen rule in the Cloud SIEM rules editor. For an explanation of the configuration options, see [Create a first seen rule](#create-a-first-seen-rule), below.
<img src={useBaseUrl('img/cse/first-seen-rule.png')} alt="Example first seen rule definition" style={{border: '1px solid gray'}} width="700"/>

<img src={useBaseUrl('img/cse/first-seen-rule.png')} alt="Example first seen rule definition" style={{border: '1px solid gray'}} width="700"/>

## Create a first seen rule

Expand All @@ -98,11 +108,9 @@ The settings in the **If Triggered** section determine what records the rule wil
:::note
For more information about how to select the type of base line, see the [Use case](#use-case-monitor-login-from-first-seen-geolocation), below.
:::
1. Set the baseline and retention settings:
1. **Baseline Retention Period (days)**. The number of days after which the data points in the baseline will expire (be dropped from the baseline). The default is 90 days. You can decrease this period, but not increase it.
1. **Baseline Learning Period (days)**. The minimum amount of time for which data points should be collected before firing a signal. The default is 30 days.
1. **Baseline Retention Period (days)**. The number of days after which the data points in the baseline will expire (be dropped from the baseline). The minimum is 1, and the maximum is 90. The default is 90 days.
:::note
The **Baseline Learning Period** must be shorter than the **Baseline Retention Period**. Also be aware that short baseline learning periods can potentially generate false positive signals.
If the [retention period for logs](/docs/cse/administration/cse-data-retention/) is less than the baseline retention period, then the baseline will be created based on the logs retention time only.
:::

### Configure "Then Create a Signal" settings
Expand All @@ -123,13 +131,15 @@ The settings in the **If Triggered** section determine what records the rule wil

## When the baseline is reset for a first seen rule

The baseline learning period begins again when the following fields on the rule are updated or overridden:
Baseline creation begins again when the following fields on the rule are updated or overridden:
* **If Triggered**:
* **When a Record matching the expression**
* **Has a new value for the field(s)**
* **Then Create a Signal**:
* **On Entity**

If data exists in the system to build the baseline, baseline creation typically takes only minutes to complete.

## Use case: Monitor login from first seen geolocation

This section shows how the same first seen rule would function with each of the two baselining strategies.
Expand All @@ -142,12 +152,12 @@ with **has a new value for the field(s)** set to `srcDeviceIP_countryName`

### With a global baseline

With a global baseline, and the default baseline learning period of 30 days, the rule will baseline all geolocations that users are logging in for a period of 30 days. After the 30 day baseline is completed, if a new geolocation is detected, a signal will be created. Then, if a new hire (that wasn’t part of the 30 day baseline) logs in from any geolocation, a signal
will be created. As a global baseline, the 30 day baseline is shared across all entity.
With a global baseline, and the default baseline retention period of the last 90 days, the rule creates a baseline of all geolocations that users logged in from for the last 90 days. If a new geolocation is detected, a signal will be created. Then, if a new hire (that wasn’t part of the 90 day baseline) logs in from any geolocation, a signal
will be created. As a global baseline, the 90 day baseline is shared across all entities.

### With per-entity baselines

With a per-entity baseline, and the default baseline learning period of 30 days, the rule will baseline all geolocations on a per-entity basis for 30 days. It will generate a signal when a new geolocation is not part of a user’s historic baseline. On a new hire’s first login, a 30 day baseline will begin building. After the 30 day baseline is created, if that user logs on from a new geolocation, the rule will create a signal.
With a per-entity baseline, and the default baseline retention period of the last 90 days, the rule creates a baseline of all geolocations on a per-entity basis for the last 90 days. It will generate a signal when a new geolocation is not part of a user’s historic baseline. On a new hire’s first login, a baseline for the last 90 days will begin rebuilding. If that user logs on from a new geolocation, the rule will create a signal.

:::tip
If you are unsure whether to use a per-entity or a global baseline, consider your use case. If you’re inclined to select `user_username` in the **Has a new value for the field(s)** prompt, you’re better off creating a global baseline for that behavior. Alternatively, if you want to track a new value for a non-entity record field, a per-entity baseline is appropriate.
Expand Down
Loading