Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x](backport #41446) [filebeat] Elasticsearch state storage for httpjson and cel inputs #42451

Merged
merged 3 commits into from
Jan 29, 2025

Conversation

mergify[bot]
Copy link
Contributor

@mergify mergify bot commented Jan 28, 2025

Proposed commit message

[filebeat] Elasticsearch state storage for httpjson input

This is a POC for Elasticsearch as State Store Backend for Security Integrations for Agentless solution.

The scope of this change was narrowed down to supporting only httpjson inputs in order to support Okta integration for the initial release. All the other integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s environment.

This is the first cut and the details can change depending on the feedback.

Current feature currently could be enabled AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED, to be decided how this would be configurable in k8s.

This change currently contains the hacky approach to the AGENTLESS_ELASTICSEARCH_APIKEY overwrite. This allows to the user to provide the ApiKey with elevated permissions that are required in order to be able to create/write/read the state index per input. THIS IS FOR DEVELOPMENT/TESTING ONLY. REMOVE BEFORE THE MERGE.

The existing code relied on the inputs state storage to be fully configurable before the main beat managers runs. The change delays the configuration of httpjson input to the time when the actual configuration is received from the Agent.

There is an assumption that the index template for the state storage indices is already in place before the storage is used

PUT _index_template/agentless_state_template
{
  "index_patterns": [
    "agentless-state-*"
  ],
  "priority": 300,
  "template": {
    "mappings": {
      "properties": {
        "v": {
          "type": "object",
          "enabled": false
        },
        "updated_at": {
          "type": "date",
          "format": "strict_date_optional_time||epoch_millis"
        }
      }
    },
    "settings": {
      "number_of_shards": 1
    }
  }
}

Example of the state storage index content for Okta integration:

{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}

The naming convention for all state store is agentless-state-<input id>, since the expectation for agentless we would have only one agent per policy and the agents are ephemeral.

Currently in order to run the agent with Elasticsearch state storage a couple of environment variables would be required:

sudo AGENTLESS_ELASTICSEARCH_STATE_STORE_ENABLED=1 AGENTLESS_ELASTICSEARCH_APIKEY=xxxxxxxx-xvpDXfB:jVMRsW7SRIxxxxxxxxx ./elastic-agent -e

where the ApiKey in the

DEPENDENCIES / TODOS:

  • Approval of teams for this approach
  • Kibana (?) side change is required for the agentless-state index template boostrapping
  • Kibana or the intergration package (or both) change is required in order to include the permissions for agentless-state- with the Elasticsearch ApiKey (Remove the hack). I suspect that Kibana fleet code could be modified to recognize agentless supporting integration and include the proper index name for the agentless-state for the ApiKey permissions.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

Disruptive User Impact

The change should have no impact, and without the feature enabled the filebeat should work as before using the file system storage for the state.


This is an automatic backport of pull request #41446 done by [Mergify](https://mergify.com).

@mergify mergify bot added backport conflicts There is a conflict in the backported pull request labels Jan 28, 2025
@mergify mergify bot requested review from a team as code owners January 28, 2025 15:27
Copy link
Contributor Author

mergify bot commented Jan 28, 2025

Cherry-pick of 8180f23 has failed:

On branch mergify/bp/8.x/pr-41446
Your branch is up to date with 'origin/8.x'.

You are currently cherry-picking commit 8180f23fb.
  (fix conflicts and run "git cherry-pick --continue")
  (use "git cherry-pick --skip" to skip this patch)
  (use "git cherry-pick --abort" to cancel the cherry-pick operation)

Changes to be committed:
	modified:   filebeat/beater/store.go
	new file:   filebeat/features/features.go
	new file:   filebeat/features/features_test.go
	modified:   filebeat/input/filestream/environment_test.go
	modified:   filebeat/input/filestream/input_test.go
	modified:   filebeat/input/filestream/internal/input-logfile/manager.go
	modified:   filebeat/input/filestream/internal/input-logfile/store.go
	modified:   filebeat/input/filestream/internal/input-logfile/store_test.go
	modified:   filebeat/input/journald/environment_test.go
	modified:   filebeat/input/journald/input_filtering_test.go
	modified:   filebeat/input/v2/input-cursor/manager.go
	modified:   filebeat/input/v2/input-cursor/store.go
	modified:   filebeat/input/v2/input-cursor/store_test.go
	modified:   filebeat/registrar/registrar.go
	modified:   libbeat/statestore/backend/backend.go
	new file:   libbeat/statestore/backend/es/error.go
	new file:   libbeat/statestore/backend/es/notifier.go
	new file:   libbeat/statestore/backend/es/notifier_test.go
	new file:   libbeat/statestore/backend/es/registry.go
	new file:   libbeat/statestore/backend/es/store.go
	modified:   libbeat/statestore/backend/memlog/store.go
	modified:   libbeat/statestore/mock_test.go
	modified:   libbeat/statestore/store.go
	modified:   libbeat/statestore/storetest/storetest.go
	modified:   x-pack/filebeat/input/awss3/states.go
	modified:   x-pack/filebeat/input/awss3/states_test.go
	modified:   x-pack/filebeat/input/salesforce/input_manager_test.go
	modified:   x-pack/filebeat/tests/integration/managerV2_test.go

Unmerged paths:
  (use "git add <file>..." to mark resolution)
	both modified:   filebeat/beater/filebeat.go

To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally

@mergify mergify bot requested review from rdner and khushijain21 and removed request for a team January 28, 2025 15:27
@botelastic botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Jan 28, 2025
Copy link

cla-checker-service bot commented Jan 28, 2025

💚 CLA has been signed

@github-actions github-actions bot added enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution labels Jan 28, 2025
@elasticmachine
Copy link
Collaborator

Pinging @elastic/elastic-agent-data-plane (Team:Elastic-Agent-Data-Plane)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/sec-deployment-and-devices (Team:Security-Deployment and Devices)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label Jan 28, 2025
@orestisfl orestisfl enabled auto-merge (squash) January 28, 2025 20:52
…41446)

This enables Elasticsearch as State Store Backend for Security Integrations for
the Agentless solution.

The scope of this change was narrowed down to supporting only `httpjson` inputs
in order to support Okta integration for the initial release. All the other
integrations inputs still use the file storage as before.
This is a short term solution for the state storage for k8s.

The feature currently can only be enabled with the
`AGENTLESS_ELASTICSEARCH_STATE_STORE_INPUT_TYPES` env var.

The existing code relied on the inputs state storage to be fully configurable
before the main beat managers runs. The change delays the configuration of
`httpjson` input to the time when the actual configuration is received from the
Agent.

Example of the state storage index content for Okta integration:
```
{
  "took": 6,
  "timed_out": false,
  "_shards": {
    "total": 1,
    "successful": 1,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": {
      "value": 1,
      "relation": "eq"
    },
    "max_score": 1,
    "hits": [
      {
        "_index": "agentless-state-httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959",
        "_id": "httpjson::httpjson-okta.system-028ecf4b-babe-44c6-939e-9e3096af6959::https://dev-36006609.okta.com/api/v1/logs",
        "_seq_no": 39,
        "_primary_term": 1,
        "_score": 1,
        "_source": {
          "v": {
            "ttl": 1800000000000,
            "updated": "2024-10-24T20:21:22.032Z",
            "cursor": {
              "published": "2024-10-24T20:19:53.542Z"
            }
          }
        }
      }
    ]
  }
}
```

The naming convention for all state store is `agentless-state-<input id>`,
since the expectation for agentless we would have only one agent per policy and
the agents are ephemeral.

Closes elastic/security-team#11101

Co-authored-by: Aleksandr Maus <[email protected]>
Co-authored-by: Orestis Floros <[email protected]>
(cherry picked from commit 8180f23)
@orestisfl orestisfl force-pushed the mergify/bp/8.x/pr-41446 branch from 0cde111 to 5074501 Compare January 29, 2025 10:56
@orestisfl orestisfl disabled auto-merge January 29, 2025 10:56
@orestisfl orestisfl enabled auto-merge (squash) January 29, 2025 10:57
@orestisfl orestisfl merged commit 911f527 into 8.x Jan 29, 2025
141 checks passed
@orestisfl orestisfl deleted the mergify/bp/8.x/pr-41446 branch January 29, 2025 12:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport conflicts There is a conflict in the backported pull request enhancement Team:Elastic-Agent-Data-Plane Label for the Agent Data Plane team Team:Security-Deployment and Devices Deployment and Devices Team in Security Solution
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants