Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: validation of detections against cms_main #303

Merged
merged 35 commits into from
Mar 26, 2025

Conversation

cmcginley-splunk
Copy link
Contributor

@cmcginley-splunk cmcginley-splunk commented Oct 8, 2024

Context

  • We need to validate that our detections not only work in ES, but are also being parsed correctly by the content versioning service

Code change

  • Added ContentVersioningService
    • Enables content versioning
    • Activates CMS parser
    • Validates detections against CMS events, reports any missing detections or other validation issues. Validations include:
      • Counts between cms_main and detections based on YAMLs match
      • Each detection can be matched to a cms_event
      • Each cms_event can be matched to a detection
      • No cms_event is repeated
      • UUIDs match between cms/detection pairs
      • Versions match between cms/detection pairs
      • Full search name matches between cms/detection pairs
      • Correlationsearch label matches between cms/detection pairs
    • Returns a list of setup functions which are optionally inserted into the main setup of DetectionTestingInfrastructure
  • Aggregating errors across all threads (setup, testing, view, view shutdown) and reporting en masse
  • Broke up setup functions list to allow for optional inclusion of the content versioning setup functions
  • NOTE: Now we raise any exceptions that crop up during setup, instead of just logging them
  • Added logic to DetectionTestingInfrastructure to query installed ES version, so we can determine when it is appropriate to run content versioning validation
    • Content versioning validation is only run when 1) integration testing is enabled, 2) ES 8.0.0+ is installed, and 3) when the testing mode is All
  • Moved logging utility in CorrelationSearch (useful for debugging) to helper.utils.Utils so it could be re-used in ContentVersioningService

Testing

  • I've tested locally w/ some quick and dirty edits to the savedsearches.conf file and all seems to work well
  • Ran against escu-production and all results passed (see attached truncated log)
Screenshot 2024-10-08 at 12 38 55 PM

content_versioning_service.log

Caveats

  • Note that we now raise exceptions from setup, instead of logging and dropping them; I believe this is desirable
  • I went back and forth as to whether this validation should happen as part of setup, OR if I should create some new type of test case. I opted for the former as this type of testing doesn't fit our current testing paradigm; validation against the cms_main index should happen in aggregate, a test of the whole package, not individual detections. We do not have an easy way to insert testing at that level presently.
  • Additionally, this validation probably only needs to be run against a single instance and not necessarily all instances in a single run, but I don't see an easy or clean way to do that as well, and there's no harm or impact on execution time as far as I can tell

TODOs

  • disable logging
  • remove TODOs or create follow up issues
  • Test in CI when able

@cmcginley-splunk cmcginley-splunk changed the title Feature/validation against cms main Feature: validation of detections against cms_main Oct 8, 2024
@cmcginley-splunk cmcginley-splunk marked this pull request as ready for review October 8, 2024 19:50
@pyth0n1c pyth0n1c added the Draft label Nov 6, 2024
f"[{self.infrastructure.instance_name}] {offset}: Matching cms_main entry "
f"'{cms_entry_name}' against detections"
)
ptrn = re.compile(r"^" + self.global_config.app.label + r" - (?P<stripped_cms_entry_name>.+) - Rule$")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it's better to go in reverse now that we have the following function to generate a search name:

def get_action_dot_correlationsearch_dot_label(self, app:CustomApp, max_stanza_length:int=ES_MAX_STANZA_LENGTH)->str:

This also means if we change the template used in the link above to generate the detection name, we don't have to change it again in this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhhhh I don't think I knew about this function, I'll take a look!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is FAIRLY new. In the future, it might be a nice idea to actually associate the app config with a detection so that we don't have to carry/pass it around, making action_dot_correlationsearch_dot_label a cached_property instead of a function that takes an argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated this block to use this function for matching instead of regex

Copy link
Contributor

@pyth0n1c pyth0n1c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments, but out of curiosity not because changes are required.
I will do a bit more local testing of these changes then get them merged and do a new contentctl release!

)
self.logger.error(msg)
return Exception(msg)
elif cms_event["version"] != f"{detection.version}.1":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any other fields we can/should be validating as well? For example, the search field itself, given that we have seen some consistency issues there in the past? Although we can argue that things which actually impact the search at runtime are caught during a later phase of the integration test.

Or is this an issue of "yes, we could probably check this, but then should we check EVERY field?"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, yes! Hahah I have an open ticket PEX-509 (see comment on line 473) to look into this. I think there's a lot of room to validate more here

Copy link
Contributor

@pyth0n1c pyth0n1c left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving after additional runtime testing. Nice, descriptive errors, and I like that testing bails as SOON as we detect this error - it means we don't waste time on a full integration test (which is costly in terms of time AND resource costs) when we know, early on, that there has been a CMS issue.

These errors were stimulated by running a test (with the latest security_content) and all apps, then rerunning the test after making some changes to YML. Since the changes were made in YML, they should not be reflected in cms_main and SHOULD show as error:
create new detection with new name - NOT FOUND IN CMS MAIN
create new detection with new name - CMS MAIN CONTAINS 1 LESS OBJECT THAN EXPECTED
change the version of an existing detection - WRONG VERSION IN CMS MAIN
change the uuid of existing detection - UUID MISMATCH WARNING

These errors are exactly what we would expect to find. Great work!
image

@pyth0n1c pyth0n1c merged commit 8f322aa into main Mar 26, 2025
16 checks passed
@cmcginley-splunk
Copy link
Contributor Author

Thanks for the thorough testing Eric!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants