diff --git a/docs/kubernetes.md b/docs/kubernetes.md index 69a966683..0569f1181 100644 --- a/docs/kubernetes.md +++ b/docs/kubernetes.md @@ -428,6 +428,8 @@ Lookup extracted file hashes with VirusTotal? (y / N): n Download updated file scanner signatures periodically? (y / N): y +Configure pulling from threat intelligence feeds for Zeek intelligence framework? (y / N): n + Should Malcolm run and maintain an instance of NetBox, an infrastructure resource modeling tool? (y / N): y Should Malcolm enrich network traffic using NetBox? (Y / n): y diff --git a/docs/malcolm-config.md b/docs/malcolm-config.md index dc2e85812..a361a9743 100644 --- a/docs/malcolm-config.md +++ b/docs/malcolm-config.md @@ -128,9 +128,9 @@ Although the configuration script automates many of the following configuration - `ZEEK_DISABLE_ICS_ALL` and `ZEEK_DISABLE_ICS_…` - if set to `true`, these variables can be used to disable Zeek's protocol analyzers for Operational Technology/Industrial Control Systems (OT/ICS) protocols - `ZEEK_DISABLE_BEST_GUESS_ICS` - see ["Best Guess" Fingerprinting for ICS Protocols](ics-best-guess.md#ICSBestGuess) - `ZEEK_EXTRACTOR_MODE` – determines the file extraction behavior for file transfers detected by Zeek; see [Automatic file extraction and scanning](file-scanning.md#ZeekFileExtraction) for more details - - `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](zeek-intel.md#ZeekIntelSTIX), [MISP](zeek-intel.md#ZeekIntelMISP), or [Mandiant](zeek-intel.md#ZeekIntelMandiant) threat intelligence feed, only process threat indicators created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2021`) or relative interval (`30 days ago`) + - `ZEEK_INTEL_FEED_SINCE` - when querying a [TAXII](zeek-intel.md#ZeekIntelSTIX), [MISP](zeek-intel.md#ZeekIntelMISP), or [Mandiant](zeek-intel.md#ZeekIntelMandiant) threat intelligence feed, only process threat indicators created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2025`) or relative interval (`7 days ago`) - `ZEEK_INTEL_ITEM_EXPIRATION` - specifies the value for Zeek's [`Intel::item_expiration`](https://docs.zeek.org/en/current/scripts/base/frameworks/intel/main.zeek.html#id-Intel::item_expiration) timeout as used by the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) (default `-1min`, which disables item expiration) - - `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` - specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) indicating the refresh interval for generating the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) files (defaults to empty, which disables automatic refresh) + - `ZEEK_INTEL_REFRESH_CRON_EXPRESSION` - Specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) (using [`cronexpr`](https://github.com/aptible/supercronic/tree/master/cronexpr#implementation)-compatible syntax) indicating the refresh interval for generating the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) files (defaults to empty, which disables automatic refresh) - `ZEEK_JA4SSH_PACKET_COUNT` - the Zeek [JA4+ plugin](https://github.com/FoxIO-LLC/ja4) calculates the JA4SSH value once for every *x* SSH packets; *x* is set here (default `200`) - `ZEEK_LIVE_CAPTURE` - if set to `true`, Zeek will monitor live traffic on the local interface(s) defined by `PCAP_FILTER` + See [**Tuning Zeek**](live-analysis.md#LiveAnalysisTuningZeek) for other variables related to managing Zeek's performance and resource utilization. diff --git a/docs/malcolm-hedgehog-e2e-iso-install.md b/docs/malcolm-hedgehog-e2e-iso-install.md index 708948b63..3ac4019f1 100644 --- a/docs/malcolm-hedgehog-e2e-iso-install.md +++ b/docs/malcolm-hedgehog-e2e-iso-install.md @@ -239,39 +239,49 @@ The [configuration and tuning](malcolm-config.md#ConfigAndTuning) wizard's quest + Users should answer **N** unless they plan to use SFTP/SCP to [upload](upload.md#Upload) PCAP files to Malcolm; answering **Y** will expose TCP port 8022 in Malcolm's firewall for SFTP/SCP connections * **Enable file extraction with Zeek?** - Answer **Y** to indicate that Zeek should [extract files](file-scanning.md#ZeekFileExtraction) transfered in observed network traffic. -* **Select file extraction behavior** - - This determines which files Zeek should extract for scanning: - + `none`: no file extraction - + `interesting`: extraction of files with mime types of common attack vectors - + `mapped`: extraction of files with recognized mime types - + `known`: extraction of files for which any mime type can be determined - + `all`: extract all files - + `notcommtxt`: extract all files except common plain text files -* **Select file preservation behavior** - - This determines the behavior for preservation of Zeek-extracted files: - + `quarantined`: preserve only flagged files in `./zeek-logs/extract_files/quarantine` - + `all`: preserve flagged files in `./zeek-logs/extract_files/quarantine` and all other extracted files in `./zeek-logs/extract_files/preserved` - + `none`: preserve no extracted files -* **Enter maximum allowed space for Zeek-extracted files (e.g., 250GB) or file system fill threshold (e.g., 90%)** - - Files [extracted by Zeek](file-scanning.md#ZeekFileExtraction) can be periodically pruned to ensure the disk storage they consume does not exceed a user-specified threshold. See the documentation on [managing Malcolm's disk usage](malcolm-config.md#DiskUsage) for more information. -* **Expose web interface for downloading preserved files?** - - Answering **Y** enables access to the Zeek-extracted files path through the means of a simple HTTPS directory server at **https:///extracted-files/**. Beware that Zeek-extracted files may contain malware. -* **ZIP downloaded preserved files?** - - Answering **Y** will cause that Zeek-extracted files downloaded as described under the previous question will be archived using the ZIP file format. -* **Enter ZIP archive password for downloaded preserved files (or leave blank for unprotected)** and **Enter AES-256-CBC encryption password for downloaded preserved files (or leave blank for unencrypted)** - - A non-blank value will be used as either the ZIP archive file password (if the previous question was answered **Y**) or as the encryption key for the file to be AES-256-CBC-encrypted in an `openssl enc`-compatible format (e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`). -* **Scan extracted files with ClamAV?** - - Answer **Y** to scan extracted files with [ClamAV](https://www.clamav.net/), an antivirus engine. -* **Scan extracted files with Yara?** - - Answer **Y** to scan extracted files with [Yara](https://github.com/VirusTotal/yara), a tool used to identify and classify malware samples. -* **Scan extracted PE files with Capa?** - - Answer **Y** to scan extracted executable files with [Capa](https://github.com/fireeye/capa), a tool for detecting capabilities in executable files. -* **Lookup extracted file hashes with VirusTotal?** - - Answer **Y** to be prompted for a [**VirusTotal**](https://www.virustotal.com/en/#search) API key, which will be used for submitting the hashes of extracted files. Only specify this option if the Malcolm instance has Internet connectivity. -* **Enter VirusTotal API key** - - Specify the [**VirusTotal**](https://www.virustotal.com/en/#search) [API key](https://support.virustotal.com/hc/en-us/articles/115002100149-API) as indicated under the previous question. -* **Download updated file scanner signatures periodically?** - - If the Malcolm instance has Internet connectivity, answer **Y** to enable periodic downloads of signatures used by ClamAV and YARA. + - **Select file extraction behavior** + + This determines which files Zeek should extract for scanning: + * `none`: no file extraction + * `interesting`: extraction of files with mime types of common attack vectors + * `mapped`: extraction of files with recognized mime types + * `known`: extraction of files for which any mime type can be determined + * `all`: extract all files + * `notcommtxt`: extract all files except common plain text files + - **Select file preservation behavior** + + This determines the behavior for preservation of Zeek-extracted files: + * `quarantined`: preserve only flagged files in `./zeek-logs/extract_files/quarantine` + * `all`: preserve flagged files in `./zeek-logs/extract_files/quarantine` and all other extracted files in `./zeek-logs/extract_files/preserved` + * `none`: preserve no extracted files + - **Enter maximum allowed space for Zeek-extracted files (e.g., 250GB) or file system fill threshold (e.g., 90%)** + + Files [extracted by Zeek](file-scanning.md#ZeekFileExtraction) can be periodically pruned to ensure the disk storage they consume does not exceed a user-specified threshold. See the documentation on [managing Malcolm's disk usage](malcolm-config.md#DiskUsage) for more information. + - **Expose web interface for downloading preserved files?** + + Answering **Y** enables access to the Zeek-extracted files path through the means of a simple HTTPS directory server at **https:///extracted-files/**. Beware that Zeek-extracted files may contain malware. + - **ZIP downloaded preserved files?** + + Answering **Y** will cause that Zeek-extracted files downloaded as described under the previous question will be archived using the ZIP file format. + - **Enter ZIP archive password for downloaded preserved files (or leave blank for unprotected)** and **Enter AES-256-CBC encryption password for downloaded preserved files (or leave blank for unencrypted)** + + A non-blank value will be used as either the ZIP archive file password (if the previous question was answered **Y**) or as the encryption key for the file to be AES-256-CBC-encrypted in an `openssl enc`-compatible format (e.g., `openssl enc -aes-256-cbc -d -in example.exe.encrypted -out example.exe`). + - **Scan extracted files with ClamAV?** + + Answer **Y** to scan extracted files with [ClamAV](https://www.clamav.net/), an antivirus engine. + - **Scan extracted files with Yara?** + + Answer **Y** to scan extracted files with [Yara](https://github.com/VirusTotal/yara), a tool used to identify and classify malware samples. + - **Scan extracted PE files with Capa?** + + Answer **Y** to scan extracted executable files with [Capa](https://github.com/fireeye/capa), a tool for detecting capabilities in executable files. + - **Lookup extracted file hashes with VirusTotal?** + + Answer **Y** to be prompted for a [**VirusTotal**](https://www.virustotal.com/en/#search) API key, which will be used for submitting the hashes of extracted files. Only specify this option if the Malcolm instance has Internet connectivity. + - **Enter VirusTotal API key** + + Specify the [**VirusTotal**](https://www.virustotal.com/en/#search) [API key](https://support.virustotal.com/hc/en-us/articles/115002100149-API) as indicated under the previous question. + - **Download updated file scanner signatures periodically?** + + If the Malcolm instance has Internet connectivity, answer **Y** to enable periodic downloads of signatures used by ClamAV and YARA. +* **Configure pulling from threat intelligence feeds for Zeek intelligence framework?** + - Answer **Y** to configure pulling from threat intelligence feeds to populate the [Zeek intelligence framework](zeek-intel.md#ZeekIntel). Answer **N** to leave settings for pulling from threat intelligence feeds unmodified. + - **Pull from threat intelligence feeds on startup?** + + Answer **Y** for Malcolm to pull from threat intelligence feeds when the `zeek-offline` container starts up. + - **Cron expression for scheduled pulls from threat intelligence feeds** + + Specifies a [cron expression](https://en.wikipedia.org/wiki/Cron#CRON_expression) (using [`cronexpr`](https://github.com/aptible/supercronic/tree/master/cronexpr#implementation)-compatible syntax) indicating the refresh interval for generating the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) files. + - **Threat indicator "since" period** + + When querying a [TAXII](zeek-intel.md#ZeekIntelSTIX), [MISP](zeek-intel.md#ZeekIntelMISP), or [Mandiant](zeek-intel.md#ZeekIntelMandiant) threat intelligence feed, only process threat indicators created or modified since the time represented by this value; it may be either a fixed date/time (`01/01/2025`) or relative interval (`7 days ago`). + - **`Intel::item_expiration` timeout for intelligence items (`-1min` to disable)** + + Specifies the value for Zeek's [`Intel::item_expiration`](https://docs.zeek.org/en/current/scripts/base/frameworks/intel/main.zeek.html#id-Intel::item_expiration) timeout as used by the [Zeek Intelligence Framework](zeek-intel.md#ZeekIntel) (default `-1min`, which disables item expiration). * **Should Malcolm run and maintain an instance of NetBox, an infrastructure resource modeling tool?** - Answer **Y** to enable [NetBox](https://netbox.dev/), a tool for modeling networks and documenting network assets. * **Should Malcolm enrich network traffic using NetBox?** diff --git a/docs/ubuntu-install-example.md b/docs/ubuntu-install-example.md index 1bbccc16e..2124f650e 100644 --- a/docs/ubuntu-install-example.md +++ b/docs/ubuntu-install-example.md @@ -190,6 +190,8 @@ Lookup extracted file hashes with VirusTotal? (y / N): n Download updated file scanner signatures periodically? (Y / n): n +Configure pulling from threat intelligence feeds for Zeek intelligence framework? (y / N): n + Should Malcolm run and maintain an instance of NetBox, an infrastructure resource modeling tool? (y / N): n 1: no diff --git a/docs/zeek-intel.md b/docs/zeek-intel.md index 84010e202..8e9136aeb 100644 --- a/docs/zeek-intel.md +++ b/docs/zeek-intel.md @@ -20,6 +20,8 @@ docker compose exec --user $(id -u) zeek /usr/local/bin/docker_entrypoint.sh tru As multiple instances of this container may be running in a Malcolm deployment (i.e., a `zeek-live` container for [monitoring local network interfaces](live-analysis.md#LocalPCAP) and a `zeek` container for scanning [uploaded PCAPs](upload.md#Upload)), only the non-live container is responsible for creating and managing the Zeek intel files, which are then shared and used by both types of container instances. +Additional settings governing Malcolm's behavior when pulling from threat intelligence feeds may be specified during Malcolm configuration (see the [**end-to-end Malcolm installation example**](malcolm-hedgehog-e2e-iso-install.md#MalcolmConfig)). + For a public example of Zeek intelligence files, see Critical Path Security's [repository](https://github.com/CriticalPathSecurity/Zeek-Intelligence-Feeds), which aggregates data from various other threat feeds into Zeek's format. ## STIX™ and TAXII™ diff --git a/scripts/install.py b/scripts/install.py index d9bd17b23..e539bb96c 100755 --- a/scripts/install.py +++ b/scripts/install.py @@ -116,6 +116,7 @@ ################################################################################################### args = None +raw_args = None requests_imported = None yaml_imported = None kube_imported = None @@ -152,10 +153,11 @@ class ConfigOptions(IntEnum): Enrichment = 19 OpenPorts = 20 FileCarving = 21 - NetBox = 22 - Capture = 23 - DarkMode = 24 - PostConfig = 25 + ZeekIntel = 22 + NetBox = 23 + Capture = 24 + DarkMode = 25 + PostConfig = 26 ################################################################################################### @@ -520,6 +522,7 @@ def install_malcolm_files(self, malcolm_install_file, default_config_dir): # ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ def tweak_malcolm_runtime(self, malcolm_install_path): global args + global raw_args global dotenv_imported configFiles = [] @@ -622,6 +625,11 @@ def tweak_malcolm_runtime(self, malcolm_install_path): indexSnapshotCompressed = False behindReverseProxy = False dockerNetworkExternalName = "" + zeekIntelParamsProvided = False + zeekIntelCronExpression = '0 0 * * *' + zeekIntelFeedSince = '7 days ago' + zeekIntelItemExipration = '-1min' + zeekIntelOnStartup = True prevStep = None currentStep = ConfigOptions.Preconfig @@ -1761,6 +1769,73 @@ def tweak_malcolm_runtime(self, malcolm_install_path): if (vtotApiKey is None) or (len(vtotApiKey) <= 1): vtotApiKey = '0' + ################################################################################### + elif currentStep == ConfigOptions.ZeekIntel: + if zeekIntelParamsProvided := InstallerYesOrNo( + 'Configure pulling from threat intelligence feeds for Zeek intelligence framework?', + default=any( + [ + x in raw_args + for x in [ + '--zeek-intel-on-startup', + '--zeek-intel-feed-since', + '--zeek-intel-cron-expression', + '--zeek-intel-item-expiration', + ] + ] + ), + extraLabel=BACK_LABEL, + ): + zeekIntelOnStartup = InstallerYesOrNo( + 'Pull from threat intelligence feeds on startup?', + default=args.zeekIntelOnStartup, + extraLabel=BACK_LABEL, + ) + + # https://stackoverflow.com/a/67419837 + cronRegex = re.compile( + r"(^((\*\/)?([0-5]?[0-9])((\,|\-|\/)([0-5]?[0-9]))*|\*)\s+((\*\/)?((2[0-3]|1[0-9]|[0-9]|00))((\,|\-|\/)(2[0-3]|1[0-9]|[0-9]|00))*|\*)\s+((\*\/)?([1-9]|[12][0-9]|3[01])((\,|\-|\/)([1-9]|[12][0-9]|3[01]))*|\*)\s+((\*\/)?([1-9]|1[0-2])((\,|\-|\/)([1-9]|1[0-2]))*|\*|(jan|feb|mar|apr|may|jun|jul|aug|sep|oct|nov|des))\s+((\*\/)?[0-6]((\,|\-|\/)[0-6])*|\*|00|(sun|mon|tue|wed|thu|fri|sat))\s*$)|@(annually|yearly|monthly|weekly|daily|hourly)" + ) + zeekIntelCronExpression = '_invalid_' + loopBreaker = CountUntilException(MaxAskForValueCount, 'Invalid cron expression') + while loopBreaker.increment(): + zeekIntelCronExpression = InstallerAskForString( + 'Cron expression for scheduled pulls from threat intelligence feeds', + default=args.zeekIntelCronExpression, + extraLabel=BACK_LABEL, + ) + if len(zeekIntelCronExpression) == 0: + if InstallerYesOrNo( + 'An empty cron expression will disable scheduled threat intelligence updates, are you sure?', + default=False, + extraLabel=BACK_LABEL, + ): + break + elif cronRegex.match(zeekIntelCronExpression): + break + + zeekIntelFeedSince = '' + loopBreaker = CountUntilException(MaxAskForValueCount, 'Invalid "since" period') + while (len(zeekIntelFeedSince) <= 0) and loopBreaker.increment(): + zeekIntelFeedSince = InstallerAskForString( + 'Threat indicator "since" period', + default=args.zeekIntelFeedSince, + extraLabel=BACK_LABEL, + ) + + zeekIntelItemExipration = '' + loopBreaker = CountUntilException(MaxAskForValueCount, 'Invalid Intel::item_expiration timeout') + while (len(zeekIntelItemExipration) <= 0) and loopBreaker.increment(): + zeekIntelItemExipration = InstallerAskForString( + "Intel::item_expiration timeout for intelligence items (-1min to disable)", + default=args.zeekIntelItemExipration, + extraLabel=BACK_LABEL, + ) + + InstallerDisplayMessage( + f'Place feed definitions in\n\n * TAXII - {os.path.join(malcolm_install_path, "zeek/intel/STIX/taxii.yaml")}\n * MISP - {os.path.join(malcolm_install_path, "zeek/intel/MISP/misp.yaml")}\n * Mandiant - {os.path.join(malcolm_install_path, "zeek/intel/Mandiant/mandiant.yaml")}\n\nSee Zeek Intelligence Framework in Malcolm documentation.', + ) + ################################################################################### elif currentStep == ConfigOptions.NetBox: # NetBox @@ -1963,520 +2038,633 @@ def tweak_malcolm_runtime(self, malcolm_install_path): shutil.copyfile(envExampleFile, envFile) # define environment variables to be set in .env files - EnvValue = namedtuple("EnvValue", ["envFile", "key", "value"], rename=False) + EnvValue = namedtuple("EnvValue", ["provided", "envFile", "key", "value"], rename=False) EnvValues = [ # Whether or not Arkime is allowed to delete uploaded/captured PCAP EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'MANAGE_PCAP_FILES', TrueOrFalseNoQuote(arkimeManagePCAP), ), # Threshold for Arkime PCAP deletion EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'ARKIME_FREESPACEG', arkimeFreeSpaceG, ), # live traffic analysis with Arkime capture (only available with remote opensearch or elasticsearch) EnvValue( + True, os.path.join(args.configDir, 'arkime-live.env'), 'ARKIME_LIVE_CAPTURE', TrueOrFalseNoQuote(liveArkime), ), # capture source "node host" for live Arkime capture EnvValue( + True, os.path.join(args.configDir, 'arkime-live.env'), 'ARKIME_LIVE_NODE_HOST', liveArkimeNodeHost, ), # rotated captured PCAP analysis with Arkime (not live capture) EnvValue( + True, os.path.join(args.configDir, 'arkime-offline.env'), 'ARKIME_ROTATED_PCAP', TrueOrFalseNoQuote(autoArkime and (not liveArkime)), ), # automatic uploaded pcap analysis with Arkime EnvValue( + True, os.path.join(args.configDir, 'arkime-offline.env'), 'ARKIME_AUTO_ANALYZE_PCAP_FILES', TrueOrFalseNoQuote(autoArkime), ), # Should Arkime use an ILM policy? EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_ENABLED', TrueOrFalseNoQuote(indexManagementPolicy), ), # Should Arkime use a hot/warm design in which non-session data is stored in a warm index? (see https://https://arkime.com/faq#ilm) EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_HOT_WARM_ENABLED', TrueOrFalseNoQuote(indexManagementHotWarm), ), # Time in hours/days before moving (Arkime indexes to warm) and force merge (number followed by h or d), default 30 EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_OPTIMIZATION_PERIOD', indexManagementOptimizationTimePeriod, ), # Time in hours/days before deleting Arkime indexes (number followed by h or d), default 90 EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_RETENTION_TIME', indexManagementSpiDataRetention, ), # Number of replicas for older sessions indices in the ILM policy, default 0 EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_OLDER_SESSION_REPLICAS', indexManagementReplicas, ), # Number of weeks of history to keep, default 13 EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_HISTORY_RETENTION_WEEKS', indexManagementHistoryInWeeks, ), # Number of segments to optimize sessions to in the ILM policy, default 1 EnvValue( + True, os.path.join(args.configDir, 'arkime.env'), 'INDEX_MANAGEMENT_SEGMENTS', indexManagementOptimizeSessionSegments, ), # authentication method: basic (true), ldap (false) or no_authentication EnvValue( + True, os.path.join(args.configDir, 'auth-common.env'), 'NGINX_BASIC_AUTH', allowedAuthModes.get(authMode, TrueOrFalseNoQuote(True)), ), # StartTLS vs. ldap:// or ldaps:// EnvValue( + True, os.path.join(args.configDir, 'auth-common.env'), 'NGINX_LDAP_TLS_STUNNEL', TrueOrFalseNoQuote(('ldap' in authMode.lower()) and ldapStartTLS), ), # Logstash host and port EnvValue( + True, os.path.join(args.configDir, 'beats-common.env'), 'LOGSTASH_HOST', logstashHost, ), # OpenSearch Dashboards URL EnvValue( + True, os.path.join(args.configDir, 'dashboards.env'), 'DASHBOARDS_URL', dashboardsUrl, ), # turn on dark mode, or not EnvValue( + True, os.path.join(args.configDir, 'dashboards-helper.env'), 'DASHBOARDS_DARKMODE', TrueOrFalseNoQuote(dashboardsDarkMode), ), # OpenSearch index state management snapshot compression EnvValue( + True, os.path.join(args.configDir, 'dashboards-helper.env'), 'ISM_SNAPSHOT_COMPRESSED', TrueOrFalseNoQuote(indexSnapshotCompressed), ), # delete based on index pattern size EnvValue( + True, os.path.join(args.configDir, 'dashboards-helper.env'), 'OPENSEARCH_INDEX_SIZE_PRUNE_LIMIT', indexPruneSizeLimit, ), # delete based on index pattern size (sorted by name vs. creation time) EnvValue( + True, os.path.join(args.configDir, 'dashboards-helper.env'), 'OPENSEARCH_INDEX_SIZE_PRUNE_NAME_SORT', TrueOrFalseNoQuote(indexPruneNameSort), ), # expose a filebeat TCP input listener EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_TCP_LISTEN', TrueOrFalseNoQuote(filebeatTcpOpen), ), # log format expected for events sent to the filebeat TCP input listener EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_TCP_LOG_FORMAT', filebeatTcpFormat, ), # source field name to parse for events sent to the filebeat TCP input listener EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_TCP_PARSE_SOURCE_FIELD', filebeatTcpSourceField, ), # target field name to store decoded JSON fields for events sent to the filebeat TCP input listener EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_TCP_PARSE_TARGET_FIELD', filebeatTcpTargetField, ), # field to drop in events sent to the filebeat TCP input listener EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_TCP_PARSE_DROP_FIELD', filebeatTcpDropField, ), # tag to append to events sent to the filebeat TCP input listener EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_TCP_TAG', filebeatTcpTag, ), # logstash memory allowance EnvValue( + True, os.path.join(args.configDir, 'logstash.env'), 'LS_JAVA_OPTS', re.sub(r'(-Xm[sx])(\w+)', fr'\g<1>{lsMemory}', LOGSTASH_JAVA_OPTS_DEFAULT), ), # automatic local reverse dns lookup EnvValue( + True, os.path.join(args.configDir, 'logstash.env'), 'LOGSTASH_REVERSE_DNS', TrueOrFalseNoQuote(reverseDns), ), # automatic MAC OUI lookup EnvValue( + True, os.path.join(args.configDir, 'logstash.env'), 'LOGSTASH_OUI_LOOKUP', TrueOrFalseNoQuote(autoOui), ), # logstash pipeline workers EnvValue( + True, os.path.join(args.configDir, 'logstash.env'), 'pipeline.workers', lsWorkers, ), # freq.py string randomness calculations EnvValue( + True, os.path.join(args.configDir, 'lookup-common.env'), 'FREQ_LOOKUP', TrueOrFalseNoQuote(autoFreq), ), # enrich network traffic metadata via NetBox API calls EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_ENRICHMENT', TrueOrFalseNoQuote(netboxLogstashEnrich), ), # create missing NetBox subnet prefixes based on observed network traffic EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_AUTO_CREATE_PREFIX', TrueOrFalseNoQuote(netboxLogstashAutoSubnets), ), # populate the NetBox inventory based on observed network traffic EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_AUTO_POPULATE', TrueOrFalseNoQuote(netboxAutoPopulate), ), # NetBox default site name EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_DEFAULT_SITE', netboxSiteName, ), # enable/disable netbox EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_DISABLED', TrueOrFalseNoQuote(not netboxEnabled), ), # enable/disable netbox (postgres) EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_POSTGRES_DISABLED', TrueOrFalseNoQuote(not netboxEnabled), ), # enable/disable netbox (redis) EnvValue( + True, os.path.join(args.configDir, 'netbox-common.env'), 'NETBOX_REDIS_DISABLED', TrueOrFalseNoQuote(not netboxEnabled), ), # HTTPS (nginxSSL=True) vs unencrypted HTTP (nginxSSL=False) EnvValue( + True, os.path.join(args.configDir, 'nginx.env'), 'NGINX_SSL', TrueOrFalseNoQuote(nginxSSL), ), # OpenSearch primary instance is local vs. remote EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_PRIMARY', DATABASE_MODE_LABELS[opensearchPrimaryMode], ), # OpenSearch primary instance URL EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_URL', opensearchPrimaryUrl, ), # OpenSearch primary instance needs SSL verification EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_SSL_CERTIFICATE_VERIFICATION', TrueOrFalseNoQuote(opensearchPrimarySslVerify), ), # OpenSearch secondary instance URL EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_SECONDARY_URL', opensearchSecondaryUrl, ), # OpenSearch secondary instance needs SSL verification EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_SECONDARY_SSL_CERTIFICATE_VERIFICATION', TrueOrFalseNoQuote(opensearchSecondarySslVerify), ), # OpenSearch secondary remote instance is enabled EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_SECONDARY', DATABASE_MODE_LABELS[opensearchSecondaryMode], ), # OpenSearch memory allowance EnvValue( + True, os.path.join(args.configDir, 'opensearch.env'), 'OPENSEARCH_JAVA_OPTS', re.sub(r'(-Xm[sx])(\w+)', fr'\g<1>{osMemory}', OPENSEARCH_JAVA_OPTS_DEFAULT), ), # capture pcaps via netsniff-ng EnvValue( + True, os.path.join(args.configDir, 'pcap-capture.env'), 'PCAP_ENABLE_NETSNIFF', TrueOrFalseNoQuote(pcapNetSniff), ), # capture pcaps via tcpdump EnvValue( + True, os.path.join(args.configDir, 'pcap-capture.env'), 'PCAP_ENABLE_TCPDUMP', TrueOrFalseNoQuote(pcapTcpDump and (not pcapNetSniff)), ), # disable NIC hardware offloading features and adjust ring buffers EnvValue( + True, os.path.join(args.configDir, 'pcap-capture.env'), 'PCAP_IFACE_TWEAK', TrueOrFalseNoQuote(tweakIface), ), # capture interface(s) EnvValue( + True, os.path.join(args.configDir, 'pcap-capture.env'), 'PCAP_IFACE', pcapIface, ), # capture filter EnvValue( + True, os.path.join(args.configDir, 'pcap-capture.env'), 'PCAP_FILTER', pcapFilter, ), # process UID EnvValue( + True, os.path.join(args.configDir, 'process.env'), 'PUID', puid, ), # process GID EnvValue( + True, os.path.join(args.configDir, 'process.env'), 'PGID', pgid, ), # Container runtime engine (e.g., docker, podman) EnvValue( + True, os.path.join(args.configDir, 'process.env'), CONTAINER_RUNTIME_KEY, 'kubernetes' if (self.orchMode is OrchestrationFramework.KUBERNETES) else args.runtimeBin, ), # Malcolm run profile (malcolm vs. hedgehog) EnvValue( + True, os.path.join(args.configDir, 'process.env'), PROFILE_KEY, malcolmProfile, ), # Suricata signature updates (via suricata-update) EnvValue( + True, os.path.join(args.configDir, 'suricata.env'), 'SURICATA_UPDATE_RULES', TrueOrFalseNoQuote(suricataRuleUpdate), ), # disable/enable ICS analyzers EnvValue( + True, os.path.join(args.configDir, 'suricata.env'), 'SURICATA_DISABLE_ICS_ALL', TrueOrFalseNoQuote(not malcolmIcs), ), # live traffic analysis with Suricata EnvValue( + True, os.path.join(args.configDir, 'suricata-live.env'), 'SURICATA_LIVE_CAPTURE', TrueOrFalseNoQuote(liveSuricata), ), # live capture statistics for Suricata EnvValue( + True, os.path.join(args.configDir, 'suricata-live.env'), 'SURICATA_STATS_ENABLED', TrueOrFalseNoQuote(captureStats), ), EnvValue( + True, os.path.join(args.configDir, 'suricata-live.env'), 'SURICATA_STATS_EVE_ENABLED', TrueOrFalseNoQuote(captureStats), ), # rotated captured PCAP analysis with Suricata (not live capture) EnvValue( + True, os.path.join(args.configDir, 'suricata-offline.env'), 'SURICATA_ROTATED_PCAP', TrueOrFalseNoQuote(autoSuricata and (not liveSuricata)), ), # automatic uploaded pcap analysis with suricata EnvValue( + True, os.path.join(args.configDir, 'suricata-offline.env'), 'SURICATA_AUTO_ANALYZE_PCAP_FILES', TrueOrFalseNoQuote(autoSuricata), ), # capture source "node name" for locally processed PCAP files EnvValue( + True, os.path.join(args.configDir, 'upload-common.env'), 'PCAP_NODE_NAME', pcapNodeName, ), # zeek file extraction mode EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'ZEEK_EXTRACTOR_MODE', fileCarveMode, ), # zeek file preservation mode EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_PRESERVATION', filePreserveMode, ), # total disk fill threshold for pruning zeek extracted files EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_PRUNE_THRESHOLD_TOTAL_DISK_USAGE_PERCENT', extractedFileMaxPercentThreshold, ), # zeek extracted files maximum consumption threshold EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_PRUNE_THRESHOLD_MAX_SIZE', extractedFileMaxSizeThreshold, ), # HTTP server for extracted files EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_HTTP_SERVER_ENABLE', TrueOrFalseNoQuote(fileCarveHttpServer), ), # ZIP HTTP server for extracted files EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_HTTP_SERVER_ZIP', TrueOrFalseNoQuote(fileCarveHttpServerZip), ), # key for encrypted HTTP-served extracted files (' -> '' for escaping in YAML) EnvValue( + True, os.path.join(args.configDir, 'zeek-secret.env'), 'EXTRACTED_FILE_HTTP_SERVER_KEY', fileCarveHttpServeEncryptKey, ), # virustotal API key EnvValue( + True, os.path.join(args.configDir, 'zeek-secret.env'), 'VTOT_API2_KEY', vtotApiKey, ), # file scanning via yara EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_ENABLE_YARA', TrueOrFalseNoQuote(yaraScan), ), # PE file scanning via capa EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_ENABLE_CAPA', TrueOrFalseNoQuote(capaScan), ), # file scanning via clamav EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_ENABLE_CLAMAV', TrueOrFalseNoQuote(clamAvScan), ), # rule updates (yara/capa via git, clamav via freshclam) EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_UPDATE_RULES', TrueOrFalseNoQuote(fileScanRuleUpdate), ), # disable/enable ICS analyzers EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'ZEEK_DISABLE_ICS_ALL', '' if malcolmIcs else TrueOrFalseNoQuote(not malcolmIcs), ), # disable/enable ICS best guess EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'ZEEK_DISABLE_BEST_GUESS_ICS', '' if zeekICSBestGuess else TrueOrFalseNoQuote(not zeekICSBestGuess), ), # live traffic analysis with Zeek EnvValue( + True, os.path.join(args.configDir, 'zeek-live.env'), 'ZEEK_LIVE_CAPTURE', TrueOrFalseNoQuote(liveZeek), ), # live capture statistics for Zeek EnvValue( + True, os.path.join(args.configDir, 'zeek-live.env'), 'ZEEK_DISABLE_STATS', TrueOrFalseNoQuote(not captureStats), ), # rotated captured PCAP analysis with Zeek (not live capture) EnvValue( + True, os.path.join(args.configDir, 'zeek-offline.env'), 'ZEEK_ROTATED_PCAP', TrueOrFalseNoQuote(autoZeek and (not liveZeek)), ), # automatic uploaded pcap analysis with Zeek EnvValue( + True, os.path.join(args.configDir, 'zeek-offline.env'), 'ZEEK_AUTO_ANALYZE_PCAP_FILES', TrueOrFalseNoQuote(autoZeek), ), + # Pull from threat intelligence feeds on container startup + EnvValue( + zeekIntelParamsProvided, + os.path.join(args.configDir, 'zeek-offline.env'), + 'ZEEK_INTEL_REFRESH_ON_STARTUP', + TrueOrFalseNoQuote(zeekIntelOnStartup), + ), + # Cron expression for scheduled pulls from threat intelligence feeds + EnvValue( + zeekIntelParamsProvided, + os.path.join(args.configDir, 'zeek-offline.env'), + 'ZEEK_INTEL_REFRESH_CRON_EXPRESSION', + zeekIntelCronExpression, + ), + # Threat indicator "since" period + EnvValue( + zeekIntelParamsProvided, + os.path.join(args.configDir, 'zeek.env'), + 'ZEEK_INTEL_FEED_SINCE', + zeekIntelFeedSince, + ), + # Intel::item_expiration timeout for intelligence items + EnvValue( + zeekIntelParamsProvided, + os.path.join(args.configDir, 'zeek.env'), + 'ZEEK_INTEL_ITEM_EXPIRATION', + zeekIntelItemExipration, + ), # Use polling for file watching vs. native EnvValue( + True, os.path.join(args.configDir, 'zeek.env'), 'EXTRACTED_FILE_WATCHER_POLLING', TrueOrFalseNoQuote(self.orchMode is OrchestrationFramework.KUBERNETES), ), EnvValue( + True, os.path.join(args.configDir, 'upload-common.env'), 'PCAP_PIPELINE_POLLING', TrueOrFalseNoQuote(self.orchMode is OrchestrationFramework.KUBERNETES), ), EnvValue( + True, os.path.join(args.configDir, 'filebeat.env'), 'FILEBEAT_WATCHER_POLLING', TrueOrFalseNoQuote(self.orchMode is OrchestrationFramework.KUBERNETES), ), ] - # now, go through and modify the values in the .env files - for val in EnvValues: + # now, go through and modify the provided values in the .env files + for val in [v for v in EnvValues if v.provided]: try: touch(val.envFile) except Exception: @@ -3769,6 +3957,7 @@ def install_docker(self): # main def main(): global args + global raw_args global requests_imported global kube_imported global yaml_imported @@ -4413,6 +4602,45 @@ def main(): help='Perform string randomness scoring on some fields', ) + zeekIntelGroup = parser.add_argument_group('Threat intelligence feed options') + zeekIntelGroup.add_argument( + '--zeek-intel-on-startup', + dest='zeekIntelOnStartup', + type=str2bool, + metavar="true|false", + nargs='?', + const=True, + default=True, + help='Pull from threat intelligence feeds on container startup', + ) + zeekIntelGroup.add_argument( + '--zeek-intel-feed-since', + dest='zeekIntelFeedSince', + required=False, + metavar='', + type=str, + default='7 days ago', + help=f"When pulling from threat intelligence feeds, only process indicators created or modified since the time represented by this value; either a fixed date (01/01/2021) or relative interval (7 days ago)", + ) + zeekIntelGroup.add_argument( + '--zeek-intel-cron-expression', + dest='zeekIntelCronExpression', + required=False, + metavar='', + type=str, + default='0 0 * * *', + help=f'Cron expression for scheduled pulls from threat intelligence feeds', + ) + zeekIntelGroup.add_argument( + '--zeek-intel-item-expiration', + dest='zeekIntelItemExipration', + required=False, + metavar='', + type=str, + default='-1min', + help=f"Specifies the value for Zeek's Intel::item_expiration timeout (-1min to disable)", + ) + fileCarveArgGroup = parser.add_argument_group('File extraction options') fileCarveArgGroup.add_argument( '--file-extraction', @@ -4680,6 +4908,10 @@ def main(): help="Extra environment variables to set (e.g., foobar.env:VARIABLE_NAME=value)", ) + try: + raw_args = sys.argv[1:] + except Exception: + pass try: parser.error = parser.exit args = parser.parse_args()