From 1894319bd46608a888d01e93516e90a4b21e6ff0 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 12:20:48 +0100 Subject: [PATCH 01/18] [4.9][filterx] Adds the CEF and LEEF parsers --- content/filterx/filterx-parsing/cef/_index.md | 54 ++++++++++++++++++ .../cef/cef-parser-options/_index.md | 15 +++++ .../filterx/filterx-parsing/leef/_index.md | 57 +++++++++++++++++++ .../leef/leef-parser-options/_index.md | 15 +++++ 4 files changed, 141 insertions(+) create mode 100644 content/filterx/filterx-parsing/cef/_index.md create mode 100644 content/filterx/filterx-parsing/cef/cef-parser-options/_index.md create mode 100644 content/filterx/filterx-parsing/leef/_index.md create mode 100644 content/filterx/filterx-parsing/leef/leef-parser-options/_index.md diff --git a/content/filterx/filterx-parsing/cef/_index.md b/content/filterx/filterx-parsing/cef/_index.md new file mode 100644 index 00000000..b825b521 --- /dev/null +++ b/content/filterx/filterx-parsing/cef/_index.md @@ -0,0 +1,54 @@ +--- +title: "CEF" +weight: 100 +--- + + +{{< include-headless "chunk/filterx-experimental-banner.md" >}} + +Available in {{< product >}} 4.9 and later. + +The `parse_cef` FilterX function parses messages formatted in the [Common Event Format (CEF)](https://www.microfocus.com/documentation/arcsight/arcsight-smartconnectors-8.3/cef-implementation-standard/Content/CEF/Chapter%201%20What%20is%20CEF.htm). + +## Declaration + +Usage: `parse_cef(, value_separator="=", pair_separator="|")` + +The first argument is the input message. Optionally, you can set the `pair_separator` and `value_separator` arguments to override their default values. + +The `value_separator` must be a single-character string. The `pair_separator` can be a regular string. + +## Example + +The following is a CEF-formatted message including mandatory and custom (extension) fields: + +```shell +CEF:0|KasperskyLab|SecurityCenter|13.2.0.1511|KLPRCI_TaskState|Completed successfully|1|foo=foo bar=bar baz=test +``` + +The following FilterX expression parses it and converts it into JSON format: + +```shell +filterx { + ${PARSED_MESSAGE} = json(parse_cef(${MESSAGE})); +}; +``` + +The content of the JSON object for this message will be: + +```json +{ +"version":"0", +"device_vendor":"KasperskyLab", +"device_product":"SecurityCenter", +"device_version":"13.2.0.1511", +"device_event_class_id":"KLPRCI_TaskState", +"name":"Completed successfully", +"agent_severity":"1", +"extensions": { + "foo":"foo=bar", + "bar":"bar=baz", + "baz":"test" + } +} +``` diff --git a/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md b/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md new file mode 100644 index 00000000..0cc18477 --- /dev/null +++ b/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md @@ -0,0 +1,15 @@ +--- +title: "Options of CEF parsers" +weight: 100 +--- + + +The `parse_cef` FilterX function has the following options. + +## pair_separator + +Specifies the character or string that separates the CEF fields from each other. Default value: `|` . + +## value_separator + +Specifies the character that separates the keys from the values in the extensions. Default value: `=`. diff --git a/content/filterx/filterx-parsing/leef/_index.md b/content/filterx/filterx-parsing/leef/_index.md new file mode 100644 index 00000000..0b1b2de5 --- /dev/null +++ b/content/filterx/filterx-parsing/leef/_index.md @@ -0,0 +1,57 @@ +--- +title: "LEEF" +weight: 1100 +--- + + +{{< include-headless "chunk/filterx-experimental-banner.md" >}} + +Available in {{< product >}} 4.9 and later. + +The `parse_leef` FilterX function parses messages formatted in the [Log Event Extended Format (LEEF)](https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=https://www.ibm.com/docs/en/dsm%3Ftopic%3Doverview-leef-event-components&ved=2ahUKEwj87cLOjs-JAxUjgf0HHfxyM6AQFnoECBkQAQ&usg=AOvVaw1-YjjgdcnHjZLcJtzB3t6X). + +Currently, only LEEF version 1.0 is supported. + +## Declaration + +Usage: `parse_leef(, value_separator="=", pair_separator="|")` + +The first argument is the input message. Optionally, you can set the `pair_separator` and `value_separator` arguments to override their default values. + +The `value_separator` must be a single-character string. The `pair_separator` can be a regular string. + +## Example + +The following is a LEEF-formatted message including mandatory and custom (extension) fields: + +```shell +LEEF:1.0|Microsoft|MSExchange|4.0 SP1|15345|src=192.0.2.0 dst=172.50.123.1 sev=5cat=anomaly srcPort=81 dstPort=21 usrName=john.smith +``` + +The following FilterX expression parses it and converts it into JSON format: + +```shell +filterx { + ${PARSED_MESSAGE} = json(parse_leef(${MESSAGE})); +}; +``` + +The content of the JSON object for this message will be: + +```json +{ +"version":"1.0", +"vendor":"Microsoft", +"product_name":"MSExchange", +"product_version":"4.0 SP1", +"event_id":"15345", +"extensions": { + "src":"192.0.2.0", + "dst":"172.50.123.1", + "sev":"5cat=anomaly", + "srcPort":"81", + "dstPort":"21", + "usrName":"john.smith" + } +} +``` diff --git a/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md b/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md new file mode 100644 index 00000000..b06de366 --- /dev/null +++ b/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md @@ -0,0 +1,15 @@ +--- +title: "Options of LEEF parsers" +weight: 100 +--- + + +The `parse_leef` FilterX function has the following options. + +## pair_separator + +Specifies the character or string that separates the LEEF fields from each other. Default value: `|` . + +## value_separator + +Specifies the character that separates the keys from the values in the extensions. Default value: `=`. From dcfb682004924e951c4373f565d55bbfc9c826b3 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 12:32:22 +0100 Subject: [PATCH 02/18] [4.9][filterx] Adds drop and done https://github.com/axoflow/axosyslog/pull/269 --- content/filterx/_index.md | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/content/filterx/_index.md b/content/filterx/_index.md index 221fdfd5..a466d976 100644 --- a/content/filterx/_index.md +++ b/content/filterx/_index.md @@ -64,6 +64,10 @@ FilterX statements can be one of the following: - Existence of a variable of field. For example, the `${HOST};` expression is true only if the `${HOST}` macro exists and isn't empty. - A conditional statement ( `if (expr) { ... } elif (expr) {} else { ... };`) which allows you to evaluate complex decision trees. - A declaration of a [pipeline variable](#variable-scope), for example, `declare my_pipeline_variable = "something";`. +- A FilterX action. This can be one of the following: + + - `drop;`: Intentionally drop the message. This means that the message was successfully processed, but discarded. Processing the dropped message stops at the `drop` statement, subsequent sections or other branches of the FilterX block won't process the message. For example, you can use this to discard unneeded messages, like debug logs. + - `done;`: Return truthy and don't execute the rest of the FilterX block, returns with true. This is an early return that you can use to avoid unnecessary processing, for example, when the message matches an early classification in the block. {{% alert title="Note" color="info" %}} From 1737476aaf184bb54809d550a1209fb5de6d950f Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 12:43:10 +0100 Subject: [PATCH 03/18] [4.9][filterx] Minor updates --- content/filterx/filterx-parsing/csv/_index.md | 2 +- .../filterx/filterx-parsing/csv/reference-parsers-csv/_index.md | 2 +- content/filterx/function-reference.md | 2 +- 3 files changed, 3 insertions(+), 3 deletions(-) diff --git a/content/filterx/filterx-parsing/csv/_index.md b/content/filterx/filterx-parsing/csv/_index.md index a1a0ea58..1fb97198 100644 --- a/content/filterx/filterx-parsing/csv/_index.md +++ b/content/filterx/filterx-parsing/csv/_index.md @@ -64,7 +64,7 @@ block filterx p_apache() { "CONTENT_LENGTH", "REFERER", "USER_AGENT", "PROCESS_TIME", "SERVER_NAME" ]; - ${APACHE} = parse_csv(${MESSAGE}, columns=cols, delimiter=(" "), strip_whitespaces=true, dialect="escape-double-char"); + ${APACHE} = parse_csv(${MESSAGE}, columns=cols, delimiter=(" "), strip_whitespace=true, dialect="escape-double-char"); # Set the important elements as name-value pairs so they can be referenced in the destination template ${APACHE_USER_NAME} = ${APACHE.USER_NAME}; diff --git a/content/filterx/filterx-parsing/csv/reference-parsers-csv/_index.md b/content/filterx/filterx-parsing/csv/reference-parsers-csv/_index.md index ee29b87b..6c0b6d1a 100644 --- a/content/filterx/filterx-parsing/csv/reference-parsers-csv/_index.md +++ b/content/filterx/filterx-parsing/csv/reference-parsers-csv/_index.md @@ -76,7 +76,7 @@ my-parsed-values = parse_csv(${MESSAGE}, columns=["COLUMN1", "COLUMN2", "COLUMN3 | Synopsis: | `strip_whitespace=true` | | Default value: | `false` | -*Description:* Remove leading and trailing whitespaces from all columns. The `strip_whitespaces` option is an alias for `strip_whitespace`. +*Description:* Remove leading and trailing whitespaces from all columns. The `strip_whitespace` option is an alias for `strip_whitespace`. ## string_delimiters {#string-delimiters} diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index c1c90072..1828a0c2 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -69,7 +69,7 @@ Usually, you use the [strptime](#strptime) FilterX function to create datetime v Flattens the nested elements of an object using the specified separator, similarly to the [`format-flat-json()` template function]({{< relref "/chapter-manipulating-messages/customizing-message-format/reference-template-functions/_index.md#template-function-format-flat-json" >}}). For example, you can use it to flatten nested JSON objects in the output if the receiving application cannot handle nested JSON objects. -Usage: `flatten(dict, separator=".")` +Usage: `flatten(dict_or_list, separator=".")` You can use multi-character separators, for example, `=>`. If you omit the separator, the default dot (`.`) separator is used. From 765fe86c95336f2c14f2145520062f5054d59e90 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 14:13:13 +0100 Subject: [PATCH 04/18] [4.9][filterx] Adds new unset_empties options https://github.com/axoflow/axosyslog/pull/275 --- content/filterx/function-reference.md | 34 +++++++++------------------ 1 file changed, 11 insertions(+), 23 deletions(-) diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index 1828a0c2..cbffe272 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -358,34 +358,22 @@ See also {{% xref "/filterx/_index.md#delete-values" %}}. ## unset_empties {#unset-empties} -Deletes ([unsets](#unset)) the empty fields of an object, for example, a JSON object or list. Use the `recursive=true` parameter to delete empty values of inner dicts' and lists' values. +Deletes ([unsets](#unset)) the empty fields of an object, for example, a JSON object or list. By default, the object is processed recursively, so the empty values are deleted from inner dicts and lists as well. If you set the `replacement` option, you can also use this function to replace fields of the object to custom values. -Usage: `unset_empties(object, recursive=true)` +Usage: `unset_empties(object, options)` - +```shell +unset_empties(input_object, targets=["-", "N/A"], ignorecase=false); +``` ## upper From b2258dee89b944e33792d3267638752f9c85e52a Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 14:13:34 +0100 Subject: [PATCH 05/18] [4.9][filterx] drop and done are 4.9 only --- content/filterx/_index.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/content/filterx/_index.md b/content/filterx/_index.md index a466d976..a495cf5f 100644 --- a/content/filterx/_index.md +++ b/content/filterx/_index.md @@ -66,8 +66,8 @@ FilterX statements can be one of the following: - A declaration of a [pipeline variable](#variable-scope), for example, `declare my_pipeline_variable = "something";`. - A FilterX action. This can be one of the following: - - `drop;`: Intentionally drop the message. This means that the message was successfully processed, but discarded. Processing the dropped message stops at the `drop` statement, subsequent sections or other branches of the FilterX block won't process the message. For example, you can use this to discard unneeded messages, like debug logs. - - `done;`: Return truthy and don't execute the rest of the FilterX block, returns with true. This is an early return that you can use to avoid unnecessary processing, for example, when the message matches an early classification in the block. + - `drop;`: Intentionally drop the message. This means that the message was successfully processed, but discarded. Processing the dropped message stops at the `drop` statement, subsequent sections or other branches of the FilterX block won't process the message. For example, you can use this to discard unneeded messages, like debug logs. Available in {{< product >}} 4.9 and later. + - `done;`: Return truthy and don't execute the rest of the FilterX block, returns with true. This is an early return that you can use to avoid unnecessary processing, for example, when the message matches an early classification in the block. Available in {{< product >}} 4.9 and later. {{% alert title="Note" color="info" %}} From c81ba855ff7ece7f3803c98e205db036abfc32a1 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 14:50:45 +0100 Subject: [PATCH 06/18] [4.9][filterx] Adds sdata-related functions https://github.com/axoflow/axosyslog/pull/242 --- content/filterx/filterx-sdata/_index.md | 48 +++++++++++++++++++++++++ content/filterx/function-reference.md | 12 +++++++ 2 files changed, 60 insertions(+) create mode 100644 content/filterx/filterx-sdata/_index.md diff --git a/content/filterx/filterx-sdata/_index.md b/content/filterx/filterx-sdata/_index.md new file mode 100644 index 00000000..6206dc6f --- /dev/null +++ b/content/filterx/filterx-sdata/_index.md @@ -0,0 +1,48 @@ +--- +title: "Handle SDATA in RFC5424 log records" +linkTitle: "SDATA in syslog" +weight: 900 +--- + + +{{< include-headless "chunk/filterx-experimental-banner.md" >}} + +Available in {{< product >}} 4.9 and later. + +{{< product >}} FilterX has a few functions to handle the [structured data (SDATA) part of RFC5424-formatted log messages]({{< relref "/chapter-concepts/concepts-message-structure/concepts-message-ietfsyslog/_index.md#the-structured-data-message-part" >}}). These functions allow you to filter messages based on their SDATA fields. + + + +## get_sdata() + +Extracts the SDATA part of the message into a two-level dictionary, for example: + +```json +{"Originator@6876": {"sub": "Vimsvc.ha-eventmgr", "opID": "esxui-13c6-6b16"}} +``` + +```shell +filterx { + sdata_json = get_sdata(); +}; +``` + +## has_sdata() + +Returns `true` if the SDATA field of the current message is not empty: + +```shell +filterx { + has_sdata(); +}; +``` + +## is_sdata_from_enterprise + +Filter messages based on enterprise ID in the SDATA field. For example: + +```shell +filterx { + is_sdata_from_enterprise("6876"); +}; +``` diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index cbffe272..21d0813d 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -118,10 +118,22 @@ Formats any value into a raw JSON string. Usage: `format_json($data)` +## get_sdata + +See {{% xref "/filterx/filterx-sdata/_index.md" %}}. + +## has_sdata + +See {{% xref "/filterx/filterx-sdata/_index.md" %}}. + ## isodate Parses a string as a date in ISODATE format: `%Y-%m-%dT%H:%M:%S%z` +## is_sdata_from_enterprise() + +See {{% xref "/filterx/filterx-sdata/_index.md" %}}. + ## isset Returns true if the argument exists and its value is not empty or null. From 6251aca4781daf7a03af4d94d5143723cc8e3244 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sat, 9 Nov 2024 15:46:29 +0100 Subject: [PATCH 07/18] [4.9][filterx] Adds the xml and windows_eventlog parsers --- .../windows-eventlog/_index.md | 103 ++++++++++++++++++ content/filterx/filterx-parsing/xml/_index.md | 48 +++++++- .../xml/xml-parser-options/_index.md | 8 -- 3 files changed, 148 insertions(+), 11 deletions(-) create mode 100644 content/filterx/filterx-parsing/windows-eventlog/_index.md delete mode 100644 content/filterx/filterx-parsing/xml/xml-parser-options/_index.md diff --git a/content/filterx/filterx-parsing/windows-eventlog/_index.md b/content/filterx/filterx-parsing/windows-eventlog/_index.md new file mode 100644 index 00000000..6177167d --- /dev/null +++ b/content/filterx/filterx-parsing/windows-eventlog/_index.md @@ -0,0 +1,103 @@ +--- +title: "Windows Event Log" +weight: 1100 +--- + + +{{< include-headless "chunk/filterx-experimental-banner.md" >}} + +Available in {{< product >}} 4.9 and later. + +The `parse_windows_eventlog_xml()` FilterX function parses Windows Event Logs XMLs. It's a specialized version of the [`parse_xml()` parser]({{< relref "/filterx/filterx-parsing/xml/_index.md" >}}) that: + +- validates that the data matches the Windows Event Log schema, and +- automatically handles named `Data` elements. + +For example, the following converts the input XML into a JSON object: + +```shell +filterx { + xml = "" + $MSG = json(parse_windows_eventlog_xml(xml)); +}; +``` + +Given the following input: + +```xml + + + + 999 + 0 + 2 + 0 + 0 + 0x80000000000000 + + 934 + + + Application + DESKTOP-2MBFIV7 + + + + foobar + Error + + Info + + + + Classic + + + + foo + bar + + +``` + +The parser creates the following JSON object: + +```json +{ + "Event": { + "@xmlns": "http://schemas.microsoft.com/win/2004/08/events/event", + "System": { + "Provider": {"@Name": "EventCreate"}, + "EventID": {"@Qualifiers": "0", "#text": "999"}, + "Version": "0", + "Level": "2", + "Task": "0", + "Opcode": "0", + "Keywords": "0x80000000000000", + "TimeCreated": {"@SystemTime": "2024-01-12T09:30:12.1566754Z"}, + "EventRecordID": "934", + "Correlation": "", + "Execution": {"@ProcessID": "0", "@ThreadID": "0"}, + "Channel": "Application", + "Computer": "DESKTOP-2MBFIV7", + "Security": {"@UserID": "S-1-5-21-3714454296-2738353472-899133108-1001"}, + }, + "RenderingInfo": { + "@Culture": "en-US", + "Message": "foobar", + "Level": "Error", + "Task": "", + "Opcode": "Info", + "Channel": "", + "Provider": "", + "Keywords": {"Keyword": "Classic"}, + }, + "EventData": { + "Data": { + "param1": "foo", + "param2": "bar", + }, + }, + }, +} +``` diff --git a/content/filterx/filterx-parsing/xml/_index.md b/content/filterx/filterx-parsing/xml/_index.md index 90e823d1..0386e572 100644 --- a/content/filterx/filterx-parsing/xml/_index.md +++ b/content/filterx/filterx-parsing/xml/_index.md @@ -1,10 +1,52 @@ --- title: "XML" -weight: 1100 -draft: true +weight: 1300 --- {{< include-headless "chunk/filterx-experimental-banner.md" >}} -The `parse_xml` FilterX function can ... \ No newline at end of file +Available in {{< product >}} 4.9 and later. + +The `parse_xml()` FilterX function parses raw XMLs into dictionaries. For example: + +```shell +my_structured_data = parse_xml(raw_xml); +``` + +There is no standardized way of converting XML into a dict. {{< product >}} creates the most compact dict possible. This means certain nodes will have different types and structures depending on the input XML element. Note the following points: + +1. Empty XML elements become empty strings. + + ``` + XML: + JSON: {"foo": ""} + ``` + +1. Attributions are stored in `@attr` key-value pairs, similarly to other converters (like python xmltodict). + + ``` + XML: + JSON: {"foo": {"@bar": "123", "@baz": "bad"}} + ``` + +1. If an XML element has both attributes and a value, we need to store them in a dict, and the value needs a key. We store the text value under the `#text` key. + + ``` + XML: baz + JSON: {"foo": {"@bar": "123", "#text": "baz"}} + ``` + +1. An XML element can have both a value and inner elements. We use the `#text` key here, too. + + ``` + XML: bar123 + JSON: {"foo": {"#text": "bar", "baz": "123"}} + ``` + +1. An XML element can have multiple values separated by inner elements. In that case we concatenate the values. + + ``` + XML: barbaz + JSON: {"foo": {"#text": "barbaz", "a": ""}} + ``` diff --git a/content/filterx/filterx-parsing/xml/xml-parser-options/_index.md b/content/filterx/filterx-parsing/xml/xml-parser-options/_index.md deleted file mode 100644 index ef1d6738..00000000 --- a/content/filterx/filterx-parsing/xml/xml-parser-options/_index.md +++ /dev/null @@ -1,8 +0,0 @@ ---- -title: "Options of key=value parsers" -weight: 100 ---- - - -The `parse_xml` FilterX function has the following options. - From 372df56d65dd2a946fd5edb3a5880b11f9fb4c65 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Sun, 10 Nov 2024 10:54:28 +0100 Subject: [PATCH 08/18] [4.9][filterx] Adds includes, endswith, startswith https://github.com/axoflow/axosyslog/pull/297 --- .../filterx/filterx-string-search/_index.md | 36 +++++++++++++++++ content/filterx/function-reference.md | 39 +++++++++++++++++++ 2 files changed, 75 insertions(+) create mode 100644 content/filterx/filterx-string-search/_index.md diff --git a/content/filterx/filterx-string-search/_index.md b/content/filterx/filterx-string-search/_index.md new file mode 100644 index 00000000..a5465590 --- /dev/null +++ b/content/filterx/filterx-string-search/_index.md @@ -0,0 +1,36 @@ +--- +title: "String search in FilterX" +linkTitle: "String search" +weight: 550 +--- + + +{{< include-headless "chunk/filterx-experimental-banner.md" >}} + +Available in {{< product >}} 4.9 and later. + +You can check if a string contains a specified string using the `includes` FilterX function. The `startswith` and `endswith` functions check the beginning and ending of the strings, respectively. For example, the following expression checks if the message (`$MESSAGE`) begins with the `%ASA-` string: + +```shell +startswith($MESSAGE, '%ASA-') +``` + +By default, matches are case sensitive. For case insensitive matches, use the `ignorecase=true` option: + +```shell +startswith($MESSAGE, '%ASA-', ignorecase=true) +``` + +All three functions (`includes`, `startswith`, and `endswith`) can take a list with multiple search strings and return true if any of them match. This is equivalent with using combining the individual searches with logical OR operators. For example: + +```shell +${MESSAGE} = "%ASA-5-111010: User ''john'', running ''CLI'' from IP 0.0.0.0, executed ''dir disk0:/dap.xml" +includes($MESSAGE, ['%ASA-','john','CLI']) + +includes($MESSAGE, ['%ASA-','john','CLI']) +includes($MESSAGE, '%ASA-') or includes($MESSAGE, 'john') or includes($MESSAGE, 'CLI') +``` + +For more complex searches, or if you need to match a regular expression, use the [`regexp_search` FilterX function]({{< relref "/filterx/filterx-string-search/_index.md#regexp-search" >}}). + + \ No newline at end of file diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index 21d0813d..6616cd4b 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -65,6 +65,19 @@ Usually, you use the [strptime](#strptime) FilterX function to create datetime v - When casting from a double, the double is the number of seconds elapsed since the UNIX epoch (00:00:00 UTC on 1 January 1970). (The part before the floating points is the seconds, the part after the floating point is the microseconds.) - When casting from a string, the string (for example, `1701350398.123000+01:00`) is interpreted as: `.+` +## endswith + +Available in {{< product >}} 4.9 and later. + +Returns true if the input string ends with the specified substring. By default, matches are case sensitive. Usage: + +```shell +endswith(input-string, substring); +endswith(input-string, [substring_1, substring_2], ignorecase=true); +``` + +For details, see {{% xref "/filterx/filterx-string-search/_index.md" %}}. + ## flatten Flattens the nested elements of an object using the specified separator, similarly to the [`format-flat-json()` template function]({{< relref "/chapter-manipulating-messages/customizing-message-format/reference-template-functions/_index.md#template-function-format-flat-json" >}}). For example, you can use it to flatten nested JSON objects in the output if the receiving application cannot handle nested JSON objects. @@ -126,6 +139,19 @@ See {{% xref "/filterx/filterx-sdata/_index.md" %}}. See {{% xref "/filterx/filterx-sdata/_index.md" %}}. +## includes + +Available in {{< product >}} 4.9 and later. + +Returns true if the input string contains the specified substring. By default, matches are case sensitive. Usage: + +```shell +includes(input-string, substring); +includes(input-string, [substring_1, substring_2], ignorecase=true); +``` + +For details, see {{% xref "/filterx/filterx-string-search/_index.md" %}}. + ## isodate Parses a string as a date in ISODATE format: `%Y-%m-%dT%H:%M:%S%z` @@ -320,6 +346,19 @@ You can use the following flags with the `regexp_subst` function: - `utf8=true`: {{< include-headless "chunk/regex-flag-utf8.md" >}} +## startswith + +Available in {{< product >}} 4.9 and later. + +Returns true if the input string begins with the specified substring. By default, matches are case sensitive. Usage: + +```shell +startswith(input-string, substring); +startswith(input-string, [substring_1, substring_2], ignorecase=true); +``` + +For details, see {{% xref "/filterx/filterx-string-search/_index.md" %}}. + ## string Cast a value into a string. Note that currently {{< product >}} evaluates strings and executes [template functions]({{< relref "/filterx/_index.md#template-functions" >}}) and template expressions within the strings. In the future, template evaluation will be moved to a separate FilterX function. From d082af582473e3147124eee6d0622c6cd43f347e Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Mon, 11 Nov 2024 09:55:45 +0100 Subject: [PATCH 09/18] [4.9][filterx] Plus operator updates --- content/filterx/_index.md | 16 ++++------------ content/filterx/operator-reference.md | 6 +----- 2 files changed, 5 insertions(+), 17 deletions(-) diff --git a/content/filterx/_index.md b/content/filterx/_index.md index a495cf5f..fc70842b 100644 --- a/content/filterx/_index.md +++ b/content/filterx/_index.md @@ -239,13 +239,9 @@ To unset every empty field of an object, use the [`unset-empties`]({{< relref "/ {{< include-headless "chunk/filterx-unset-hard-macros.md" >}} -## Concatenate strings +## Add two values -You can concatenate strings by adding them with the `+` operator. Note that if you want to have spaces between the added elements, you have to add them manually, like in Python, for example: - -```shell -${MESSAGE} = ${HOST} + " first part of the message," + " second part of the message" + "\n"; -``` +{{< include-headless "chunk/filterx-plus-operator.md" >}} ## Complex types: lists, dicts, and JSON {#json} @@ -335,11 +331,7 @@ Within a FilterX block, you can access the fields of complex data types by using When referring to the field of a name-value pair (which begins with the `$` character), place the dot or the square bracket outside the curly bracket surrounding the name of the name-value pair, for example: `${MY-LIST}[2]` or `${MY-OBJECT}.mykey`. If the name of the key contains characters that are not permitted in FilterX variable names, for example, a hyphen (`-`), use the bracketed syntax and enclose the key in double quotes: `${MY-LIST}["my-key-name"]`. - - - +You can add two lists or two dicts using the {{% xref "/filterx/operator-reference.md#plus-operator" %}}. + +{{< include-headless "chunk/filterx-experimental-banner.md" >}} + +Available in {{< product >}} 4.9 and later. + +Updates a labeled metric counter, similarly to the [`metrics-probe()` parser]({{< relref "/chapter-parsers/metrics-probe/_index.md" >}}). For details, see {{% xref "/filterx/filterx-metrics/_index.md" %}}. + +You can use `update_metric` to count the processed messages, and create labeled metric counters based on the fields of the processed messages. + +You can configure the name of the counter to update and the labels to add. The name of the counter is an unnamed, mandatory option. Note that the name is automatically prefixed with the `syslogng_` string. For example: + +```json +update_metric( + "my_counter_name", + labels={ + "host": ${HOST}, + "app": ${PROGRAM}, + "id": ${SOURCE} + } +); +``` + +This results in counters like: + +```shell +syslogng_my_counter_name{app="example-app", host="localhost", source="s_local_1"} 3 +``` + +## Options + +### increment + +| | | +| -------- | ------- | +| Type: | integer or variable | +| Default: | 1 | + +An integer, or an expression that resolves to an integer that defines the increment of the counter. The following example defines a counter called `syslogng_input_event_bytes_total`, and increases its value with the size of the incoming message (in bytes). + +```shell +update_metric( + "input_event_bytes_total", + labels={ + "host": ${HOST}, + "app": ${PROGRAM}, + "id": ${SOURCE} + }, + increment("${RAWMSG_SIZE}") +); +``` + +### labels + +| | | +| -------- | ------- | +| Type: | dict | +| Default: | `{}` | + +The labels used to create separate counters, based on the fields of the messages processed by `update_metrics`. Use the following format: + +```shell +labels( + { + "name-of-label1": "value-of-the-label1", + ... , + "name-of-labelx": "value-of-the-labelx" + } +) +``` + +## level + +| | | +| -------- | ------- | +| Type: | integer (0-3) | +| Default: | 0 | + +Sets the stats level of the generated metrics. + +> Note: Drivers configured with `internal(yes)` register their metrics on level 3. That way if you are creating an SCL, you can disable the built-in metrics of the driver, and create metrics manually using `update_metrics`. diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index 6616cd4b..1d4b8990 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -426,6 +426,10 @@ For example, to remove the fields with `-` and `N/A` values, you can use unset_empties(input_object, targets=["-", "N/A"], ignorecase=false); ``` +## update_metric {#update-metric} + +Updates a labeled metric counter, similarly to the [`metrics-probe()` parser]({{< relref "/chapter-parsers/metrics-probe/_index.md" >}}). For details, see {{% xref "/filterx/filterx-metrics/_index.md" %}}. + ## upper Converts all characters of a string uppercase characters. From 3d6c16a37ef2c7f3664059b375cb3e9812a4cede Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Mon, 11 Nov 2024 13:16:11 +0100 Subject: [PATCH 14/18] [4.9][filterx] Updates filter function reference lists Signed-off-by: Robert Fekete --- content/filterx/_index.md | 11 ++++++++++- content/filterx/function-reference.md | 24 ++++++++++++++++++++++++ 2 files changed, 34 insertions(+), 1 deletion(-) diff --git a/content/filterx/_index.md b/content/filterx/_index.md index fc70842b..0535d8f9 100644 --- a/content/filterx/_index.md +++ b/content/filterx/_index.md @@ -361,23 +361,32 @@ For details, see {{% xref "/filterx/operator-reference.md" %}}. FilterX has the following built-in functions. - [`cache_json_file`]({{< relref "/filterx/function-reference.md#cache-json-file" >}}): Loads an external JSON file to lookup contextual information. +- [`endswith`]({{< relref "/filterx/filterx-string-search/_index.md" >}}): Checks if a string ends with the specified value. - [`flatten`]({{< relref "/filterx/function-reference.md#flatten" >}}): Flattens the nested elements of an object. - [`format_csv`]({{< relref "/filterx/function-reference.md#format-csv" >}}): Formats a dictionary or a list into a comma-separated string. - [`format_json`]({{< relref "/filterx/function-reference.md#format-json" >}}): Dumps a JSON object into a string. - [`format_kv`]({{< relref "/filterx/function-reference.md#format-kv" >}}): Formats a dictionary into key=value pairs. +- [`get_sdata`]({{< relref "/filterx/filterx-sdata/_index.md" >}}): Returns the SDATA part of an RFC5424-formatted syslog message as a JSON object. +- [`has_sdata`]({{< relref "/filterx/filterx-sdata/_index.md" >}}): Checks if a string ends with the specified value. +- [`includes`]({{< relref "/filterx/filterx-string-search/_index.md" >}}): Checks if a string contains a specific substring. - [`isodate`]({{< relref "/filterx/function-reference.md#isodate" >}}): Parses a string as a date in ISODATE format. +- [`is_sdata_from_enterprise`]({{< relref "/filterx/filterx-sdata/_index.md" >}}): Checks if the message contains the specified organization ID. - [`isset`]({{< relref "/filterx/function-reference.md#isset" >}}): Checks that argument exists and its value is not empty or null. - [`istype`]({{< relref "/filterx/function-reference.md#istype" >}}): Checks the type of an object. - [`len`]({{< relref "/filterx/function-reference.md#len" >}}): Returns the length of an object. - [`lower`]({{< relref "/filterx/function-reference.md#lower" >}}): Converts a string into lowercase characters. - [`parse_csv`]({{< relref "/filterx/filterx-parsing/csv/_index.md" >}}): Parses a comma-separated or similar string. - [`parse_kv`]({{< relref "/filterx/filterx-parsing/key-value-parser/_index.md" >}}): Parses a string consisting of whitespace or comma-separated `key=value` pairs. - +- [`parse_leef`]({{< relref "/filterx/filterx-parsing/leef/_index.md" >}}): Parses LEEF-formatted string. +- [`parse_xml`]({{< relref "/filterx/filterx-parsing/xml/_index.md" >}}): Parses an XML object into a JSON object. +- [`parse_windows_eventlog_xml`]({{< relref "/filterx/filterx-parsing/windows-eventlog/_index.md" >}}): Parses a Windows Event Log XML object into a JSON object. - [`regexp_search`]({{< relref "/filterx/function-reference.md#regexp-search" >}}): Searches a string using regular expressions. - [`regexp_subst`]({{< relref "/filterx/function-reference.md#regexp-subst" >}}): Rewrites a string using regular expressions. +- [`startswith`]({{< relref "/filterx/filterx-string-search/_index.md" >}}): Checks if a string begins with the specified value. - [`strptime`]({{< relref "/filterx/function-reference.md#strptime" >}}): Converts a string containing a date/time value, using a specified format string. - [`unset`]({{< relref "/filterx/function-reference.md#unset" >}}): Deletes a name-value pair, or a field from an object. - [`unset_empties`]({{< relref "/filterx/function-reference.md#unset-empties" >}}): Deletes empty fields from an object. +- [`update_metric`]({{< relref "/filterx/filterx-metrics/_index.md" >}}): Updates a labeled metric counter. - [`upper`]({{< relref "/filterx/function-reference.md#upper" >}}): Converts a string into uppercase characters. - [`vars`]({{< relref "/filterx/function-reference.md#vars" >}}): Lists the variables defined in the FilterX block. diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index 1d4b8990..b8bcbbe2 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -258,6 +258,30 @@ The `value_separator` must be a single character. The `pair_separator` can consi For details, see {{% xref "/filterx/filterx-parsing/key-value-parser/_index.md" %}}. +## parse_leef {#parse-leef} + +Parse a LEEF-formatted string. + +Usage: `parse_leef(msg)` + +For details, see {{% xref "/filterx/filterx-parsing/leef/_index.md" %}}. + +## parse_xml {#parse-xml} + +Parse an XML object into a JSON object. + +Usage: `parse_xml(msg)` + +For details, see {{< relref "/filterx/filterx-parsing/xml/_index.md" >}} + +## parse_windows_eventlog_xml {#parse-windows} + +Parses a Windows Event Log XML object into a JSON object. + +Usage: `parse_xml(msg)` + +For details, see {{< relref "/filterx/filterx-parsing/xml/_index.md" >}} + ## regexp_search {#regexp-search} Searches a string and returns the matches of a regular expression as a list or a dictionary. If there are no matches, the list or dictionary is empty. From 19285f66ec55bdc8eb093b348c656cafb985bc70 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Mon, 11 Nov 2024 13:24:46 +0100 Subject: [PATCH 15/18] [4.9][filterx] [] and {} are now aliases for json_array() and json() --- content/filterx/function-reference.md | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index b8bcbbe2..aed98b36 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -193,7 +193,13 @@ Usage: `json()` For example: ```shell -js = json({"key": "value"}); +js_dict = json({"key": "value"}); +``` + +Starting with version 4.9, you can use `{}` without the `json()` keyword as well. For example, the following creates an empty JSON object: + +```shell +js_dict = {}; ``` ## json_array {#json-array} @@ -205,7 +211,13 @@ Usage: `json_array()` For example: ```shell -list = json_array(["first_element", "second_element", "third_element"]); +js_list = json_array(["first_element", "second_element", "third_element"]); +``` + +Starting with version 4.9, you can use `[]` without the `json_array()` keyword as well. For example, the following creates an empty JSON list: + +```shell +js_dict = []; ``` ## len From 0ff3f9580b28837253a363b4931833ea627c1ded Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Mon, 11 Nov 2024 13:54:00 +0100 Subject: [PATCH 16/18] Synchronize hugo versions for the builds --- .github/workflows/publish.yaml | 2 +- .github/workflows/staging.yaml | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-) diff --git a/.github/workflows/publish.yaml b/.github/workflows/publish.yaml index f169107d..8adba1f8 100644 --- a/.github/workflows/publish.yaml +++ b/.github/workflows/publish.yaml @@ -38,7 +38,7 @@ jobs: - name: Set up Hugo uses: peaceiris/actions-hugo@v2 with: - hugo-version: '0.119.0' + hugo-version: '0.122.0' extended: true - name: Set up Node diff --git a/.github/workflows/staging.yaml b/.github/workflows/staging.yaml index f40e6fc2..98fd3ca3 100644 --- a/.github/workflows/staging.yaml +++ b/.github/workflows/staging.yaml @@ -33,7 +33,7 @@ jobs: - name: Set up Hugo uses: peaceiris/actions-hugo@v2 with: - hugo-version: '0.110.0' + hugo-version: '0.122.0' extended: true - name: Set up Node From dd693370c93b343f25b77a5657c0ff069eeb8976 Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Mon, 11 Nov 2024 13:58:38 +0100 Subject: [PATCH 17/18] Update hugo version in ci workflow --- .github/workflows/ci.yaml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml index 5b726e60..cc6b55e4 100644 --- a/.github/workflows/ci.yaml +++ b/.github/workflows/ci.yaml @@ -18,7 +18,7 @@ jobs: - name: Set up Hugo uses: peaceiris/actions-hugo@v2 with: - hugo-version: '0.110.0' + hugo-version: '0.122.0' extended: true - name: Set up Node From bc93e9409a7c89c5f8825e5deb5c39ed0a6d74ae Mon Sep 17 00:00:00 2001 From: Robert Fekete Date: Mon, 11 Nov 2024 15:10:06 +0100 Subject: [PATCH 18/18] [4.9][filterx] Review fixes --- content/filterx/filterx-metrics/_index.md | 6 +++--- .../filterx-parsing/cef/cef-parser-options/_index.md | 2 +- .../filterx-parsing/leef/leef-parser-options/_index.md | 4 +++- content/filterx/filterx-parsing/windows-eventlog/_index.md | 2 +- content/filterx/function-reference.md | 2 +- content/headless/chunk/filterx-plus-operator.md | 6 +++--- 6 files changed, 12 insertions(+), 10 deletions(-) diff --git a/content/filterx/filterx-metrics/_index.md b/content/filterx/filterx-metrics/_index.md index eda23770..ce7515c3 100644 --- a/content/filterx/filterx-metrics/_index.md +++ b/content/filterx/filterx-metrics/_index.md @@ -50,7 +50,7 @@ update_metric( "app": ${PROGRAM}, "id": ${SOURCE} }, - increment("${RAWMSG_SIZE}") + increment=${RAWMSG_SIZE} ); ``` @@ -61,7 +61,7 @@ update_metric( | Type: | dict | | Default: | `{}` | -The labels used to create separate counters, based on the fields of the messages processed by `update_metrics`. Use the following format: +The labels used to create separate counters, based on the fields of the messages processed by `update_metric`. Use the following format: ```shell labels( @@ -82,4 +82,4 @@ labels( Sets the stats level of the generated metrics. -> Note: Drivers configured with `internal(yes)` register their metrics on level 3. That way if you are creating an SCL, you can disable the built-in metrics of the driver, and create metrics manually using `update_metrics`. +> Note: Drivers configured with `internal(yes)` register their metrics on level 3. That way if you are creating an SCL, you can disable the built-in metrics of the driver, and create metrics manually using `update_metric`. diff --git a/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md b/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md index 0cc18477..66f0c44c 100644 --- a/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md +++ b/content/filterx/filterx-parsing/cef/cef-parser-options/_index.md @@ -8,7 +8,7 @@ The `parse_cef` FilterX function has the following options. ## pair_separator -Specifies the character or string that separates the CEF fields from each other. Default value: `|` . +Specifies the character or string that separates the key-value pairs in the extensions. Default value: ` ` (space). ## value_separator diff --git a/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md b/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md index b06de366..03e4215d 100644 --- a/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md +++ b/content/filterx/filterx-parsing/leef/leef-parser-options/_index.md @@ -8,7 +8,9 @@ The `parse_leef` FilterX function has the following options. ## pair_separator -Specifies the character or string that separates the LEEF fields from each other. Default value: `|` . +Specifies the character or string that separates the key-value pairs in the extensions. Default value: `\t` (tab). + +LEEF v2 can specify the separator per message. Omitting this option uses the LEEF v2 provided separator, setting this value overrides it during parsing. ## value_separator diff --git a/content/filterx/filterx-parsing/windows-eventlog/_index.md b/content/filterx/filterx-parsing/windows-eventlog/_index.md index 13bba33f..a38d726a 100644 --- a/content/filterx/filterx-parsing/windows-eventlog/_index.md +++ b/content/filterx/filterx-parsing/windows-eventlog/_index.md @@ -13,7 +13,7 @@ The `parse_windows_eventlog_xml()` FilterX function parses Windows Event Logs XM The parser returns false in the following cases: - The input isn't valid XML. -- The root element doesn't references the [Windows Event Log schema](https://learn.microsoft.com/en-us/windows/win32/wes/eventschema-schema) (``). Note that the parser doesn't validate the input data to the schema. +- The root element doesn't reference the [Windows Event Log schema](https://learn.microsoft.com/en-us/windows/win32/wes/eventschema-schema) (``). Note that the parser doesn't validate the input data to the schema. For example, the following converts the input XML into a JSON object: diff --git a/content/filterx/function-reference.md b/content/filterx/function-reference.md index aed98b36..375fc1a1 100644 --- a/content/filterx/function-reference.md +++ b/content/filterx/function-reference.md @@ -451,7 +451,7 @@ Usage: `unset_empties(object, options)` The `unset_empties()` function has the following options: -- `ignorecase`: Set to `false` to perform case-insensitive matching. Default value: `true`. Available in Available in {{< product >}} 4.9 and later. +- `ignorecase`: Set to `false` to perform case-sensitive matching. Default value: `true`. Available in Available in {{< product >}} 4.9 and later. - `recursive`: Enables recursive processing of nested dictionaries. Default value: `true` - `replacement`: Replace the target elements with the value of `replacement` instead of removing them. Available in {{< product >}} 4.9 and later. - `targets`: A list of elements to remove or replace. Default value: `["", null, [], {}]`. Available in {{< product >}} 4.9 and later. diff --git a/content/headless/chunk/filterx-plus-operator.md b/content/headless/chunk/filterx-plus-operator.md index 038eb192..064fbd21 100644 --- a/content/headless/chunk/filterx-plus-operator.md +++ b/content/headless/chunk/filterx-plus-operator.md @@ -11,9 +11,9 @@ The plus operator (`+`) adds two arguments, if possible. (For example, you can't - Adding two dicts updates the dict with the values of the second operand. For example: ```shell - x = {"element1", "element2", "element3"}; - y = {"element3", "element4", "element5"}; - ${MESSAGE} = x + y; # ${MESSAGE} value is {"element1", "element2", "element3", "element4", "element5"} + x = {"key1": "value1", "key2": "value1"}; + y = {"key3": "value1", "key2": "value2"}; + ${MESSAGE} = x + y; # ${MESSAGE} value is {"key1": "value1", "key3": "value1", "key2": "value2"}; ``` Available in {{< product >}} 4.9 and later.