From b2967ee75fd83aabcae7a0ce006bd87db65428a8 Mon Sep 17 00:00:00 2001 From: Tom Sellers Date: Thu, 23 Feb 2023 10:26:35 -0600 Subject: [PATCH 1/6] mirror doc changes from recog-ruby --- README.md | 50 ++++++++++++++++++++++++++++-- features/data/successful_tests.xml | 7 +++++ 2 files changed, 55 insertions(+), 2 deletions(-) diff --git a/README.md b/README.md index f9541f65..a534495f 100644 --- a/README.md +++ b/README.md @@ -127,8 +127,6 @@ At least one `example` element should be present, however multiple `example` ele tests that `RomSShell_4.62` matches the provided regular expression and that the value of `service.version` is 4.62. -The `param` elements contain a `pos` attribute, which indicates what capture field from the `pattern` should be extracted, or `0` for a static string. The `name` attribute is the key that will be reported in the case of a successful match and the `value` will either be a static string for `pos` values of `0` or missing and taken from the captured field. - The `example` string can be base64 encoded to permit the use of unprintable characters. To signal this to Recog an `_encoding` attribute with the value of `base64` is added to the `example` element. Based64 encoded text that is longer than 80 characters may be wrapped with newlines as shown below to aid in readability. ```xml @@ -155,6 +153,54 @@ They can then be loaded using the `_filename` attribute: This is useful for long examples. +The `param` elements contain a `pos` attribute, which indicates what capture field +from the `pattern` should be extracted, or `0` for a static string. The `name` attribute +is the key that will be reported in the case of a successful match and the `value` +will either be a static string for `pos` values of `0` or missing and taken from the +captured field. + +The `value` attribute supports interpolation of data from other fields. This is +often useful when capturing the value for `hw.product` via regex and re-using this +value in `os.product`. + +Here is an example from`http_servers.xml` where `hw.product` is captured and reused. + +```xml + + Eltex TAU model VoIP gateway + Eltex TAU-72 + Eltex TAU-1.IP + + + + + + + +``` + +There is special handling for temporary `name` attributes starting with `_tmp.` such +that they can be used for interpolation but are not emitted in the output. This is +useful when a particular product name is inconsistent in various banners, vendor +marketing, or with NIST values when trying to generated CPEs. In these cases the useful +parts of the banner can be extracted and a new value crafted without cluttering the +data emitted by a match. + +```xml + + NetCorp NX series switches + foo baz switchThing-8200 + + + + +``` + +In order to reduce churn in the `identifiers/fields.txt` file any `names` values starting +with `_tmp.` should be followed by three digits such as `_tmp.001`, `_tmp.002`, etc. These +only need to be unique within a specific fingerprint and so there shouldn't generally be +a need for many of them. + [^back to top](#recog-a-recognition-framework) ## Contributing diff --git a/features/data/successful_tests.xml b/features/data/successful_tests.xml index 23772467..85875320 100755 --- a/features/data/successful_tests.xml +++ b/features/data/successful_tests.xml @@ -15,4 +15,11 @@ + + test of temp params + foo sb-1.0 + + + + From e2dce3b96f608bd9214993b578d2a5e8589f5d4f Mon Sep 17 00:00:00 2001 From: Tom Sellers Date: Thu, 2 Mar 2023 08:11:22 -0600 Subject: [PATCH 2/6] Adjust tracking of _tmp attributes --- README.md | 17 +++++++---------- bin/recog_standardize | 4 +++- 2 files changed, 10 insertions(+), 11 deletions(-) diff --git a/README.md b/README.md index a534495f..1b495ba6 100644 --- a/README.md +++ b/README.md @@ -179,12 +179,12 @@ Here is an example from`http_servers.xml` where `hw.product` is captured and reu ``` -There is special handling for temporary `name` attributes starting with `_tmp.` such -that they can be used for interpolation but are not emitted in the output. This is -useful when a particular product name is inconsistent in various banners, vendor -marketing, or with NIST values when trying to generated CPEs. In these cases the useful -parts of the banner can be extracted and a new value crafted without cluttering the -data emitted by a match. +There is special handling for temporary attributes that have a name starting with +`_tmp.`. These attributes can be used for interpolation but are not emitted in the +output. This is useful when a particular product name is inconsistent in various +banners, vendor marketing, or with NIST values when trying to generated CPEs. In +these cases the useful parts of the banner can be extracted and a new value +crafted without cluttering the data emitted by a match. ```xml @@ -196,10 +196,7 @@ data emitted by a match. ``` -In order to reduce churn in the `identifiers/fields.txt` file any `names` values starting -with `_tmp.` should be followed by three digits such as `_tmp.001`, `_tmp.002`, etc. These -only need to be unique within a specific fingerprint and so there shouldn't generally be -a need for many of them. +These temporary attributes are not tracked in the `identifiers/fields.txt`. [^back to top](#recog-a-recognition-framework) diff --git a/bin/recog_standardize b/bin/recog_standardize index 4cb6f28a..99582c55 100755 --- a/bin/recog_standardize +++ b/bin/recog_standardize @@ -59,7 +59,7 @@ end # @param current [Hash] Indentifiers extracted from fingerprints # @param original [Hash] Indentifiers loaded from the existing identifiers file -# param msg [String] Context to include in messaging to user +# @param msg [String] Context to include in messaging to user # @param ident_type [String] Key used to get the identifier file path # @param write [Boolean] Indicate if changes should be written to disk def handle_changes(current, original, msg, ident_type, write) @@ -151,6 +151,8 @@ ARGV.each do |arg| ndb.fingerprints.each do |f| f.params.each do |k, v| + # Don't track temporary attributes. + next if k.start_with?("_tmp.") curr_fields[k] = true param_index, val = v From 865b5590fa9a8c4881f2935dce82437d72904801 Mon Sep 17 00:00:00 2001 From: Tom Sellers Date: Thu, 2 Mar 2023 08:30:57 -0600 Subject: [PATCH 3/6] Update result count in verify.feature --- features/verify.feature | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/features/verify.feature b/features/verify.feature index 1675e372..843d9856 100644 --- a/features/verify.feature +++ b/features/verify.feature @@ -12,7 +12,7 @@ Feature: Verify When I run `recog_verify successful_tests.xml` Then it should pass with exactly: """ - successful_tests.xml: SUMMARY: Test completed with 4 successful, 0 warnings, and 0 failures + successful_tests.xml: SUMMARY: Test completed with 5 successful, 0 warnings, and 0 failures """ @no-clobber From 75bb1e60430e26529f698258aab719faa6fa477e Mon Sep 17 00:00:00 2001 From: Tom Sellers Date: Thu, 2 Mar 2023 08:53:32 -0600 Subject: [PATCH 4/6] Require recog-ruby 3.0.5 or higher --- Gemfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Gemfile b/Gemfile index 2890fd83..d4c9541b 100644 --- a/Gemfile +++ b/Gemfile @@ -2,7 +2,7 @@ source 'https://rubygems.org' gemspec name: 'recog-content' -gem 'recog', '~>3.0' +gem 'recog', '~>3.0.5' group :test do gem 'rake' From 673b37c1916979e77f9445d793c77a46b770723b Mon Sep 17 00:00:00 2001 From: Tom Sellers Date: Fri, 3 Mar 2023 07:37:30 -0600 Subject: [PATCH 5/6] fix typo --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 1b495ba6..52bd4a64 100644 --- a/README.md +++ b/README.md @@ -182,7 +182,7 @@ Here is an example from`http_servers.xml` where `hw.product` is captured and reu There is special handling for temporary attributes that have a name starting with `_tmp.`. These attributes can be used for interpolation but are not emitted in the output. This is useful when a particular product name is inconsistent in various -banners, vendor marketing, or with NIST values when trying to generated CPEs. In +banners, vendor marketing, or with NIST values when trying to generate CPEs. In these cases the useful parts of the banner can be extracted and a new value crafted without cluttering the data emitted by a match. From f30480f1402fe7d8c1a583c55e0b8afe7474cd5f Mon Sep 17 00:00:00 2001 From: Tom Sellers Date: Fri, 3 Mar 2023 07:40:15 -0600 Subject: [PATCH 6/6] Correct issue with requirements --- Gemfile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Gemfile b/Gemfile index d4c9541b..be07f2bc 100644 --- a/Gemfile +++ b/Gemfile @@ -2,7 +2,7 @@ source 'https://rubygems.org' gemspec name: 'recog-content' -gem 'recog', '~>3.0.5' +gem 'recog', '~>3.1' group :test do gem 'rake'