Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add extract_readable_strings function with advanced filtering support and reorganize string extraction code #2720

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from

Conversation

fabiodepin
Copy link
Contributor

@fabiodepin fabiodepin commented Feb 13, 2025

Please sign (check) the below before submitting the Pull Request:

Link to the related issue:
References #2708

Describe changes:

  • Introduces a new function extract_readable_strings to retrieve multiple human-readable substrings from a payload, with an optional filter callback for advanced matching (e.g., FQDN detection).
  • Moves the old string extraction function (ndpi_has_human_readable_string) from nfdpi_utils.c to readable_string.c, consolidating all string extraction logic into one location.

This commit complements or replaces the existing ndpi_has_human_readable_string by providing:

  1. Multi-substring extraction.
  2. More flexible handling of textual data.
  3. An optional user-defined filter for fine-tuned processing.

…upport and reorganize string extraction code

- Introduces a new function `extract_readable_strings` to retrieve multiple 
  human-readable substrings from a payload, with an optional filter callback 
  for advanced matching (e.g., FQDN detection).
- Moves the old string extraction function (`ndpi_has_human_readable_string`) 
  from nfdpi_utils.c to readable_string.c, consolidating all string extraction
  logic into one location.

This commit complements or replaces the existing `ndpi_has_human_readable_string`
by providing:
1) Multi-substring extraction.
2) More flexible handling of textual data.
3) An optional user-defined filter for fine-tuned processing.

References ntop#2708
Copy link
Collaborator

@IvanNardi IvanNardi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a preliminary and very quick review, looking mainly at the CI results.

It is possible to keep the current ndpi_has_human_readeable_string prototype, but implement it on top of your new, more generic function?

char **items; ///< Array of strings
size_t count; ///< Number of strings currently stored
size_t capacity; ///< Allocated capacity of the list
} string_list_t;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use ndpi_ as prefix for all public functions and structures.
Here ndpi_string_list_t

size_t capacity; ///< Allocated capacity of the list
} string_list_t;

void string_list_free(string_list_t *list);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here...


void string_list_free(string_list_t *list);

string_list_t* extract_readable_strings(const unsigned char *buffer, size_t buffer_len,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and here

@@ -0,0 +1,305 @@
/*
* ndpi_utils.c
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please rename this file in ndpi_readable_string.c and update this comment

/*
* ndpi_utils.c
*
* Copyright (C) 2011-24 - ntop.org and contributors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copyright (C) 2025 - ntop.org and contributors

* @return A pointer to the newly allocated string_list_t structure, or NULL on failure.
*/
static string_list_t *string_list_create(size_t initial_capacity) {
string_list_t *list = calloc(1, sizeof(string_list_t));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the internal wrapper for memory allocations, free and strdup

@fabiodepin
Copy link
Contributor Author

Just a preliminary and very quick review, looking mainly at the CI results.

It is possible to keep the current ndpi_has_human_readeable_string prototype, but implement it on top of your new, more generic function?

Yes, it is possible. I will do it soon and, while at it, I will also remove the deprecated functions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants