Skip to content

Commit

Permalink
Merge pull request #218 from network-intelligence/dev
Browse files Browse the repository at this point in the history
Merging dev into trunk
  • Loading branch information
davidmcgrew authored and GitHub Enterprise committed Nov 21, 2023
2 parents 6b181c7 + 27274a6 commit 75f6ad6
Show file tree
Hide file tree
Showing 19 changed files with 298 additions and 41 deletions.
1 change: 1 addition & 0 deletions .github/workflows/docker-image.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
name: Docker Image Build and Publish

on:
workflow_dispatch:
push:
branches:
- main
Expand Down
2 changes: 1 addition & 1 deletion VERSION
Original file line number Diff line number Diff line change
@@ -1 +1 @@
2.5.22
2.5.23
5 changes: 5 additions & 0 deletions doc/CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
# CHANGELOG for Mercury

## Version 2.5.23

* (Significantly) improved the encrypted/compressed archive reader speed.
* Process attributes now reported through the `mercury_packet_processor_get_attributes` function.

## Version 2.5.22

* JSON records created from incomplete TCP segments are now highlighted with `"reassembly_properties": { "truncated": true }`.
Expand Down
19 changes: 16 additions & 3 deletions doc/batch-gcd.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,14 +29,25 @@ Authors: Brandon Enright, Andrew Chi, David McGrew

## Installation

The easiest way to try Batch GCD is by building a Docker image of the `mercury`
package. We also provide instructions for building the mercury package from source.
The easiest way to try Batch GCD is by pulling or building a Docker image of the
`mercury` package. We also provide instructions for building the mercury
package from source.

### Docker

The Docker image for `mercury` also includes the tools `batch_gcd`,
`cert_analyze`, and `tls_scanner`. Note that depending on your Docker
configuration, the docker command may require sudo.

**Option 1.** Pull a pre-built Docker image from GitHub's container repository.
```
$ docker pull ghcr.io/cisco/mercury:latest
$ alias batch_gcd='docker run --rm -i --entrypoint /usr/local/bin/batch_gcd --volume .:/root ghcr.io/cisco/mercury:latest'
$ alias cert_analyze='docker run --rm -i --entrypoint /usr/local/bin/cert_analyze --volume .:/root ghcr.io/cisco/mercury:latest'
$ alias tls_scanner='docker run --rm -i --entrypoint /usr/local/bin/tls_scanner --volume .:/root ghcr.io/cisco/mercury:latest'
```

**Option 2.** Build the Docker image using the source code.
```
$ git clone https://github.com/cisco/mercury.git
$ cd mercury
Expand All @@ -45,7 +56,9 @@ $ alias batch_gcd='docker run --rm -i --entrypoint /usr/local/bin/batch_gcd --vo
$ alias cert_analyze='docker run --rm -i --entrypoint /usr/local/bin/cert_analyze --volume .:/root mercury:latest'
$ alias tls_scanner='docker run --rm -i --entrypoint /usr/local/bin/tls_scanner --volume .:/root mercury:latest'
```
You can then run the commands with `--help`. For example: `batch_gcd --help`.

For usage information, run each command with `--help`. For example: `batch_gcd
--help`.

### Build from source

Expand Down
2 changes: 1 addition & 1 deletion doc/npf.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**Draft**

January 4, 2022
December 6, 2022

David McGrew (Editor)

Expand Down
2 changes: 1 addition & 1 deletion src/Makefile.in
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ uninstall:
@echo "local captures not removed; to do that, run 'rm -rf $(localstatedir)'"

.PHONY: uninstall-certtools
uninstall:
uninstall-certtools:
rm -f $(bindir)/batch_gcd
rm -f $(bindir)/cert_analyze
rm -f $(bindir)/tls_scanner
Expand Down
56 changes: 36 additions & 20 deletions src/archive_reader.cc
Original file line number Diff line number Diff line change
Expand Up @@ -44,14 +44,14 @@ int main(int argc, char *argv[]) {

const char summary[] =
"usage:\n"
" archive_reader <archive> [OPTIONS]\n"
" archive_reader [OPTIONS]\n"
"\n"
"OPTIONS\n"
;
class option_processor opt({
{ argument::positional, "archive", "read file <archive>" },
{ argument::required, "--archive", "read file <archive>" },
{ argument::required, "--directory", "set the directory to <arg>" },
{ argument::required, "--decrypt", "decrypt using key <arg>" },
{ argument::required, "--decrypt", "decrypt using key from file <arg>" },
{ argument::none, "--extract", "extract archive" },
{ argument::none, "--list", "list archive entries" },
{ argument::none, "--dump", "dump archive entries" },
Expand All @@ -62,21 +62,27 @@ int main(int argc, char *argv[]) {
return EXIT_FAILURE;
}

auto [ archive_is_set, archive ] = opt.get_value("archive");
auto [ archive_is_set, archive ] = opt.get_value("--archive");
auto [ dir_is_set, directory ] = opt.get_value("--directory");
auto [ key_is_set, key_str ] = opt.get_value("--decrypt");
bool list = opt.is_set("--list");
bool dump = opt.is_set("--dump");
bool extract = opt.is_set("--extract");
bool print_help = opt.is_set("--help");

if (!archive_is_set) {
fprintf(stderr, "error: no archive specified on command line\n");
opt.usage(stdout, argv[0], summary);
return EXIT_FAILURE;
}

if (!list && !dump && !extract && !print_help) {
fprintf(stderr, "warning: no actions specified on command line\n");
}

if (print_help) {
opt.usage(stdout, argv[0], summary);
return 0;
return EXIT_SUCCESS;
}

if (dir_is_set) {
Expand All @@ -86,21 +92,31 @@ int main(int argc, char *argv[]) {
}
}

// uint8_t key[32];
// uint8_t iv[32];
// if (key_is_set) {
unsigned char key[16] = {
0xa4, 0x65, 0xd1, 0xd6, 0xed, 0xaa, 0xb0, 0xbb,
0x01, 0x69, 0xe0, 0xf8, 0xb0, 0x10, 0x8e, 0xfd
};
// unsigned char iv[16] = {
// 0x3d, 0x13, 0xe0, 0x0e, 0x90, 0xc4, 0x08, 0x2d,
// 0x02, 0x1c, 0x5a, 0xa9, 0x34, 0xdb, 0x57, 0x5a
// };
// hex_to_raw(key, sizeof(key), key_str.c_str());
// hex_to_raw(iv, sizeof(key), key_str.c_str()+32);

uint8_t *k = key_is_set ? key : nullptr;
// set the key k to that provided in the file specified in the
// --decrypt option, if present, or to nullptr otherwise
//
uint8_t *k = nullptr;
unsigned char key[16] = { 0x00, };
if (key_is_set) {
FILE *keyfile = fopen(key_str.c_str(), "r");
if (keyfile == nullptr) {
fprintf(stderr, "error: could not open key file %s\n", key_str.c_str());
return EXIT_FAILURE;
}
char raw_key[32];
size_t bytes_read = fread(raw_key, sizeof(char), sizeof(raw_key), keyfile);
if (bytes_read != sizeof(raw_key)) {
fprintf(stderr, "error: could not read key from file %s (got %zu)\n", key_str.c_str(), bytes_read);
return EXIT_FAILURE;
}
ssize_t raw_bytes = hex_to_raw(key, sizeof(key), raw_key);
if (raw_bytes != 16) {
fprintf(stderr, "error: could not convert input string into raw key (expected 32 hex chars)\n");
opt.usage(stderr, argv[0], summary);
return EXIT_FAILURE;
}
k = key;
}

const char *archive_file_name = archive.c_str();
class encrypted_compressed_archive tar{archive_file_name, k};
Expand Down
6 changes: 5 additions & 1 deletion src/libmerc/analysis.h
Original file line number Diff line number Diff line change
Expand Up @@ -641,7 +641,7 @@ class fingerprint_data {
attr_prob[common->doh_idx] = 1.0;
}

attribute_result attr_res{attr_tags, attr_prob, &common->attr_name.value()};
attribute_result attr_res{attr_tags, attr_prob, &common->attr_name.value(), common->attr_name.get_names_char()};

// set os_info (to NULL if unavailable)
//
Expand Down Expand Up @@ -865,6 +865,10 @@ class classifier {

rapidjson::Document fp;
fp.Parse(line_str.c_str());
if(!fp.IsObject()) {
printf_err(log_warning, "invalid JSON line in resource file\n");
return;
}

std::string fp_string;
if (fp.HasMember("str_repr") && fp["str_repr"].IsString()) {
Expand Down
73 changes: 64 additions & 9 deletions src/libmerc/archive.h
Original file line number Diff line number Diff line change
Expand Up @@ -335,6 +335,9 @@ class deflate_header {

class gz_file {
unsigned char file_buffer[512];
std::string remaining_file_buffer; // this buffer is used to cache the extra characters read.
ssize_t remaining_file_buffer_len;
ssize_t characters_to_read = 512;
z_stream_s z = {};
encrypted_file enc_file;

Expand Down Expand Up @@ -379,6 +382,9 @@ class gz_file {
//close(fd);
//fd = 0;
}

remaining_file_buffer = "";
remaining_file_buffer_len = 0;
}

~gz_file() {
Expand Down Expand Up @@ -488,22 +494,71 @@ class gz_file {
return z.total_out;
}

// getline read the characters until \n is found
// and returns the length of characters in current line
ssize_t getline(std::string &s, ssize_t read_len) {
s.clear();
ssize_t length = 0;
while (length < read_len) {
char c;
ssize_t bytes_read = read((uint8_t *)&c, sizeof(c));
if (bytes_read <= 0) {
ssize_t characters_in_s = 0;
int i;
bool newline_found = false;

std::string backup = remaining_file_buffer; // backup remaining_file_buffer
ssize_t backup_len = remaining_file_buffer_len;

remaining_file_buffer = ""; // reset remaining_file_buffer
remaining_file_buffer_len = 0;

// processing the remaining characters from last read()
for (i = 0; i < backup_len; i++) {
if (backup[i] == '\n') { // backup string had an entire line
newline_found = true;
break;
}
s += backup[i];
characters_in_s += 1;
}

if (newline_found && characters_in_s != 0) {
i += 1; // skip \n
remaining_file_buffer.assign(backup, i); // update remaining_file_buffer
remaining_file_buffer_len = backup_len - i;
return characters_in_s;
}

ssize_t characters_read_in_this_iteration = 0;
while (characters_read_in_this_iteration < read_len) {

if (characters_to_read > read_len - characters_read_in_this_iteration) {
characters_to_read = read_len - characters_read_in_this_iteration;
}

char c[characters_to_read+1] = "";
ssize_t characters_read = read((uint8_t *)&c, characters_to_read);
characters_read_in_this_iteration += characters_read;
c[characters_to_read] = '\0';
if (characters_read <= 0) {
break;
}
if (c == '\n') {

newline_found = false;
for (i = 0; i < characters_read; i++) {
if (c[i] == '\n') {
newline_found = true;
break;
}
s += c[i];
characters_in_s += 1;
}

if (newline_found) {
i += 1; // skip \n

remaining_file_buffer.assign(c+i); // update remaining_file_buffer
remaining_file_buffer_len = characters_read-i;
break;
}
s += c;
length++;
}
return length;
return characters_in_s;
}
};

Expand Down
2 changes: 1 addition & 1 deletion src/libmerc/dns.h
Original file line number Diff line number Diff line change
Expand Up @@ -817,7 +817,7 @@ struct dns_packet : public base_protocol {
if (header == NULL) {
return;
}
const char *key = encoded<uint16_t>{header->flags}.bit<0>() ? "response" : "query";
const char *key = encoded<uint16_t>{ntoh(header->flags)}.bit<0>() ? "response" : "query";
struct json_object dns_json{o, key};
//dns_json.print_key_uint("qdcount", qdcount);
//dns_json.print_key_uint("ancount", ancount);
Expand Down
14 changes: 14 additions & 0 deletions src/libmerc/libmerc.cc
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,20 @@ bool mercury_write_stats_data(mercury_context mc, const char *stats_data_file_pa
return true;
}

const struct attribute_context *mercury_packet_processor_get_attributes(mercury_packet_processor processor) {
try {
if (processor && processor->analysis.result.attr.is_valid()){
return processor->analysis.result.attr.get_attributes();
}
return NULL;
}
catch (std::exception &e) {
printf_err(log_err, "%s\n", e.what());
}

return NULL;

}

const char license_string[] =
"Copyright (c) 2019-2020 Cisco Systems, Inc.\n"
Expand Down
43 changes: 43 additions & 0 deletions src/libmerc/libmerc.h
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,30 @@ struct libmerc_config {
#endif
};

/**
* @brief struct attribute_context represents the attribute result given
* by the libmerc library. It comprises of a list of attributes, corresponding
* probabillity scores and the maximum number of attributes/length of prior lists
*
* To fetch the attribute_context, call mercury_packet_processor_get_analysis_context,
* followed by mercury_packet_processor_get_attributes
*/
struct attribute_context {

#ifdef __cplusplus
// default values, for c++ only
const char * const* tag_names = nullptr; /* List of attributes names as an arrary of char* */
const long double* prob_scores = nullptr; /* Probabality score of the above corresponding attributes as an array */
size_t attributes_len = 0; /* Length of the above arrays */

#else

const char * const* tag_names; /* List of attributes names as an arrary of char* */
const long double* prob_scores; /* Probabality score of the above corresponding attributes as an array */
size_t attributes_len; /* Length of the above arrays */
#endif
};

/**
* libmerc_config_init() initializes a libmerc_config structure to a
* minimal, default configuration.
Expand Down Expand Up @@ -714,4 +738,23 @@ extern "C" LIBMERC_DLL_EXPORTED
#endif
size_t get_stats_aggregator_num_entries(mercury_context mc);

//
// start of libmerc version 6 API
//

/**
* mercury_packet_processor_get_attributes() is a followup call to mercury_packet_processor_get_analysis_context
* and sets a pointer to a struct attribute_context containing zero or more attributes like MITRE techniques, mercury
* specific tags etc. for a given analysis_context.
*
* @param processor (input) is a packet processor context to be used.
*
* @return a pointer to a struct attribute_context if the call succeeds, otherwise NULL
*
*/
#ifdef __cplusplus
extern "C" LIBMERC_DLL_EXPORTED
#endif
const struct attribute_context *mercury_packet_processor_get_attributes(mercury_packet_processor processor);

#endif /* LIBMERC_H */
Loading

0 comments on commit 75f6ad6

Please sign in to comment.