PersistorObjectInterface

The Persistor Object Interface

Create an object implementing this interface to store information about previously-analyzed items and information collated for reports in a data store other than the built-in SQLite one.

Your object must implement all methods defined in this reference.

All methods should handle exceptions that can be anticipated, and return the specified "empty" return value if they occur. Serious exceptions indicating bugs in the code or similar can be let through.

Your implementation can inherit from the class cbc_binary_toolkit.state.manager.BasePersistor (defined here), to ensure that the protocol is properly implemented.

force_close method

def force_close(self):

This method is at present only called from test code. It should cause the persistor to close and release all external resources (such as a database connection). It may be omitted when implementing a new persistor; the default implementation in cbc_binary_toolkit.state.manager.BasePersistor does nothing.

Parameters: None.

Returns: Nothing.

set_checkpoint method

def set_checkpoint(self, binary_hash, engine_name, checkpoint_name, checkpoint_time=None):

This method is called by the toolkit multiple times per item analyzed, per analysis engine used, to record its progress through the analysis process. Each time it is called for any given item and analysis engine name, a different "checkpoint name" is recorded. The special checkpoint name of "DONE" is used to indicate that an item has completed analysis.

Implementations may choose to store a complete history of checkpoints, or they may choose to store only the most recent checkpoint for any given combination of item and analysis engine name.

Parameters:

binary_hash (type: str) - The hash value of the item being analyzed.
engine_name (type: str) - The name of the analysis engine being run on the particular item.
checkpoint_name (type: str) - The name of the associated checkpoint that has been reached. The special checkpoint name of "DONE" is used to indicate that an item has completed analysis.
checkpoint_time (type: str, optional) - The date and time value to be saved as "when" the specified checkpoint was reached. This parameter is not normally used, except in test code. If it is specified, it should be a string value representing a date and time in ISO 8601 format. This value is assumed to be in the local timezone. If it is not specified (if it is None), the current date and time should be used.

Returns: Nothing.

get_previous_hashes method

def get_previous_hashes(self, engine_name):

This method is called by the toolkit once during processing of a batch of items, as part of the process of deduplicating the hash values and only analyzing those that haven't already been analyzed. It should return a list of all the hashes with history saved in the data store that have previously completed analysis, corresponding to a checkpoint of "DONE" having been set with set_checkpoint.

If implementations store the history of multiple checkpoints per item and analysis engine name, only the most recent of those checkpoints should be considered for the purposes of whether the item hash is returned by this method.

Parameters:

engine_name (type: str) - The name of the analysis engine being run on the batch of items.

Returns:

(type: list) A list of all the item hashes, as string values, that have been marked as previously completed for the specified engine name. This list must be returned in sorted order. If no such item hashes exist, or if a runtime exception is handled, the empty list must be returned.

get_unfinished_hashes method

def get_unfinished_hashes(self, engine_name):

This method is called by the toolkit when started in "restart" mode, to identify which items were in the middle of being processed in a previous run and resume processing those items. It should return a list of all the hashes with history saved in the data store that have previously not completed analysis, corresponding to a checkpoint value other than "DONE" having been set with set_checkpoint.

If implementations store the history of multiple checkpoints per item and analysis engine name, only the most recent of those checkpoints should be considered for the purposes of whether the item hash is returned by this method.

Parameters:

engine_name (type: str) - The name of the analysis engine being run on the batch of items.

Returns:

(type: list) A list of all the item hashes that have been previously registered with the data store, but have not been marked as completed. The list is in the form of tuples of two string elements, the first of those being the actual hash value, and the second of those being the name of the most-recently-recorded checkpoint for that hash value. If no such item hashes exist, or if a runtime exception is handled, the empty list must be returned.

prune method

def prune(self, timestamp):

This method is called by the toolkit when asked to clear the database. It erases all entries from the data store with a timestamp older than a specified value.

Parameters:

timestamp (type: str) - The timestamp acting as the base time for this operation, specified in ISO 8601 format. This value is assumed to be in the local timezone. All item hash/engine name/checkpoint records with timestamps previous to this one shall be erased from the data store by this method.

Returns: Nothing.

add_report_item method

def add_report_item(self, severity, engine_name, data):

This method is called by the toolkit when a new report item (containing IOC records) is to be recorded for later transmission to the Carbon Black Cloud server. It should save this report item in the data store, in a list segregated by engine name and severity level.

The report data received by this method must be treated as opaque data by the persistor. For example, a persistor might serialize it to JSON format before adding it to the data store in a field capable of storing long strings.

Parameters:

severity (type: int) - The severity level associated with the report item. This value is always between 1 and 10, inclusive.
engine_name (type: str) - The analysis engine name associated with this report item.
data (type: dict) - The report item to be stored. This item must be treated as opaque data by the persistor.

Returns: Nothing.

get_current_report_items method

def get_current_report_items(self, severity, engine_name):

This method is called by the toolkit when formatting an actual report to be sent to the Carbon Black Cloud server. It should return all saved report items that are currently in the data store for the given engine name and severity level.

The report data retrieved and returned by this method must be treated as opaque data by the persistor, and must be returned in exactly the same format as it was stored with add_report_item. For example, a persistor might retrieve the item as a string from the data store, then deserialize it as JSON data before returning it.

Parameters:

severity (type: int) - The severity level associated with the report items to be retrieved. This value is always between 1 and 10, inclusive.
engine_name (type: str) - The analysis engine name associated with the report items to be retrieved.

Returns:

(type: list) A list of dict values, each of which represents a report item that was previously saved with add_report_item. This list may be returned in arbitrary order. If no such report items exist, or if a runtime exception is handled, the empty list must be returned.

clear_report_items method

def clear_report_items(self, severity, engine_name):

This method is called by the toolkit after a report has been sent to the Carbon Black Cloud server, to erase the report data just sent from the data store so that it is not mistakenly sent again. The reason for retaining the report items is, in the event of a crash of the toolkit, those report items will be sent to the server the next time the toolkit is started. This method must remove all the saved report items that are currently in the data store for the given engine name and severity level.

Parameters:

severity (type: int) - The severity level associated with the report items to be cleared. This value is always between 1 and 10, inclusive.
engine_name (type: str) - The analysis engine name associated with the report items to be cleared.

Returns: Nothing.

Information here is current as of Carbon Black Cloud Binary Toolkit 1.0.0

CBC Binary Toolkit

Home
Performance Metrics
User Guide
Developer Guide
- Replace Components
  - Analysis Engine
  - Persistence System
- Architecture Overview
- Schemas
- Interfaces
Glossary

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PersistorObjectInterface

The Persistor Object Interface

force_close method

set_checkpoint method

get_previous_hashes method

get_unfinished_hashes method

prune method

add_report_item method

get_current_report_items method

clear_report_items method

CBC Binary Toolkit

Clone this wiki locally