Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle timeout errors when calling fact_extractor, related to #820 #852

Merged
merged 2 commits into from
Sep 22, 2022
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Next Next commit
Handle timeout errors when calling fact_extractor, related to #820
Timeouts are handled by catching all requests exceptions because
requests raise ConnectionError on a timeout.
  • Loading branch information
Oren Isacson committed Sep 8, 2022
commit 9fc393cccb47013320c0d76b339967025109d76d
13 changes: 13 additions & 0 deletions src/unpacker/unpack.py
Original file line number Diff line number Diff line change
@@ -36,6 +36,10 @@ def unpack(self, current_fo: FileObject):
with TemporaryDirectory(prefix='fact_unpack_', dir=self.config['data-storage']['docker-mount-base-dir']) as tmp_dir:
file_path = self._generate_local_file_path(current_fo)
extracted_files = self.extract_files_from_file(file_path, tmp_dir)
if extracted_files is None:
self._store_unpacking_error_skip_info(current_fo)
return []

extracted_file_objects = self.generate_and_store_file_objects(extracted_files, Path(tmp_dir) / 'files', current_fo)
extracted_file_objects = self.remove_duplicates(extracted_file_objects, current_fo)
self.add_included_files_to_object(extracted_file_objects, current_fo)
@@ -44,6 +48,15 @@ def unpack(self, current_fo: FileObject):

return extracted_file_objects

@staticmethod
def _store_unpacking_error_skip_info(file_object: FileObject):
file_object.processed_analysis['unpacker'] = {
'plugin_used': 'None', 'number_of_unpacked_files': 0, 'plugin_version': '0.0', 'analysis_date': time(),
'info': 'Unpacking stopped because extractor raised a exception (possible timeout)',
}
tag_dict = {'unpacker': {'extractor error': {'value': 'possible extractor timeout', 'color': TagColor.ORANGE, 'propagate': False}}}
file_object.analysis_tags.update(tag_dict)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tags don't seem to be saved correctly in the database. But something like this should work:

Suggested change
file_object.processed_analysis['unpacker'] = {
'plugin_used': 'None', 'number_of_unpacked_files': 0, 'plugin_version': '0.0', 'analysis_date': time(),
'info': 'Unpacking stopped because extractor raised a exception (possible timeout)',
}
tag_dict = {'unpacker': {'extractor error': {'value': 'possible extractor timeout', 'color': TagColor.ORANGE, 'propagate': False}}}
file_object.analysis_tags.update(tag_dict)
file_object.processed_analysis['unpacker'] = {
'plugin_used': 'None', 'number_of_unpacked_files': 0, 'plugin_version': '0.0', 'analysis_date': time(),
'info': 'Unpacking stopped because extractor raised a exception (possible timeout)',
'tags': {'extractor error': {'value': 'possible extractor timeout', 'color': TagColor.ORANGE, 'propagate': False}},
}

This also applies to the _store_unpacking_depth_skip_info() method below

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed the changed suggested (untested).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting to draft as the changes are going do cause unneeded timeouts when using fact_extractor in standalone mode. I think the correct way to implement this should be by getting the timeout value from fact_core as a parameter.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Converting to draft as the changes are going do cause unneeded timeouts when using fact_extractor in standalone mode.

I'm not sure I can follow. The changes in this PR should not affect the extractor. Did you maybe mean fkie-cad/fact_extractor#94?

I just tested the changes again and everything seems to work fine (with one exception: analysis tags are not always rendered correctly for firmware containers but that is something that is out of scope for this PR).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct, I meant to mark the other pull request as draft. This pull request is fine (I can not remove the draft status myself).


@staticmethod
def _store_unpacking_depth_skip_info(file_object: FileObject):
file_object.processed_analysis['unpacker'] = {
28 changes: 17 additions & 11 deletions src/unpacker/unpack_base.py
Original file line number Diff line number Diff line change
@@ -7,6 +7,7 @@
from common_helper_files import safe_rglob
from docker.types import Mount

from requests import exceptions
from helperFunctions.docker import run_docker_container


@@ -23,17 +24,22 @@ def extract_files_from_file(self, file_path, tmp_dir):
self._initialize_shared_folder(tmp_dir)
shutil.copy2(file_path, str(Path(tmp_dir, 'input', Path(file_path).name)))

result = run_docker_container(
'fkiecad/fact_extractor',
combine_stderr_stdout=True,
privileged=True,
mem_limit=f"{self.config.get('unpack', 'memory-limit', fallback='1024')}m",
mounts=[
Mount('/dev/', '/dev/', type='bind'),
Mount('/tmp/extractor', tmp_dir, type='bind'),
],
command=f'--chown {getuid()}:{getgid()}'
)
try:
result = run_docker_container(
'fkiecad/fact_extractor',
combine_stderr_stdout=True,
privileged=True,
mem_limit=f"{self.config.get('unpack', 'memory-limit', fallback='1024')}m",
mounts=[
Mount('/dev/', '/dev/', type='bind'),
Mount('/tmp/extractor', tmp_dir, type='bind'),
],
command=f'--chown {getuid()}:{getgid()}'
)
except exceptions.RequestException as err:
warning = f'Request exception executing docker extractor:\n{err}'
logging.warning(warning)
return None

try:
result.check_returncode()