Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

why are there missing files #820

Closed
Oren-i opened this issue Jul 13, 2022 · 11 comments
Closed

why are there missing files #820

Oren-i opened this issue Jul 13, 2022 · 11 comments

Comments

@Oren-i
Copy link

Oren-i commented Jul 13, 2022

For some firmware images, the analysis never finishes, and there are missing files in the Admin/Find Missing Analysis tab.

Is this expected? In case no, how can I see what it went wrong to fix it? There are no messages on the Logs tab, and I am using FACT_docker.

@jstucke
Copy link
Collaborator

jstucke commented Jul 15, 2022

Hi,

For some firmware images, the analysis never finishes

Do you mean that entries in "Currently analyzed firmware" (/system_health) never complete? That could happen if there are errors during analysis or unpacking and the file gets lost during scheduling (but it should obviously not happen). Since there don't seem to be any helpful log messages, it could be complicated to debug the problem. Did you maybe see any errors or stack traces in the terminal output (docker logs could help here since you are using FACT_docker)? There could be unexpected errors that don't result in log messages.

@Oren-i
Copy link
Author

Oren-i commented Jul 15, 2022

Do you mean that entries in "Currently analyzed firmware" (/system_health) never complete?

Correct.

I rerun the test and reproduced the error, but I now see that there is a time out exception that seems to be not handled correctly, maybe because when handling an exception other exceptions were raised. Attached is the exception log message.
fact_log.txt

The extractor that seems to take a long time before the exception is binwalk as seen by ps.

@jstucke
Copy link
Collaborator

jstucke commented Jul 18, 2022

The error is indeed not handled correctly. Nevertheless, it is also not clear what caused the error in the first place. Was it a particularly large or in some other way unusual file? Running binwalk usually takes some time for large files (which may be the cause of the timeout).
You could also try to run the extractor manually on the file as documented here to maybe see what causes the error.

@Oren-i
Copy link
Author

Oren-i commented Sep 2, 2022

The issue here I think is that binwalk does indeed take too much time for some files, and that FACT_core does not correctly handle timeouts in FACT_extractor. In some cases binwalk extracts bogus data and as FACT_extractor is called in a recursive manner, a very large file can be sent to binwalk for further extraction.

Feel free to assign this to me.

@jstucke
Copy link
Collaborator

jstucke commented Sep 2, 2022

We are always happy to receive external contributions and will try to support you, so feel free to try to improve this. Some things to note:

  • binwalk is only used in two cases for extraction:
    1. when the file format is not known (e.g. the file is a binary blob without headers)
    2. when the extraction for the file's format with the designated plugin fails (as a fall-back option)
  • the output of binwalk is already (partly) filtered: we try to sort out bogus archives by verifying if the output is really an archive of the type
  • the extractor runs as a docker container and is called from unpack_base.py

Oren-i pushed a commit to Oren-i/FACT_core that referenced this issue Sep 8, 2022
…d#820

Timeouts are handled by catching all requests exceptions because
requests raise ConnectionError on a timeout.
@Oren-i
Copy link
Author

Oren-i commented Sep 8, 2022

I submited a merge request to fix any timeout in fact_extractor. I tested this on v3.3. Unfortunately I could not test this on main, but I think it should still work.

@Oren-i
Copy link
Author

Oren-i commented Sep 8, 2022

I also submited a patch to fact_extractor to try to get partial results in case binwalk does not finish.

@dgutson
Copy link

dgutson commented Sep 9, 2022

I think it's worth to mention the PR: #852

@Oren-i
Copy link
Author

Oren-i commented Sep 9, 2022

And the other PR is fkie-cad/fact_extractor#94

@rhelmke
Copy link
Collaborator

rhelmke commented Sep 14, 2022

Hello!

First of all: thank you guys so much for the contributions here.

Unfortunately, our lead developer @jstucke and his right hand @maringuu are pretty busy this week, which is why probably nothing will happen until the 19th.

Just giving you a heads up - normally both PRs would've already been considered :-)

Oren-i pushed a commit to Oren-i/FACT_core that referenced this issue Sep 21, 2022
jstucke added a commit that referenced this issue Sep 22, 2022
Handle timeout errors when calling fact_extractor, related to #820
@jstucke
Copy link
Collaborator

jstucke commented Nov 13, 2024

Since this should hopefully fixed with #852 and fkie-cad/fact_extractor#94, I will close this issue. Feel free to reopen if this is still a thing.

@jstucke jstucke closed this as completed Nov 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants