Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting zip with generic carver produces wired results #123

Open
maringuu opened this issue Oct 3, 2023 · 4 comments
Open

Extracting zip with generic carver produces wired results #123

maringuu opened this issue Oct 3, 2023 · 4 comments

Comments

@maringuu
Copy link
Collaborator

maringuu commented Oct 3, 2023

Running fact_extractor with 0.zip gives me wired results.
Here is the output of tree in the respective extraction directory:

.
├── files
│   └── 0.zip
├── input
│   └── 0.zip
└── reports
    └── meta.json

The report tells us that one file was extracted which is the file itself.
They even have the same hashes.

What happened here?

@maringuu
Copy link
Collaborator Author

maringuu commented Oct 3, 2023

Looking at the code I just noticed that this is a binwalk issue.

@maringuu maringuu closed this as completed Oct 3, 2023
@maringuu
Copy link
Collaborator Author

maringuu commented Oct 4, 2023

Actually I think this is our issue. Adding "--rm" to the binwalk invocation might be the solution (and works for my limited test cases).

@maringuu maringuu reopened this Oct 4, 2023
@jstucke
Copy link
Collaborator

jstucke commented Oct 9, 2023

It seems to me this 0.zip being unpacked by the generic_carver (or rather not being detected as MIME type ZIP) is a bug in itself. As far as I can tell, the header starts with the usual magic string PK\x03\x04 but for whatever reason file detects it as application/octet-stream

@jstucke
Copy link
Collaborator

jstucke commented Oct 9, 2023

Actually I think this is our issue. Adding "--rm" to the binwalk invocation might be the solution (and works for my limited test cases).

I don't think this is (entirely) our fault. Unpacking the same file from itself is the fault of binwalk IMHO. Adding --rm works for this file but I tried it with a different file (which previously was unpacked successful with binwalk) and this causes the file to not be unpacked at all. The problem is probably that binwalk also does not recognize the file as zip and simply tries to carve files from the file and it finds a zip file at offset 0 (the file itself).

We could also try to handle this specific case in "fact_helper_file" and force the file to be detected as application/zip (the default application/zip unpacker has no problem unpacking the file). The file actually seems to be a OOXML file but that type does not come with a MIME definition in the standard file magic.

But is this a general problem with binwalk or is this a special case? Does this only affect zip files that are not detected as zip or also other files?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants