Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Undetected PDF that can still be opened in pdf reader. #25

Open
simmisj opened this issue Dec 12, 2019 · 0 comments
Open

Undetected PDF that can still be opened in pdf reader. #25

simmisj opened this issue Dec 12, 2019 · 0 comments

Comments

@simmisj
Copy link

simmisj commented Dec 12, 2019

Hi.
Recently I received a pdf document that was not corrupt and could be opened in a pdf reader but was not detected as a pdf by Mime-Detective.
The pdf standard says that a pdf document should start with the magic number and a version number. See 'Technical overview - File structure' here: https://en.wikipedia.org/wiki/PDF But the document that I received started with a new line and this òÀ� followed by the magic number and version number. You can replicate this by taking any working pdf document and adding it to the beginning of the file in a text editor. Setting the pdf type offset to 4 makes Mime-Detective detect it as a pdf since it skips the added gibberish.
The issue here is, since pdf readers can safely open such documents, shouldn't Mime-Detective detect it as a valid pdf document?
The problem seems to be in the GetFileMatchingCount method in MimeTypes class. It expects the header to be the first thing it sees and breaks out immediately.
Cheers!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant