Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix JP2/WEBP detection, Link header, CORS origin #97

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

dpb587
Copy link

@dpb587 dpb587 commented Oct 14, 2023

Hi - I ran into a few errors when running this - both locally and from iiif.io, so I wanted to propose changes from my findings. It's not clear to me if this repository is still maintained so kept them in a single PR for my own simplicity, but can resubmit to split them up for acceptance or discussion if you prefer.

More technical details are documented in the commit messages, so, to summarize:

  • WEBP format (bug fix) - correctly identify magic header, seemingly related to a Python3 upgrade (previously the test result was a false negative)
  • JP2 format (bug fix) - correctly invoke magic library, unsure the cause of this regression (previously it was an uncaught exception)
  • Link header (bug fix) - allow semicolons in URIs (previously it was an uncaught exception)
  • CORS (behavior change) - expanded support to accept origin-specific response headers (i.e. http://iiif.io) instead of only a wildcard (previously this case was unsupported)

Also, I'm not too sure about the testing practices for discovering if I'm breaking any expectations or simply misunderstood an expected failure. I'd be happy to amend this for test coverage after more guidance.

I would appreciate any feedback you might have on the changes - thanks!

Wildcards are recommended for broadest access, but request-based origins
are still valid.

https://iiif.io/api/image/3.0/#71-cors
https://www.w3.org/TR/cors/
The `>` character is a more canonical delimiter for the end of a URI, and
fixes the original issue of failed parses of URIs containing a valid
semicolon.

The original use of semicolon and backtracking it seems intentional, but
not sure what the original intent or test case might have been.
Python3 changed behavior of strings vs bytes such that a WEBP string
constant is not considered equivalent to thel bytes from reading.

Change summarized in: https://blog.feabhas.com/2019/02/python-3-unicode-and-byte-strings/
This appears to have been a confusion between filemagic and python-magic,
but not sure where the regression would have been introduced. It was erroring
with:

    exception: __enter__

Which was a suppression of the original error:

    AttributeError: 'Magic' object has no attribute 'id_buffer'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant