Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create/update file values in API v2 #998

Closed
benjamingeer opened this issue Sep 20, 2018 · 23 comments
Closed

Create/update file values in API v2 #998

benjamingeer opened this issue Sep 20, 2018 · 23 comments
Assignees
Labels
API/V2 clientapi frontend: Salsah, DSP-APP, BEOL, MLS, etc. enhancement improve existing code or new feature

Comments

@benjamingeer
Copy link

benjamingeer commented Sep 20, 2018

How API v1 does this

  • Each resource can have at most one file, and the property IRI depends solely on the file type (in SipiConstants), so we need only the resource IRI and the file itself.
  • Knora's Lua scripts for Sipi allow anyone to upload a file.
  • There are two use cases:
    • The "GUI case", in which SALSAH uploads the file directly to Sipi, then sends a confirmation to Knora.
    • The "non-GUI case", in which the client uploads the file directly to Knora, which sends it to Sipi.

Goals for API v2

  • Allow creating a resource containing multiple files.
  • Provide a way for Sipi to determine if the user has permission to upload a file.

Design

Akka HTTP can process a request with multiple files:

https://doc.akka.io/docs/akka-http/current/routing-dsl/directives/file-upload-directives/fileUploadAll.html

To authorise an upload in Sipi, I guess we're going to use JWT (#520)? How will this work in SALSAH?

@benjamingeer
Copy link
Author

Is there actually any reason for the client to submit files to Knora? Couldn't it just submit them to Sipi, then just tell Knora about the metadata?

@lrosenth
Copy link
Contributor

lrosenth commented Sep 21, 2018 via email

@loicjaouen
Copy link
Contributor

For the user, it is simpler to send a request to knora and then let the knora/sipi magic happen behind the scene. This is very much in the bulk import way of doing (I just don't like the implementation that expect a shared directory between knora and sipi).

For the user again, The current GUI Case is more convoluted (sending a file to sipi, keep the returned iri, send it to knora).

@benjamingeer
Copy link
Author

benjamingeer commented Sep 21, 2018

For the user, it is simpler to send a request to knora and then let the knora/sipi magic happen behind the scene.

I'm not sure it's really simpler. To make this work in API v2, you would need to send a single muiltipart/form-data request to Knora, containing:

  • A JSON-LD document with a special magic name (e.g. request.jsonld), with content referring to filenames.
  • One or more files, whose filenames are referred to in the JSON-LD document.

So in any case this would be different and more complex than a normal POST/PUT request.

If you send the files to Sipi first, you will have to construct exactly the same JSON-LD request (referring to the files that you sent to Sipi). And you can make a normal POST/PUT to Knora instead of a special multipart/form-data.

In any case, this will all be handled by a Salsah component, so users won't need to care about it.

If Knora doesn't have to handle files itself, the advantage for Knora is that it takes load off the Knora server, simplifies the code (no need to deal with temporary files and multipart/form-data), and makes testing and debugging easier (instead of dealing with a request to Knora that makes a request to Sipi, you just need to look at a request to Knora).

@subotic
Copy link
Collaborator

subotic commented Sep 21, 2018

I’m for sending files directly to SIPI (#535).

@subotic
Copy link
Collaborator

subotic commented Sep 21, 2018

Also, dasch-swiss/sipi#254 should allow Knora to have any information that it needs about a file.

@lrosenth
Copy link
Contributor

lrosenth commented Sep 21, 2018 via email

@tobiasschweizer
Copy link
Contributor

Couldn't it just submit them to Sipi, then just tell Knora about the metadata

When I wrote the V1 integration Knora-Sipi, I wanted to make sure that the metadata stored by Knora (mimetype, original file name with extension) are checked by Sipi based on the submitted image file. For v2, this could be achieved by Knora asking Sipi about the metadata or the validity of user provided metadata after the upload to Sipi, but it should not be the client sending a request to Sipi first, then getting the metadata and then sending them to Knora.

My rationale is that the DaSCH could be asked to recreate the original file format from JPEG2000. So we need to be sure that this information is actually correct. If someone submits a JPG file and claims it is a tiff, the request has to be rejected.

When I speak about metadata, I mean the data stored with a file value in Knora. Probably by now some metadata is also stored in the JPEG 2000 itself.

@benjamingeer
Copy link
Author

For v2, this could be achieved by Knora asking Sipi about the metadata

Can Sipi already do this? If not, would this require C++ code? Or just Lua code?

We have only make SIPI get the permission for upload from Knora.

Currently when you log into Knora, you get a JWT token. But as far as I can see, Knora's Lua scripts for Sipi don't use it. How should we use a JWT token to authenticate a file upload? Just put it in the URL when posting the file to Sipi?

@benjamingeer
Copy link
Author

@tobiasschweizer In the current GUI case, where the user submits the file directly to Sipi, how does Knora ensure that the file metadata is correct?

@lrosenth
Copy link
Contributor

lrosenth commented Sep 21, 2018 via email

@tobiasschweizer
Copy link
Contributor

Ah, now I remember: https://github.com/dhlab-basel/Knora/blob/develop/sipi/scripts/make_thumbnail.lua

this is done first :-)

@benjamingeer
Copy link
Author

make_thumbnail.lua

But the client gets the response from make_thumbnail.lua, containing the metadata. Then the client sends that metadata to Knora. So how does Knora check that this metadata is correct?

@benjamingeer
Copy link
Author

It looks to me like even in the GUI case, SipiResponder.callSipiConvertRoute is used, which always gets all the file metadata from Sipi. Or am I misunderstanding how this works?

@subotic
Copy link
Collaborator

subotic commented Sep 21, 2018

If possible, I would like to be part of this discussion. Since I’m on vacation, I would suggest to postpone it to when I’m back.

@benjamingeer
Copy link
Author

benjamingeer commented Sep 21, 2018

@tobiasschweizer and I talked about this, now it's clear what happens in the GUI case in API v1:

  1. The client submits the file to the make_thumbnail.lua route in Sipi, which stores the submitted file in a temporary location with a randomly generated filename. The client gets back a thumbnail and the randomly generated filename.
  2. The client sends the file's metadata, including the randomly generated filename and the original filename, to Knora.
  3. Knora calls Sipi's convert_from_file.lua route, which checks that the submitted content type matches the file's real type (as determined by its magic number), and converts the file to JPEG2000.

I think this is basically what should happen in API v2, whether or not the GUI is used. The only difference is that a GUI will want a thumbnail, but a non-GUI client doesn't need a thumbnail.

Knora can get all the file's metadata from Sipi (as it does in API v1). The only thing it can't get from Sipi is the file's original filename, because Sipi doesn't store that. So the client has to give Knora the original filename, and Knora just needs to check that the file extension corresponds to the file's actual type (as determined by Sipi).

@benjamingeer
Copy link
Author

If possible, I would like to be part of this discussion.

That would be great.

Since I’m on vacation, I would suggest to postpone it to when I’m back.

No problem for me.

@subotic
Copy link
Collaborator

subotic commented Sep 21, 2018

The only thing it can't get from Sipi is the file's original filename, because Sipi doesn't store that.

It does, or at least the functionality is there, when Sipi used in the command line.

@subotic
Copy link
Collaborator

subotic commented Sep 21, 2018

It is returned in the info.json IIIF call.

@benjamingeer
Copy link
Author

It is returned in the info.json IIIF call.

OK that's good to know.

@subotic
Copy link
Collaborator

subotic commented Sep 21, 2018

No problem for me.

Thanks 😀

@benjamingeer
Copy link
Author

benjamingeer commented Oct 3, 2018

After discussion with @lrosenth, our plan is:

  1. Client sends a POST request to Sipi, containing one or more images, along with the JWT token that the client got from Knora when it logged in.
  2. If the JWT token is valid, Sipi knows that the client has permission to upload images.
  3. Sipi converts the images to JPEG 2000 files with randomly generated filenames, and stores them in a temporary directory, reporting any errors. If no errors occur, it returns those randomly generated filenames and corresponding IIIF URLs for the images, which the client can use to display thumbnails if needed.
  4. The client sends a POST or PUT request to Knora (e.g. to create a resource or to add/modify a value), providing the image filenames it got from Sipi. There is no need for the client to send the image metadata.
  5. Knora makes GET requests to Sipi, one request per image, providing the image filenames, to get the metadata for each image:
    • original filename
    • original MIME type
    • converted MIME type (typically JPEG 2000)
    • dimensions
  6. If the client's request to Knora is OK and the user has permission, Knora sends PUT requests to Sipi, one request per image, providing the project code, to tell Sipi to store the images in its permanent storage directory for the project. If Knora doesn't accept the client's request, it sends DELETE requests to Sipi, one per image, to delete the files from the temporary directory. In any case, Sipi should periodically delete unused temporary files.
  7. Knora stores the image metadata in the triplestore.

If the files are not images, the same thing happens, except that there is no conversion to JPEG 2000 and no image dimensions.

Currently, knora-base has these properties, which will not be used in API v2 and will be deprecated:

  • knora-base:isPreview
  • knora-base:qualityLevel

It would be good to change Salsah 1.5 so that it doesn't need these anymore. Or we could change API v1 to provide simulated values for those properties.

benjamingeer pushed a commit that referenced this issue Nov 28, 2018
* feature (sipi): Add Lua code for uploading a file to Sipi as per #998

* refactor (api-v2): Refactor file value classes.

- Remove deprecated properties isPreview and qualityLevel from API v2.

* feature (api-v2): Implement file value creation (ongoing).

* feature (api-v2): Implement file value creation (ongoing).

* test (api-v2): Test creating and updating still image file values with mock Sipi.

* feature (api-v2): Have Sipi move a temporary file to permanent storage (ongoing).

- Fix broken tests.

* feature (api-v2): Have Sipi move a temporary file to permanent storage (ongoing).

* refactor (Authenticator): Start replacing deprecated JWT library (#1044).

* refactor (Authenticator): Continue replacing deprecated JWT library (#1044).

* refactor (Authenticator): Finish replacing deprecated JWT library (#1044).

- Add more JWT tests.

* feature (sipi): Improve Lua script for Sipi file upload (ongoing).

- Delete old temporary files each time an upload is processed.
- Fix incorrect log level names in Lua scripts.

* feature (sipi): Validate JWT token in file upload.

* feature (sipi): Improve JWT token validation.

* test (api-v2): Add integration test for creating a file value with Sipi (ongoing).

- Fix some bugs.
- Fix broken authentication message package structure.

* fix (sipi): Fix lots of Lua script bugs.

- Improve error-handling in Lua scripts.
- Fix test data for knora-api:fileValueAsUrl.

* feature (api-v2): If a file value triplestore update fails, have Sipi delete the temp file (ongoing).

* feature (sipi): Make SipiImage directly from uploaded file.

- Take into account directory hashing in Lua scripts.
- Recursively clean up temporary directory.
- Clean up Lua scripts.

* test (sipi): Test deleting temp file if file value creation fails.

* test (sipi): Activate subdirs in temp dir for tests.

* refactor (sipi): Use simple form of SipiImage.new.

* test (api-v2): Add tests of file uploads.

- Test creating a resource with a file value.
- Test Knora's handling of Sipi errors.

* test (sipi): Test creating a resource with multiple file values.

- Fix upload.lua to preserve the order of uploaded files.
- Give knora-api:stillImageFileValueHasIIIFBaseUrl a type of xsd:anyURI.
- Fix test data.

* feature (sipi): Have Sipi return original filename in response to upload.

* docs (api-v2): Add API and design docs about file uploads.

* docs (release-notes): Update release notes.

* test (sipi): Use new login request format.

* docs (api-v2): Add warning about #1068.

* fix (api-v1): Adapt API v1 to ignore preview image file values in triplestore.

- Update tests.

* fix (sipi): Use a JPEG 2000 image to generate previews for testing.

* feature (sipi): Clean up a few things.

* test (api-v1): Test resource context response when there's no preview image in the triplestore.
@subotic subotic added this to the Backlog milestone Feb 7, 2020
@flavens flavens added clientapi frontend: Salsah, DSP-APP, BEOL, MLS, etc. and removed frontend (salsah) labels Sep 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
API/V2 clientapi frontend: Salsah, DSP-APP, BEOL, MLS, etc. enhancement improve existing code or new feature
Projects
None yet
Development

No branches or pull requests

6 participants