-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Knora integration #267
Knora integration #267
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've just tested this by uploading a file using the upload.lua
route we wrote:
I used this command to upload a TIFF file:
curl -v -F [email protected] 'http://localhost:1024/upload?token=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.Onul6wAxhPDrV9KJh9qEpKBE4K9a3RXvtw_pgYODWic'
The upload
route responded with:
{
"B7bfRzyuFIm-D7EfW8LQnWn.jp2": "http://localhost:1024/tmp/B7bfRzyuFIm-D7EfW8LQnWn.jp2"
}
I then tried to use that URL with the knora.json
route:
http://localhost:1024/tmp/B7bfRzyuFIm-D7EfW8LQnWn.jp2/knora.json
But it only returned width
and height
, not mimetype
or origname
:
{
"width": 771,
"height": 720
}
I’ll have a look at it!
Lukas
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What we found out: this actually works, but when you convert the file to JPEG 2000 from Lua, the metadata is not added to the .jp2 file.
I think that when Sipi converts the file, it should also use SipiImage::checkMimeTypeConsistency
to make sure that the submitted file extension matches the file’s real MIME type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use the complex form of SipiImage.new
, it works, but if we use the simple form, the original file metadata is not included in the converted file:
https://dhlab-basel.github.io/Sipi/documentation/lua.html#sipiimage-new-filename
Shouldn't the metadata always be included, even if we use the simple form of SipiImage.new
?
Mmmhh – it does not make sense because we do not have the original name directly at disposition. In an upload the file get's a random name, the original name is in the POST parameters. Therefore we have to retrieve the POST-parameter and add it to the SipiImage.new() constructor. I could only think of making it a mandatory parameter, but this would reduce the versatility of the Lua object...
It get's even more complicated if we have an upload of multiple files. Therefore I think it's best to stay with the current solution (but document it better!)
|
But what about the original MIME type? |
When reading (SipiImage.new()) the file, the mimetype is determined (using the magic library) and inserted. I'm not using the file extension, but the bitstream (e.g. magic number). The mimetype can be extracted from the file data, but unfortunately not the original name...
|
I don't think so. Using the complex form of Using the simple form of |
Yes, that's true. I can add the the mimetype is added automatically in any case – even if the original name is not known. Some Skype??
Am 31.10.2018 um 12:15 schrieb Benjamin Geer <[email protected]<mailto:[email protected]>>:
When reading (SipiImage.new()) the file, the mimetype is determined (using the magic library) and inserted
I don't think so. Using the complex form of SipiImage.new, the resulting jp2 file contains SIPI:density.tiff|image/tiff|sha256|6b188905b8def1f1158e8ef0263475c98dea512af4dc14dfca6e8f6cc1e53cb1|IGNORE_ICC|NULL.
Using the simple form of SipiImage.new, the resulting jp2 file doesn't contain SIPI: or image/tiff.
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub<#267 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFN9zPQJPoBAwbEEHcnd_UxBr9j99ikSks5uqYZVgaJpZM4XxUJR>.
|
After going through the code and thinking it over I am a bit reluctant to include an automatic generation of mimetype & checksum when reading a file that does not include this information in the header. The reason is as follows: The main application of SIPI is to serve images using the IIIF standard. The master image the server accesses for this purpose should be J2K, but can also be JPG's, TIFF's or even PNG's. If such a master image has not been produced by SIPI (and, we as well as others will have such images) it will not contain the additional – non-standard – information. Thus, while reading each fragment SIPI would have to determine the mimetype (which means an additional full file access [open, read, analyze, close] as well as determining the checksum which can be very time consuming for large files) and turn down efficiency while serving IIIF conform images. On the other hand, requiring that each master file has to have this additional metadata would certainly reduce the versatility and usefulness of SIPI. Therefore I believe the best way is that we enforce the addition of the additional metadata if SIPI is used to convert an image (which is already the case on the command line and can be done for uploads in the Lua-script as it is dome now). But SIPI still can deal with images that do not have this information. In the special case of the integration to Knora, we have to oblige the users to use SIPI for image conversion (even if I'm a bit hesitant – but I think this is the best solution) because Knora/SIPI must have this information present. By the way this implies that we have to reconvert all old J2K from the old salsah because at this time the J2L-images did not include this information. It's in the MySQL database. But I can live with this ;-) So I would suggest that we leave SIPI in this respect how it is at the moment: Command-line transformation and uplaods with the knora-upload route add this information to the header, but SIPI (standalone) can deal with images without this information. For Knora we will require that the images are being produces using the upload or sipi command line. J2K produced with Kakadu and other J2K-libraries will have to use SIPI to add this meatdata in order to be integrated into SIPI.. What do You think? |
If I understand correctly, when you convert an image from Lua, you must use the Lua function I agree that we should enforce the addition of this metadata, and I think the best way to do this would be to have the simple form of the Lua function Or maybe I'm not understanding what you're saying. |
Wait, do you mean that every time Sipi serves a IIIF image, the Lua function If so, then perhaps we should separate the "simple" and "complex" forms of that function into two different functions, which could be called To me, "enforcing" means that it's not possible for the programmer to do it wrong. So I think that it should not be possible, in Lua, to convert an image without including as much metadata as possible. My understanding has been that one of Sipi's goals is to always preserve image metadata when converting images. |
in the Lua-case: I would rather add a flag in the "simple" form that has a default of "true" to add this metadata (so if omitted, it will be added). But we should give the option to NOT include it for performance reasons.
What do You think about this?
|
Can you clarify when that would happen? |
Ok 😀
Its complex... i thought of adding thos features to the C++ method Sipi.read() first. This was a bad idea...😱 of mine
Usually Sipi serves the images without having to use the Lua function. But in some special cases this could be necessary. In these cases the lua function SipiImage.new() must be used in the preflight script. Therefore I would add the additional metadata as an option with default ‘true’ (=add it). Si in these very special special cases it is possible to skip this step
Von meinem iPhone gesendet
Am 31.10.2018 um 14:09 schrieb Benjamin Geer <[email protected]<mailto:[email protected]>>:
But we should give the option to NOT include it for performance reasons.
Can you clarify when that would happen?
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub<#267 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFN9zKb5ff2oObnnz_V2fqbPA9O29l_2ks5uqaEWgaJpZM4XxUJR>.
|
OK, that makes sense to me, as long as this is clear in the documentation. |
Ok - I’ll do it tonight...
|
For dasch-swiss/dsp-api#1011, I need a Lua function like |
Yes!!
|
Sorry, I have another request. Sipi needs to be able to delete old temporary files. So I'll need two more Lua functions:
|
done – I'll add some unittests for it, and then this should work (as well as moving a file...)
Am 31.10.2018 um 20:42 schrieb Benjamin Geer <[email protected]<mailto:[email protected]>>:
Sorry, I have another request. Sipi needs to be able to delete old temporary files. So I'll need two more Lua functions:
* A function to get the last modification date of a file (in seconds since the epoch)
* A function to get the current time (in seconds since the epoch)
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub<#267 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFN9zKrFLflZ4AWyLfopGyChz0iGHjSzks5uqf0ygaJpZM4XxUJR>.
|
Done! |
- Add server.fs.readdir C++ function for Lua. - Add config setting for maximum temp file age. - Remove unimplemented log levels TRACE and OFF from docs and configs. - Fix incorrect log level names in Lua scripts. - Fix inconsistencies in log level names (use syslog's names everywhere). - Add e2e test of clean_tempdir().
Two questions:
|
1. In our upload.lua script for Knora, we first save the uploaded file using server.copyTmpfile. Then we load it into a SipiImage using SipiImage.new, and finally we convert it to JPEG 2000, saving it in another file. Do we really need to save the upload to a file first? If the upload is already in memory, would it be possible to construct a SipiImage directly from the uploaded data in memory? For example, could SipiImage.new take an element of server.uploads as a parameter?
I have to look at this. Could be possible to create the SipiImage object directly.. I’ll check! – and if possible I implement it!
1. I would like to have upload.lua check the uploaded file's MIME type before converting it. If the file is not an image, we shouldn't try to convert it to JPEG 2000. And I'd like to reject, say, Windows .EXE files and not even save them in the temporary directory. Could we make a C++ that we could call from Lua, which would use libmagic and return the file's real MIME type? If we do what I'm suggesting in (1), this would have to use magic_buffer (to look at the uploaded data in memory) instead of magic_file.
This should be rather simple. I’ll do it – maybe tonight….
—
You are receiving this because you were assigned.
Reply to this email directly, view it on GitHub<#267 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AFN9zDLEVMu3CdSriB8ycquSfto-214Wks5ustnngaJpZM4XxUJR>.
|
- Log all internal server errors in send_error. - Fix documentation syntax.
…knora-integration # Conflicts: # scripts/upload.lua # test/_test_data/scripts/upload.lua
…knora-integration
See https://tools.ietf.org/html/rfc7231#section-4.3.5: "A payload within a DELETE request message has no defined semantics; sending a payload body on a DELETE request might cause some existing implementations to reject the request."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything seems to work now, thank you! I think we can merge this if it looks OK to you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, actually just one more thing:
Usually Sipi serves the images without having to use the Lua function. But in some special cases this could be necessary. In these cases the lua function SipiImage.new() must be used in the preflight script. Therefore I would add the additional metadata as an option with default ‘true’ (=add it). Si in these very special special cases it is possible to skip this step
Did you change your mind about this? I can't find it in the code. From the documentation, it looks as if the simple form always includes the additional metadata if the image comes from an upload, but not if it comes from a file. Is that right? (So I don't have to use the complex form if I pass an upload index to SipiImage.new
?)
Did you change your mind about this? I can't find it in the code. From the documentation, it looks as if the simple form always includes the additional metadata if the image comes from an upload, but not if it comes from a file. Is that right? (So I don't have to use the complex form if I pass an upload index to SipiImage.new?)
Exactely. SipiImage.new(index) taking the file from an upload automatically adds the necessery metadata (original name, mimetype and SHA256 checksum) using the simple form!
The simple form reading from a file (not upload index) doesn't to this. It is meant to read a J2K-image internally, transform it and write it to the conncection (e.g. for add some special image processing, tricks etc.) before delivering the image to the user...
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, great, I just clarified the docs about that a little. Now I think this is really OK to merge, if it seems OK to you.
I added the url http://iiiif.server/directory/imgid.jp2/knora.json which returns a json containing
the required image information:
I added unittests for this feature and for getting info.json
Required for dasch-swiss/dsp-api#1011.