-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(api-v2): Add support for PDF files #1206
Conversation
I am in the office next week. So let me know if I can help. |
This pull request has been mentioned on Discuss DaSCH. There might be relevant details there: https://discuss.dasch.swiss/t/support-non-image-files-in-knora-api-v2/33/1 |
# Conflicts: # webapi/src/test/scala/org/knora/webapi/e2e/v2/OntologyV2R2RSpec.scala
@subotic There's a test failure here because So I increased the Redis connection timeout: |
@tobiasschweizer Can you review this? |
Ok, will do it now! |
by adding IIIF parameters to `temporaryBaseIIIFUrl`. For example, to get | ||
In the case of an image file, the client may now wish to get a thumbnail of each | ||
uploaded image, to allow the user to confirm that the correct files have been uploaded. | ||
This can be done by adding IIIF parameters to `temporaryBaseIIIFUrl`. For example, to get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does the mapping between Sipi's response and SipiUploadResponse
happen? Is this a spray.json
config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SipiUploadResponse
is defined in KnoraSipiIntegrationV2ITSpec
and is only used there. The client receives this response, and Knora never sees it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thx.
override def toOntologySchema(targetSchema: OntologySchema): DocumentFileValueContentV2 = copy(ontologySchema = targetSchema) | ||
|
||
override def toJsonLDValue(targetSchema: ApiV2Schema, projectADM: ProjectADM, settings: SettingsImpl, schemaOptions: Set[SchemaOption]): JsonLDValue = { | ||
val fileUrl: String = s"${settings.externalSipiBaseUrl}/${projectADM.shortcode}/${fileValue.internalFilename}" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could Knora also create the fragment identifier? Is it possible to refer to a part of a PDF in an annotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not currently possible, but something could be added to knora-base
for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you make an issue for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -1298,31 +1298,31 @@ class ResourcesResponderV2(responderData: ResponderData) extends ResponderWithSt | |||
resource: ReadResourceV2 = resources.toResource(gravsearchTemplateIri) | |||
|
|||
_ = if (resource.resourceClassIri.toString != OntologyConstants.KnoraBase.TextRepresentation) { | |||
throw BadRequestException(s"Resource $gravsearchTemplateIri is not a ${OntologyConstants.KnoraBase.XSLTransformation}") | |||
throw BadRequestException(s"Resource $gravsearchTemplateIri is not a text file") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this context, a Gravsearch template is expected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified the error message in b4acacf.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, thx.
success, newFilePath = helper.filename_hash(fullImgName); | ||
if not success then | ||
server.sendStatus(500) | ||
server.log(gaga, server.loglevel.LOG_ERR) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was gaga
even defined? ;-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course not. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything else wold have disappointed me ;-)
return | ||
end | ||
|
||
--success, check = fullImg:mimetype_consistency(originalMimeType, originalFilename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does Sipi check the actual mime type of the file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't in this Lua script; this code was already commented out, but I don't know why.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it should be commented in again. Otherwise, the check is not performed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then the integration test won't work anymore, I think. I commented it out. You are of course welcome to find a solution for the test that allows for the mime-type check to be added :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there will also be situations where a problem cannot be fixed at once. But we should keep track of such problems by creating issues.
|
||
server.log("upload.lua: wrote image file to " .. tmp_storage_file_path, server.loglevel.LOG_DEBUG) | ||
else | ||
-- It's not an image file. Just move it to its temporary storage location. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can Sipi check that the mime type is correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is only implemented for SipiImage
in Sipi. Is there a more generic implementation needed for any file type? The current implementation relies on libmagic
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be better to have a more generic implementation. We could open an issue for this in Sipi.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please do so and assign it to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the logs (https://github.com/dasch-swiss/knora-api/pull/1206/checks?check_run_id=324265543) I see two 500 errors. Does this mean that |
Yes. Sipi says:
I guess that's why that code was commented out. |
The file with the |
sipi/scripts/convert_from_path.lua
Outdated
@@ -124,6 +124,20 @@ if media_type == IMAGE then | |||
return | |||
end | |||
|
|||
local check | |||
success, check = fullImg:mimetype_consistency(originalMimeType, originalFilename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe you need to add
ocal submitted_mimetype
success, submitted_mimetype = server.parse_mimetype(originalMimeType)
local check
success, check = fullImg:mimetype_consistency(submitted_mimetype.mimetype, originalFilename)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added that in a1a37cc, but Sipi returns the same HTTP 500 error. I actually think that since this is an API v1 or Sipi bug, and this PR only adds functionality to API v2, there's no reason to hold back this PR because of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok , and you are going to change the v1 behaviour anyway. If your fix doesn't solve the problem, could you open an issue in Sipi?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, this problem was there before, and this PR doesn't affect it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand, but the code shouldn't have been just commented out in https://github.com/dasch-swiss/knora-api/pull/1459/files#diff-4a1a2aedd2d8b7dee2f824e33b243ca1R123
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's ok for me to make separate issues (Knora and/or Sipi) and move on with this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but that's a different PR. You can't expect every PR to fix every bug that already existed before the PR. You're reviewing this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps, but that's a different PR. You can't expect every PR to fix every bug that already existed before the PR. You're reviewing this PR.
I agree. But this is kind of a critical issue (Sipi is expected to recreate the image file in its original format), this is why I kept insisting. If it had been a detail, it would not have mattered that much.
It was commented out here: https://github.com/dasch-swiss/knora-api/pull/1459/files#diff-4a1a2aedd2d8b7dee2f824e33b243ca1R123 |
I think this is this the file that Knora temporarily stores and informs Sipi about so it can read the file. So this doesn't work any more as of Sipi 2.0.1. I do not understand why. |
Thanks for checking this very boring PR. :) |
.> Thanks for checking this very boring PR. :) I made me feel nostalgic ...good old api v1 times ... |
This pull request has been mentioned on Discuss DaSCH. There might be relevant details there: https://discuss.dasch.swiss/t/support-non-image-files-in-knora-api-v2/33/2 |
knora-base
:pageCount
,dimX
, anddimY
toDocumentFileValue
.originalFilename
andoriginalMimeType
optional,because we only get them for images.
knora-base
version.knora-api
complex ontology:documentFileValueHasPageCount
,documentFileValueHasDimX
,and
documentFileValueHasDimY
.internalSipiFileServerGetUrl
andexternalSipiFileServerGetUrl
from
Settings
, because all file types can now be stored together in Sipi.StandoffResponderV2
to look for XSL transformations atinternalSipiBaseUrl
(Sipi'sconfig.imgroot
).get_mediatype.lua
with a more flexible MIME-type-checkingframework in
file_info.lua
.upload.lua
.config.imgroot
instead ofconfig.docroot
).KnoraSipiIntegrationV2ITSpec
:Resolves #1202.
Fixes #1430.
Needs dasch-swiss/sipi#283.
Needs dasch-swiss/sipi#309.