Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Phaidra importer accepts PDF files as pages for a book import #7

Open
gonter opened this issue Jun 23, 2017 · 0 comments
Open

Phaidra importer accepts PDF files as pages for a book import #7

gonter opened this issue Jun 23, 2017 · 0 comments

Comments

@gonter
Copy link
Member

gonter commented Jun 23, 2017

The content model for "Book" assumes that each page is an image, e.g. a TIFF, JPEG file. All pages are combined to a PDF which represents the OCTETS datastream of the Book object. The BOOKINFO datastream lists all Page objects which are assumed to be image files.

On an (internal, non public) Phaidra instance, a book was created [1] where the first page is a PDF file:

<foxml:datastream ID="RELS-EXT" STATE="A" CONTROL_GROUP="X" VERSIONABLE="false">
<foxml:datastreamVersion ID="RELS-EXT.1" LABEL="Relationships" CREATED="2017-06-06T12:46:17.607Z" MIMETYPE="application/rdf+xml" FORMAT_URI="info:fedora/fedora-system:FedoraRELSExt-
1.0" SIZE="362">
<foxml:xmlContent>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">

<rdf:Description rdf:about="info:fedora/o:197863">
        <hasModel xmlns="info:fedora/fedora-system:def/model#" rdf:resource="info:fedora/cmodel:Page"></hasModel>
        <itemID xmlns="http://www.openarchives.org/OAI/2.0/" rdf:resource="oai:univie.ac.at:o:197863"></itemID>
</rdf:Description>

</rdf:RDF>
</foxml:xmlContent>
</foxml:datastreamVersion>
</foxml:datastream>

<foxml:datastream ID="OCTETS" STATE="A" CONTROL_GROUP="M" VERSIONABLE="true">
<foxml:datastreamVersion ID="OCTETS.0" LABEL="\\UB-DOM\U_Dom\Users M-R\Pospichal\Documents\Phaidra-Versuch_01\AT-UAW_Index_R_57_2_Titelblatt.pdf" CREATED="2017-06-06T12:46:18.031Z" MIMETYPE="application/pdf" SIZE="69613">
<foxml:contentLocation TYPE="INTERNAL_ID" REF="o:197863+OCTETS+OCTETS.0"/>
</foxml:datastreamVersion>
</foxml:datastream>

The Phaidra Importer should validate the media type of each page and refuse to build a Book object if the validation fails.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant