Skip to content

discoverygarden/islandora_paged_content_pdf_batch

Repository files navigation

Islandora Paged Content PDF Batch Build Status

Introduction

This module extends the Islandora Batch framework to facilitate the ingestion of a ZIP or directory filled with one or more PDFs and associated xml metadata files into paged content and individual page objects.

The ingest is a two-step process:

  • Preprocessing: The data is scanned and a number of entries are created in the Drupal database. There is minimal processing done at this point, so preprocessing can be completed outside of a batch process.
  • Ingest: The data is actually processed and ingested. This happens inside of a Drupal batch.

Requirements

This module requires the following modules/libraries:

Installation

Install as usual, see this for further information.

Usage

The base ZIP preprocessor can be called as a drush script (see drush help islandora_paged_content_pdf_batch_preprocess for additional parameters):

Drush made the target parameter reserved as of Drush 7. To allow for backwards compatability this will be preserved. The target option requires the full path to your archive from root directory. e.g. /var/www/drupal/sites/archive.zip

The parent_relationship_pred defaults to isMemberOfCollection. If ingesting newspaper issues into a Newspaper parent object, this must be set to isMemberOf.

Drush 7 and above:

For books: drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --scan_target=/path/to/archive.zip --content_model=islandora:bookCModel --parent=islandora:bookCollection

For newspaper issues: drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --scan_target=/path/to/archive.zip --content_model=islandora:newspaperIssueCModel --parent=islandora:my_newspaper --parent_relationship_pred=isMemberOf

Drush 6 and below:

For books: drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --target=/path/to/archive.zip --content_model=islandora:bookCModel --parent=islandora:bookCollection

For newspaper issues: drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --target=/path/to/archive.zip --content_model=islandora:newspaperIssueCModel --parent=islandora:my_newspaper --parent_relationship_pred=isMemberOf

This will populate the queue (stored in the Drupal database) with base entries.

The queue of preprocessed items can then be processed:

drush -v -u 1 --uri=http://localhost islandora_batch_ingest

Currently, the ingester has only been tested with the Book and Newspaper solution packs. Other paged content may need to extend and customize the batch as noted below.

Customization

Custom ingests can be written by extending any of the existing preprocessors and batch object implementations.

Troubleshooting/Issues

Having problems or solved a problem? Contact discoverygarden.

Maintainers/Sponsors

This project has been sponsored by:

Development

If you would like to contribute to this module, please check out our helpful Documentation for Developers info, Developers section on Islandora.ca and contact discoverygarden.

License

GPLv3

Releases

No releases published

Packages

No packages published

Languages