This module extends the Islandora Batch framework to facilitate the ingestion of a ZIP or directory filled with one or more PDFs and associated xml metadata files into paged content and individual page objects.
The ingest is a two-step process:
- Preprocessing: The data is scanned and a number of entries are created in the Drupal database. There is minimal processing done at this point, so preprocessing can be completed outside of a batch process.
- Ingest: The data is actually processed and ingested. This happens inside of a Drupal batch.
This module requires the following modules/libraries:
Install as usual, see this for further information.
The base ZIP preprocessor can be called as a drush script (see drush help islandora_paged_content_pdf_batch_preprocess
for additional parameters):
Drush made the target
parameter reserved as of Drush 7. To allow for backwards compatability this will be preserved.
The target
option requires the full path to your archive from root directory. e.g. /var/www/drupal/sites/archive.zip
The parent_relationship_pred
defaults to isMemberOfCollection
. If ingesting newspaper issues into a Newspaper parent object, this must be set to isMemberOf
.
Drush 7 and above:
For books:
drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --scan_target=/path/to/archive.zip --content_model=islandora:bookCModel --parent=islandora:bookCollection
For newspaper issues:
drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --scan_target=/path/to/archive.zip --content_model=islandora:newspaperIssueCModel --parent=islandora:my_newspaper --parent_relationship_pred=isMemberOf
Drush 6 and below:
For books:
drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --target=/path/to/archive.zip --content_model=islandora:bookCModel --parent=islandora:bookCollection
For newspaper issues:
drush -v -u 1 --uri=http://localhost islandora_paged_content_pdf_batch_preprocess --target=/path/to/archive.zip --content_model=islandora:newspaperIssueCModel --parent=islandora:my_newspaper --parent_relationship_pred=isMemberOf
This will populate the queue (stored in the Drupal database) with base entries.
The queue of preprocessed items can then be processed:
drush -v -u 1 --uri=http://localhost islandora_batch_ingest
Currently, the ingester has only been tested with the Book and Newspaper solution packs. Other paged content may need to extend and customize the batch as noted below.
Custom ingests can be written by extending any of the existing preprocessors and batch object implementations.
Having problems or solved a problem? Contact discoverygarden.
This project has been sponsored by:
If you would like to contribute to this module, please check out our helpful Documentation for Developers info, Developers section on Islandora.ca and contact discoverygarden.