This repository is a static metadata export of works from Northwestern University Library's Digital Collections site. It exports metadata from the Digital Collections API weekly. Use these collection-based datasets if you do not need the full-power of a dynamic search and IIIF-based responses. It is useful for individuals that want to perform analysis on the entire set of NUL's digitized works, import into other systems for retrieval, and other bulk analysis uses.
Data is broken out by format (serialization) and collection. Metadata is contained within directories corresponding to the format data/[format]
Files are named by slugged-collection-name-[collectionID].[format].gz.
json
files are a straight export from the apicsv
files are a human readable version, favoringlabels
and using pipes to separate multi-value fieldsxml
files are a conversion of json using the pythondict2xml
library
Exports take place Sunday at midnight via a github action. The github action uses the nuldc cli to search for any collection that has an "indexed_at" date later than the last update. You can check when the metadata was last updated by looking at the commit history. In the event that a fresh export is needed, a committer on this repository can remove the upadated_at
file which will force a full re-export on the next run.