This repository contains a live cut of the ArchiveSpace-exported EADs describing finding aids from various NYU-associated special collections. This data is harvested into a Solr index for the finding aids discovery portal.
The EAD publishing tool has a hook to push out changes here, therefore maintaining one up-to-date repository of EADs.
The full documentation for the K8s jobs is in a private repos now
A K8s job is triggered every time this index is updated. This job calls a reindex task which gets the previous commit and updates the Solr index with all the changed files.
In addition to real-time updates we run a nightly job that reindexes any files changes in commits over the past 24 hours and a weekly job that reindexes changes in commits over the past 7 days. This serves as a failsafe for any failed rebuilds.
There is also a job for running a full reindex in case things get out of sync. Note that these full reindex jobs can take up to 5 days.
We can also run single index jobs for a named EAD.
The commits within range 5a67a80 to bfb2f0e (inclusive) are known to corrupt the state of the repo on case-insensitive filesystems. Note that Macos and Windows filesystems default to case-insensitive. For full details on this issue, see Jira ticket DLFA-155: Duplicate finding aids and filename collision in findingaids_eads Github repo.
The special collections content owners can create issues here if they are relevant to data errors in the EADs.