-
Notifications
You must be signed in to change notification settings - Fork 3
workflow
Heidi Frank edited this page Feb 5, 2014
·
6 revisions
KARMS Workflow for processing ACO MARCXML records:
- Retrieve individual MARCXML files from "work" folder on github and copy to folder named [003][batchDate]/[003][batchDate]_marc_files/ (e.g., COO_20131122/COO_20131122_marc_files/)
- Convert individual MARCXML files to individual .mrc files using MarcEdit (copied to folder [003][batchDate]/[003][batchDate]_marc_files/processed_files)
- Join individual .mrc files into a single .mrc file using MarcEdit - save to folder/filename = [003][batchDate]/[003][batchDate]_all.mrc
- Convert single .mrc file to eye-readable .mrk file using MarcEdit
- Separate single .mrk file into 2 separate files - one containing records with OCLC numbers, the other containing records without OCLC numbers
- Extract OCLC numbers from first .mrk file using MarcEdit and save to a .txt file with filename =[003][batchDate]/[003][batchDate]_all_OCLCnums.txt
- Batch export all of the matching OCLC records from OCLC Connexion, saving to filename = [003][batchDate]/[003][batchDate]_OCLC_all.mrc
- Batch export only the linked matching OCLC records from OCLC Connexion, saving to filename = [003][batchDate]/[003][batchDate]_OCLC_linked1.mrc
- Dedup the "all" versus "linked1" records to get a list of "unlinked" records using Python script, generating filenames = [003][batchDate]/[003][batchDate]OCLC_unlinked.mrc and [003][batchDate]/[003]_[batchDate]_OCLCnums_unlinked.txt
- Look up each of the OCLC numbers in the unlinked.txt file and link/fix them in OCLC Connexion
- Re-export the fixed set of "unlinked" OCLC records from OCLC Connexion, saving to filename = [003][batchDate]/[003][batchDate]_OCLC_linked2.mrc
- Combine the linked1 and linked2 sets of records using MarcEdit Join function, saving to filename = [003][batchDate]/[003][batchDate]_OCLC_linked_all.mrc
- Compare the quality of the original records from the partner (filename: [003]_[batchDate]all.mrc) to the matched OCLC records (filename: [003][batchDate]_OCLC_linked_all.mrc)
Sample File Structure: