Skip to content

Data Loading Pipeline

Jeremy Nelson edited this page Apr 3, 2023 · 4 revisions

High-level Data Loading Pipeline

flowchart LR
vendorApp([Vendor Management App]) -. Queries REST API? .- daily(Daily DAG)
daily(Daily DAG) --> DynamicDAG1(Gobi YANKEE DAG Run) -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG2(Gobi YANKEE_APPR DAG Run)  -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG3(Gobi YANKEE_EBOOK DAG Run) -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG4(Harrassowitz HARRAS_E DAG Run) -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG5(Harrassowitz HARRAS_ESUB DAG Run) -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG6(Caslini CASALINI_EDAG DAG Run) -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG7(Caslini CAS_ILIBRI DAG Run) -.-> FOLIO{{Okapi API}}
daily(Daily DAG) --> DynamicDAG8(Amalivre AMALIV_E DAG Run) -.-> FOLIO{{Okapi API}}
Loading

Example Gobi YANKEE DAG Run

flowchart TD
daily(Daily DAG Run)--Triggers Gobi DAG Run --> GobiYANKEE(Gobi YANKEE DAG Run)
GobiYANKEE(Gobi YANKEE DAG Run) --> FTPTask[SFTP Task] -- Connects and downloads file --> GOBI(Gobi FTP Site)
FTPTask[SFTP Task] --> MARCBibTask[MARC file Backup]
FTPTask[SFTP Task] --> MARCModTask[MARC Modification]
MARCModTask[MARC Modification] --> MARCBatchFileTask[Create MARC record Batches]
MARCBatchFileTask[Create MARC record Batches] -- Uses Okapi APIs --> FOLIODataImportTasks[FOLIO Data Import]
FOLIODataImportTasks[FOLIO Data Import] -- Generates Report --> ReportTask[Report Task]
FOLIODataImportTasks[FOLIO Data Import] -- REST API --> VendorManagementApp(Updates Vendor Management App)
ReportTask[Report Task] --> EmailUsersTask[Emails Users]
Loading

Open questions:

  • Do we need a daily DAG to query the Vendor Management App?
  • Can the Individual DAG runs be scheduled based on Vendor Management App information?
  • Can the Vendor Management just update an Airflow Connection for each FOLIO Interface when changed?