-
Notifications
You must be signed in to change notification settings - Fork 18
DCP Technical Architecture
Madison Dunitz edited this page Oct 21, 2019
·
2 revisions
Component Specific Documentation
Description | Repo(s) | API documentation | Charter | Contact | Internal Architecture | Data Model | State | Inputs | Transformation | Outputs | Dependencies | Other important documentation | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Metadata | The Human Cell Atlas (HCA) Metadata Schema are JSON format schema. These schemas are designed to capture and provide structure for the descriptive scientific metadata associated with HCA datasets. These schemas aim to ensure the FAIRness of the HCA data. | Metadata Schema, Metadata Schema Publisher | N/A | Metadata Charter | Laura Clarke, Norman Morrison, Mark Diekhans | Link | The metadata component produces JSON schema which define several core entities (project, biomaterial, protocol, process and file), type schema which declare specific subtypes of the different entities (biomaterial can be donor_organism, specimen_from_organism, cell suspension, cell_line, organoid and imaged specimen) and schema modules which provide schema for attributes which meet a specific use case (e.g mouse specific fields or 10x specific fields).These schemas are individually versioned following semantic versioning with major, minor and patch version numbers, e.g biomaterial_core is currently 8.2.0. What changes trigger what type of version increment is documented in https://github.com/HumanCellAtlas/metadata-schema/blob/master/docs/evolution.md#schema-versioning. The basic rule is major versions change if the change breaks backwards compatibility, minor changes are attribute changes which don’t break backwards compatibility and patch changes are for documentation changes or bug fixesMetadata Release Process: Schema changes are made via PRs into develop. Release from integration to develop does not follow a specific schedule and happens on Thursdays and Fridays not to interfere with the DCP-wide release schedule. Releases from integration to staging and from staging to prod follow the DCP-wide release schedule. Metadata release process is documented here: https://github.com/HumanCellAtlas/metadata-schema/blob/master/docs/release_process.md#steps-of-the-pre-release-process DCP-wide release SOP is here: https://allspark.dev.data.humancellatlas.org/dcp-ops/docs/wikis/SOP:%20Releasing%20new%20Versions%20of%20DCP%20Software | This service does not need to track state. | This service does not accept input | This service does not transform its data | This service distributes JSON schema (draft 7). The schema themselves are stored in git but they are released via https://schema.humancellatlas.org/ and the publishing process is operated via code in https://github.com/HumanCellAtlas/metadata-schema-publisher | Any other DCP service which collects, processes, stores, queries or presents HCA data will create or read instances data which use the JSON schema | https://github.com/HumanCellAtlas/metadata-schema/blob/master/README.md is a good starting point. It isn’t 100% complete, please reach out to Laura/Norman/Mark if you identify clear gaps. |
Ingest | The Ingest Service is responsible for the intake, validation of metadata and data (thru the Upload service) and persisting it into the Data Storage System (DSS) of Human Cell Atlas (HCA) Data Coordination Platform (DCP). | Ingest Core, Ingest State Tracking, Ingest Validator, Ingest Client, Ingest UI, Ingest Broker, Ingest Staging Manager, Ingest Exporter, Ingest Deployment | Ingest HAL Browser, Primary Submission, Secondary Submission | Missing | Missing | Missing | Lucid Chart Data Model | Missing | Missing | Missing | Missing | Missing | Missing, Note - Ingest creates the graph immediately from any user supplied content. It uses the graph to calculate the contents required to be added to any created bundles and serializes the graph into links.json by walking through the ingest API after validation and upon submission |
Upload | |||||||||||||
DataStore (DSS) | |||||||||||||
Secondary Analysis | |||||||||||||
Azul | |||||||||||||
DataBrowser | |||||||||||||
Matrix Service | |||||||||||||
Query Service | |||||||||||||
Authentication and Authorization |