What is this?

This repository includes pipelines to transform data from a FHIR server (like HAPI, GCP FHIR store, or even OpenMRS) using the FHIR format into a data warehouse based on Apache Parquet files, or another FHIR server. There is also a query library in Python to make working with FHIR-based data warehouses simpler.

These tools are intended to be generic and eventually work with any FHIR-based data source and data warehouse. Here is the list of main directories with a brief description of their content:

pipelines/ *START HERE*: Batch and streaming pipelines to transform data from a FHIR-based source to an analytics-friendly data warehouse or another FHIR store.
docker/: Docker configurations for various servers/pipelines.
doc/: Documentation for project contributors. See the pipelines README and wiki for usage documentation.
utils/: Various artifacts for setting up an initial database, running pipelines, etc.
dwh/: Query library for working with distributed FHIR-based data warehouses.
bunsen/: A fork of a subset of the Bunsen project which is used to transform FHIR JSON resources to Avro records with SQL-on-FHIR schema.
e2e-tests/: Scripts for testing pipelines end-to-end.

NOTE: This was originally started as a collaboration between Google and the OpenMRS community.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

What is this?

Files

README.md

Latest commit

History

README.md

File metadata and controls

What is this?